Thursday, 31 March 2011
SSH setup - between 2 linux boxes without passwords
On Nagios Server
I created /home/nagios/.ssh/
root@Nagi:/home/nagios# ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/root/.ssh/id_dsa): /home/nagios/.ssh/id-dsa
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/nagios/.ssh/id-dsa.
Your public key has been saved in /home/nagios/.ssh/id-dsa.pub.
The key fingerprint is:
d2:f1:3b:2e:8a:8c:3a:db:33:81:ec:57:7e:0a:88:37 root@Nagi
The key's randomart image is:
+--[ DSA 1024]----+
| |
| |
| . |
| . o |
|.. . S . |
|o.o . . . |
|o.Eoo o |
|.+o=.o ... . |
|o++oo.+. .. |
+-----------------+
root@Nagi:/home/nagios#
I created /home/nagios/.ssh/
root@Nagi:/home/nagios# ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/root/.ssh/id_dsa): /home/nagios/.ssh/id-dsa
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/nagios/.ssh/id-dsa.
Your public key has been saved in /home/nagios/.ssh/id-dsa.pub.
The key fingerprint is:
d2:f1:3b:2e:8a:8c:3a:db:33:81:ec:57:7e:0a:88:37 root@Nagi
The key's randomart image is:
+--[ DSA 1024]----+
| |
| |
| . |
| . o |
|.. . S . |
|o.o . . . |
|o.Eoo o |
|.+o=.o ... . |
|o++oo.+. .. |
+-----------------+
root@Nagi:/home/nagios#
Nagios Error: Check command not defined anywhere!
Error: Service check command 'check_disk_remote' specified in service 'check_disk_remote' for host 'Nagios-CPT' not defined anywhere!
Location check:
** Plugin located in /usr/local/nagios/libexec/
** Commands.cfg --> is the service defined?
** services.cfg --> is the service defined to be used with a host?
Nagios - Examples check_disk_remote
superman@Nagi:/usr/local/nagios/libexec$ ./check_disk_remote -e ssh -H 10.0.0.110 -w 90 -c 95 -v
superman@10.0.0.110's password:
superman@10.0.0.110's password:
Filesystem 1024-blocks Used Available Capacity Mounted on
/dev/sda1 73986264 1480204 68747724 3% /
percent = 3% warn=90 crit=95
none 504728 236 504492 1% /dev
none 508936 0 508936 0% /dev/shm
none 508936 280 508656 1% /var/run
none 508936 0 508936 0% /var/lock
none 508936 0 508936 0% /lib/init/rw
none 73986264 1480204 68747724 3% /var/lib/ureadahead/debugfs
OK: All Filesystems are below threshold (90/95%) | /=3%;;;0;100
superman@10.0.0.110's password:
superman@10.0.0.110's password:
Filesystem 1024-blocks Used Available Capacity Mounted on
/dev/sda1 73986264 1480204 68747724 3% /
percent = 3% warn=90 crit=95
none 504728 236 504492 1% /dev
none 508936 0 508936 0% /dev/shm
none 508936 280 508656 1% /var/run
none 508936 0 508936 0% /var/lock
none 508936 0 508936 0% /lib/init/rw
none 73986264 1480204 68747724 3% /var/lib/ureadahead/debugfs
OK: All Filesystems are below threshold (90/95%) | /=3%;;;0;100
Nagios - Smartmon monitoring
CHECK_SMARTMON
Define services
I add the following to
Define services
/usr/local/etc/nagios/objects/services.cfg
# SMART ad0 define service { use generic-service host_name host1,host2,host3 service_description nrpe_check_smart_ad0 check_command check_nrpe2!check_smart_ad0 } # SMART ad1 define service { use generic-service host_name host2 service_description nrpe_check_smart_ad1 check_command check_nrpe2!check_smart_ad1
}
Edit Suoders
I add the following to
/usr/local/etc/sudoers
on the servers being monitored: nagios ALL=(ALL) NOPASSWD: /usr/local/libexec/nagios/check_smartmon -d /dev/ad*
nagios ALL=(ALL) NOPASSWD: /usr/local/libexec/nagios/check_smartmon -d /dev/da*
Add to commands.cfg
command[check_smart_ad0]=/usr/local/bin/sudo /usr/local/libexec/nagios/check_smartmon -d /dev/ad0
command[check_smart_ad1]=/usr/local/bin/sudo /usr/local/libexec/nagios/check_smartmon -d /dev/ad1
ERROR:
superman@Nagi:/usr/local/nagios/libexec$ ./check_smartmon ? -bash: ./check_smartmon: /usr/local/bin/python: bad interpreter: No such file or directory
If your smartmon don't work --> change the python path in your check_smartmon file - and remove the top
line which points to /usr/local/bin/python and replace with /usr/bin/python
= Installation = Adjust the first line to your Python binary (e.g. /usr/local/bin/python or /usr/bin/python) and the path to your smartctl binary (e.g. /usr/local/sbin/smartctl or /usr/sbin/smartctl).
Install Smartmon
sudo apt-get install smartmontools
Check_smartmon Examples:
Tuesday, 29 March 2011
Nagios Error: Host has no default contacts or contactgroups defined
Warning: Host 'BLALBA' has no default contacts or contactgroups defined!
host definition:
define host{
use generic-host
host_name Scopserve
alias Scopserve
address 172.18.0.30
}
error - generic-host don't have a contact or contactgroup defined.
Setup correct template/hostgroup definition and "use" the correct definition ie:
template.cfg
define host{
name PT-PBX ; The name of this host template
use generic-host ; Inherit default values from the generic-host template
check_period 24x7 ; By default, switches are monitored round the clock
check_interval 5 ; Switches are checked every 5 minutes
retry_interval 1 ; Schedule host check retries at 1 minute intervals
max_check_attempts 10 ; Check each switch 10 times (max)
check_command check-host-alive ; Default command to check if routers are "alive"
notification_period 24x7 ; Send notifications at any time
notification_interval 30 ; Resend notifications every 30 minutes
notification_options d,r ; Only send notifications for specific host states
contact_groups admins ; Notifications get sent to the admins by default
register 0 ; DONT REGISTER THIS - ITS JUST A TEMPLATE
}
New host definition using "CPT-PBX"
define host{
use CPT-PBX
host_name Scopserve
alias Scopserve
address 172.18.0.30
}
host definition:
define host{
use generic-host
host_name Scopserve
alias Scopserve
address 172.18.0.30
}
error - generic-host don't have a contact or contactgroup defined.
Setup correct template/hostgroup definition and "use" the correct definition ie:
template.cfg
define host{
name PT-PBX ; The name of this host template
use generic-host ; Inherit default values from the generic-host template
check_period 24x7 ; By default, switches are monitored round the clock
check_interval 5 ; Switches are checked every 5 minutes
retry_interval 1 ; Schedule host check retries at 1 minute intervals
max_check_attempts 10 ; Check each switch 10 times (max)
check_command check-host-alive ; Default command to check if routers are "alive"
notification_period 24x7 ; Send notifications at any time
notification_interval 30 ; Resend notifications every 30 minutes
notification_options d,r ; Only send notifications for specific host states
contact_groups admins ; Notifications get sent to the admins by default
register 0 ; DONT REGISTER THIS - ITS JUST A TEMPLATE
}
New host definition using "CPT-PBX"
define host{
use CPT-PBX
host_name Scopserve
alias Scopserve
address 172.18.0.30
}
Nagios error: "Invalid_Max_Check_Attempts"
Error: Invalid max_check_attempts value for host 'CBD-DC'
Error: Could not register host (config file '/usr/local/nagios/etc/objects/windows.cfg', starting on line 2)
Error processing object config files!
Check:
* Does your generic-host template specify a valid max_check_attempts
* Check hostgroups.cfg defined in /usr/local/nagios/etc/nagios.cfg
* Check hostgroup defined in /usr/local/nagios/etc/objects/templates.cfg
Error: Could not register host (config file '/usr/local/nagios/etc/objects/windows.cfg', starting on line 2)
Error processing object config files!
Check:
* Does your generic-host template specify a valid max_check_attempts
value? If not, you'll need to add it there or to the host definition itself.* Host defined in windows.cfg
* Check hostgroups.cfg defined in /usr/local/nagios/etc/nagios.cfg
* Check hostgroup defined in /usr/local/nagios/etc/objects/templates.cfg
Friday, 25 March 2011
Nagios - Monitoring Eventlogs on Windows Servers (My Comprehensive Guide)
Monitor DNS events on Windows Servers
define service{
service_description System Eventlog
use generic-service
check_command check_win_eventlog!a!System!.*:+1
max_check_attempts 1
host_name Recruit
contact_groups admins
is_volatile 1
}
define service{
service_description DNS Eventlog
use generic-service
check_command check_win_eventlog!a!DNS!.*:+1
max_check_attempts 1
host_name Recruit
contact_groups admins
is_volatile 1
}
define service{
service_description Directory Service Eventlog
use generic-service
check_command check_win_eventlog!a!Directory Service!.*:+1
max_check_attempts 1
host_name Recruit
contact_groups admins
is_volatile 1
define service{
service_description File Replication Service Eventlog
use generic-service
check_command check_win_eventlog!a!File Replication Service!.*:+1
max_check_attempts 1
host_name Recruit
contact_groups admins
is_volatile 1
}
The bits in red needs to be filled in correctly.
Errors
If eventlog.exe not running you'll get this error message
If errors continue - restart the .exe running on the host
To automate & install the .exe as a service
You will need 'instsrv.exe' and 'srvany.exe' from Microsoft Resource Kit.
Just copy those files together with 'eventlog_agent.exe', 'eventlog_agent.bat' and
'eventlog_agent.reg' into the folder 'c:\programme\eventlog_agent' and run the
batch file. If you want to use a different folder, then you will need to modify
the path in 'eventlog_agent.bat' and 'eventlog_agent.reg'
Autostart
You may put the exe into your Systems Autostart Folder. But this requires that there is
someone logged in.
Uninstall the eventlog_agent
If you used installation method a) or c), then can just delete the Files.
If you used installation method b), then you go into the installation directory
and call "eventlog_agent.bat stop" on the console.
- Copy eventlog_agent files to c:\
- Create folder on c:\ called "programme"
- Create subfolder "eventlog_agent"
- Copy the eventlog_agent files (.exe, .bat, .reg) to c:\programme\eventlog_agent\
- Run eventlog_agent.exe (if doing it manually)
- add "eventlogs.cfg" to nagios.cfg
- Add hosts to eventlogs.cfg
define service{
service_description System Eventlog
use generic-service
check_command check_win_eventlog!a!System!.*:+1
max_check_attempts 1
host_name Recruit
contact_groups admins
is_volatile 1
}
define service{
service_description DNS Eventlog
use generic-service
check_command check_win_eventlog!a!DNS!.*:+1
max_check_attempts 1
host_name Recruit
contact_groups admins
is_volatile 1
}
define service{
service_description Directory Service Eventlog
use generic-service
check_command check_win_eventlog!a!Directory Service!.*:+1
max_check_attempts 1
host_name Recruit
contact_groups admins
is_volatile 1
define service{
service_description File Replication Service Eventlog
use generic-service
check_command check_win_eventlog!a!File Replication Service!.*:+1
max_check_attempts 1
host_name Recruit
contact_groups admins
is_volatile 1
}
The bits in red needs to be filled in correctly.
Errors
If eventlog.exe not running you'll get this error message
Current Status: |
CRITICAL
(for 0d 0h 1m 57s) |
Status Information: | An Error occured before state could be read: Connection refused at /usr/local/nagios/libexec/check_win_eventlog.pl line 145. |
If errors continue - restart the .exe running on the host
To automate & install the .exe as a service
You will need 'instsrv.exe' and 'srvany.exe' from Microsoft Resource Kit.
Just copy those files together with 'eventlog_agent.exe', 'eventlog_agent.bat' and
'eventlog_agent.reg' into the folder 'c:\programme\eventlog_agent' and run the
batch file. If you want to use a different folder, then you will need to modify
the path in 'eventlog_agent.bat' and 'eventlog_agent.reg'
Autostart
You may put the exe into your Systems Autostart Folder. But this requires that there is
someone logged in.
Uninstall the eventlog_agent
If you used installation method a) or c), then can just delete the Files.
If you used installation method b), then you go into the installation directory
and call "eventlog_agent.bat stop" on the console.
Thursday, 24 March 2011
Using Nagios to monitor Zimbra Servers
Monitoring Zimbra Mail queue's with Nagios
edit
vi /usr/local/nagios/libexec/utils.pm
remove
$PATH_TO_MAILQ = "/usr/bin/mailq";
Add
$PATH_TO_MAILQ ="/opt/zimbra/postfix/sbin/mailq";
Test
/usr/local/nagios/libexec# /usr/local/nagios/libexec/check_mailq xxx.xxx.xxx.xxx 1 -w 100 -c 150
Error
root@Nagi:/usr/local/nagios/libexec# /usr/local/nagios/libexec/check_mailq 10.0.0.251 -w 100 -c 150
ERROR: /opt/zimbra/postfix/sbin/mailq is not executable by (uid 0:gid(0 0))
Fix Error
edit
vi /etc/sudoers
nagios ALL=(zimbra) NOPASSWD: /usr/local/nagios/libexec/check_clamav.pl
nagios ALL=(zimbra) NOPASSWD: /usr/local/nagios/libexec/check_mail
NRPE Checks
NRPE
command[check_zimbra_route_lookup_handler]=/usr/lib/nagios/plugins/check_http -H localhost -p 7072
command[check_zimbra_spell_checker]=/usr/lib/nagios/plugins/check_http -H localhost -p 7780
command[check_zimbra_pop3_real]=/usr/lib/nagios/plugins/check_pop -H localhost -p 7110
command[check_zimbra_pop3s_real]=/usr/lib/nagios/plugins/check_pop -H localhost -p 7995 -S
command[check_zimbra_imap_real]=/usr/lib/nagios/plugins/check_imap -H localhost -p 7143
command[check_zimbra_imaps_real]=/usr/lib/nagios/plugins/check_imap -H localhost -p 7993 -S
command[check_zimbra_mailq]=/usr/lib/nagios/plugins/check_mailq -w 100 -c 150 -M postfix
command[check_zimbra_clamd]=/usr/lib/nagios/plugins/check_clamd -H localhost
command[check_zimbra_mysql]=/usr/lib/nagios/plugins/check_mysql -s /opt/zimbra/db/mysql.sock
command[check_zimbra_mysql_logger]=/usr/lib/nagios/plugins/check_mysql -s /opt/zimbra/logger/db/mysql.sock
command[check_zimbra_amavisd]=/usr/lib/nagios/plugins/check_smtp -H localhost -p 10024 -e '220 [127.0.0.1] ESMTP amavisd-new service ready'
command[check_zimbra_lmtp]=/usr/lib/nagios/plugins/check_smtp -H localhost -p 7025 -e '220 zimbra.example.com Zimbra LMTP ready'
command[check_zimbra_postfix_amavis]=/usr/lib/nagios/plugins/check_smtp -H localhost -p 10025 -e '220 zimbra.example.com ESMTP Postfix'
check_clamav.pl
command[check_zimbra_clamd_sig]=sudo -u zimbra /usr/lib/nagios/plugins/contrib/check_clamav.pl -w 3 -c 5
/etc/sudoers
nagios ALL=(zimbra) NOPASSWD: /usr/lib/nagios/plugins/contrib/check_clamav.pl
Validate SSL Cert
service_description Zimbra SSL Certificate
command_line $USER1$/check_http -S -H zimbra.example.com -C 10
Check LDAP
service_description Zimbra LDAP
check_command check_ldap_with_HOST!zimbra.example.com!dc=de
edit
vi /usr/local/nagios/libexec/utils.pm
remove
$PATH_TO_MAILQ = "/usr/bin/mailq";
Add
$PATH_TO_MAILQ ="/opt/zimbra/postfix/sbin/mailq";
Test
/usr/local/nagios/libexec# /usr/local/nagios/libexec/check_mailq xxx.xxx.xxx.xxx 1 -w 100 -c 150
Error
root@Nagi:/usr/local/nagios/libexec# /usr/local/nagios/libexec/check_mailq 10.0.0.251 -w 100 -c 150
ERROR: /opt/zimbra/postfix/sbin/mailq is not executable by (uid 0:gid(0 0))
Fix Error
edit
vi /etc/sudoers
nagios ALL=(zimbra) NOPASSWD: /usr/local/nagios/libexec/check_clamav.pl
nagios ALL=(zimbra) NOPASSWD: /usr/local/nagios/libexec/check_mail
NRPE Checks
NRPE
command[check_zimbra_route_lookup_handler]=/usr/lib/nagios/plugins/check_http -H localhost -p 7072
command[check_zimbra_spell_checker]=/usr/lib/nagios/plugins/check_http -H localhost -p 7780
command[check_zimbra_pop3_real]=/usr/lib/nagios/plugins/check_pop -H localhost -p 7110
command[check_zimbra_pop3s_real]=/usr/lib/nagios/plugins/check_pop -H localhost -p 7995 -S
command[check_zimbra_imap_real]=/usr/lib/nagios/plugins/check_imap -H localhost -p 7143
command[check_zimbra_imaps_real]=/usr/lib/nagios/plugins/check_imap -H localhost -p 7993 -S
command[check_zimbra_mailq]=/usr/lib/nagios/plugins/check_mailq -w 100 -c 150 -M postfix
command[check_zimbra_clamd]=/usr/lib/nagios/plugins/check_clamd -H localhost
command[check_zimbra_mysql]=/usr/lib/nagios/plugins/check_mysql -s /opt/zimbra/db/mysql.sock
command[check_zimbra_mysql_logger]=/usr/lib/nagios/plugins/check_mysql -s /opt/zimbra/logger/db/mysql.sock
command[check_zimbra_amavisd]=/usr/lib/nagios/plugins/check_smtp -H localhost -p 10024 -e '220 [127.0.0.1] ESMTP amavisd-new service ready'
command[check_zimbra_lmtp]=/usr/lib/nagios/plugins/check_smtp -H localhost -p 7025 -e '220 zimbra.example.com Zimbra LMTP ready'
command[check_zimbra_postfix_amavis]=/usr/lib/nagios/plugins/check_smtp -H localhost -p 10025 -e '220 zimbra.example.com ESMTP Postfix'
check_clamav.pl
command[check_zimbra_clamd_sig]=sudo -u zimbra /usr/lib/nagios/plugins/contrib/check_clamav.pl -w 3 -c 5
/etc/sudoers
nagios ALL=(zimbra) NOPASSWD: /usr/lib/nagios/plugins/contrib/check_clamav.pl
Validate SSL Cert
service_description Zimbra SSL Certificate
command_line $USER1$/check_http -S -H zimbra.example.com -C 10
Check LDAP
service_description Zimbra LDAP
check_command check_ldap_with_HOST!zimbra.example.com!dc=de
Monday, 21 March 2011
Event ID 13508, Source: NtFrs
The File Replication Service is having trouble enabling replication from BELL-AD-PRIMARY to SERVER1 for c:\windows\sysvol\domain using the DNS name Bell-AD-Primary.ambition24.com. FRS will keep retrying.
Following are some of the reasons you would see this warning.
[1] FRS can not correctly resolve the DNS name Bell-AD-Primary.ambition24.com from this computer.
[2] FRS is not running on Bell-AD-Primary.ambition24.com.
[3] The topology information in the Active Directory for this replica has not yet replicated to all the Domain Controllers.
Troubleshooting:
* Check if SYSVOL is shared by using the "net share" command on all Servers?
* Is the FRS running on all Servers?
* Run dcdiag and netdiag on both servers. to check
Following are some of the reasons you would see this warning.
[1] FRS can not correctly resolve the DNS name Bell-AD-Primary.ambition24.com from this computer.
[2] FRS is not running on Bell-AD-Primary.ambition24.com.
[3] The topology information in the Active Directory for this replica has not yet replicated to all the Domain Controllers.
Troubleshooting:
* Check if SYSVOL is shared by using the "net share" command on all Servers?
* Is the FRS running on all Servers?
* Run dcdiag and netdiag on both servers. to check
replication please run repadmin /showreps >rep.txt.
Sunday, 6 March 2011
Cisco Router Startup
On startup the System Bootstrap (BootStrap) process:
- Runs the POST
- Find IOS in flash memory (Tell router how to load by def flash)
- IOS load and look for a valid configuration "startup config" stored in RAM or NVRAM
- Once the IOS is loaded the POST information will be displayed
- If no startup config is found in NVRAM the Router will go into "setup mode
Acronyms
POST - Power on Self Test
IOS - Internetwork Operating System
EEPROM - Electronically Erasable Programmable Read Only Memory
NVRAM - Nonvolatile Read Only Memory
Cisco Router - Online Simulator
Cisco Online Router Sim
http://www.techexams.net/testsim/techsim.php#
http://www.techexams.net/testsim/techsim.php#
Subscribe to:
Posts (Atom)
Blog Archive
-
▼
2011
(43)
-
▼
March
(12)
- Check Path...............
- SSH setup - between 2 linux boxes without passwords
- Nagios Error: Check command not defined anywhere!
- Nagios - Examples check_disk_remote
- Nagios - Smartmon monitoring
- Nagios Error: Host has no default contacts or cont...
- Nagios error: "Invalid_Max_Check_Attempts"
- Nagios - Monitoring Eventlogs on Windows Servers (...
- Using Nagios to monitor Zimbra Servers
- Event ID 13508, Source: NtFrs
- Cisco Router Startup
- Cisco Router - Online Simulator
-
▼
March
(12)