| Cheshire Cat Computing http://steveshipway.org/forum/ |
|
| Heartbeat function http://steveshipway.org/forum/viewtopic.php?f=22&t=1633 |
Page 1 of 1 |
| Author: | bjoern [ Sat Sep 06, 2008 12:26 am ] |
| Post subject: | Heartbeat function |
Hello@all!! We are using the nagios eventlog agent in the version 1.8.3 for checking our daily backups. It's works fine! But we have some troubles with the integrated heartbeat function. At the nagios website the service has often the status: EventLog Agent UNKNOWN 09-05-2008 13:09:08 0d 0h 9m 8s 1/1 UNKNOWN: Check agent is running But sometimes the service says "ok" and I can see, if the monitoring service will be restarted. I don't know if this is a serious problem, but that doesn't look very nice. I have to say sorry for my bad english. Thank you all in advance. Kind regards Björn My configs: define service{ service_description EventLog Agent active_checks_enabled 1 passive_checks_enabled 1 flap_detection_enabled 0 check_period 24x7 max_check_attempts 1 normal_check_interval 15 retry_check_interval 1 check_command check_dummy!3!Check agent is running contact_groups admins notification_interval 120 notification_period 24x7 notification_options c,r hostgroup_name kivbf_standard } Nagios.log: [1220522995] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;a1habs001;EventLog Agent;0;HEARTBEAT [INFO #0]: Service running OK [1220523002] PASSIVE SERVICE CHECK: a1habs001;EventLog Agent;0;HEARTBEAT [INFO #0]: Service running OK [1220523002] SERVICE ALERT: a1habs001;EventLog Agent;OK;HARD;1;HEARTBEAT [INFO #0]: Service running OK [1220523172] SERVICE ALERT: a1swws002;EventLog Agent;UNKNOWN;HARD;1;UNKNOWN: Check agent is running [1220523200] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;a1swws003;EventLog Agent;0;HEARTBEAT [INFO #0]: Service running OK [1220523202] PASSIVE SERVICE CHECK: a1swws003;EventLog Agent;0;HEARTBEAT [INFO #0]: Service running OK [1220523202] SERVICE ALERT: a1swws003;EventLog Agent;OK;HARD;1;HEARTBEAT [INFO #0]: Service running OK [1220523417] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;nagios_serv;EventLog Agent;0;HEARTBEAT [INFO #0]: Service running OK [1220523422] PASSIVE SERVICE CHECK: nagios_serv;EventLog Agent;0;HEARTBEAT [INFO #0]: Service running OK [1220523422] SERVICE ALERT: nagios_serv;EventLog Agent;OK;HARD;1;HEARTBEAT [INFO #0]: Service running OK [1220523595] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;n1lbrs001;EventLog Agent;0;HEARTBEAT [INFO #0]: Service running OK [1220523602] PASSIVE SERVICE CHECK: n1lbrs001;EventLog Agent;0;HEARTBEAT [INFO #0]: Service running OK [1220523602] SERVICE ALERT: n1lbrs001;EventLog Agent;OK;HARD;1;HEARTBEAT [INFO #0]: Service running OK [1220523696] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;a1swws001;EventLog Agent;0;HEARTBEAT [INFO #0]: Service running OK [1220523702] PASSIVE SERVICE CHECK: a1swws001;EventLog Agent;0;HEARTBEAT [INFO #0]: Service running OK [1220523702] SERVICE ALERT: a1swws001;EventLog Agent;OK;HARD;1;HEARTBEAT [INFO #0]: Service running OK [1220523732] SERVICE ALERT: a1swws003;EventLog Agent;UNKNOWN;HARD;1;UNKNOWN: Check agent is running [1220523752] SERVICE ALERT: n1lbrs001;EventLog Agent;UNKNOWN;HARD;1;UNKNOWN: Check agent is running [1220523755] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;a1swws003;EventLog Agent;0;HEARTBEAT [INFO #0]: Service running OK [1220523762] PASSIVE SERVICE CHECK: a1swws003;EventLog Agent;0;HEARTBEAT [INFO #0]: Service running OK [1220523762] SERVICE ALERT: a1swws003;EventLog Agent;OK;HARD;1;HEARTBEAT [INFO #0]: Service running OK [1220523772] SERVICE ALERT: nagios_serv;EventLog Agent;UNKNOWN;HARD;1;UNKNOWN: Check agent is running |
|
| Author: | bjoern [ Sat Sep 06, 2008 1:02 am ] |
| Post subject: | Re: Heartbeat function |
Ok, I think I got the error. I applied the active checks in the nagios service. This maybe caused the unknown messages. Thanks@all With kind regards Björn |
|
| Author: | stevesh [ Mon Sep 08, 2008 11:58 am ] |
| Post subject: | Re: Heartbeat function |
Exactly right - you should have 24x7 timeperiod, but disable active checks. The checkcommand, when run, sets the status to 'unknown', but this should only be triggered by the freshness check. This means that, if the heartbeat stops, the freshness check is triggered which sets it to unknown. Otherwise, nagios just uses the received passive alerts and does not run the checkcommand. |
|
| Page 1 of 1 | All times are UTC + 12 hours [ DST ] |
| Powered by phpBB® Forum Software © phpBB Group http://www.phpbb.com/ |
|