PDA

View Full Version : Preventing spike alerts


kolchak
01-18-2006, 06:54 AM
Hi all - we have Oreon setup and running nicely - we even have 12 monitors on the wall of our section of the office to spit out errors, and sports scores when everything is fine :)

One thing I have noticed is that we get a lot of spike alerts. For example, a machine may be processing a query and CPU is running at 99% for 5 seconds, but Oreon will show this alert. I'd like to restrict it to something like 'if there are 3 critical or warning alerts in a row, change the status to warning / critical'. Any advice?

DonKiShoot
01-18-2006, 12:01 PM
nagios max_check_attempt value or something like that, should help you, i guess.

kolchak
01-19-2006, 12:46 AM
Thanks DonKiShoot - but my understanding of max_check_attempts is that after the number in max_check_attempts Nagios will stop throwing errors. Anyone?

cih
01-19-2006, 11:48 AM
Hi kolchak!

I think DonKiShoot is in the right way. max_check_attempts is the number of retries tha Nagios would perform before it decides it's a real problem. We're using it very hard in our systems, and very tuned for all our +600 services.

Hope it helps!

cih

DonKiShoot
01-19-2006, 11:49 AM
no it's the max number of bad result before send alert (to pass state soft in hard)

cih
01-19-2006, 12:29 PM
Exactly that DonKiShoot, I tried to express that, but it seems i did it in the wrong way ;P

DonKiShoot
01-19-2006, 03:02 PM
Exactly that DonKiShoot, I tried to express that, but it seems i did it in the wrong way ;P

Oups !!! we reply to him in the same time :lol: