Announcement

Collapse
No announcement yet.

Very high latency on Nagios/Centreon

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Very high latency on Nagios/Centreon

    Hello everyone.

    I am currently having issues regarding high latency in my nagios/centreon setup. In Nagios the Program Wide Performance Information page reports an average of 170s. Using the nagiostats command reports an average active service latency of 589s and an average active host latency of 444s. This is obviously way too high.

    I am checking approx 4000 services on 250 hosts.
    I am using Nagios 3.0.2 and Centreon 2.0.2 (for project reasons unable to upgrade to 2.1.9 at this time)
    The monitoring host has a load average of approx 2.1
    Maximum concurrent service checks = 5000
    Aggregated status updates option = yes

    Could anyone advise how I could dramatically reduce this latency and explain why it might be so high?

    Any advice would be greatly appreciated.

    Many thanks,

    Paul

    -------------

    Bonjour à tous.

    Je suis actuellement rencontrez des problèmes concernant les temps de latence élevés dans ma configuration nagios centreon /. Dans le programme Nagios rendement à l'échelle de l'information page des rapports une moyenne de 170s. Utilisation de la commande nagiostats signale une latence moyenne de service actif de 589s et une latence moyenne d'accueil active de 444s. C'est évidemment beaucoup trop élevé.

    Je vérifie environ 4000 services sur 250 hôtes.
    J'utilise Nagios 3.0.2 et Centreon 2.0.2 (pour des raisons de projet peuvent évoluer vers la 2.1.9 en ce moment)
    L'hôte de surveillance a une charge moyenne de 2,1 environ
    contrôles simultanés maximale de service = 5000
    l'état des mises à jour cumulatives option = oui

    Quelqu'un pourrait-il conseiller comment je pourrait considérablement réduire cette latence et d'expliquer pourquoi il peut être si élevé?

    Tout conseil serait grandement apprécié.

    Un grand merci,

    Paul

  • #2
    Paul,

    I've noticed latency around 4000+ service checks too. I've tried optimizing both MySQL and the Nagios CFG to handled this amount of checks but I have not been able to tweak it properly. When I stop the NDO2db process I found that our service check latency goes down to under 1 sec. It seems to be something with the NDOUtils even though I am on the patched Centreon version. Have you tried disabling the NDO2db process?

    Comment


    • #3
      Hello,

      I found that in my case, the high latency was being caused by mysql not being tuned properly to deal with InnoDB tables. The centstatus database was constantly being updated, but mysql was not properly tuned to deal with my workload.

      I'm monitoring about 2000 servers with ~14000 services, using a distributed setup of 1 master and 2 satellite pollers, all connecting to the database running on the master poller, running Ubuntu 9.10. Here's my service latency on my servers being reported by nagiostats:

      Master:
      Active Service Latency: 30.945 / 104.340 / 72.821 sec

      Slave 1:
      Active Service Latency: 9.036 / 73.419 / 29.889 sec

      Slave 2:
      Active Service Latency: 0.006 / 13.735 / 3.658 sec



      These were the changes I made to the default my.cnf on the master to improve InnoDB table performance in mysql:

      innodb_buffer_pool_size = 12288M
      innodb_log_file_size = 256M
      innodb_log_buffer_size = 16M
      innodb_thread_concurrency = 32
      innodb_flush_log_at_trx_commit = 2

      Please note that these values are fairly high as I'm using pretty beefy hardware (2x Quad-Core Xeon with 24GB RAM), but you can play with the values as you see fit for your setup.

      Comment


      • #4
        try this....
        http://www.r71.nl/kb/technical/185-d...se-performance
        i hope it works for u ... thanks

        Comment


        • #5
          Thank you guys for the tips, setting this up in my my.cnf under [mysqld] worked splendid!

          innodb_buffer_pool_size = 1000M
          innodb_log_file_size = 64M
          innodb_log_buffer_size = 16M
          innodb_thread_concurrency = 8
          innodb_flush_log_at_trx_commit = 2

          Worked for ~500 units and 1k services.

          Also I had to delete the files:
          rm /var/lib/mysql/ib_logfile1
          rm /var/lib/mysql/ib_logfile0

          And restart my mysql. Now the system doesn't seem to hang every 5minutes. But sadly Centreon still says I have latency on the system. But hopefully that goes away in a while?

          Comment


          • #6
            my.conf

            I think you should have second server and start spliting your checks.
            I notice that nagios wont have good performance after about 3000 checks,
            I recommend you go to ditributed monitoring

            Anyways MySQL is usally the bootleneck. I had simillar problems, specially with grphs,

            This is my configuration:

            [mysqld]
            #datadir=/var/lib/mysql
            #socket=/var/lib/mysql/mysql.sock
            datadir=/usr/local/mysql
            socket=/usr/local/mysql/mysql.sock
            user=mysql
            bind-address=0.0.0.0
            old_passwords=1

            #Melhorias propostas
            innodb_buffer_pool_size = 1024M
            # Enable it for vast improvement and it may be all you need to tweak.
            query_cache_type=1
            query_cache_limit=1M
            query_cache_size=32M
            # Reduced it to 32 to prevent memory hogging. Also, see notes below.
            thread_cache=32
            wait_timeout=25
            connect_timeout=10
            innodb_flush_log_at_trx_commit = 2

            set-variable = max_connections=512
            log_slow_queries = /var/log/mysql-slow.log
            long_query_time = 4

            [mysqld_safe]
            log-error=/var/log/mysqld.log
            pid-file=/var/run/mysqld/mysqld.pid

            [mysql]
            socket=/usr/local/mysql/mysql.sock

            cheers,
            Felipe
            ________________________________________
            CentOS 5.5 x64 / Nagios 3.2.2 / Centreon 2.1.10
            Monitoring: 467 Hosts / 2109Services 16th Server
            NdoUtils 1.49,NagiosPlugins 1.4.14, NagVis 1.5.1, Distributed Architeture(howto)

            Nagios/Centreon Custom Scripts / Troubleshooting
            www.felipeferreira.net

            Comment


            • #7
              brooker

              I have not yet used check_mk but it seems like a great deal for large instalations.
              check it out at
              http://community.nagios.org/2009/08/...tner-check_mk/
              ________________________________________
              CentOS 5.5 x64 / Nagios 3.2.2 / Centreon 2.1.10
              Monitoring: 467 Hosts / 2109Services 16th Server
              NdoUtils 1.49,NagiosPlugins 1.4.14, NagVis 1.5.1, Distributed Architeture(howto)

              Nagios/Centreon Custom Scripts / Troubleshooting
              www.felipeferreira.net

              Comment


              • #8
                Hi,

                Have you validated if your latency was not caused by your perfdata?

                Try disabling them to see of the latency goes down.

                Comment


                • #9
                  pour vos problèmes de performance, je ne vous recommande pas de tuner InnoDB.. mais de basculer vers le moteur MyISAM, bien plus efficace à mon sens pour nagios/centreon/ndo

                  Pour tester le changement :
                  Un petit script pour changer toutes les tables d'une base de données d'un moteur à un autre :

                  # debut du script
                  BASE=$1
                  PASS=PASSWORD-MYSQL
                  USER=MYSQL-USER
                  CHANGETO=MOTEUR-MYISAM-ou-INNODB

                  mysql -u $USER -p$PASS -e "show tables in $BASE;" | tail --lines=+2 | xargs -i echo "ALTER TABLE {} ENGINE=$CHANGETO;" > alter_table.sql

                  mysql --database=$BASE -u $USER -p$PASS < alter_table.sql
                  rm -rf alter_table.sql
                  #Fin du script

                  Suffit d’exécuter le script en passant comme paramètre le nom de la table a basculer d'un moteur à un autre.

                  Normalement aucun problème avec ce script, mais bon, un backup préliminaire reste de rigueur!

                  Comment


                  • #10
                    Hi,

                    We face a latency issue, but we are not using centstorage DB, only RRD and we have put the centstatus DB
                    in memory - HEAP... Do you have 'graphing' turned on, ie.... service-perfdata processing ?

                    gt.

                    Comment


                    • #11
                      Originally posted by opalanque View Post
                      pour vos problèmes de performance, je ne vous recommande pas de tuner InnoDB.. mais de basculer vers le moteur MyISAM, bien plus efficace à mon sens pour nagios/centreon/ndo

                      Pour tester le changement :
                      Un petit script pour changer toutes les tables d'une base de données d'un moteur à un autre :

                      # debut du script
                      BASE=$1
                      PASS=PASSWORD-MYSQL
                      USER=MYSQL-USER
                      CHANGETO=MOTEUR-MYISAM-ou-INNODB

                      mysql -u $USER -p$PASS -e "show tables in $BASE;" | tail --lines=+2 | xargs -i echo "ALTER TABLE {} ENGINE=$CHANGETO;" > alter_table.sql

                      mysql --database=$BASE -u $USER -p$PASS < alter_table.sql
                      rm -rf alter_table.sql
                      #Fin du script

                      Suffit d’exécuter le script en passant comme paramètre le nom de la table a basculer d'un moteur à un autre.

                      Normalement aucun problème avec ce script, mais bon, un backup préliminaire reste de rigueur!
                      Bonjour,

                      Quel est le risque de modifier le moteur ? La modification est-elle vraiment bénéfique par rapport à un tuning innoDB ? L'équipe de Centreon, des recommandations ?

                      Merci
                      Ubuntu server 10.04 LTS 64 Bits - Haute dispo 4 serveurs centraux (réplication MySQL + VIP + Rsync)
                      Nagios 3.3.1 | Centreon 2.3.9 | Centreon-Broker 2.1.1 | 2000 hôtes - 5000 services | 6 Remote Pollers

                      Dev : CES 2.2 - Centengine - Centreon 2.4.1

                      Comment


                      • #12
                        Bonjour,

                        Je me suis posé la même question concernant cette modification sur la BDD : Quelle est l'implication suite au passage en MyISAM.
                        Pour ma part, un PowerEdge avec un XEON et 2GB de RAM est à 3/4 de load avec 40hosts et 840services (99.9% contrôles actifs).

                        Comment

                        Working...
                        X