Announcement

Collapse
No announcement yet.

Host & Service latency

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Host & Service latency

    Bonsoir,

    j'aurais besoin d'un avis expert sur les valeurs de latence d'exécution de service et host. Je n'ai pas spécialement de problème de performances, cependant, la récente utilisation d'un plugin nécessitant une faible latence m'a amené à vérifier mon système, car le service check latency du plugin était de 60 secondes. En vérifiant d'autres services, j'ai obtenu des valeurs similaires, et sur d'autres, quelques millisecondes. C'est donc aléatoire.

    Je monitore actuellement 922 host et 2513 services sur le central, l'architecture comportant 3 remote pollers.

    Voici le résultat de la commande "Optimize" de Centreon :

    Poller Central

    Code:
    OBJECT CONFIG PROCESSING TIMES (* = Potential for precache savings with -u option)
    ----------------------------------
    Read: 0.051807 sec
    Resolve: 0.005398 sec *
    Recomb Contactgroups: 0.000760 sec *
    Recomb Hostgroups: 0.004377 sec *
    Dup Services: 0.006291 sec *
    Recomb Servicegroups: 0.002928 sec *
    Duplicate: 0.000158 sec *
    Inherit: 0.001641 sec *
    Recomb Contacts: 0.000000 sec *
    Sort: 0.000001 sec *
    Register: 0.017371 sec
    Free: 0.004340 sec
    ============
    TOTAL: 0.095072 sec * = 0.021554 sec (22.67%) estimated savings
    
    RETENTION DATA TIMES
    ----------------------------------
    Read and Process: 61.756945 sec
    ============
    TOTAL: 61.756945 sec
    
    CONFIG VERIFICATION TIMES (* = Potential for speedup with -x option)
    ----------------------------------
    Object Relationships: 0.036206 sec
    Circular Paths: 0.000210 sec *
    Misc: 0.002267 sec
    ============
    TOTAL: 0.038683 sec * = 0.000210 sec (0.5%) estimated savings
    
    EVENT SCHEDULING TIMES
    -------------------------------------
    Get service info: 0.009798 sec
    Get host info info: 0.003589 sec
    Get service params: 0.000020 sec
    Schedule service times: 0.023686 sec
    Schedule service events: 0.011390 sec
    Get host params: 0.000001 sec
    Schedule host times: 0.008895 sec
    Schedule host events: 0.004380 sec
    ============
    TOTAL: 0.061759 sec
    
    HOST SCHEDULING INFORMATION
    ---------------------------
    Total hosts: 922
    Total scheduled hosts: 921
    Host inter-check delay method: SMART
    Average host check interval: 1651.66 sec
    Host inter-check delay: 1.79 sec
    Max host check spread: 30 min
    First scheduled check: Tue May 3 21:42:04 2011
    Last scheduled check: Tue May 3 22:09:33 2011
    
    SERVICE SCHEDULING INFORMATION
    -------------------------------
    Total services: 2513
    Total scheduled services: 2409
    Service inter-check delay method: SMART
    Average service check interval: 18311.36 sec
    Inter-check delay: 0.12 sec
    Interleave factor method: SMART
    Average services per host: 2.73
    Service interleave factor: 3
    Max service check spread: 5 min
    First scheduled check: Tue May 3 21:43:44 2011
    Last scheduled check: Tue May 3 21:48:43 2011
    
    CHECK PROCESSING INFORMATION
    ----------------------------
    Check result reaper interval: 2 sec
    Max concurrent service checks: Unlimited
    Poller 1 :

    Code:
    HOST SCHEDULING INFORMATION
    ---------------------------
    Total hosts: 33
    Total scheduled hosts: 33
    Host inter-check delay method: SMART
    Average host check interval: 270.91 sec
    Host inter-check delay: 8.21 sec
    Max host check spread: 30 min
    First scheduled check: Tue May 3 21:55:36 2011
    Last scheduled check: Tue May 3 21:59:58 2011
    
    
    SERVICE SCHEDULING INFORMATION
    -------------------------------
    Total services: 129
    Total scheduled services: 124
    Service inter-check delay method: SMART
    Average service check interval: 14923.55 sec
    Inter-check delay: 2.42 sec
    Interleave factor method: SMART
    Average services per host: 3.91
    Service interleave factor: 4
    Max service check spread: 5 min
    First scheduled check: Tue May 3 21:56:51 2011
    Last scheduled check: Tue May 3 22:01:48 2011
    Poller 2 :

    Code:
    HOST SCHEDULING INFORMATION
    ---------------------------
    Total hosts: 10
    Total scheduled hosts: 10
    Host inter-check delay method: SMART
    Average host check interval: 3960.00 sec
    Host inter-check delay: 180.00 sec
    Max host check spread: 30 min
    First scheduled check: Tue May 3 21:57:00 2011
    Last scheduled check: Tue May 3 22:24:00 2011
    
    
    SERVICE SCHEDULING INFORMATION
    -------------------------------
    Total services: 25
    Total scheduled services: 24
    Service inter-check delay method: SMART
    Average service check interval: 55100.00 sec
    Inter-check delay: 12.50 sec
    Interleave factor method: SMART
    Average services per host: 2.50
    Service interleave factor: 3
    Max service check spread: 5 min
    First scheduled check: Tue May 3 21:58:40 2011
    Last scheduled check: Tue May 3 22:03:27 2011
    Poller 3 :

    Code:
    HOST SCHEDULING INFORMATION
    ---------------------------
    Total hosts: 62
    Total scheduled hosts: 62
    Host inter-check delay method: SMART
    Average host check interval: 241.94 sec
    Host inter-check delay: 3.90 sec
    Max host check spread: 30 min
    First scheduled check: Tue May 3 21:57:35 2011
    Last scheduled check: Tue May 3 22:01:33 2011
    
    
    SERVICE SCHEDULING INFORMATION
    -------------------------------
    Total services: 319
    Total scheduled services: 319
    Service inter-check delay method: SMART
    Average service check interval: 301.88 sec
    Inter-check delay: 0.94 sec
    Interleave factor method: SMART
    Average services per host: 5.15
    Service interleave factor: 6
    Max service check spread: 5 min
    First scheduled check: Tue May 3 21:58:25 2011
    Last scheduled check: Tue May 3 22:03:28 2011
    Concernant la latence sur le central :

    Check Latency

    Hosts : Min : 0.000 sec Max : 104.956 sec Average : 61.372 sec
    Services : Min : 1.799 sec Max : 105.564 sec Average : 64.435 sec

    Check Execution Time

    Hosts : Min : 0.000 sec : Max : 10.012 sec : Average : 1.387 sec
    Services : Min : 0.015 sec : Max : 60.010 sec : Average : 2.078 sec

    99% des services ont un interval de polling d'un minimum de 5 min. Quelque uns sont à 3 min.

    Tout conseils ou commentaires sur ces valeurs sont les bienvenus.
    Ubuntu server 10.04 LTS 64 Bits - Haute dispo 4 serveurs centraux (réplication MySQL + VIP + Rsync)
    Nagios 3.3.1 | Centreon 2.3.9 | Centreon-Broker 2.1.1 | 2000 hôtes - 5000 services | 6 Remote Pollers

    Dev : CES 2.2 - Centengine - Centreon 2.4.1

  • #2
    Hello,

    Tu as des stats sur l'utilisation du CPU avec mpstat ou sar? dedans rien de spécial?

    Tu as essayé d'identifier les checks lents avec cela :
    http://forum.centreon.com/showthread...3418#post63418

    pas d'autres idées comme ça après :-s

    *edit* : par exemple chez moi c est la supervision des consommables d'imprimantes qui font chuté mes perfs si bien que je me demande si je ne vais pas avoir un spoller juste pour ca
    Raphael
    --
    Bi Intel(R) Xeon(TM) CPU 3.06GHz - 3Gb ram
    Debian
    Nagios® Core™ Version 3.2.1 - Nagios Plugins 1.4.14 - NDO 1.4b9 patché
    Centreon 2.3.4 - Syslog Module 1.3.2 - StatusMap Module 2.5 - NDO Tools Module 0.4 SVN - Nagvis
    Beta tester : centreon-engine - centreon-broker

    Comment


    • #3
      Bonjour,

      Merci pour les conseils. En fait, j'ai en partie identifier le problème. J'ai environ 20% de service down ou unknown et 15% d'host down, en quasi continue. J'ai désactivé le check d'un grand nombre dont je sais qu'il ne seront pas up d'ici là et j'ai fais chuter ma latence.

      Donc, rien d'anormal à priori (si ce n'est que le pourcentage est énorme mais cà c'est un autre problème :-))
      Ubuntu server 10.04 LTS 64 Bits - Haute dispo 4 serveurs centraux (réplication MySQL + VIP + Rsync)
      Nagios 3.3.1 | Centreon 2.3.9 | Centreon-Broker 2.1.1 | 2000 hôtes - 5000 services | 6 Remote Pollers

      Dev : CES 2.2 - Centengine - Centreon 2.4.1

      Comment

      Working...
      X