[03:58:40] RECOVERY - PHP opcache health on mw1407 is OK: OK: opcache is healthy https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_opcache_health [04:26:30] PROBLEM - PHP opcache health on mw1407 is CRITICAL: CRITICAL: opcache cache-hit ratio is below 99.85% https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_opcache_health [05:02:22] RECOVERY - PHP opcache health on mw1407 is OK: OK: opcache is healthy https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_opcache_health [07:12:44] PROBLEM - Disk space on an-coord1001 is CRITICAL: DISK CRITICAL - free space: / 1766 MB (3% inode=86%): /tmp 1766 MB (3% inode=86%): /var/tmp 1766 MB (3% inode=86%): https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=an-coord1001&var-datasource=eqiad+prometheus/ops [07:59:20] good morning [07:59:26] checking an-coord sigh :( [08:00:45] morning elukey [08:15:20] RECOVERY - Disk space on an-coord1001 is OK: DISK OK https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=an-coord1001&var-datasource=eqiad+prometheus/ops [08:30:37] (03PS1) 10VulpesVulpes825: wmf-config/: Adjust MT threshold for Chinese Wikipedia to 70% [mediawiki-config] - 10https://gerrit.wikimedia.org/r/592479 (https://phabricator.wikimedia.org/T246383) [08:37:21] (03PS2) 10VulpesVulpes825: wmf-config/: Adjust MT threshold for Chinese Wikipedia to 70% [mediawiki-config] - 10https://gerrit.wikimedia.org/r/592479 (https://phabricator.wikimedia.org/T246383) [08:39:13] (03CR) 10Elukey: "@Busecolak thanks a lot for your review, I'll follow up asap with your suggestions and I'll send a new patch" (037 comments) [software/druid_exporter] - 10https://gerrit.wikimedia.org/r/592261 (owner: 10Elukey) [08:48:37] (03CR) 10VulpesVulpes825: [C: 03+1] wmf-config/: Adjust MT threshold for Chinese Wikipedia to 70% [mediawiki-config] - 10https://gerrit.wikimedia.org/r/592479 (https://phabricator.wikimedia.org/T246383) (owner: 10VulpesVulpes825) [09:14:10] PROBLEM - Ensure traffic_exporter binds on port 9322 and responds to HTTP requests on cp3058 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server [09:21:32] RECOVERY - Ensure traffic_exporter binds on port 9322 and responds to HTTP requests on cp3058 is OK: HTTP OK: HTTP/1.0 200 OK - 22730 bytes in 0.256 second response time https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server [09:53:06] (03CR) 10Elukey: Refactor the exporter to support metrics specs via config file (035 comments) [software/druid_exporter] - 10https://gerrit.wikimedia.org/r/592261 (owner: 10Elukey) [11:58:17] Hi ! For the french Wikipedia, we have a problem with the visual editor : when we add a galer,: for some people all the links added in references end up being only the first repeated for every instance, and for others saving the modifications doesn't even apply. [11:58:48] an art gallery* [11:59:34] Thus, it can only be corrected with the classic editor [12:00:15] Is it the same for every chapters ? [12:02:18] (03PS3) 10Elukey: Refactor the exporter to support metrics specs via config file [software/druid_exporter] - 10https://gerrit.wikimedia.org/r/592261 [12:05:22] (03CR) 10Elukey: "@Busecolak changes in the new version:" [software/druid_exporter] - 10https://gerrit.wikimedia.org/r/592261 (owner: 10Elukey) [12:10:40] (03CR) 10Busecolak: [C: 04-1] Refactor the exporter to support metrics specs via config file (033 comments) [software/druid_exporter] - 10https://gerrit.wikimedia.org/r/592261 (owner: 10Elukey) [12:31:27] (03CR) 10Busecolak: Refactor the exporter to support metrics specs via config file (032 comments) [software/druid_exporter] - 10https://gerrit.wikimedia.org/r/592261 (owner: 10Elukey) [12:59:15] (03CR) 10Elukey: Refactor the exporter to support metrics specs via config file (032 comments) [software/druid_exporter] - 10https://gerrit.wikimedia.org/r/592261 (owner: 10Elukey) [13:00:35] (03PS4) 10Elukey: Refactor the exporter to support metrics specs via config file [software/druid_exporter] - 10https://gerrit.wikimedia.org/r/592261 [13:29:52] (03CR) 10Busecolak: [C: 03+1] "@Elukey thank you for your effort! I briefly tested your patch, it seem good to me." [software/druid_exporter] - 10https://gerrit.wikimedia.org/r/592261 (owner: 10Elukey) [14:59:14] PROBLEM - Ensure traffic_exporter binds on port 9322 and responds to HTTP requests on cp3058 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server [15:10:20] RECOVERY - Ensure traffic_exporter binds on port 9322 and responds to HTTP requests on cp3058 is OK: HTTP OK: HTTP/1.0 200 OK - 22726 bytes in 0.272 second response time https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server [16:13:51] 10Puppet, 10Wikimedia Meet: Puppetize the jitsi instance - https://phabricator.wikimedia.org/T251040 (10Majavah) [18:02:08] PROBLEM - Hadoop NodeManager on analytics1054 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration [18:03:30] PROBLEM - Host puppetmaster1001 is DOWN: PING CRITICAL - Packet loss = 100% [18:03:36] wow [18:05:52] RECOVERY - Hadoop NodeManager on analytics1054 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration [18:06:13] mgmt serial is frozen for puppetmaster1001 [18:06:43] nothing from getsel [18:07:52] PROBLEM - Widespread puppet agent failures- no resources reported on icinga1001 is CRITICAL: 0.02164 ge 0.01 https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/yOxVDGvWk/puppet [18:08:16] !log powercycle puppetmaster1001 - mgmt serial console not usable, no ssh, racadm getsel doesn't show anything [18:08:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:09:32] RECOVERY - Host puppetmaster1001 is UP: PING WARNING - Packet loss = 80%, RTA = 0.26 ms [18:10:23] welcome back [18:43:28] RECOVERY - Widespread puppet agent failures- no resources reported on icinga1001 is OK: (C)0.01 ge (W)0.006 ge 0.005725 https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/yOxVDGvWk/puppet [20:06:10] PROBLEM - Ensure traffic_exporter binds on port 9322 and responds to HTTP requests on cp3058 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server [20:13:34] RECOVERY - Ensure traffic_exporter binds on port 9322 and responds to HTTP requests on cp3058 is OK: HTTP OK: HTTP/1.0 200 OK - 22738 bytes in 0.257 second response time https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server [20:39:52] PROBLEM - Ensure traffic_exporter binds on port 9322 and responds to HTTP requests on cp3058 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server [20:47:08] RECOVERY - Ensure traffic_exporter binds on port 9322 and responds to HTTP requests on cp3058 is OK: HTTP OK: HTTP/1.0 200 OK - 22749 bytes in 0.256 second response time https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server [21:41:16] PROBLEM - Host mw1280 is DOWN: PING CRITICAL - Packet loss = 100% [22:14:52] PROBLEM - Router interfaces on cr2-esams is CRITICAL: CRITICAL: host 91.198.174.244, interfaces up: 85, down: 1, dormant: 0, excluded: 2, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [22:15:50] PROBLEM - Router interfaces on cr2-eqiad is CRITICAL: CRITICAL: host 208.80.154.197, interfaces up: 240, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [23:27:22] RECOVERY - Check whether microcode mitigations for CPU vulnerabilities are applied on puppetmaster1001 is OK: OK - All expected CPU flags found https://wikitech.wikimedia.org/wiki/Microcode [23:44:54] ACKNOWLEDGEMENT - Router interfaces on cr2-eqiad is CRITICAL: CRITICAL: host 208.80.154.197, interfaces up: 240, down: 1, dormant: 0, excluded: 0, unused: 0: CDanis Scheduled Maintenance #: 18531336 - The acknowledgement expires at: 2020-04-27 04:44:30. https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [23:44:54] ACKNOWLEDGEMENT - Router interfaces on cr2-esams is CRITICAL: CRITICAL: host 91.198.174.244, interfaces up: 85, down: 1, dormant: 0, excluded: 2, unused: 0: CDanis Scheduled Maintenance #: 18531336 - The acknowledgement expires at: 2020-04-27 04:44:30. https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [23:49:15] (03PS1) 10Zoranzoki21: Update logos for for tiwiki and tiwikt (part I) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/592506 (https://phabricator.wikimedia.org/T249451) [23:53:21] (03PS1) 10Zoranzoki21: Update logos for for tiwiki and tiwikt (part II) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/592507 (https://phabricator.wikimedia.org/T249451) [23:54:23] (03CR) 10jerkins-bot: [V: 04-1] Update logos for for tiwiki and tiwikt (part II) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/592507 (https://phabricator.wikimedia.org/T249451) (owner: 10Zoranzoki21) [23:55:17] (03PS2) 10Zoranzoki21: Update logos for for tiwiki and tiwikt (part I) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/592506 (https://phabricator.wikimedia.org/T150618) [23:55:25] (03PS2) 10Zoranzoki21: Update logos for for tiwiki and tiwikt (part II) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/592507 (https://phabricator.wikimedia.org/T249451) [23:55:35] (03PS3) 10Zoranzoki21: Update logos for for tiwiki and tiwikt (part II) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/592507 (https://phabricator.wikimedia.org/T150618) [23:55:47] (03PS4) 10Zoranzoki21: Update logos for for tiwiki and tiwikt (part II) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/592507 (https://phabricator.wikimedia.org/T150618) [23:56:57] (03CR) 10jerkins-bot: [V: 04-1] Update logos for for tiwiki and tiwikt (part II) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/592507 (https://phabricator.wikimedia.org/T150618) (owner: 10Zoranzoki21) [23:58:46] (03PS5) 10Zoranzoki21: Update logos for for tiwiki and tiwikt (part II) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/592507 (https://phabricator.wikimedia.org/T150618) [23:59:29] (03PS6) 10Zoranzoki21: Update logos for for tiwiki and tiwikt (part II) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/592507 (https://phabricator.wikimedia.org/T150618)