[01:02:10] <icinga-wm_>	 PROBLEM - Check health of redis instance on 6381 on rdb2003 is CRITICAL: CRITICAL: replication_delay is 1499562124 600 - REDIS 2.8.17 on 127.0.0.1:6381 has 1 databases (db0) with 9229041 keys, up 2 minutes 2 seconds - replication_delay is 1499562124
[01:02:20] <icinga-wm_>	 PROBLEM - Check health of redis instance on 6379 on rdb2003 is CRITICAL: CRITICAL: replication_delay is 1499562137 600 - REDIS 2.8.17 on 127.0.0.1:6379 has 1 databases (db0) with 9322319 keys, up 2 minutes 15 seconds - replication_delay is 1499562137
[01:03:00] <icinga-wm_>	 PROBLEM - Check health of redis instance on 6380 on rdb2003 is CRITICAL: CRITICAL ERROR - Redis Library - can not ping 127.0.0.1 on port 6380
[01:03:10] <icinga-wm_>	 PROBLEM - Check health of redis instance on 6379 on rdb2001 is CRITICAL: CRITICAL ERROR - Redis Library - can not ping 127.0.0.1 on port 6379
[01:03:20] <icinga-wm_>	 RECOVERY - Check health of redis instance on 6379 on rdb2003 is OK: OK: REDIS 2.8.17 on 127.0.0.1:6379 has 1 databases (db0) with 9318435 keys, up 3 minutes 16 seconds - replication_delay is 0
[01:04:00] <icinga-wm_>	 RECOVERY - Check health of redis instance on 6380 on rdb2003 is OK: OK: REDIS 2.8.17 on 127.0.0.1:6380 has 1 databases (db0) with 9318649 keys, up 3 minutes 53 seconds - replication_delay is 0
[01:04:10] <icinga-wm_>	 RECOVERY - Check health of redis instance on 6379 on rdb2001 is OK: OK: REDIS 2.8.17 on 127.0.0.1:6379 has 1 databases (db0) with 9321480 keys, up 3 minutes 59 seconds - replication_delay is 0
[01:04:10] <icinga-wm_>	 RECOVERY - Check health of redis instance on 6381 on rdb2003 is OK: OK: REDIS 2.8.17 on 127.0.0.1:6381 has 1 databases (db0) with 9224469 keys, up 4 minutes 5 seconds - replication_delay is 0
[01:07:00] <icinga-wm_>	 PROBLEM - Text HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0]
[01:08:00] <icinga-wm_>	 PROBLEM - Ulsfo HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0]
[01:09:20] <icinga-wm_>	 PROBLEM - Codfw HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0]
[01:10:20] <icinga-wm_>	 PROBLEM - Upload HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0]
[01:14:00] <icinga-wm_>	 RECOVERY - Text HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0]
[01:15:20] <icinga-wm_>	 RECOVERY - Codfw HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0]
[01:15:20] <icinga-wm_>	 RECOVERY - Upload HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0]
[01:16:00] <icinga-wm_>	 RECOVERY - Ulsfo HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0]
[03:33:00] <icinga-wm_>	 PROBLEM - puppet last run on mw2222 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/share/GeoIP/GeoIP2-City.mmdb.gz]
[03:33:50] <icinga-wm_>	 PROBLEM - puppet last run on mw2135 is CRITICAL: CRITICAL: Puppet has 2 failures. Last run 2 minutes ago with 2 failures. Failed resources (up to 3 shown): File[/usr/share/GeoIP/GeoIP2-City.mmdb.test],File[/usr/share/GeoIP/GeoIP2-City.mmdb.gz]
[04:00:30] <icinga-wm_>	 RECOVERY - puppet last run on mw2222 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures
[04:01:00] <icinga-wm_>	 RECOVERY - puppet last run on mw2135 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures
[04:08:40] <icinga-wm_>	 PROBLEM - mailman I/O stats on fermium is CRITICAL: CRITICAL - I/O stats: Transfers/Sec=449.50 Read Requests/Sec=3833.00 Write Requests/Sec=0.20 KBytes Read/Sec=35971.20 KBytes_Written/Sec=12.00
[04:15:40] <icinga-wm_>	 RECOVERY - mailman I/O stats on fermium is OK: OK - I/O stats: Transfers/Sec=55.30 Read Requests/Sec=4.40 Write Requests/Sec=37.80 KBytes Read/Sec=17.60 KBytes_Written/Sec=5955.60
[05:34:40] <icinga-wm_>	 PROBLEM - nova-compute process on labvirt1013 is CRITICAL: PROCS CRITICAL: 2 processes with regex args ^/usr/bin/python /usr/bin/nova-compute
[05:35:40] <icinga-wm_>	 RECOVERY - nova-compute process on labvirt1013 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/nova-compute
[05:46:30] <wikibugs>	 10Operations, 10DBA, 10MediaWiki-extensions-WikibaseClient, 10Performance-Team, and 5 others: Cache invalidations coming from the JobQueue are causing lag on several wikis - https://phabricator.wikimedia.org/T164173#3418483 (10Ladsgroup) a:03Ladsgroup
[05:50:22] <wikibugs>	 10Operations, 10DBA, 10MediaWiki-extensions-WikibaseClient, 10Performance-Team, and 6 others: Cache invalidations coming from the JobQueue are causing lag on several wikis - https://phabricator.wikimedia.org/T164173#3418489 (10Ladsgroup) >>! In T164173#3413805, @Krinkle wrote: >   * PageUpdater::purgeParse...
[06:48:30] <icinga-wm_>	 PROBLEM - Text HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [1000.0]
[06:50:30] <icinga-wm_>	 PROBLEM - puppet last run on sca2004 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 3 minutes ago with 1 failures. Failed resources (up to 3 shown): Package[scap]
[06:50:30] <icinga-wm_>	 PROBLEM - Ulsfo HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [1000.0]
[06:55:30] <icinga-wm_>	 RECOVERY - Ulsfo HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0]
[06:56:30] <icinga-wm_>	 RECOVERY - Text HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0]
[07:16:50] <icinga-wm_>	 RECOVERY - puppet last run on sca2004 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures
[07:37:00] <wikibugs>	 (03PS2) 10Smalyshev: Index deletes everywhere [mediawiki-config] - 10https://gerrit.wikimedia.org/r/363669 (https://phabricator.wikimedia.org/T163235)
[07:47:00] <icinga-wm_>	 PROBLEM - cassandra-a SSL 10.192.16.176:7001 on restbase2007 is CRITICAL: SSL CRITICAL - failed to connect or SSL handshake:Connection refused
[07:47:00] <icinga-wm_>	 PROBLEM - cassandra-a CQL 10.192.16.176:9042 on restbase2007 is CRITICAL: connect to address 10.192.16.176 and port 9042: Connection refused
[07:48:50] <icinga-wm_>	 PROBLEM - cassandra-a service on restbase2007 is CRITICAL: CRITICAL - Expecting active but unit cassandra-a is failed
[07:48:50] <icinga-wm_>	 PROBLEM - Check systemd state on restbase2007 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed.
[08:01:50] <icinga-wm_>	 RECOVERY - Check systemd state on restbase2007 is OK: OK - running: The system is fully operational
[08:02:50] <icinga-wm_>	 RECOVERY - cassandra-a service on restbase2007 is OK: OK - cassandra-a is active
[08:04:10] <icinga-wm_>	 RECOVERY - cassandra-a SSL 10.192.16.176:7001 on restbase2007 is OK: SSL OK - Certificate restbase2007-a valid until 2017-09-12 15:35:50 +0000 (expires in 65 days)
[08:05:00] <icinga-wm_>	 RECOVERY - cassandra-a CQL 10.192.16.176:9042 on restbase2007 is OK: TCP OK - 0.036 second response time on 10.192.16.176 port 9042
[10:17:30] <icinga-wm_>	 PROBLEM - cassandra-c SSL 10.192.48.70:7001 on restbase2012 is CRITICAL: SSL CRITICAL - failed to connect or SSL handshake:Connection refused
[10:18:00] <icinga-wm_>	 PROBLEM - cassandra-c CQL 10.192.48.70:9042 on restbase2012 is CRITICAL: connect to address 10.192.48.70 and port 9042: Connection refused
[10:18:11] <icinga-wm_>	 PROBLEM - Check systemd state on restbase2012 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed.
[10:18:20] <icinga-wm_>	 PROBLEM - cassandra-c service on restbase2012 is CRITICAL: CRITICAL - Expecting active but unit cassandra-c is failed
[10:24:20] <icinga-wm_>	 RECOVERY - Check systemd state on restbase2012 is OK: OK - running: The system is fully operational
[10:24:30] <icinga-wm_>	 RECOVERY - cassandra-c service on restbase2012 is OK: OK - cassandra-c is active
[10:24:50] <icinga-wm_>	 RECOVERY - cassandra-c SSL 10.192.48.70:7001 on restbase2012 is OK: SSL OK - Certificate restbase2012-c valid until 2017-11-17 00:54:34 +0000 (expires in 130 days)
[10:25:00] <icinga-wm_>	 RECOVERY - cassandra-c CQL 10.192.48.70:9042 on restbase2012 is OK: TCP OK - 0.036 second response time on 10.192.48.70 port 9042
[10:51:40] <wikibugs>	 (03CR) 10Ladsgroup: "Daniel: It includes these rules in line 95 of https://gerrit.wikimedia.org/r/#/c/361801/2/modules/mediawiki/files/apache/sites/main.conf :" [puppet] - 10https://gerrit.wikimedia.org/r/357985 (https://phabricator.wikimedia.org/T119536) (owner: 10Ladsgroup)
[12:40:35] <wikibugs>	 (03PS1) 10Elukey: Remove ladsgroup from production access [puppet] - 10https://gerrit.wikimedia.org/r/364102
[12:42:37] <wikibugs>	 (03CR) 10Elukey: [C: 032] Remove ladsgroup from production access [puppet] - 10https://gerrit.wikimedia.org/r/364102 (owner: 10Elukey)
[13:25:37] <wikibugs>	 (03PS1) 10Elukey: Set ladsgroup as absented user [puppet] - 10https://gerrit.wikimedia.org/r/364104
[13:27:31] <wikibugs>	 (03CR) 10Elukey: [C: 032] Set ladsgroup as absented user [puppet] - 10https://gerrit.wikimedia.org/r/364104 (owner: 10Elukey)
[15:56:39] <wikibugs>	 (03PS1) 10Framawiki: Set $wgUploadNavigationUrl for fr.wikt [mediawiki-config] - 10https://gerrit.wikimedia.org/r/364115 (https://phabricator.wikimedia.org/T170083)
[16:30:20] <icinga-wm_>	 PROBLEM - Apache HTTP on mw2235 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[16:31:10] <icinga-wm_>	 RECOVERY - Apache HTTP on mw2235 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 612 bytes in 0.110 second response time
[16:35:20] <wikibugs>	 10Operations, 10vm-requests, 10Patch-For-Review: Site: 2 VM request for tendril (switch tendril from einsteinium to dbmonitor*) - https://phabricator.wikimedia.org/T149557#3418902 (10Dzahn) can be closed as resolved now?
[16:44:11] <wikibugs>	 (03Abandoned) 10Framawiki: Set $wgUploadNavigationUrl for fr.wikt [mediawiki-config] - 10https://gerrit.wikimedia.org/r/364115 (https://phabricator.wikimedia.org/T170083) (owner: 10Framawiki)
[17:24:30] <icinga-wm_>	 PROBLEM - HHVM rendering on mw2199 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[17:25:20] <icinga-wm_>	 RECOVERY - HHVM rendering on mw2199 is OK: HTTP OK: HTTP/1.1 200 OK - 74898 bytes in 0.341 second response time
[18:08:12] <wikibugs>	 (03PS1) 10Framawiki: Set $wgUploadNavigationUrl for few wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/364121 (https://phabricator.wikimedia.org/T170083)
[20:42:14] <wikibugs>	 10Operations, 10Wikimedia-General-or-Unknown: Icinga has httpauth on (not accessible for public) - https://phabricator.wikimedia.org/T62112#661810 (10Luke081515) Did something changed here in over two years since icinga is login-only?
[21:32:24] <wikibugs>	 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Patch-For-Review: Create Dinka Wikipedia - https://phabricator.wikimedia.org/T168518#3419139 (10Urbanecm) Hello, I don't think so. There is no blocked AFAIK. @dereckson, do you know what is next step? I think it is reserving an window a...
[21:35:58] <wikibugs>	 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Patch-For-Review: Create Dinka Wikipedia - https://phabricator.wikimedia.org/T168518#3419141 (10Dereckson) Thanks @Urbanecm for the update, I haven't seen this ticket yet.  >>! In T168518#3413280, @Amire80 wrote: > Hi, >  > It seems to...
[21:44:58] <wikibugs>	 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Patch-For-Review, 10User-Urbanecm: Create Dinka Wikipedia - https://phabricator.wikimedia.org/T168518#3419153 (10Urbanecm)
[22:05:14] <wikibugs>	 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Hindi-Sites: Create Wikiversity Hindi - https://phabricator.wikimedia.org/T168765#3419162 (10Dereckson) 05Open>03stalled
[22:16:40] <wikibugs>	 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Hindi-Sites: Create Wikiversity Hindi - https://phabricator.wikimedia.org/T168765#3419180 (10Dereckson) Discussions occurred [[ https://lists.wikimedia.org/pipermail/langcom/2017-June/thread.html | end June ]] and [[ https://lists.wikim...
[22:27:03] <wikibugs>	 (03PS1) 10Urbanecm: Add import sources for specieswiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/364131 (https://phabricator.wikimedia.org/T170094)
[22:35:55] <wikibugs>	 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Hindi-Sites: Create Wikiversity Hindi - https://phabricator.wikimedia.org/T168765#3419191 (10Koavf) How can I ensure that the Committee sees my concerns? Posting to Meta, here, the mailing list?
[22:52:09] <wikibugs>	 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Hindi-Sites: Create Wikiversity Hindi - https://phabricator.wikimedia.org/T168765#3376126 (10Urbanecm) >>! In T168765#3419191, @Koavf wrote: > How can I ensure that the Committee sees my concerns? Posting to Meta, here, the mailing list...