[00:40:02] <wikibugs>	 06Operations, 10Phabricator: Set up Yubikey support in Phabricator - https://phabricator.wikimedia.org/T134672#2272805 (10mmodell)
[01:10:24] <wikibugs>	 06Operations, 10Phabricator: Set up Yubikey support in Phabricator - https://phabricator.wikimedia.org/T134672#2272828 (10Krenair) Source makes several references to YubiCloud...
[01:34:21] <wikibugs>	 06Operations, 10Traffic, 10Wikidata: Varnish seems to sometimes mangle uncompressed API results - https://phabricator.wikimedia.org/T133866#2272839 (10MZMcBride) Copying @Anomie's comment from T132159:  > While investigating T132123, I discovered that responses with lengths near to multiples of 32768 bytes w...
[01:39:37] <wikibugs>	 10Ops-Access-Reviews, 05acl*operations-team: ops access request (T123158) - https://phabricator.wikimedia.org/T123159#2272843 (10ori)
[02:25:34] <logmsgbot>	 !log mwdeploy@tin sync-l10n completed (1.27.0-wmf.22) (duration: 09m 52s)
[02:25:42] <morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master
[02:43:00] <logmsgbot>	 !log mwdeploy@tin sync-l10n completed (1.27.0-wmf.23) (duration: 08m 19s)
[02:43:07] <morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master
[02:52:46] <logmsgbot>	 !log l10nupdate@tin ResourceLoader cache refresh completed at Sun May  8 02:52:46 UTC 2016 (duration 9m 47s)
[02:52:53] <morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master
[03:35:46] <icinga-wm>	 PROBLEM - puppet last run on db2051 is CRITICAL: CRITICAL: Puppet has 1 failures
[03:37:26] <icinga-wm>	 PROBLEM - puppet last run on mw2054 is CRITICAL: CRITICAL: Puppet has 1 failures
[03:38:56] <icinga-wm>	 PROBLEM - puppet last run on furud is CRITICAL: CRITICAL: Puppet has 1 failures
[04:02:36] <icinga-wm>	 RECOVERY - puppet last run on mw2054 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[04:02:46] <icinga-wm>	 RECOVERY - puppet last run on db2051 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[04:06:07] <icinga-wm>	 RECOVERY - puppet last run on furud is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[04:54:17] <icinga-wm>	 PROBLEM - puppet last run on cp3031 is CRITICAL: CRITICAL: puppet fail
[05:16:38] <icinga-wm>	 PROBLEM - puppet last run on mw1221 is CRITICAL: CRITICAL: Puppet has 1 failures
[05:21:18] <icinga-wm>	 RECOVERY - puppet last run on cp3031 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures
[05:31:58] <icinga-wm>	 PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [1000.0]
[05:35:04] <icinga-wm>	 PROBLEM - Text HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [1000.0]
[05:38:44] <icinga-wm>	 RECOVERY - Text HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0]
[05:41:54] <icinga-wm>	 RECOVERY - Esams HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0]
[05:42:23] <icinga-wm>	 RECOVERY - puppet last run on mw1221 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[06:07:44] <gehel>	 !log restarting elasticsearch server elastic1015.eqiad.wmnet (T110236)
[06:07:45] <stashbot>	 T110236: Use unicast instead of multicast for node communication - https://phabricator.wikimedia.org/T110236
[06:07:50] <morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master
[06:30:46] <icinga-wm>	 PROBLEM - Redis status tcp_6479 on rdb2006 is CRITICAL: CRITICAL: replication_delay is 679 600 - REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5973297 keys - replication_delay is 679
[06:30:47] <icinga-wm>	 PROBLEM - puppet last run on mw2036 is CRITICAL: CRITICAL: Puppet has 1 failures
[06:31:06] <icinga-wm>	 PROBLEM - puppet last run on mw2129 is CRITICAL: CRITICAL: Puppet has 1 failures
[06:31:16] <icinga-wm>	 PROBLEM - puppet last run on mw2023 is CRITICAL: CRITICAL: Puppet has 1 failures
[06:31:57] <icinga-wm>	 PROBLEM - puppet last run on mw2158 is CRITICAL: CRITICAL: Puppet has 1 failures
[06:31:58] <icinga-wm>	 PROBLEM - puppet last run on mw1110 is CRITICAL: CRITICAL: Puppet has 1 failures
[06:32:17] <icinga-wm>	 PROBLEM - puppet last run on mw2126 is CRITICAL: CRITICAL: Puppet has 1 failures
[06:32:56] <icinga-wm>	 PROBLEM - puppet last run on mw2073 is CRITICAL: CRITICAL: Puppet has 2 failures
[06:38:27] <icinga-wm>	 RECOVERY - Redis status tcp_6479 on rdb2006 is OK: OK: REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5944375 keys - replication_delay is 0
[06:55:57] <icinga-wm>	 RECOVERY - puppet last run on mw2036 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures
[06:56:18] <icinga-wm>	 RECOVERY - puppet last run on mw2023 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures
[06:56:58] <icinga-wm>	 RECOVERY - puppet last run on mw2158 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures
[06:57:06] <icinga-wm>	 RECOVERY - puppet last run on mw1110 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures
[06:57:26] <icinga-wm>	 RECOVERY - puppet last run on mw2126 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures
[06:58:06] <icinga-wm>	 RECOVERY - puppet last run on mw2073 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[06:58:07] <icinga-wm>	 RECOVERY - puppet last run on mw2129 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[06:59:39] <icinga-wm>	 PROBLEM - tools-home on tools.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Not Available - 530 bytes in 0.034 second response time
[07:07:33] <icinga-wm>	 RECOVERY - tools-home on tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 825023 bytes in 3.707 second response time
[07:10:38] <icinga-wm>	 PROBLEM - puppet last run on mw2059 is CRITICAL: CRITICAL: puppet fail
[07:20:58] <icinga-wm>	 PROBLEM - puppet last run on db2047 is CRITICAL: CRITICAL: puppet fail
[07:37:55] <icinga-wm>	 RECOVERY - puppet last run on mw2059 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures
[07:44:16] <gehel>	 !log restarting elasticsearch server elastic1016.eqiad.wmnet (T110236)
[07:44:17] <stashbot>	 T110236: Use unicast instead of multicast for node communication - https://phabricator.wikimedia.org/T110236
[07:44:23] <morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master
[07:47:55] <icinga-wm>	 RECOVERY - puppet last run on db2047 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures
[08:07:07] <icinga-wm>	 PROBLEM - Apache HTTP on mw1251 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 50404 bytes in 0.009 second response time
[08:08:58] <icinga-wm>	 RECOVERY - Apache HTTP on mw1251 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 626 bytes in 0.029 second response time
[08:30:08] <icinga-wm>	 PROBLEM - Disk space on pollux is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[08:30:27] <icinga-wm>	 PROBLEM - DPKG on pollux is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[08:30:38] <icinga-wm>	 PROBLEM - salt-minion processes on pollux is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[08:31:18] <icinga-wm>	 PROBLEM - configured eth on pollux is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[08:31:18] <icinga-wm>	 PROBLEM - RAID on pollux is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[08:31:28] <icinga-wm>	 PROBLEM - dhclient process on pollux is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[08:31:48] <icinga-wm>	 PROBLEM - Check size of conntrack table on pollux is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[08:31:48] <icinga-wm>	 PROBLEM - puppet last run on pollux is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[08:32:12] <icinga-wm>	 PROBLEM - Corp OIT LDAP Mirror on pollux is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[08:33:48] <icinga-wm>	 PROBLEM - wikidata.org dispatch lag is higher than 300s on wikidata is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - pattern not found - 1695 bytes in 0.222 second response time
[08:34:38] <icinga-wm>	 RECOVERY - salt-minion processes on pollux is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion
[08:35:17] <icinga-wm>	 RECOVERY - configured eth on pollux is OK: OK - interfaces up
[08:35:17] <icinga-wm>	 RECOVERY - RAID on pollux is OK: OK: no RAID installed
[08:35:27] <icinga-wm>	 RECOVERY - dhclient process on pollux is OK: PROCS OK: 0 processes with command name dhclient
[08:35:38] <icinga-wm>	 RECOVERY - Check size of conntrack table on pollux is OK: OK: nf_conntrack is 0 % full
[08:35:47] <icinga-wm>	 RECOVERY - puppet last run on pollux is OK: OK: Puppet is currently enabled, last run 28 minutes ago with 0 failures
[08:36:03] <icinga-wm>	 RECOVERY - Corp OIT LDAP Mirror on pollux is OK: LDAP OK - 0.117 seconds response time
[08:36:03] <icinga-wm>	 RECOVERY - Disk space on pollux is OK: DISK OK
[08:36:18] <icinga-wm>	 RECOVERY - DPKG on pollux is OK: All packages OK
[08:39:47] <icinga-wm>	 RECOVERY - wikidata.org dispatch lag is higher than 300s on wikidata is OK: HTTP OK: HTTP/1.1 200 OK - 1683 bytes in 0.200 second response time
[08:42:07] <icinga-wm>	 PROBLEM - Redis status tcp_6479 on rdb2006 is CRITICAL: CRITICAL ERROR - Redis Library - can not ping 10.192.48.44 on port 6479
[08:43:58] <icinga-wm>	 RECOVERY - Redis status tcp_6479 on rdb2006 is OK: OK: REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5950584 keys - replication_delay is 0
[08:46:03] <wikibugs>	 06Operations, 10Mail, 10MediaWiki-Email: Wiki-Mail sent but never delivered - https://phabricator.wikimedia.org/T134674#2272981 (10MarcoAurelio) @01tonythomas Thank you again. My account is not live or yahoo. I'll wait for instructions from #operations then.
[09:59:57] <icinga-wm>	 PROBLEM - Text HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [1000.0]
[10:09:47] <icinga-wm>	 RECOVERY - Text HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0]
[12:52:03] <icinga-wm>	 PROBLEM - Disk space on elastic1016 is CRITICAL: DISK CRITICAL - free space: /var/lib/elasticsearch 80426 MB (15% inode=99%)
[13:00:04] <icinga-wm>	 PROBLEM - Apache HTTP on mw1017 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[13:04:18] <icinga-wm>	 RECOVERY - Apache HTTP on mw1017 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 628 bytes in 1.714 second response time
[13:04:28] <icinga-wm>	 PROBLEM - HHVM rendering on mw1017 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[13:08:17] <icinga-wm>	 PROBLEM - Apache HTTP on mw1017 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[13:10:07] <icinga-wm>	 RECOVERY - HHVM rendering on mw1017 is OK: HTTP OK: HTTP/1.1 200 OK - 67988 bytes in 0.483 second response time
[13:10:07] <icinga-wm>	 RECOVERY - Apache HTTP on mw1017 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 628 bytes in 2.497 second response time
[13:12:38] <icinga-wm>	 PROBLEM - puppet last run on mw2029 is CRITICAL: CRITICAL: puppet fail
[13:17:53] <grrrit-wm>	 (03PS1) 10BBlack: debug_proxy: allow all WMF networks [puppet] - 10https://gerrit.wikimedia.org/r/287440 
[13:18:05] <grrrit-wm>	 (03CR) 10BBlack: [C: 032 V: 032] debug_proxy: allow all WMF networks [puppet] - 10https://gerrit.wikimedia.org/r/287440 (owner: 10BBlack)
[13:40:09] <icinga-wm>	 RECOVERY - puppet last run on mw2029 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[14:09:41] <Danny_B>	 valhallasw`cloud: around?
[14:11:56] <valhallasw`cloud>	 Danny_B: yes
[14:11:56] <wikibugs>	 06Operations, 06Commons, 10MediaWiki-File-management, 06Multimedia, 07Regression: image magick stripping colour profile of PNG files [probably regression] - https://phabricator.wikimedia.org/T113123#2273285 (10Danny_B)
[14:12:24] <Danny_B>	 valhallasw`cloud: i was impatient... ;-) but see query anyway, pls...
[14:42:10] <icinga-wm>	 PROBLEM - puppet last run on elastic1022 is CRITICAL: CRITICAL: puppet fail
[15:07:11] <icinga-wm>	 RECOVERY - puppet last run on elastic1022 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures
[15:39:10] <grrrit-wm>	 (03PS1) 10Yuvipanda: k8s: Verify pause container is correct on each build [puppet] - 10https://gerrit.wikimedia.org/r/287444 
[15:39:37] <grrrit-wm>	 (03PS2) 10Yuvipanda: k8s: Verify pause container is correct on each build [puppet] - 10https://gerrit.wikimedia.org/r/287444 
[15:41:32] <grrrit-wm>	 (03CR) 10Yuvipanda: [C: 032] k8s: Verify pause container is correct on each build [puppet] - 10https://gerrit.wikimedia.org/r/287444 (owner: 10Yuvipanda)
[15:44:22] <YuviPanda>	 aaaa fucking stupid 'y' bug in bash. may bash 'scripts' in general fucking rot in hell
[15:44:51] <YuviPanda>	 (different hell than the one I'm going to)
[15:46:07] <YuviPanda>	 maybe they're already roting in hell, and I'm in hell, which is why I'm encountering them.
[15:54:15] <grrrit-wm>	 (03PS1) 10Yuvipanda: k8s: Fixup to previous commit [puppet] - 10https://gerrit.wikimedia.org/r/287445 
[15:56:04] <grrrit-wm>	 (03PS2) 10Yuvipanda: k8s: Fixup to previous commit [puppet] - 10https://gerrit.wikimedia.org/r/287445 
[16:01:58] <grrrit-wm>	 (03CR) 10Yuvipanda: [C: 032] k8s: Fixup to previous commit [puppet] - 10https://gerrit.wikimedia.org/r/287445 (owner: 10Yuvipanda)
[17:48:54] <sjoerddebruin>	 https://en.wikipedia.org/w/index.php?title=MediaWiki:Wdsearch-autodesc.js&action=raw&ctype=text/javascript doesn't seem to load
[17:49:42] <sjoerddebruin>	 i can't access whole enwiki
[17:51:57] <sjoerddebruin>	 hm, weird browser hiccup
[18:05:15] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[18:09:54] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[18:10:55] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs1001 is OK: All endpoints are healthy
[18:11:36] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs1003 is OK: All endpoints are healthy
[18:58:30] <grrrit-wm>	 (03PS1) 10Yuvipanda: k8s: Allow pod infra image to be overriden with hiera [puppet] - 10https://gerrit.wikimedia.org/r/287501 (https://phabricator.wikimedia.org/T133873) 
[18:58:51] <grrrit-wm>	 (03CR) 10jenkins-bot: [V: 04-1] k8s: Allow pod infra image to be overriden with hiera [puppet] - 10https://gerrit.wikimedia.org/r/287501 (https://phabricator.wikimedia.org/T133873) (owner: 10Yuvipanda)
[19:01:35] <grrrit-wm>	 (03PS2) 10Yuvipanda: k8s: Allow pod infra image to be overriden with hiera [puppet] - 10https://gerrit.wikimedia.org/r/287501 (https://phabricator.wikimedia.org/T133873) 
[19:09:44] <grrrit-wm>	 (03CR) 10Yuvipanda: [C: 032] k8s: Allow pod infra image to be overriden with hiera [puppet] - 10https://gerrit.wikimedia.org/r/287501 (https://phabricator.wikimedia.org/T133873) (owner: 10Yuvipanda)
[19:10:32] <icinga-wm>	 PROBLEM - Redis status tcp_6479 on rdb2006 is CRITICAL: CRITICAL: replication_delay is 711 600 - REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5997619 keys - replication_delay is 711
[19:20:01] <icinga-wm>	 RECOVERY - Redis status tcp_6479 on rdb2006 is OK: OK: REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5971742 keys - replication_delay is 0
[19:39:23] <Josve05a>	 Stupid 8 lagged databases making v.wp in read mode only onceevery 15 minutes... gah...
[19:39:27] <Josve05a>	 sv*
[19:42:04] <Krenair>	 svwiki is on s2
[19:42:45] <Krenair>	 I don't see any significant lag
[19:43:34] <Josve05a>	 AWB gets stuck once every 15 min. SOme times for "slave database cathing up wit master" "Sv.wp is in read only mode" and 8 lagged databases.." something-something...
[19:43:37] <Josve05a>	 a bit annoying...
[19:43:47] <Josve05a>	 with*
[19:43:49] <Krenair>	 8 lagged databases?
[19:43:57] <Josve05a>	 That's what it said.
[19:43:59] <Krenair>	 next time it happens, give me the exact error message
[19:44:10] <Josve05a>	 Sure thing
[20:09:42] <icinga-wm>	 PROBLEM - Redis status tcp_6479 on rdb2006 is CRITICAL: CRITICAL: replication_delay is 643 600 - REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5978193 keys - replication_delay is 643
[20:16:11] <icinga-wm>	 PROBLEM - puppet last run on dbstore1001 is CRITICAL: CRITICAL: Puppet has 1 failures
[20:34:45] <icinga-wm>	 RECOVERY - Redis status tcp_6479 on rdb2006 is OK: OK: REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5975869 keys - replication_delay is 0
[20:42:24] <icinga-wm>	 RECOVERY - puppet last run on dbstore1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[21:59:39] <icinga-wm>	 PROBLEM - Host pc2006 is DOWN: PING CRITICAL - Packet loss = 100%
[21:59:50] <icinga-wm>	 PROBLEM - Redis status tcp_6479 on rdb2006 is CRITICAL: CRITICAL: replication_delay is 626 600 - REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5987493 keys - replication_delay is 626
[22:17:20] <icinga-wm>	 RECOVERY - Redis status tcp_6479 on rdb2006 is OK: OK: REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5981316 keys - replication_delay is 0
[22:26:48] <paladox>	 Hi it seems that grrrit-wm bot has quit irc.
[22:26:48] <paladox>	 * grrrit-wm has quit (Remote host closed the connection)
[22:26:53] <paladox>	 and hasent rejoined.