[00:04:23] <icinga-wm>	 PROBLEM - HHVM rendering on mw2219 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[00:05:22] <icinga-wm>	 RECOVERY - HHVM rendering on mw2219 is OK: HTTP OK: HTTP/1.1 200 OK - 74752 bytes in 1.535 second response time
[00:17:48] <wikibugs>	 10Operations, 10Traffic, 10Accessibility, 10Browser-Support-Internet-Explorer: Wikipedia no longer accessible to those using some braille devices - https://phabricator.wikimedia.org/T185582#3943836 (10Cameron11598) Sent!
[00:30:22] <icinga-wm>	 PROBLEM - HHVM rendering on mw2220 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[00:31:12] <icinga-wm>	 RECOVERY - HHVM rendering on mw2220 is OK: HTTP OK: HTTP/1.1 200 OK - 74750 bytes in 0.301 second response time
[00:48:07] <wikibugs>	 10Operations, 10Cloud-Services, 10netops: Intermittent bandwidth issue to labs proxy (eqiad) from Comcast in Portland OR - https://phabricator.wikimedia.org/T136671#3943871 (10brion) 05Resolved>03Open I'm encountering this problem again; the routes seem to have changed but symptoms are similar -- I see a...
[00:54:48] * brion blames comcast, but it might be telia :D
[01:48:23] <icinga-wm>	 PROBLEM - HHVM rendering on mw2223 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[01:49:22] <icinga-wm>	 RECOVERY - HHVM rendering on mw2223 is OK: HTTP OK: HTTP/1.1 200 OK - 74493 bytes in 0.298 second response time
[03:26:22] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 870.19 seconds
[03:58:23] <icinga-wm>	 RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 47.14 seconds
[04:25:43] <icinga-wm>	 PROBLEM - Check systemd state on conf2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed.
[04:26:02] <icinga-wm>	 PROBLEM - etcdmirror-conftool-eqiad-wmnet service on conf2002 is CRITICAL: CRITICAL - Expecting active but unit etcdmirror-conftool-eqiad-wmnet is failed
[04:26:15] <icinga-wm>	 PROBLEM - Etcd replication lag on conf2002 is CRITICAL: connect to address 10.192.32.141 and port 8000: Connection refused
[04:28:21] <_joe_>	  here I am
[04:28:25] <_joe_>	 loads of fun
[04:29:12] <_joe_>	 is anyone else getting paged?
[04:33:12] <icinga-wm>	 RECOVERY - etcdmirror-conftool-eqiad-wmnet service on conf2002 is OK: OK - etcdmirror-conftool-eqiad-wmnet is active
[04:33:15] <icinga-wm>	 RECOVERY - Etcd replication lag on conf2002 is OK: HTTP OK: HTTP/1.1 200 OK - 148 bytes in 0.073 second response time
[04:33:31] <_joe_>	 !log restarted etcdmirror on conf2002, failure caused by raid resyncs in codfw
[04:33:52] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[04:33:52] <icinga-wm>	 RECOVERY - Check systemd state on conf2002 is OK: OK - running: The system is fully operational
[04:35:00] <apergos>	 I did gt the page
[05:42:22] <icinga-wm>	 PROBLEM - MegaRAID on analytics1038 is CRITICAL: CRITICAL: 13 LD(s) must have write cache policy WriteBack, currently using: WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough
[06:02:22] <icinga-wm>	 RECOVERY - MegaRAID on analytics1038 is OK: OK: optimal, 13 logical, 14 physical, WriteBack policy
[06:02:23] <icinga-wm>	 PROBLEM - Check systemd state on conf2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed.
[06:02:36] <icinga-wm>	 PROBLEM - Etcd replication lag on conf2002 is CRITICAL: connect to address 10.192.32.141 and port 8000: Connection refused
[06:02:52] <icinga-wm>	 PROBLEM - etcdmirror-conftool-eqiad-wmnet service on conf2002 is CRITICAL: CRITICAL - Expecting active but unit etcdmirror-conftool-eqiad-wmnet is failed
[06:04:23] <icinga-wm>	 RECOVERY - Check systemd state on conf2002 is OK: OK - running: The system is fully operational
[06:04:45] <icinga-wm>	 RECOVERY - Etcd replication lag on conf2002 is OK: HTTP OK: HTTP/1.1 200 OK - 148 bytes in 0.076 second response time
[06:04:52] <icinga-wm>	 RECOVERY - etcdmirror-conftool-eqiad-wmnet service on conf2002 is OK: OK - etcdmirror-conftool-eqiad-wmnet is active
[06:18:55] <_joe_>	 !log reduced raid resync speed on conf2* to 5000 KB/s
[06:19:08] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:56:10] <wikibugs>	 10Operations, 10Developer-Relations, 10Discourse: Bring discourse.mediawiki.org to production - https://phabricator.wikimedia.org/T180853#3944015 (10Tgr) Probably should get Bitergia integration by the time of production deployment.
[07:12:22] <icinga-wm>	 PROBLEM - MegaRAID on analytics1038 is CRITICAL: CRITICAL: 13 LD(s) must have write cache policy WriteBack, currently using: WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough
[08:14:32] <wikibugs>	 (03PS3) 10Zoranzoki21: Change namespaces on urwiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/407901 (https://phabricator.wikimedia.org/T186393)
[08:21:58] <wikibugs>	 (03PS4) 10Zoranzoki21: Change namespaces on urwiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/407901 (https://phabricator.wikimedia.org/T186393)
[09:12:22] <icinga-wm>	 RECOVERY - MegaRAID on analytics1038 is OK: OK: optimal, 13 logical, 14 physical, WriteBack policy
[09:32:12] <wikibugs>	 10Operations, 10Developer-Relations, 10Discourse: Bring discourse.mediawiki.org to production - https://phabricator.wikimedia.org/T180853#3944108 (10Tgr) Will need monitoring as well. There is an [[https://meta.discourse.org/t/prometheus-exporter-plugin-for-discourse/72666|official Prometheus exporter]] whic...
[09:58:33] <icinga-wm>	 PROBLEM - HHVM rendering on mw1262 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 0.007 second response time
[09:59:19] <Zoranzoki21>	 ahh how is gerrit now so fast
[09:59:32] <icinga-wm>	 RECOVERY - HHVM rendering on mw1262 is OK: HTTP OK: HTTP/1.1 200 OK - 74554 bytes in 0.092 second response time
[10:05:07] <elukey>	 added downtime for an1038, we'll try to swap the bbu this week
[10:10:07] <Zoranzoki21>	 I have question
[10:10:19] <Zoranzoki21>	 Which tests mw-testskin run?
[10:14:38] <wikibugs>	 (03PS1) 10Amire80: Add sitename for sdwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/408032 (https://phabricator.wikimedia.org/T184521)
[10:15:07] <wikibugs>	 (03CR) 10Zoranzoki21: [C: 031] Add sitename for sdwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/408032 (https://phabricator.wikimedia.org/T184521) (owner: 10Amire80)
[11:24:32] <icinga-wm>	 PROBLEM - Apache HTTP on mw2206 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[11:24:42] <icinga-wm>	 PROBLEM - etcdmirror-conftool-eqiad-wmnet service on conf2002 is CRITICAL: CRITICAL - Expecting active but unit etcdmirror-conftool-eqiad-wmnet is failed
[11:24:48] <icinga-wm>	 PROBLEM - Etcd replication lag on conf2002 is CRITICAL: connect to address 10.192.32.141 and port 8000: Connection refused
[11:25:13] <icinga-wm>	 PROBLEM - Check systemd state on conf2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed.
[11:25:22] <icinga-wm>	 RECOVERY - Apache HTTP on mw2206 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 616 bytes in 0.117 second response time
[11:27:02] <icinga-wm>	 PROBLEM - puppet last run on cp3036 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[11:27:42] <icinga-wm>	 RECOVERY - etcdmirror-conftool-eqiad-wmnet service on conf2002 is OK: OK - etcdmirror-conftool-eqiad-wmnet is active
[11:27:52] <paravoid>	 again?
[11:27:55] <marostegui>	 ^ I ran puppet on that host
[11:27:57] <icinga-wm>	 RECOVERY - Etcd replication lag on conf2002 is OK: HTTP OK: HTTP/1.1 200 OK - 148 bytes in 0.074 second response time
[11:28:01] <marostegui>	 and it was brought up
[11:28:22] <icinga-wm>	 RECOVERY - Check systemd state on conf2002 is OK: OK - running: The system is fully operational
[11:45:53] <icinga-wm>	 PROBLEM - HHVM rendering on mw1280 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 0.601 second response time
[11:46:02] <icinga-wm>	 PROBLEM - Nginx local proxy to apache on mw1280 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 0.006 second response time
[11:46:23] <icinga-wm>	 PROBLEM - Apache HTTP on mw1280 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 0.001 second response time
[11:46:53] <icinga-wm>	 RECOVERY - HHVM rendering on mw1280 is OK: HTTP OK: HTTP/1.1 200 OK - 74487 bytes in 0.215 second response time
[11:47:03] <icinga-wm>	 RECOVERY - Nginx local proxy to apache on mw1280 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 617 bytes in 0.095 second response time
[11:47:23] <icinga-wm>	 RECOVERY - Apache HTTP on mw1280 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 616 bytes in 0.062 second response time
[11:57:02] <icinga-wm>	 RECOVERY - puppet last run on cp3036 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures
[12:43:50] <wikibugs>	 (03Draft2) 10محمد شعیب: Enable ArticlePlaceholder ext for urwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/408043 (https://phabricator.wikimedia.org/T186451)
[14:04:10] <wikibugs>	 (03PS1) 10Urbanecm: Make alias from old NS_PROJECT to new NS_PROJECT at hiwikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/408045 (https://phabricator.wikimedia.org/T185347)
[14:17:04] <wikibugs>	 (03PS1) 10Urbanecm: Change cswiki logo for celebration - 400k [mediawiki-config] - 10https://gerrit.wikimedia.org/r/408046 (https://phabricator.wikimedia.org/T186455)
[14:21:55] <wikibugs>	 (03CR) 10Zoranzoki21: Enable ArticlePlaceholder ext for urwiki (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/408043 (https://phabricator.wikimedia.org/T186451) (owner: 10محمد شعیب)
[15:06:32] <Zoranzoki21>	 Changes related to phabricator merged on gerrit will be deployed for 2 days? I forgot..
[15:08:04] <wikibugs>	 (03PS3) 10محمد شعیب: Enable ArticlePlaceholder ext for urwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/408043 (https://phabricator.wikimedia.org/T186451)
[15:18:01] <wikibugs>	 (03PS4) 10Zoranzoki21: Enable ArticlePlaceholder ext for urwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/408043 (https://phabricator.wikimedia.org/T186451) (owner: 10محمد شعیب)
[15:18:16] <wikibugs>	 (03CR) 10Zoranzoki21: "Now is ok" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/408043 (https://phabricator.wikimedia.org/T186451) (owner: 10محمد شعیب)
[15:22:11] <wikibugs>	 (03CR) 10Jayprakash12345: [C: 031] Make alias from old NS_PROJECT to new NS_PROJECT at hiwikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/408045 (https://phabricator.wikimedia.org/T185347) (owner: 10Urbanecm)
[15:33:04] <wikibugs>	 (03CR) 10Zoranzoki21: [C: 031] Make alias from old NS_PROJECT to new NS_PROJECT at hiwikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/408045 (https://phabricator.wikimedia.org/T185347) (owner: 10Urbanecm)
[17:04:33] <wikibugs>	 (03CR) 10Zoranzoki21: [C: 04-1] "Rebase patch and do tips which told user Framawiki." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/403120 (owner: 10محمد شعیب)
[17:05:13] <wikibugs>	 (03Abandoned) 10Framawiki: Allow euwiki bureaucrats to add/remove 'accountcreator' right [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405771 (https://phabricator.wikimedia.org/T185531) (owner: 10Framawiki)
[17:17:04] <Zoranzoki21>	 Please abandon this patch: https://gerrit.wikimedia.org/r/#/c/137982/
[17:17:04] <wikibugs>	 (03PS1) 10Framawiki: Remove old 'accountcreator' rules now handled by default [mediawiki-config] - 10https://gerrit.wikimedia.org/r/408071 (https://phabricator.wikimedia.org/T185417)
[17:20:30] <wikibugs>	 (03PS2) 10Framawiki: Remove old 'accountcreator' rules now handled by default [mediawiki-config] - 10https://gerrit.wikimedia.org/r/408071 (https://phabricator.wikimedia.org/T185417)
[17:29:46] <wikibugs>	 (03PS3) 10محمد شعیب: Changing namespaces on some Urdu language projects.  [mediawiki-config] - 10https://gerrit.wikimedia.org/r/403120
[17:29:55] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Changing namespaces on some Urdu language projects.  [mediawiki-config] - 10https://gerrit.wikimedia.org/r/403120 (owner: 10محمد شعیب)
[17:34:48] <wikibugs>	 (03Abandoned) 10محمد شعیب: Changing namespaces on some Urdu language projects.  [mediawiki-config] - 10https://gerrit.wikimedia.org/r/403120 (owner: 10محمد شعیب)
[18:09:30] <wikibugs>	 10Operations: Backport firejail 0.9.52 for use on Wikimedia appservers - https://phabricator.wikimedia.org/T179022#3944791 (10Legoktm) 05stalled>03Open @MoritzMuehlenhoff firejail 0.9.52 has been released and is in unstable and stretch-backports.
[18:16:31] <wikibugs>	 (03PS1) 10Zoranzoki21: Disable Flow extension on Commons [mediawiki-config] - 10https://gerrit.wikimedia.org/r/408073 (https://phabricator.wikimedia.org/T186463)
[19:26:55] <Zoranzoki21>	 Why gerrit sending emails with content (example): $1 would like to $2 review the patch $3?
[19:30:23] <paladox>	 It shoulden't be sending that. if it is then that's a bug.
[19:33:06] <paladox>	 Oh is that an example?
[19:36:16] <Zoranzoki21>	 Example
[19:36:47] <Zoranzoki21>	 Real: Reviewer-bot would like Urbanecm to review this change.
[19:36:56] <Zoranzoki21>	 Why he sending it?
[19:37:27] <Zoranzoki21>	 He have not to send it
[19:37:42] <paladox>	 a reviewer added you to a change
[19:37:46] <paladox>	 and wants you to review it
[19:38:35] <Zoranzoki21>	 But why gerrit send ME to reviewer-bot would like Urbanecm to review this change WHICH IS NOT MY CHANGE
[19:39:34] <paladox>	 Because your on the change. And that was a known bug in 2.13 which was fixed.
[19:39:58] <Zoranzoki21>	 It is on 2.14.6-7-g55dde9d68b which you have current
[19:40:00] <paladox>	 but with the introduction of notedb in 2.14+ (will be used in 2.15) it does not behave the same as reviewdb.
[19:40:30] <Zoranzoki21>	 ok
[19:40:36] <Zoranzoki21>	 thank you very much
[20:22:42] <icinga-wm>	 PROBLEM - Varnish HTTP text-backend - port 3128 on cp4029 is CRITICAL: connect to address 10.128.0.129 and port 3128: Connection refused
[20:23:42] <icinga-wm>	 RECOVERY - Varnish HTTP text-backend - port 3128 on cp4029 is OK: HTTP OK: HTTP/1.1 200 OK - 218 bytes in 0.157 second response time
[22:40:53] <elukey>	 !log restart aphlict.service on phab1001 to force it to pick up the new logfile (/var/log/aphlict/aphlict.log rather than the .log.1)
[22:41:06] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[22:46:10] <wikibugs>	 (03PS1) 10Elukey: phabricator: add copytruncate to aphlict's logrotate [puppet] - 10https://gerrit.wikimedia.org/r/408222
[22:47:42] <wikibugs>	 (03CR) 10Elukey: [C: 032] phabricator: add copytruncate to aphlict's logrotate [puppet] - 10https://gerrit.wikimedia.org/r/408222 (owner: 10Elukey)
[22:47:50] <elukey>	 mutante: --^
[22:53:33] <icinga-wm>	 PROBLEM - puppet last run on kafka1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[23:23:33] <icinga-wm>	 RECOVERY - puppet last run on kafka1003 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures