[00:19:55] RECOVERY - MegaRAID on analytics1038 is OK: OK: optimal, 13 logical, 14 physical, WriteBack policy [00:56:25] PROBLEM - HHVM rendering on mw2146 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:57:15] RECOVERY - HHVM rendering on mw2146 is OK: HTTP OK: HTTP/1.1 200 OK - 75078 bytes in 0.556 second response time [00:58:12] 10Operations, 10Citoid, 10VisualEditor, 10Services (watching), 10User-mobrovac: Wiley requests for DOI and some other publishers don't work in production - https://phabricator.wikimedia.org/T165105#3915291 (10Mvolz) Another reported on Twitter: https://twitter.com/SiobhanLeachman/status/954878073207365632 [00:59:55] PROBLEM - MegaRAID on analytics1038 is CRITICAL: CRITICAL: 13 LD(s) must have write cache policy WriteBack, currently using: WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough [02:35:53] !log bootstrapping restbase2012-b - T184100 [02:36:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:36:06] T184100: Reprovision legacy Cassandra nodes into new cluster - https://phabricator.wikimedia.org/T184100 [03:28:35] PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 873.52 seconds [03:36:24] PROBLEM - puppet last run on cp4028 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 5 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/share/GeoIP/GeoIPCity.dat.gz] [03:58:44] RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 168.75 seconds [04:01:24] RECOVERY - puppet last run on cp4028 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [05:59:55] RECOVERY - MegaRAID on analytics1038 is OK: OK: optimal, 13 logical, 14 physical, WriteBack policy [06:27:35] PROBLEM - graphite.wikimedia.org on graphite1003 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 398 bytes in 0.004 second response time [06:28:35] RECOVERY - graphite.wikimedia.org on graphite1003 is OK: HTTP OK: HTTP/1.1 200 OK - 1547 bytes in 0.016 second response time [06:29:55] PROBLEM - MegaRAID on analytics1038 is CRITICAL: CRITICAL: 13 LD(s) must have write cache policy WriteBack, currently using: WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough [07:09:55] RECOVERY - MegaRAID on analytics1038 is OK: OK: optimal, 13 logical, 14 physical, WriteBack policy [07:29:34] PROBLEM - Check Varnish expiry mailbox lag on cp4021 is CRITICAL: CRITICAL: expiry mailbox lag is 2103739 [07:39:55] PROBLEM - MegaRAID on analytics1038 is CRITICAL: CRITICAL: 13 LD(s) must have write cache policy WriteBack, currently using: WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough [07:49:55] RECOVERY - MegaRAID on analytics1038 is OK: OK: optimal, 13 logical, 14 physical, WriteBack policy [08:14:07] 10Operations, 10ops-eqiad, 10Analytics-Kanban: BBU alarms flapping for analytics1038 - https://phabricator.wikimedia.org/T185409#3915460 (10elukey) [08:19:55] PROBLEM - MegaRAID on analytics1038 is CRITICAL: CRITICAL: 13 LD(s) must have write cache policy WriteBack, currently using: WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough [08:21:18] downtimed an1038 --^ [09:39:34] RECOVERY - Check Varnish expiry mailbox lag on cp4021 is OK: OK: expiry mailbox lag is 5 [09:49:55] RECOVERY - MegaRAID on analytics1038 is OK: OK: optimal, 13 logical, 14 physical, WriteBack policy [10:46:16] (03CR) 10Hoo man: [C: 031] "Looks sensible, compared the lock manager settings to the "default" one in `wmf-config/filebackend.php`." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/395967 (https://phabricator.wikimedia.org/T178652) (owner: 10Addshore) [12:01:00] (03CR) 10Framawiki: [C: 04-1] Add several domains of Ukraine government to wgCopyUploadsDomains (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405550 (https://phabricator.wikimedia.org/T185399) (owner: 10Urbanecm) [12:03:14] !log Defragment s2 on db1102 - T182450 [12:03:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:03:28] T182450: db1102 (sanitarium) filling up (WAS: Clean up old binlogs from db1102 (sanitarium multi-instance)) - https://phabricator.wikimedia.org/T182450 [12:09:59] (03PS2) 10Zoranzoki21: Add several domains of Ukraine government to wgCopyUploadsDomains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405550 (https://phabricator.wikimedia.org/T185399) (owner: 10Urbanecm) [12:13:00] (03CR) 10Odder: "For most of these, I used the already existing SVG logos, and exported three PNG files: 16x16px, 32x32px and 48x48px. I then optimised the" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/402618 (https://phabricator.wikimedia.org/T177726) (owner: 10Odder) [12:38:42] (03PS5) 10EddieGP: Redirect techblog.wikimedia.org to blog.wikimedia.org/c/technology [puppet] - 10https://gerrit.wikimedia.org/r/394743 (https://phabricator.wikimedia.org/T181878) (owner: 10Framawiki) [12:39:47] (03CR) 10EddieGP: "I'm no expert at the syntax of this dat file either, but I'd say something like this: Leave the rewrite as-is, and add an override just fo" [puppet] - 10https://gerrit.wikimedia.org/r/394743 (https://phabricator.wikimedia.org/T181878) (owner: 10Framawiki) [13:06:45] (03CR) 10MarcoAurelio: "recheck" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405421 (https://phabricator.wikimedia.org/T143789) (owner: 10MarcoAurelio) [13:08:09] (03CR) 10jerkins-bot: [V: 04-1] Remove upload rights on wikis where local uploads are disabled [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405421 (https://phabricator.wikimedia.org/T143789) (owner: 10MarcoAurelio) [13:29:04] (03PS5) 10Paladox: mediawiki_vagrant: Update role name used for if defined check [puppet] - 10https://gerrit.wikimedia.org/r/389295 [13:31:04] (03Abandoned) 10Paladox: DO NOT MERGE [labs/private] - 10https://gerrit.wikimedia.org/r/363847 (owner: 10Paladox) [13:31:41] (03PS9) 10Paladox: mysql: Fix installing package on stretch [puppet] - 10https://gerrit.wikimedia.org/r/354131 [13:34:14] PROBLEM - puppet last run on maps1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [13:36:48] (03PS7) 10Paladox: Gerrit: Fix performance issues with new login ui [puppet] - 10https://gerrit.wikimedia.org/r/405368 [13:46:04] (03PS9) 10Paladox: lxc: Fix support for stretch [puppet] - 10https://gerrit.wikimedia.org/r/405208 (https://phabricator.wikimedia.org/T180377) [13:46:06] (03CR) 10Paladox: lxc: Fix support for stretch (033 comments) [puppet] - 10https://gerrit.wikimedia.org/r/405208 (https://phabricator.wikimedia.org/T180377) (owner: 10Paladox) [13:48:07] (03PS10) 10Paladox: lxc: Fix support for stretch [puppet] - 10https://gerrit.wikimedia.org/r/405208 (https://phabricator.wikimedia.org/T180377) [13:59:14] RECOVERY - puppet last run on maps1001 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [15:28:39] (03Draft1) 10Paladox: ircecho: Support ssl when connecting to irc [puppet] - 10https://gerrit.wikimedia.org/r/405591 [15:28:41] (03PS2) 10Paladox: ircecho: Support ssl when connecting to irc [puppet] - 10https://gerrit.wikimedia.org/r/405591 [15:29:04] (03CR) 10jerkins-bot: [V: 04-1] ircecho: Support ssl when connecting to irc [puppet] - 10https://gerrit.wikimedia.org/r/405591 (owner: 10Paladox) [15:29:53] (03PS3) 10Paladox: ircecho: Support ssl when connecting to irc [puppet] - 10https://gerrit.wikimedia.org/r/405591 [15:30:17] (03CR) 10Paladox: "I've tested the ircecho file locally with /etc/default/ircecho and ssl works. This works with ssl off and ssl on." [puppet] - 10https://gerrit.wikimedia.org/r/405591 (owner: 10Paladox) [15:33:03] (03Draft1) 10Paladox: ircecho: Enable ssl by default [puppet] - 10https://gerrit.wikimedia.org/r/405593 [15:33:05] (03Draft2) 10Paladox: ircecho: Enable ssl by default [puppet] - 10https://gerrit.wikimedia.org/r/405593 [15:50:17] (03Draft1) 10Paladox: ircecho: Support auth over irc [puppet] - 10https://gerrit.wikimedia.org/r/405594 [15:50:19] (03Draft2) 10Paladox: ircecho: Support auth over irc [puppet] - 10https://gerrit.wikimedia.org/r/405594 [15:50:44] (03CR) 10jerkins-bot: [V: 04-1] ircecho: Support auth over irc [puppet] - 10https://gerrit.wikimedia.org/r/405594 (owner: 10Paladox) [15:51:44] (03PS3) 10Paladox: ircecho: Support auth over irc [puppet] - 10https://gerrit.wikimedia.org/r/405594 [15:54:56] (03PS4) 10Paladox: ircecho: Support auth over irc [puppet] - 10https://gerrit.wikimedia.org/r/405594 [15:57:52] (03PS5) 10Paladox: ircecho: Support auth over irc [puppet] - 10https://gerrit.wikimedia.org/r/405594 [15:58:58] (03CR) 10Paladox: "Tested the ircecho python file locally and works." [puppet] - 10https://gerrit.wikimedia.org/r/405594 (owner: 10Paladox) [16:05:11] (03PS4) 10Jayprakash12345: Update the project namespace in Nepali Wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404148 (https://phabricator.wikimedia.org/T184865) (owner: 10Biplab Anand) [16:05:20] (03CR) 10jerkins-bot: [V: 04-1] Update the project namespace in Nepali Wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404148 (https://phabricator.wikimedia.org/T184865) (owner: 10Biplab Anand) [16:07:02] (03CR) 10Jayprakash12345: "recheck" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404148 (https://phabricator.wikimedia.org/T184865) (owner: 10Biplab Anand) [16:07:11] (03CR) 10jerkins-bot: [V: 04-1] Update the project namespace in Nepali Wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404148 (https://phabricator.wikimedia.org/T184865) (owner: 10Biplab Anand) [16:14:08] (03PS5) 10Biplab Anand: Update the project namespace in Nepali Wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404148 (https://phabricator.wikimedia.org/T184865) [16:14:56] (03CR) 10Jayprakash12345: "recheck" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404148 (https://phabricator.wikimedia.org/T184865) (owner: 10Biplab Anand) [17:21:25] !log Compress frwiki and jawiki on db1102 - T182450 [17:21:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:21:38] T182450: db1102 (sanitarium) filling up (WAS: Clean up old binlogs from db1102 (sanitarium multi-instance)) - https://phabricator.wikimedia.org/T182450 [17:57:24] PROBLEM - HHVM rendering on mw2131 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:58:15] RECOVERY - HHVM rendering on mw2131 is OK: HTTP OK: HTTP/1.1 200 OK - 74904 bytes in 0.343 second response time [18:53:52] (03PS2) 10Groovier1: Adding config for WikimediaEvents module for logging behaviour data [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404910 [18:54:01] (03CR) 10jerkins-bot: [V: 04-1] Adding config for WikimediaEvents module for logging behaviour data [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404910 (owner: 10Groovier1) [19:40:35] (03PS3) 10Gergő Tisza: Adding config for WikimediaEvents module for logging behaviour data [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404910 (owner: 10Groovier1) [19:41:33] (03CR) 10Gergő Tisza: "It should be enabled for Beta though (that is, 'default' => true in InitialiseSettings-labs.php)." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404910 (owner: 10Groovier1) [20:33:09] (03PS4) 10Groovier1: Adding config for WikimediaEvents module for logging behaviour data [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404910 [21:47:08] (03CR) 10Gergő Tisza: [C: 031] Adding config for WikimediaEvents module for logging behaviour data [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404910 (owner: 10Groovier1) [23:06:18] (03CR) 10Thiemo Kreuz (WMDE): Gerrit: Fix performance issues with new login ui (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/405368 (owner: 10Paladox) [23:07:01] (03CR) 10Paladox: Gerrit: Fix performance issues with new login ui (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/405368 (owner: 10Paladox) [23:10:11] (03PS8) 10Paladox: Gerrit: Fix performance issues with new login ui [puppet] - 10https://gerrit.wikimedia.org/r/405368