[01:20:38] 10Operations, 10Cloud-Services, 10DC-Ops, 10hardware-requests, 10Patch-For-Review: decom californium - https://phabricator.wikimedia.org/T189921#4058930 (10TerraCodes) @Dzahn why remove all the subscribers? [01:32:06] 04Critical Alert for device cr2-codfw.wikimedia.org - Primary inbound port utilisation over 80% [01:33:05] 04Critical Alert for device cr2-eqiad.wikimedia.org - Primary outbound port utilisation over 80% [01:58:05] 04̶C̶r̶i̶t̶i̶c̶a̶l Device cr2-eqiad.wikimedia.org recovered from Primary outbound port utilisation over 80% [01:58:08] 04̶C̶r̶i̶t̶i̶c̶a̶l Device cr2-codfw.wikimedia.org recovered from Primary inbound port utilisation over 80% [03:26:50] PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 678.04 seconds [03:37:00] PROBLEM - puppet last run on mw2162 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 6 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/share/GeoIP/GeoIPCity.dat.gz] [03:37:00] PROBLEM - puppet last run on mw2172 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 6 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/share/GeoIP/GeoIPCity.dat.gz] [03:58:59] RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 281.88 seconds [04:01:59] RECOVERY - puppet last run on mw2162 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [04:01:59] RECOVERY - puppet last run on mw2172 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [05:55:06] (03CR) 10Gergő Tisza: "Scheduled for Monday." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/418843 (https://phabricator.wikimedia.org/T184000) (owner: 10Gergő Tisza) [05:56:37] (03PS1) 10Gergő Tisza: Enable Wikidata description override on enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/420227 (https://phabricator.wikimedia.org/T184000) [06:27:29] PROBLEM - HHVM jobrunner on mw1307 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:19:49] RECOVERY - HHVM jobrunner on mw1307 is OK: HTTP OK: HTTP/1.1 200 OK - 206 bytes in 3.764 second response time [07:27:02] 10Operations, 10DBA, 10Goal: Generate consistent logical database backups in CODFW - https://phabricator.wikimedia.org/T184699#4059027 (10Marostegui) [09:01:46] (03PS1) 10Nemo bis: Add reference for itwiki $wgAbuseFilterActions [mediawiki-config] - 10https://gerrit.wikimedia.org/r/420237 [09:44:49] (03PS1) 10Nehajha: GSoC task for Neha Jha [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/420238 [09:45:29] (03CR) 10jerkins-bot: [V: 04-1] GSoC task for Neha Jha [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/420238 (owner: 10Nehajha) [09:51:34] (03PS2) 10Nehajha: GSoC task for Neha Jha [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/420238 (https://phabricator.wikimedia.org/T189974) [09:52:10] (03CR) 10jerkins-bot: [V: 04-1] GSoC task for Neha Jha [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/420238 (https://phabricator.wikimedia.org/T189974) (owner: 10Nehajha) [11:16:40] (03PS8) 10MarcoAurelio: beta: disable abusefilter from collecting user IP addresses [mediawiki-config] - 10https://gerrit.wikimedia.org/r/416346 (https://phabricator.wikimedia.org/T188862) [11:16:45] (03PS9) 10MarcoAurelio: beta: disable abusefilter from collecting user IP addresses [mediawiki-config] - 10https://gerrit.wikimedia.org/r/416346 (https://phabricator.wikimedia.org/T188862) [11:50:43] (03PS1) 10Nemo bis: [Italian Planet] Add User:Sciking blog to it.planet.wikimedia.org [puppet] - 10https://gerrit.wikimedia.org/r/420243 [12:24:29] 10Operations, 10Cloud-Services: Create Wikitech account for Aryeh Gregor (preexisting SVN login "simetrical") - https://phabricator.wikimedia.org/T189981#4059265 (10Simetrical) [13:48:20] PROBLEM - HHVM rendering on mw2207 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:49:10] RECOVERY - HHVM rendering on mw2207 is OK: HTTP OK: HTTP/1.1 200 OK - 75129 bytes in 0.360 second response time [17:18:35] (03PS5) 10Ahmed123: Revert "Restrict FlaggedRevs to only operated on NS_MAIN on arwiki" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/418700 (https://phabricator.wikimedia.org/T148603) [20:13:06] 04Critical Alert for device cr2-eqiad.wikimedia.org - Primary outbound port utilisation over 80% [20:17:05] 04Critical Alert for device cr2-codfw.wikimedia.org - Primary inbound port utilisation over 80% [20:47:05] 04̶C̶r̶i̶t̶i̶c̶a̶l Device cr2-codfw.wikimedia.org recovered from Primary inbound port utilisation over 80% [20:48:04] 04̶C̶r̶i̶t̶i̶c̶a̶l Device cr2-eqiad.wikimedia.org recovered from Primary outbound port utilisation over 80% [21:02:06] 04Critical Alert for device cr2-codfw.wikimedia.org - Primary inbound port utilisation over 80% [21:07:06] 04̶C̶r̶i̶t̶i̶c̶a̶l Device cr2-codfw.wikimedia.org recovered from Primary inbound port utilisation over 80% [21:18:06] 04Critical Alert for device cr2-eqiad.wikimedia.org - Primary outbound port utilisation over 80% [21:18:09] 04Critical Alert for device cr2-codfw.wikimedia.org - Primary inbound port utilisation over 80% [21:21:58] Is Gerrit down? [21:22:31] Never mind, back now. [21:32:06] 04̶C̶r̶i̶t̶i̶c̶a̶l Device cr2-codfw.wikimedia.org recovered from Primary inbound port utilisation over 80% [21:33:05] 04̶C̶r̶i̶t̶i̶c̶a̶l Device cr2-eqiad.wikimedia.org recovered from Primary outbound port utilisation over 80% [21:42:05] 04Critical Alert for device cr2-codfw.wikimedia.org - Primary inbound port utilisation over 80% [21:43:05] 04Critical Alert for device cr2-eqiad.wikimedia.org - Primary outbound port utilisation over 80% [21:48:04] 04̶C̶r̶i̶t̶i̶c̶a̶l Device cr2-codfw.wikimedia.org recovered from Primary inbound port utilisation over 80% [21:53:06] 04̶C̶r̶i̶t̶i̶c̶a̶l Device cr2-eqiad.wikimedia.org recovered from Primary outbound port utilisation over 80% [22:17:06] 04Critical Alert for device cr2-codfw.wikimedia.org - Primary inbound port utilisation over 80% [22:18:04] 04Critical Alert for device cr2-eqiad.wikimedia.org - Primary outbound port utilisation over 80% [22:27:05] 04̶C̶r̶i̶t̶i̶c̶a̶l Device cr2-codfw.wikimedia.org recovered from Primary inbound port utilisation over 80% [22:28:05] 04̶C̶r̶i̶t̶i̶c̶a̶l Device cr2-eqiad.wikimedia.org recovered from Primary outbound port utilisation over 80% [23:02:06] 04Critical Alert for device cr2-codfw.wikimedia.org - Primary inbound port utilisation over 80% [23:03:06] 04Critical Alert for device cr2-eqiad.wikimedia.org - Primary outbound port utilisation over 80% [23:32:04] 04Critical Alert for device cr2-codfw.wikimedia.org - Primary inbound port utilisation over 80% [23:37:06] 04̶C̶r̶i̶t̶i̶c̶a̶l Device cr2-codfw.wikimedia.org recovered from Primary inbound port utilisation over 80% [23:48:05] 04̶C̶r̶i̶t̶i̶c̶a̶l Device cr2-eqiad.wikimedia.org recovered from Primary outbound port utilisation over 80% [23:57:06] 04Critical Alert for device cr2-codfw.wikimedia.org - Primary inbound port utilisation over 80%