[00:03:12] RECOVERY - recommendation_api endpoints health on scb2003 is OK: All endpoints are healthy [00:03:51] RECOVERY - recommendation_api endpoints health on scb2004 is OK: All endpoints are healthy [01:48:52] RECOVERY - Check systemd state on kubernetes2003 is OK: OK - running: The system is fully operational [01:52:12] PROBLEM - Check systemd state on kubernetes2003 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [02:52:21] PROBLEM - proton endpoints health on proton1002 is CRITICAL: /{domain}/v1/pdf/{title}/{format}/{type} (Print the Foo page from en.wp.org in letter format) is CRITICAL: Test Print the Foo page from en.wp.org in letter format returned the unexpected status 503 (expecting: 200) [02:53:22] RECOVERY - proton endpoints health on proton1002 is OK: All endpoints are healthy [03:36:32] PROBLEM - puppet last run on mw2277 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 6 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/share/GeoIP/GeoIPCity.dat.gz] [03:36:32] PROBLEM - puppet last run on mw2234 is CRITICAL: CRITICAL: Puppet has 2 failures. Last run 5 minutes ago with 2 failures. Failed resources (up to 3 shown): File[/usr/share/GeoIP/GeoIP2-City.mmdb.gz],File[/usr/share/GeoIP/GeoIP2-City.mmdb.test] [03:36:41] PROBLEM - puppet last run on cp4028 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 6 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/share/GeoIP/GeoIPCity.dat.gz] [03:46:47] !log reindexing Serbo-Croatian wikis on elastic@codfw (T196658) [03:46:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:46:49] T196658: Re-index Croatian, Serbo-Croatian, and Bosnian Wikis - https://phabricator.wikimedia.org/T196658 [04:02:01] RECOVERY - puppet last run on mw2277 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [04:02:01] RECOVERY - puppet last run on mw2234 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [04:02:11] RECOVERY - puppet last run on cp4028 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [05:00:01] RECOVERY - Memory correctable errors -EDAC- on cp1053 is OK: (C)4 ge (W)2 ge 0 https://grafana.wikimedia.org/dashboard/db/host-overview?orgId=1&var-server=cp1053&var-datasource=eqiad%2520prometheus%252Fops [05:08:45] 10Operations, 10Wikimedia Australia, 10Wikimedia-Mailing-lists: Wikimedia Australia mailing list creation - https://phabricator.wikimedia.org/T55847#4295106 (10Peachey88) [07:56:17] 10Operations, 10DBA, 10MediaWiki-Configuration: Data model for dbconfig - https://phabricator.wikimedia.org/T197531#4295158 (10Joe) p:05Triage>03Normal [08:05:59] (03PS1) 10Urbanecm: Enable TemplateStyles on ruwikiquote [mediawiki-config] - 10https://gerrit.wikimedia.org/r/440722 (https://phabricator.wikimedia.org/T197526) [08:06:46] 10Operations, 10TemplateStyles, 10Traffic, 10Wikimedia-Extension-setup, and 4 others: Deploy TemplateStyles to WMF production - https://phabricator.wikimedia.org/T133410#4295184 (10Urbanecm) [08:10:42] PROBLEM - proton endpoints health on proton2001 is CRITICAL: /{domain}/v1/pdf/{title}/{format}/{type} (Print the Foo page from en.wp.org in letter format) is CRITICAL: Test Print the Foo page from en.wp.org in letter format returned the unexpected status 503 (expecting: 200) [08:14:02] RECOVERY - proton endpoints health on proton2001 is OK: All endpoints are healthy [09:09:02] PROBLEM - proton endpoints health on proton1002 is CRITICAL: /{domain}/v1/pdf/{title}/{format}/{type} (Respond file not found for a nonexistent title) is CRITICAL: Test Respond file not found for a nonexistent title returned the unexpected status 503 (expecting: 404) [09:10:12] RECOVERY - proton endpoints health on proton1002 is OK: All endpoints are healthy [09:41:14] 10Operations, 10DBA, 10MediaWiki-Configuration: Data model for dbconfig - https://phabricator.wikimedia.org/T197531#4295263 (10Joe) [09:41:54] 10Operations, 10DBA, 10MediaWiki-Configuration: Data model for dbconfig - https://phabricator.wikimedia.org/T197531#4295158 (10Joe) [10:03:27] (03PS2) 10Giuseppe Lavagetto: conftool: schema for database configuration on etcd [puppet] - 10https://gerrit.wikimedia.org/r/422373 (https://phabricator.wikimedia.org/T197531) [10:29:41] PROBLEM - proton endpoints health on proton1001 is CRITICAL: /{domain}/v1/pdf/{title}/{format}/{type} (Print the Foo page from en.wp.org in letter format) is CRITICAL: Test Print the Foo page from en.wp.org in letter format returned the unexpected status 503 (expecting: 200): /{domain}/v1/pdf/{title}/{format}/{type} (Respond file not found for a nonexistent title) timed out before a response was received [10:32:51] RECOVERY - proton endpoints health on proton1001 is OK: All endpoints are healthy [11:43:32] RECOVERY - Router interfaces on cr2-eqiad is OK: OK: host 208.80.154.197, interfaces up: 226, down: 0, dormant: 0, excluded: 0, unused: 0 [11:44:11] RECOVERY - Router interfaces on cr1-eqord is OK: OK: host 208.80.154.198, interfaces up: 39, down: 0, dormant: 0, excluded: 0, unused: 0 [13:05:48] !log reindexing Serbo-Croatian wikis on elastic@eqiad (T196658) [13:05:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:05:51] T196658: Re-index Croatian, Serbo-Croatian, and Bosnian Wikis - https://phabricator.wikimedia.org/T196658 [13:52:34] (03PS1) 10Gergő Tisza: MWScript: do not wrangle absolute path [mediawiki-config] - 10https://gerrit.wikimedia.org/r/440743 [14:55:30] (03PS1) 10MacFan4000: Update ExtDist versions [mediawiki-config] - 10https://gerrit.wikimedia.org/r/440745 [14:57:13] (03PS2) 10MacFan4000: Remove MW 1.29 from ExtDist as it is now no longer supported [mediawiki-config] - 10https://gerrit.wikimedia.org/r/440745 [15:06:01] PROBLEM - Disk space on elastic1017 is CRITICAL: DISK CRITICAL - free space: /srv 61946 MB (12% inode=99%) [15:10:51] PROBLEM - Disk space on elastic1023 is CRITICAL: DISK CRITICAL - free space: /srv 60874 MB (12% inode=99%) [15:12:01] RECOVERY - Disk space on elastic1023 is OK: DISK OK [15:13:32] RECOVERY - Disk space on elastic1017 is OK: DISK OK [17:06:02] PROBLEM - Disk space on elastic1023 is CRITICAL: DISK CRITICAL - free space: /srv 61608 MB (12% inode=99%) [17:19:11] RECOVERY - Check systemd state on kubernetes2003 is OK: OK - running: The system is fully operational [17:22:22] PROBLEM - Check systemd state on kubernetes2003 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [17:44:02] RECOVERY - Disk space on elastic1023 is OK: DISK OK [18:48:31] RECOVERY - Check systemd state on kubernetes2003 is OK: OK - running: The system is fully operational [18:51:42] PROBLEM - Check systemd state on kubernetes2003 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [19:02:34] (03CR) 10Jforrester: [C: 031] Enable zh-my variant on zhwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/440651 (https://phabricator.wikimedia.org/T193983) (owner: 10星耀晨曦) [19:08:23] James_F: got a minute to T197549 ? [19:08:24] T197549: Please update the interwiki cache - https://phabricator.wikimedia.org/T197549 [19:09:55] Hauskatze: I don't think running that on a Sunday ahead of a no-deploy week is a good idea. :-( [19:10:15] (03CR) 10Jforrester: [C: 031] Remove MW 1.29 from ExtDist as it is now no longer supported [mediawiki-config] - 10https://gerrit.wikimedia.org/r/440745 (owner: 10MacFan4000) [19:10:17] true that, next swat window probably [19:11:22] Which is in 180 hours' time. [19:11:47] jouncebot: next [19:11:47] In 183 hour(s) and 48 minute(s): Wikimedia Portals Update (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180625T1100) [19:11:55] 183 and 48' :P [19:11:58] Close enough. :-) [19:12:02] :D [19:12:15] wait... that's more than a week [19:12:38] ah, off-site stuff [19:12:40] k, k [19:18:52] (03CR) 10Legoktm: [C: 04-1] "Did I miss something? I don't see any end of life notice to mediawiki-announce, and https://www.mediawiki.org/wiki/Version_lifecycle#Versi" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/440745 (owner: 10MacFan4000) [19:43:21] PROBLEM - Disk space on elastic1023 is CRITICAL: DISK CRITICAL - free space: /srv 61129 MB (12% inode=99%) [19:44:22] RECOVERY - Disk space on elastic1023 is OK: DISK OK [19:55:41] PROBLEM - proton endpoints health on proton2001 is CRITICAL: /{domain}/v1/pdf/{title}/{format}/{type} (Print the Foo page from en.wp.org in letter format) timed out before a response was received: /{domain}/v1/pdf/{title}/{format}/{type} (Print the Bar page from en.wp.org in A4 format using optimized for reading on mobile devices) timed out before a response was received [19:56:41] RECOVERY - proton endpoints health on proton2001 is OK: All endpoints are healthy [20:25:12] 10Operations: Update wikitech-static mediawiki version - https://phabricator.wikimedia.org/T197554#4295745 (10Andrew) [20:41:01] (03CR) 10MacFan4000: [C: 031] "@Legoktm it was added in https://www.mediawiki.org/w/index.php?title=Version_lifecycle&diff=2273307&oldid=2259265 as June, and was changed" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/440745 (owner: 10MacFan4000) [20:43:35] (03CR) 10MacFan4000: [C: 031] "Literally every other summer release has an EOL date in June the next year after it was released" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/440745 (owner: 10MacFan4000) [20:44:39] (03CR) 10Legoktm: [C: 04-1] "> Patch Set 2: Code-Review+1" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/440745 (owner: 10MacFan4000) [21:43:21] PROBLEM - proton endpoints health on proton1002 is CRITICAL: /{domain}/v1/pdf/{title}/{format}/{type} (Print the Foo page from en.wp.org in letter format) timed out before a response was received [21:44:21] RECOVERY - proton endpoints health on proton1002 is OK: All endpoints are healthy [22:23:55] !log gerrit: stopping services for a few minutes [22:23:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:24:29] !log gerrit: started back up, nvm [22:24:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:44:12] PROBLEM - Disk space on elastic1018 is CRITICAL: DISK CRITICAL - free space: /srv 59779 MB (12% inode=99%)