[00:18:08] RECOVERY - puppet last run on mw2123 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [00:47:47] PROBLEM - DPKG on logstash1001 is CRITICAL: DPKG CRITICAL dpkg reports broken packages [01:50:49] PROBLEM - puppet last run on labvirt1010 is CRITICAL: CRITICAL: Puppet has 4 failures [02:28:48] !log mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.5) (duration: 12m 24s) [02:28:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [02:39:14] (03CR) 1020after4: "I think iridium makes sense for the rewrite rules. That's how we handled the bugzilla redirects - having them on iridium was a lot easier " [dns] - 10https://gerrit.wikimedia.org/r/293747 (https://phabricator.wikimedia.org/T123718) (owner: 10Dzahn) [02:46:03] RECOVERY - puppet last run on labvirt1010 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [03:33:11] PROBLEM - puppet last run on cp4013 is CRITICAL: CRITICAL: puppet fail [03:36:33] PROBLEM - puppet last run on analytics1037 is CRITICAL: CRITICAL: Puppet has 1 failures [03:36:42] PROBLEM - puppet last run on mw2214 is CRITICAL: CRITICAL: Puppet has 1 failures [03:38:41] PROBLEM - puppet last run on cp4020 is CRITICAL: CRITICAL: puppet fail [03:59:12] RECOVERY - puppet last run on cp4013 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [04:02:55] RECOVERY - puppet last run on analytics1037 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [04:03:56] RECOVERY - puppet last run on cp4020 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [04:04:36] RECOVERY - puppet last run on mw2214 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:28:14] PROBLEM - Apache HTTP on mw2218 is CRITICAL: Connection refused [06:30:02] PROBLEM - puppet last run on elastic2007 is CRITICAL: CRITICAL: puppet fail [06:30:32] PROBLEM - puppet last run on wtp2015 is CRITICAL: CRITICAL: puppet fail [06:31:12] PROBLEM - puppet last run on labnet1002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:13] PROBLEM - puppet last run on mw2158 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:23] PROBLEM - puppet last run on nobelium is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:33] PROBLEM - puppet last run on pc1006 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:43] PROBLEM - puppet last run on cp2013 is CRITICAL: CRITICAL: Puppet has 2 failures [06:31:43] PROBLEM - puppet last run on mw2081 is CRITICAL: CRITICAL: Puppet has 1 failures [06:34:32] PROBLEM - puppet last run on mw2126 is CRITICAL: CRITICAL: Puppet has 1 failures [06:35:01] PROBLEM - puppet last run on mw1158 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:41] PROBLEM - puppet last run on labvirt1010 is CRITICAL: CRITICAL: Puppet has 4 failures [06:55:40] RECOVERY - puppet last run on labnet1002 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [06:55:51] PROBLEM - puppet last run on ms-be2013 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:00] RECOVERY - puppet last run on mw2081 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [06:56:41] RECOVERY - puppet last run on mw1158 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [06:57:01] RECOVERY - puppet last run on nobelium is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [06:57:30] RECOVERY - puppet last run on mw2158 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:57:31] RECOVERY - puppet last run on pc1006 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:57:41] RECOVERY - puppet last run on elastic2007 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:57:50] RECOVERY - puppet last run on wtp2015 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [06:58:00] RECOVERY - puppet last run on cp2013 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:58:21] RECOVERY - puppet last run on mw2126 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:59:30] PROBLEM - puppet last run on analytics1053 is CRITICAL: CRITICAL: Puppet has 1 failures [07:11:25] RECOVERY - Disk space on lithium is OK: DISK OK [07:22:16] RECOVERY - puppet last run on ms-be2013 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [07:23:25] RECOVERY - puppet last run on analytics1053 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [07:46:20] RECOVERY - puppet last run on labvirt1010 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [08:14:10] (03PS2) 10Ladsgroup: Enable ORES on fawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/269478 (https://phabricator.wikimedia.org/T120923) (owner: 10Reedy) [08:15:55] (03CR) 10Elukey: Add a notification parameter of analytics to cassandra monitoring (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/293916 (https://phabricator.wikimedia.org/T137422) (owner: 10JanZerebecki) [08:20:24] PROBLEM - puppet last run on labvirt1010 is CRITICAL: CRITICAL: Puppet has 2 failures [08:52:10] there is some high-load purges ongoing on some of the pc* hosts, hopefully that will not create any issues [09:46:03] RECOVERY - puppet last run on labvirt1010 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [10:50:51] PROBLEM - puppet last run on labvirt1010 is CRITICAL: CRITICAL: Puppet has 2 failures [11:17:12] RECOVERY - puppet last run on labvirt1010 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:49:51] PROBLEM - puppet last run on labvirt1010 is CRITICAL: CRITICAL: puppet fail [12:03:46] 06Operations, 06Release-Engineering-Team, 05Gitblit-Deprecate, 13Patch-For-Review: write Apache rewrite rules for gitblit -> diffusion migration - https://phabricator.wikimedia.org/T137224#2374778 (10Paladox) [12:16:56] RECOVERY - puppet last run on labvirt1010 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [12:18:08] 06Operations, 06Release-Engineering-Team, 05Gitblit-Deprecate, 13Patch-For-Review: write Apache rewrite rules for gitblit -> diffusion migration - https://phabricator.wikimedia.org/T137224#2374779 (10Paladox) [12:51:06] PROBLEM - puppet last run on labvirt1010 is CRITICAL: CRITICAL: Puppet has 2 failures [12:55:09] (03CR) 10BBlack: [C: 04-1] "I tend to agree with hashar here. I think it makes more sense to handle these redirects should happen in iridium's apache config. It's b" [dns] - 10https://gerrit.wikimedia.org/r/293747 (https://phabricator.wikimedia.org/T123718) (owner: 10Dzahn) [13:01:48] PROBLEM - puppet last run on eventlog2001 is CRITICAL: CRITICAL: puppet fail [13:15:28] RECOVERY - puppet last run on labvirt1010 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [13:28:07] RECOVERY - puppet last run on eventlog2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [14:21:11] 06Operations, 10Analytics, 10Traffic: Make upload.wikimedia.org cookieless - https://phabricator.wikimedia.org/T137609#2373208 (10BBlack) WMF-Last-Access doesn't ever set `Domain=`, so tracking last-access has always been separate per-project/language, and upload would be entirely in its own bin. So at leas... [14:50:47] PROBLEM - puppet last run on labvirt1010 is CRITICAL: CRITICAL: Puppet has 2 failures [15:17:27] RECOVERY - puppet last run on labvirt1010 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [15:57:26] (03PS1) 10Urbanecm: Remove old throttle rules [mediawiki-config] - 10https://gerrit.wikimedia.org/r/293970 [16:32:13] PROBLEM - puppet last run on db2062 is CRITICAL: CRITICAL: puppet fail [16:59:31] RECOVERY - puppet last run on db2062 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [17:30:30] 06Operations, 06Release-Engineering-Team, 05Gitblit-Deprecate, 13Patch-For-Review: write Apache rewrite rules for gitblit -> diffusion migration - https://phabricator.wikimedia.org/T137224#2374900 (10Paladox) [17:30:32] 06Operations, 07Blocked-on-RelEng, 05Gitblit-Deprecate, 13Patch-For-Review: Phase out antimony.wikimedia.org (git.wikimedia.org / gitblit) - https://phabricator.wikimedia.org/T123718#2374899 (10Paladox) [18:11:45] PROBLEM - Start and verify pages via webservices on checker.tools.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 SERVICE UNAVAILABLE - 187 bytes in 10.843 second response time [18:15:36] RECOVERY - Start and verify pages via webservices on checker.tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 166 bytes in 8.272 second response time [19:18:26] PROBLEM - Start and verify pages via webservices on checker.tools.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 SERVICE UNAVAILABLE - 187 bytes in 11.084 second response time [19:21:05] PROBLEM - puppet last run on labvirt1010 is CRITICAL: CRITICAL: Puppet has 2 failures [19:22:35] RECOVERY - Start and verify pages via webservices on checker.tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 166 bytes in 14.926 second response time [19:47:26] RECOVERY - puppet last run on labvirt1010 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [19:50:38] 06Operations, 06Release-Engineering-Team, 05Gitblit-Deprecate, 13Patch-For-Review: write Apache rewrite rules for gitblit -> diffusion migration - https://phabricator.wikimedia.org/T137224#2375034 (10Danny_B) Where to redirect https://git.wikimedia.org/project/ https://git.wikimedia.org/project/main http... [20:38:07] 06Operations, 06Release-Engineering-Team, 05Gitblit-Deprecate, 13Patch-For-Review: write Apache rewrite rules for gitblit -> diffusion migration - https://phabricator.wikimedia.org/T137224#2375052 (10Paladox) >>! In T137224#2375034, @Danny_B wrote: > Where to redirect > > https://git.wikimedia.org/projec... [20:52:41] (03PS1) 10Nemo bis: [Planet Wikimedia] 4 additions to Italian and English planets [puppet] - 10https://gerrit.wikimedia.org/r/294015 [20:53:09] (03PS2) 10Nemo bis: [Planet Wikimedia] 5 additions to Italian and English planets [puppet] - 10https://gerrit.wikimedia.org/r/294015 [21:12:50] PROBLEM - Disk space on lithium is CRITICAL: DISK CRITICAL - free space: /srv/syslog 11451 MB (3% inode=99%) [21:29:40] PROBLEM - puppet last run on mw1132 is CRITICAL: CRITICAL: Puppet has 20 failures [21:51:50] RECOVERY - puppet last run on mw1132 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [21:51:54] 06Operations, 06Release-Engineering-Team, 05Gitblit-Deprecate, 13Patch-For-Review: write Apache rewrite rules for gitblit -> diffusion migration - https://phabricator.wikimedia.org/T137224#2375119 (10Paladox) [21:54:11] 06Operations, 06Release-Engineering-Team, 05Gitblit-Deprecate, 13Patch-For-Review: write Apache rewrite rules for gitblit -> diffusion migration - https://phabricator.wikimedia.org/T137224#2375120 (10Paladox) [22:11:57] PROBLEM - puppet last run on es2004 is CRITICAL: CRITICAL: puppet fail [22:39:01] RECOVERY - puppet last run on es2004 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [23:01:39] 06Operations, 10Analytics, 10Traffic: Make upload.wikimedia.org cookieless - https://phabricator.wikimedia.org/T137609#2375157 (10Nuria) Unique Devices are calculated per project as @BBlack mentioned, in the case of upload.wikimedia.org we neither report pageviews nor unique devices. thus WMF-Last- Access ca... [23:36:36] 06Operations, 06Release-Engineering-Team, 05Gitblit-Deprecate, 13Patch-For-Review: write Apache rewrite rules for gitblit -> diffusion migration - https://phabricator.wikimedia.org/T137224#2375161 (10Danny_B) [23:41:22] 06Operations, 06Release-Engineering-Team, 05Gitblit-Deprecate, 13Patch-For-Review: write Apache rewrite rules for gitblit -> diffusion migration - https://phabricator.wikimedia.org/T137224#2375162 (10Danny_B)