[00:10:26] PROBLEM - Slow CirrusSearch query rate on fluorine is CRITICAL: CirrusSearch-slow.log_line_rate CRITICAL: 0.0166666666667 [00:18:17] RECOVERY - Graphite Carbon on graphite2001 is OK: OK: All defined Carbon jobs are runnning. [00:22:46] PROBLEM - Graphite Carbon on graphite2001 is CRITICAL: CRITICAL: Not all configured Carbon instances are running. [01:10:17] PROBLEM - Slow CirrusSearch query rate on fluorine is CRITICAL: CirrusSearch-slow.log_line_rate CRITICAL: 0.00333333333333 [01:22:29] !log running `nodetool cleanup` on restbase1005 [01:22:36] Logged the message, Master [02:15:27] PROBLEM - Slow CirrusSearch query rate on fluorine is CRITICAL: CirrusSearch-slow.log_line_rate CRITICAL: 0.00666666666667 [02:17:30] !log l10nupdate Synchronized php-1.25wmf21/cache/l10n: (no message) (duration: 04m 54s) [02:17:42] Logged the message, Master [02:20:56] !log LocalisationUpdate completed (1.25wmf21) at 2015-03-22 02:19:53+00:00 [02:21:01] Logged the message, Master [02:21:43] !log l10nupdate Synchronized php-1.25wmf22/cache/l10n: (no message) (duration: 00m 03s) [02:21:47] Logged the message, Master [02:22:50] !log LocalisationUpdate completed (1.25wmf22) at 2015-03-22 02:21:46+00:00 [02:22:53] Logged the message, Master [03:02:27] PROBLEM - puppet last run on tmh1002 is CRITICAL: CRITICAL: Puppet has 1 failures [03:19:46] RECOVERY - puppet last run on tmh1002 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [03:20:36] PROBLEM - Slow CirrusSearch query rate on fluorine is CRITICAL: CirrusSearch-slow.log_line_rate CRITICAL: 0.01 [03:34:57] PROBLEM - puppet last run on mw1087 is CRITICAL: CRITICAL: Puppet has 1 failures [03:36:47] PROBLEM - puppet last run on mw1241 is CRITICAL: CRITICAL: puppet fail [03:42:17] (03CR) 10MZMcBride: "As pointed out at , the issue here is that it's $wgDismissableSiteNoticeForAnons (trailing S)." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/193090 (https://phabricator.wikimedia.org/T59732) (owner: 10Nemo bis) [03:46:47] PROBLEM - puppet last run on wtp1014 is CRITICAL: CRITICAL: Puppet has 1 failures [03:53:06] RECOVERY - Slow CirrusSearch query rate on fluorine is OK: CirrusSearch-slow.log_line_rate OKAY: 0.0 [03:58:17] PROBLEM - puppet last run on amssq58 is CRITICAL: CRITICAL: Puppet has 1 failures [03:58:17] PROBLEM - puppet last run on netmon1001 is CRITICAL: CRITICAL: Puppet has 1 failures [03:58:17] PROBLEM - puppet last run on cp4020 is CRITICAL: CRITICAL: Puppet has 1 failures [04:00:06] PROBLEM - puppet last run on ms-be1007 is CRITICAL: CRITICAL: Puppet has 1 failures [04:00:27] PROBLEM - puppet last run on ms-be1008 is CRITICAL: CRITICAL: Puppet has 1 failures [04:00:37] PROBLEM - puppet last run on ms-be3002 is CRITICAL: CRITICAL: Puppet has 1 failures [04:00:47] RECOVERY - puppet last run on mw1087 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [04:00:47] PROBLEM - puppet last run on mw1137 is CRITICAL: CRITICAL: Puppet has 1 failures [04:00:47] PROBLEM - puppet last run on ms-be2012 is CRITICAL: CRITICAL: Puppet has 1 failures [04:00:47] PROBLEM - puppet last run on ms-be2005 is CRITICAL: CRITICAL: Puppet has 1 failures [04:00:56] PROBLEM - puppet last run on ms-be2001 is CRITICAL: CRITICAL: Puppet has 3 failures [04:01:07] PROBLEM - puppet last run on ms-be1018 is CRITICAL: CRITICAL: Puppet has 1 failures [04:01:07] RECOVERY - puppet last run on wtp1014 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [04:01:07] PROBLEM - puppet last run on mc1007 is CRITICAL: CRITICAL: Puppet has 1 failures [04:01:16] PROBLEM - puppet last run on gallium is CRITICAL: CRITICAL: Puppet has 1 failures [04:01:17] RECOVERY - puppet last run on mw1241 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [04:02:08] PROBLEM - puppet last run on mw1141 is CRITICAL: CRITICAL: Puppet has 1 failures [04:02:37] RECOVERY - puppet last run on netmon1001 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [04:02:46] PROBLEM - puppet last run on silver is CRITICAL: CRITICAL: Puppet has 1 failures [04:02:47] PROBLEM - puppet last run on mw1026 is CRITICAL: CRITICAL: Puppet has 1 failures [04:03:07] PROBLEM - puppet last run on mw1042 is CRITICAL: CRITICAL: Puppet has 1 failures [04:03:37] PROBLEM - puppet last run on mw1189 is CRITICAL: CRITICAL: Puppet has 1 failures [04:03:37] RECOVERY - puppet last run on mw1141 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [04:04:17] RECOVERY - puppet last run on mw1026 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [04:05:08] PROBLEM - puppet last run on mw2157 is CRITICAL: CRITICAL: Puppet has 1 failures [04:05:08] PROBLEM - puppet last run on mw2166 is CRITICAL: CRITICAL: Puppet has 1 failures [04:05:58] RECOVERY - puppet last run on mw1042 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [04:06:27] RECOVERY - puppet last run on mw1189 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [04:06:57] RECOVERY - puppet last run on gallium is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [04:08:06] RECOVERY - puppet last run on mw2166 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [04:08:28] RECOVERY - puppet last run on silver is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [04:09:17] RECOVERY - puppet last run on ms-be3002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [04:09:27] RECOVERY - puppet last run on ms-be2012 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [04:10:36] RECOVERY - puppet last run on ms-be1008 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [04:10:57] RECOVERY - puppet last run on ms-be2005 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [04:10:57] RECOVERY - puppet last run on ms-be2001 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [04:11:07] RECOVERY - puppet last run on mc1007 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [04:11:36] RECOVERY - puppet last run on ms-be1007 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [04:12:27] RECOVERY - puppet last run on ms-be1018 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [04:14:07] RECOVERY - puppet last run on amssq58 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [04:14:07] RECOVERY - puppet last run on cp4020 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [04:15:08] RECOVERY - puppet last run on mw2157 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [04:17:57] RECOVERY - puppet last run on mw1137 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [05:46:39] !log LocalisationUpdate ResourceLoader cache refresh completed at Sun Mar 22 05:45:32 UTC 2015 (duration 45m 31s) [05:46:48] Logged the message, Master [06:28:47] PROBLEM - puppet last run on lvs2001 is CRITICAL: CRITICAL: puppet fail [06:29:16] PROBLEM - puppet last run on cp4008 is CRITICAL: CRITICAL: Puppet has 1 failures [06:29:27] PROBLEM - puppet last run on mw1118 is CRITICAL: CRITICAL: Puppet has 1 failures [06:29:27] PROBLEM - puppet last run on cp4003 is CRITICAL: CRITICAL: Puppet has 3 failures [06:29:37] PROBLEM - puppet last run on mw1042 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:27] PROBLEM - puppet last run on mw2093 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:27] PROBLEM - puppet last run on mw2104 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:37] PROBLEM - puppet last run on amssq54 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:37] PROBLEM - puppet last run on mw1061 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:37] PROBLEM - puppet last run on cp4019 is CRITICAL: CRITICAL: puppet fail [06:30:56] PROBLEM - puppet last run on mw1052 is CRITICAL: CRITICAL: Puppet has 2 failures [06:33:16] PROBLEM - puppet last run on mw2017 is CRITICAL: CRITICAL: Puppet has 3 failures [06:33:17] PROBLEM - puppet last run on mw2097 is CRITICAL: CRITICAL: Puppet has 1 failures [06:33:47] PROBLEM - puppet last run on mw1144 is CRITICAL: CRITICAL: Puppet has 1 failures [06:34:18] PROBLEM - puppet last run on mw1092 is CRITICAL: CRITICAL: Puppet has 3 failures [06:34:46] PROBLEM - puppet last run on mw2079 is CRITICAL: CRITICAL: Puppet has 1 failures [06:34:47] PROBLEM - puppet last run on mw2003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:34:47] PROBLEM - puppet last run on mw2030 is CRITICAL: CRITICAL: Puppet has 1 failures [06:46:18] RECOVERY - puppet last run on mw2104 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [06:46:26] RECOVERY - puppet last run on mw2003 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [06:46:27] RECOVERY - puppet last run on mw1061 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [06:46:28] RECOVERY - puppet last run on cp4008 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [06:46:28] RECOVERY - puppet last run on amssq54 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [06:46:37] RECOVERY - puppet last run on mw1118 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [06:46:46] RECOVERY - puppet last run on cp4003 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [06:46:46] RECOVERY - puppet last run on mw1052 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:46:47] RECOVERY - puppet last run on mw1144 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [06:46:57] RECOVERY - puppet last run on mw1042 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [06:47:26] RECOVERY - puppet last run on mw1092 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:47:37] RECOVERY - puppet last run on lvs2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:47:47] RECOVERY - puppet last run on mw2017 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [06:47:47] RECOVERY - puppet last run on mw2093 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:47:48] RECOVERY - puppet last run on mw2097 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [06:47:48] RECOVERY - puppet last run on mw2079 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:47:57] RECOVERY - puppet last run on mw2030 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [06:48:06] RECOVERY - puppet last run on cp4019 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [08:45:15] (03PS1) 10Glaisher: Fix typo in DismissableSiteNotice configuration [mediawiki-config] - 10https://gerrit.wikimedia.org/r/198542 (https://phabricator.wikimedia.org/T59732) [08:45:48] (03CR) 10Glaisher: "I2082ad9ba5c173dc413cbac5b2652e138bd62430" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/193090 (https://phabricator.wikimedia.org/T59732) (owner: 10Nemo bis) [10:00:47] PROBLEM - puppetmaster https on virt1000 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:11:27] PROBLEM - Host mw2053 is DOWN: PING CRITICAL - Packet loss = 100% [10:12:47] RECOVERY - Host mw2053 is UP: PING WARNING - Packet loss = 44%, RTA = 43.00 ms [10:15:47] PROBLEM - DPKG on mw2053 is CRITICAL: Connection refused by host [10:15:48] PROBLEM - puppet last run on mw2053 is CRITICAL: Connection refused by host [10:16:17] PROBLEM - Disk space on mw2053 is CRITICAL: Connection refused by host [10:16:17] PROBLEM - nutcracker port on mw2053 is CRITICAL: Connection refused by host [10:16:27] PROBLEM - configured eth on mw2053 is CRITICAL: Connection refused by host [10:16:37] <_joe_> that's me, I counted badly and I'm reimaging one server that didn't need it [10:16:38] PROBLEM - dhclient process on mw2053 is CRITICAL: Connection refused by host [10:16:38] PROBLEM - nutcracker process on mw2053 is CRITICAL: Connection refused by host [10:16:38] PROBLEM - salt-minion processes on mw2053 is CRITICAL: Connection refused by host [10:17:07] PROBLEM - RAID on mw2053 is CRITICAL: Connection refused by host [10:26:57] RECOVERY - puppetmaster https on virt1000 is OK: HTTP OK: Status line output matched 400 - 335 bytes in 0.683 second response time [10:28:37] PROBLEM - Host mw2053 is DOWN: PING CRITICAL - Packet loss = 100% [10:29:16] RECOVERY - Host mw2053 is UP: PING OK - Packet loss = 0%, RTA = 44.02 ms [10:56:06] PROBLEM - puppet last run on mw1167 is CRITICAL: CRITICAL: Puppet has 1 failures [11:13:25] RECOVERY - puppet last run on mw1167 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [14:54:07] (03CR) 10MZMcBride: "I think this just needs to get added to some schedule/queue? It can probably be fixed on Monday." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/198542 (https://phabricator.wikimedia.org/T59732) (owner: 10Glaisher) [15:58:03] (03CR) 10JanZerebecki: [C: 031] integration - Enable HSTS max-age=7 days [puppet] - 10https://gerrit.wikimedia.org/r/198458 (https://phabricator.wikimedia.org/T40516) (owner: 10Chmarkine) [16:03:20] (03CR) 10JanZerebecki: [C: 031] ishmael - Enable HSTS max-age=7 days [puppet] - 10https://gerrit.wikimedia.org/r/198457 (https://phabricator.wikimedia.org/T40516) (owner: 10Chmarkine) [16:11:35] !log restarted nova-api on labnet1001 because it was timing out [16:11:40] Logged the message, Master [16:15:14] (03CR) 10JanZerebecki: [C: 031] RT - Enable HSTS max-age=7 days [puppet] - 10https://gerrit.wikimedia.org/r/198455 (https://phabricator.wikimedia.org/T40516) (owner: 10Chmarkine) [16:32:50] legoktm: Which wikis have abusefilter entries enabled on irc.wikimedia.org? [16:33:01] I mean not entries, but log hits [16:34:14] Bsadowski1: all the wikis by default [16:34:24] there are a few wikis where it's disabled [16:52:29] (03CR) 10JanZerebecki: [C: 031] gdash - Enable HSTS max-age=7 days [puppet] - 10https://gerrit.wikimedia.org/r/198469 (https://phabricator.wikimedia.org/T40516) (owner: 10Chmarkine) [16:54:35] PROBLEM - HTTP 5xx req/min on graphite2001 is CRITICAL: CRITICAL: 6.67% of data above the critical threshold [500.0] [16:54:35] PROBLEM - HTTP 5xx req/min on graphite1001 is CRITICAL: CRITICAL: 6.67% of data above the critical threshold [500.0] [16:55:06] PROBLEM - HTTP error ratio anomaly detection on graphite1001 is CRITICAL: CRITICAL: Anomaly detected: 11 data above and 5 below the confidence bounds [16:55:06] PROBLEM - HTTP error ratio anomaly detection on graphite2001 is CRITICAL: CRITICAL: Anomaly detected: 11 data above and 5 below the confidence bounds [16:57:46] PROBLEM - puppet last run on amssq57 is CRITICAL: CRITICAL: puppet fail [16:59:44] (03PS1) 10Andrew Bogott: Attempt to raise quotas so that we can have more than 500 records. [puppet] - 10https://gerrit.wikimedia.org/r/198558 [17:03:19] (03CR) 10Andrew Bogott: [C: 032] Attempt to raise quotas so that we can have more than 500 records. [puppet] - 10https://gerrit.wikimedia.org/r/198558 (owner: 10Andrew Bogott) [17:10:26] RECOVERY - HTTP 5xx req/min on graphite2001 is OK: OK: Less than 1.00% above the threshold [250.0] [17:10:26] RECOVERY - HTTP 5xx req/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [17:17:47] RECOVERY - puppet last run on amssq57 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [18:03:16] (03CR) 10Ori.livneh: "We can avoid having to maintain more inline C code by using the cookie vmod (). It has a forma" [puppet] - 10https://gerrit.wikimedia.org/r/196009 (https://phabricator.wikimedia.org/T88813) (owner: 10Nuria) [18:12:16] PROBLEM - DPKG on labmon1001 is CRITICAL: DPKG CRITICAL dpkg reports broken packages [18:13:37] RECOVERY - DPKG on labmon1001 is OK: All packages OK [18:29:57] RECOVERY - HTTP error ratio anomaly detection on graphite1001 is OK: OK: No anomaly detected [18:29:57] RECOVERY - HTTP error ratio anomaly detection on graphite2001 is OK: OK: No anomaly detected [19:14:01] (03PS1) 10Tim Landscheidt: Tools: Don't let user names mask system aliases [puppet] - 10https://gerrit.wikimedia.org/r/198563 [19:14:42] (03PS1) 10John F. Lewis: graphite: use use HTTPS by default [puppet] - 10https://gerrit.wikimedia.org/r/198564 [19:18:14] (03CR) 10Tim Landscheidt: "Tested live on tools-mail." [puppet] - 10https://gerrit.wikimedia.org/r/198563 (owner: 10Tim Landscheidt) [19:39:27] (03CR) 10Giuseppe Lavagetto: [C: 04-1] "We have internal tools interrogating graphite, which would get redirected to https when a) there is no need for that b) it may just make t" [puppet] - 10https://gerrit.wikimedia.org/r/198564 (owner: 10John F. Lewis) [19:41:38] (03PS1) 10Giuseppe Lavagetto: ganglia: DRY, use hiera [puppet] - 10https://gerrit.wikimedia.org/r/198566 [19:42:31] (03PS1) 10John F. Lewis: scholorships: use HTTPS by default [puppet] - 10https://gerrit.wikimedia.org/r/198567 [19:48:23] (03CR) 10John F. Lewis: "@Giuseppe https://gerrit.wikimedia.org/r/#/c/98003/ is the commit which put graphite behind misc." [puppet] - 10https://gerrit.wikimedia.org/r/198564 (owner: 10John F. Lewis) [19:48:36] 6operations, 10Tool-Labs: Crontabs broken on toolslabs - https://phabricator.wikimedia.org/T93530#1139416 (10Steinsplitter) 3NEW [19:55:17] 6operations, 10Tool-Labs: Crontabs broken on toolslabs - https://phabricator.wikimedia.org/T93530#1139428 (10Steinsplitter) [19:59:14] YuviPanda: see -labs . looks like filesystem broken on tools. [19:59:19] hi :-) [19:59:23] yeah, possibly [19:59:25] am looking [20:01:03] (03PS1) 10Tim Landscheidt: Tools: Make "admin" and "administrator" system aliases [puppet] - 10https://gerrit.wikimedia.org/r/198571 [20:06:22] (03CR) 10Tim Landscheidt: "Tested on Toolsbeta." [puppet] - 10https://gerrit.wikimedia.org/r/198571 (owner: 10Tim Landscheidt) [20:06:38] 6operations: Decommission svn.wikimedia.org server (import SVN into Phabricator) - https://phabricator.wikimedia.org/T86655#1139440 (10valhallasw) >>! In T86655#1089545, @Dzahn wrote: > > https://phabricator.wikimedia.org/diffusion/SVN/ Cool! Did you/could you also import the other repositories (most important... [20:08:08] 6operations, 10Tool-Labs: Crontabs broken on toolslabs - https://phabricator.wikimedia.org/T93530#1139442 (10Steinsplitter) 5Open>3Resolved a:3Steinsplitter @yuvipanda has reeboted. Back now. Thanks. [20:08:26] 6operations, 10Tool-Labs: Crontabs broken on toolslabs - https://phabricator.wikimedia.org/T93530#1139446 (10Steinsplitter) a:5Steinsplitter>3yuvipanda [20:23:29] 6operations, 6CA-team: secure.wikimedia.org entries still showing up in Google search results - https://phabricator.wikimedia.org/T93531#1139454 (10Krenair) 3NEW [20:52:05] 6operations, 10OTRS, 6Security: Make OTRS sessions IP-address-agnostic - https://phabricator.wikimedia.org/T87217#1139489 (10Aschmidt) This is just to let you know that I can confirm that—for the time being, at least—I have been able to use OTRS with IPv6 disabled in Firefox, or in OS X Mavericks, for that m... [22:21:26] PROBLEM - puppet last run on elastic1009 is CRITICAL: CRITICAL: Puppet has 1 failures [22:30:52] (03CR) 10Springle: [C: 04-2] "Only -2 until the tables are imported and in sync." [puppet] - 10https://gerrit.wikimedia.org/r/198292 (owner: 10Hoo man) [22:38:16] RECOVERY - Graphite Carbon on graphite2001 is OK: OK: All defined Carbon jobs are runnning. [22:38:26] RECOVERY - puppet last run on elastic1009 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [22:41:44] (03PS1) 10BryanDavis: Monolog: simplify beta configurations [mediawiki-config] - 10https://gerrit.wikimedia.org/r/198662 [22:42:45] PROBLEM - Graphite Carbon on graphite2001 is CRITICAL: CRITICAL: Not all configured Carbon instances are running. [22:48:20] !log Deployed patch for T93543 [22:48:29] Logged the message, Master [23:00:40] (03PS1) 10Tim Landscheidt: Tools: Allow proxy certificate to be manually managed [puppet] - 10https://gerrit.wikimedia.org/r/198665