[00:00:04] PROBLEM - Puppet freshness on search24 is CRITICAL: No successful Puppet run in the last 10 hours [00:00:18] (03PS5) 10Legoktm: Enable MassMessage on all wikis [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91344 [00:00:51] that was weird. [00:01:39] what? [00:02:35] PS4 had a random test failure [00:02:48] I edited the commit message to retrigger the +2 tests, which passed. [00:03:54] it did not merge? [00:04:04] jenkins seems to have having this kind of issues lately [00:04:11] (no, i did not report it) [00:04:33] no [00:04:34] it was [00:04:35] https://integration.wikimedia.org/ci/job/operations-mw-config-tests/5125/console : FAILURE in 10s [00:15:51] (03CR) 10Ori.livneh: [C: 032] Apply misc::monitoring::view::mobile on nickel [operations/puppet] - 10https://gerrit.wikimedia.org/r/92035 (owner: 10Ori.livneh) [00:46:04] RECOVERY - Puppet freshness on search30 is OK: puppet ran at Sat Oct 26 00:46:00 UTC 2013 [00:46:34] PROBLEM - Puppet freshness on search30 is CRITICAL: No successful Puppet run in the last 10 hours [00:53:04] RECOVERY - Puppet freshness on search32 is OK: puppet ran at Sat Oct 26 00:53:02 UTC 2013 [00:53:04] RECOVERY - Puppet freshness on search27 is OK: puppet ran at Sat Oct 26 00:53:02 UTC 2013 [00:53:24] PROBLEM - Puppet freshness on search27 is CRITICAL: No successful Puppet run in the last 10 hours [00:53:44] PROBLEM - Puppet freshness on search32 is CRITICAL: No successful Puppet run in the last 10 hours [00:55:54] RECOVERY - Puppet freshness on search23 is OK: puppet ran at Sat Oct 26 00:55:44 UTC 2013 [00:55:54] PROBLEM - Puppet freshness on search23 is CRITICAL: No successful Puppet run in the last 10 hours [00:58:54] RECOVERY - Puppet freshness on search26 is OK: puppet ran at Sat Oct 26 00:58:46 UTC 2013 [00:59:14] PROBLEM - Puppet freshness on search26 is CRITICAL: No successful Puppet run in the last 10 hours [00:59:54] RECOVERY - Puppet freshness on search24 is OK: puppet ran at Sat Oct 26 00:59:47 UTC 2013 [01:00:04] PROBLEM - Puppet freshness on search24 is CRITICAL: No successful Puppet run in the last 10 hours [01:45:54] RECOVERY - Puppet freshness on search30 is OK: puppet ran at Sat Oct 26 01:45:50 UTC 2013 [01:46:34] PROBLEM - Puppet freshness on search30 is CRITICAL: No successful Puppet run in the last 10 hours [01:53:04] RECOVERY - Puppet freshness on search32 is OK: puppet ran at Sat Oct 26 01:53:02 UTC 2013 [01:53:04] RECOVERY - Puppet freshness on search27 is OK: puppet ran at Sat Oct 26 01:53:02 UTC 2013 [01:53:24] PROBLEM - Puppet freshness on search27 is CRITICAL: No successful Puppet run in the last 10 hours [01:53:44] PROBLEM - Puppet freshness on search32 is CRITICAL: No successful Puppet run in the last 10 hours [01:55:54] RECOVERY - Puppet freshness on search23 is OK: puppet ran at Sat Oct 26 01:55:48 UTC 2013 [01:56:54] PROBLEM - Puppet freshness on search23 is CRITICAL: No successful Puppet run in the last 10 hours [01:58:54] RECOVERY - Puppet freshness on search26 is OK: puppet ran at Sat Oct 26 01:58:49 UTC 2013 [01:59:14] PROBLEM - Puppet freshness on search26 is CRITICAL: No successful Puppet run in the last 10 hours [01:59:55] RECOVERY - Puppet freshness on search24 is OK: puppet ran at Sat Oct 26 01:59:49 UTC 2013 [02:00:04] PROBLEM - Puppet freshness on search24 is CRITICAL: No successful Puppet run in the last 10 hours [02:04:55] someone could turn off active checks on that service for those hosts... just sayin [02:05:13] someone is not me at 5 am, I dunno why I am even typing in this channel... [02:08:20] !log LocalisationUpdate completed (1.22wmf22) at Sat Oct 26 02:08:20 UTC 2013 [02:08:40] Logged the message, Master [02:16:30] !log LocalisationUpdate completed (1.23wmf1) at Sat Oct 26 02:16:30 UTC 2013 [02:16:47] Logged the message, Master [02:32:41] !log LocalisationUpdate ResourceLoader cache refresh completed at Sat Oct 26 02:32:40 UTC 2013 [02:32:55] Logged the message, Master [02:45:55] RECOVERY - Puppet freshness on search30 is OK: puppet ran at Sat Oct 26 02:45:51 UTC 2013 [02:46:34] PROBLEM - Puppet freshness on search30 is CRITICAL: No successful Puppet run in the last 10 hours [02:52:54] RECOVERY - Puppet freshness on search32 is OK: puppet ran at Sat Oct 26 02:52:44 UTC 2013 [02:53:04] RECOVERY - Puppet freshness on search27 is OK: puppet ran at Sat Oct 26 02:52:54 UTC 2013 [02:53:24] PROBLEM - Puppet freshness on search27 is CRITICAL: No successful Puppet run in the last 10 hours [02:53:44] PROBLEM - Puppet freshness on search32 is CRITICAL: No successful Puppet run in the last 10 hours [02:55:54] RECOVERY - Puppet freshness on search23 is OK: puppet ran at Sat Oct 26 02:55:45 UTC 2013 [02:55:54] PROBLEM - Puppet freshness on search23 is CRITICAL: No successful Puppet run in the last 10 hours [02:59:04] RECOVERY - Puppet freshness on search26 is OK: puppet ran at Sat Oct 26 02:58:56 UTC 2013 [02:59:14] PROBLEM - Puppet freshness on search26 is CRITICAL: No successful Puppet run in the last 10 hours [02:59:54] RECOVERY - Puppet freshness on search24 is OK: puppet ran at Sat Oct 26 02:59:46 UTC 2013 [03:00:04] PROBLEM - Puppet freshness on search24 is CRITICAL: No successful Puppet run in the last 10 hours [03:09:04] PROBLEM - SSH on lvs4001 is CRITICAL: Server answer: [03:13:04] RECOVERY - SSH on lvs4001 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [03:46:04] RECOVERY - Puppet freshness on search30 is OK: puppet ran at Sat Oct 26 03:45:55 UTC 2013 [03:46:34] PROBLEM - Puppet freshness on search30 is CRITICAL: No successful Puppet run in the last 10 hours [03:53:14] RECOVERY - Puppet freshness on search32 is OK: puppet ran at Sat Oct 26 03:53:08 UTC 2013 [03:53:14] RECOVERY - Puppet freshness on search27 is OK: puppet ran at Sat Oct 26 03:53:08 UTC 2013 [03:53:24] PROBLEM - Puppet freshness on search27 is CRITICAL: No successful Puppet run in the last 10 hours [03:53:44] PROBLEM - Puppet freshness on search32 is CRITICAL: No successful Puppet run in the last 10 hours [03:55:54] RECOVERY - Puppet freshness on search23 is OK: puppet ran at Sat Oct 26 03:55:48 UTC 2013 [03:56:54] PROBLEM - Puppet freshness on search23 is CRITICAL: No successful Puppet run in the last 10 hours [03:58:54] RECOVERY - Puppet freshness on search26 is OK: puppet ran at Sat Oct 26 03:58:45 UTC 2013 [03:59:14] PROBLEM - Puppet freshness on search26 is CRITICAL: No successful Puppet run in the last 10 hours [03:59:54] RECOVERY - Puppet freshness on search24 is OK: puppet ran at Sat Oct 26 03:59:45 UTC 2013 [04:00:04] PROBLEM - Puppet freshness on search24 is CRITICAL: No successful Puppet run in the last 10 hours [04:06:38] fine, I just turned them off... going back to bed [04:06:40] :-P [04:45:55] RECOVERY - Puppet freshness on search30 is OK: puppet ran at Sat Oct 26 04:45:52 UTC 2013 [04:52:54] RECOVERY - Puppet freshness on search32 is OK: puppet ran at Sat Oct 26 04:52:45 UTC 2013 [04:53:04] RECOVERY - Puppet freshness on search27 is OK: puppet ran at Sat Oct 26 04:52:56 UTC 2013 [04:55:54] RECOVERY - Puppet freshness on search23 is OK: puppet ran at Sat Oct 26 04:55:47 UTC 2013 [04:58:54] RECOVERY - Puppet freshness on search26 is OK: puppet ran at Sat Oct 26 04:58:43 UTC 2013 [04:59:54] RECOVERY - Puppet freshness on search24 is OK: puppet ran at Sat Oct 26 04:59:44 UTC 2013 [07:06:34] (03CR) 10Odder: [C: 031] Added filemover user group to bnwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91886 (owner: 10Vogone) [07:33:06] (03PS1) 10Odder: (bug 56198) Additional user right for sysops on eswikivoyage [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/92055 [07:39:14] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:40:04] RECOVERY - RAID on searchidx1001 is OK: OK: State is Optimal, checked 1 logical drive(s), 4 physical drive(s) [12:13:14] PROBLEM - RAID on searchidx2 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [12:17:14] RECOVERY - RAID on searchidx2 is OK: OK: State is Optimal, checked 1 logical drive(s), 4 physical drive(s) [12:30:24] PROBLEM - Host mw123 is DOWN: PING CRITICAL - Packet loss = 100% [12:31:04] RECOVERY - Host mw123 is UP: PING OK - Packet loss = 0%, RTA = 26.58 ms [14:25:50] (03PS1) 10Vogone: Various changes to wikidatawiki's user rights configuration [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/92070 [14:30:38] (03CR) 10John F. Lewis: [C: 031] Various changes to wikidatawiki's user rights configuration [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/92070 (owner: 10Vogone) [16:26:44] PROBLEM - Host mw1085 is DOWN: PING CRITICAL - Packet loss = 100% [16:28:04] RECOVERY - Host mw1085 is UP: PING OK - Packet loss = 0%, RTA = 0.28 ms [17:31:34] PROBLEM - MySQL Idle Transactions on db1039 is CRITICAL: CRIT longest blocking idle transaction sleeps for 618 seconds [17:32:04] PROBLEM - MySQL InnoDB on db1039 is CRITICAL: CRIT longest blocking idle transaction sleeps for 645 seconds [17:32:34] RECOVERY - MySQL Idle Transactions on db1039 is OK: OK longest blocking idle transaction sleeps for 0 seconds [17:33:04] RECOVERY - MySQL InnoDB on db1039 is OK: OK longest blocking idle transaction sleeps for 0 seconds [17:37:14] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:39:04] RECOVERY - RAID on searchidx1001 is OK: OK: State is Optimal, checked 1 logical drive(s), 4 physical drive(s) [19:01:25] (03PS1) 10Matanya: site.pp: removed apache-utils [operations/puppet] - 10https://gerrit.wikimedia.org/r/92079 [19:19:10] paravoid: https://graphite.wikimedia.org/render/?title=Top%208%20FileBackend%20Methods%20by%20Max%2090th%20Percentile%20Time%20%28ms%29%20log%282%29%20-8hours&from=-8hours&width=1024&height=500&until=now&areaMode=none&hideLegend=false&logBase=2&lineWidth=1&lineMode=connected&target=cactiStyle%28substr%28highestMax%28FileBackendStore.*.tp90,8%29,0,2%29%29 [19:19:37] \o/ [19:26:14] gah, 1 day graph is better [19:27:04] how so? [19:27:06] https://graphite.wikimedia.org/render/?title=Top%208%20FileBackend%20Methods%20by%20Max%2090th%20Percentile%20Time%20%28ms%29%20log%282%29%20-8hours&from=-24hours&width=1024&height=500&until=now&areaMode=none&hideLegend=false&logBase=2&lineWidth=1&lineMode=connected&target=cactiStyle%28substr%28highestMax%28FileBackendStore.*.tp90,8%29,0,2%29%29 ? [19:27:37] or, to fix the title too: https://graphite.wikimedia.org/render/?title=Top%208%20FileBackend%20Methods%20by%20Max%2090th%20Percentile%20Time%20%28ms%29%20log%282%29%20-24hours&from=-24hours&width=1024&height=500&until=now&areaMode=none&hideLegend=false&logBase=2&lineWidth=1&lineMode=connected&target=cactiStyle%28substr%28highestMax%28FileBackendStore.*.tp90,8%29,0,2%29%29 [19:28:36] Aaron|home: hrm? how? [19:33:24] paravoid: https://graphite.wikimedia.org/render/?width=588&height=309&_salt=1382815889.707&target=LocalFile.purgeThumbnails.tavg&from=-3days and https://wikitech.wikimedia.org/wiki/Server_admin_log [19:34:23] * Aaron|home synced at about 23:44 25th, almost 12AM 26th [19:38:22] grumble, submodules [19:38:27] https://gerrit.wikimedia.org/r/#/c/92033/ is completely useless :) [19:39:18] congrats in any case [19:39:22] very impressive [19:57:41] paravoid: hi, is there any machine except mchenry hardy? [19:59:00] i was thinking you could answer that with salt but... i wonder if such an old box runs salt? [20:04:35] matanya: yes [20:04:41] why? [20:05:05] I'd prefer not to list all of our boxes running an EOL distro into one line :) [20:05:23] which ones? I was looking in some puppet manifests, and they had weird stuff for hardy. I thought removing those if the machine was updraded [20:05:40] of course, paravoid :) [20:51:44] PROBLEM - Host mw1085 is DOWN: PING CRITICAL - Packet loss = 100% [20:52:24] RECOVERY - Host mw1085 is UP: PING OK - Packet loss = 0%, RTA = 0.25 ms [23:58:24] graphite is so awful to configure [23:58:52] hah [23:58:52] it has more config vars than mediawiki [23:59:05] is this for vagrant? [23:59:13] no, prod [23:59:23] i'm developing it in vagrant, but the idea is to then apply it to prod [23:59:29] right [23:59:39] turn everything of [23:59:41] off [23:59:42] or no [23:59:44] *on [23:59:46] DAMN IT [23:59:48] it'll be fine [23:59:52] * ori-l comforts Reedy.