[00:09:41] PROBLEM - very high load average likely xfs on ms-be1007 is CRITICAL - load average: 200.54, 119.49, 58.64 [01:03:41] PROBLEM - puppet last run on ms-be1007 is CRITICAL puppet fail [01:10:20] RECOVERY - puppet last run on ms-be1007 is OK Puppet is currently enabled, last run 16 seconds ago with 0 failures [01:36:40] PROBLEM - puppet last run on lvs2003 is CRITICAL Puppet has 1 failures [01:44:40] PROBLEM - swift-account-reaper on ms-be1007 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:45:00] PROBLEM - swift-account-replicator on ms-be1007 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:45:00] PROBLEM - swift-object-server on ms-be1007 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:45:01] PROBLEM - swift-container-updater on ms-be1007 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:45:10] PROBLEM - swift-object-replicator on ms-be1007 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:45:11] PROBLEM - swift-account-server on ms-be1007 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:45:20] PROBLEM - swift-account-auditor on ms-be1007 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:45:20] PROBLEM - puppet last run on ms-be1007 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:45:21] PROBLEM - configured eth on ms-be1007 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:45:21] PROBLEM - swift-object-updater on ms-be1007 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:45:30] PROBLEM - Disk space on ms-be1007 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:45:31] PROBLEM - RAID on ms-be1007 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:45:32] PROBLEM - dhclient process on ms-be1007 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:45:40] PROBLEM - swift-container-auditor on ms-be1007 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:45:41] PROBLEM - swift-container-server on ms-be1007 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:45:51] PROBLEM - swift-container-replicator on ms-be1007 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:46:01] PROBLEM - salt-minion processes on ms-be1007 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:46:02] PROBLEM - DPKG on ms-be1007 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:46:10] PROBLEM - swift-object-auditor on ms-be1007 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:51:40] RECOVERY - puppet last run on lvs2003 is OK Puppet is currently enabled, last run 1 second ago with 0 failures [02:30:20] PROBLEM - puppet last run on mw2011 is CRITICAL puppet fail [02:30:26] !log l10nupdate Synchronized php-1.26wmf2/cache/l10n: (no message) (duration: 07m 39s) [02:30:37] Logged the message, Master [02:31:30] PROBLEM - RAID on snapshot1004 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [02:33:00] RECOVERY - RAID on snapshot1004 is OK no RAID installed [02:35:02] !log LocalisationUpdate completed (1.26wmf2) at 2015-04-26 02:33:59+00:00 [02:35:07] Logged the message, Master [02:40:20] PROBLEM - puppet last run on cp3036 is CRITICAL puppet fail [02:48:40] RECOVERY - puppet last run on mw2011 is OK Puppet is currently enabled, last run 41 seconds ago with 0 failures [02:52:00] !log l10nupdate Synchronized php-1.26wmf3/cache/l10n: (no message) (duration: 06m 38s) [02:52:06] Logged the message, Master [02:56:03] !log LocalisationUpdate completed (1.26wmf3) at 2015-04-26 02:55:00+00:00 [02:56:08] Logged the message, Master [02:58:01] PROBLEM - Host mw2027 is DOWN: PING CRITICAL - Packet loss = 100% [02:58:31] RECOVERY - Host mw2027 is UPING OK - Packet loss = 0%, RTA = 44.26 ms [02:58:32] RECOVERY - puppet last run on cp3036 is OK Puppet is currently enabled, last run 52 seconds ago with 0 failures [02:59:30] hmm mw2027 did actually reboot. is someone working on it? [03:05:56] !log mw2027 rebooted unexpectedly, no clues in syslog. afterward i dist-upgraded, including new kernel. [03:06:01] Logged the message, Master [03:35:07] (03PS1) 10Ori.livneh: Set max_execution_time in CommonSettings.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/206626 [03:35:20] PROBLEM - puppet last run on cp3031 is CRITICAL Puppet has 1 failures [03:35:21] (03CR) 10Ori.livneh: [C: 032] Set max_execution_time in CommonSettings.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/206626 (owner: 10Ori.livneh) [03:35:26] (03Merged) 10jenkins-bot: Set max_execution_time in CommonSettings.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/206626 (owner: 10Ori.livneh) [03:35:53] (this is the same value we set it to in fcgi.ini on friday, but apparently since it's a legacy config var it only works via set_ini()). [03:36:39] !log ori Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 14s) [03:36:43] Logged the message, Master [03:37:13] !log Previous sync-file was for: If296f3d3c: Set max_execution_time in CommonSettings.php [03:37:16] Logged the message, Master [03:38:31] PROBLEM - puppet last run on mw2031 is CRITICAL Puppet has 1 failures [03:44:49] (03CR) 10Ori.livneh: "It actually doesn't apply at all; HHVM just ignores it unless you set it via ini_set(). Follow-up: https://gerrit.wikimedia.org/r/#/c/2066" [puppet] - 10https://gerrit.wikimedia.org/r/206440 (owner: 10Ori.livneh) [03:52:01] RECOVERY - puppet last run on cp3031 is OK Puppet is currently enabled, last run 49 seconds ago with 0 failures [03:53:31] RECOVERY - puppet last run on mw2031 is OK Puppet is currently enabled, last run 32 seconds ago with 0 failures [05:29:15] !log LocalisationUpdate ResourceLoader cache refresh completed at Sun Apr 26 05:28:11 UTC 2015 (duration 28m 10s) [05:29:24] Logged the message, Master [06:29:01] PROBLEM - puppet last run on cp1056 is CRITICAL Puppet has 1 failures [06:30:20] PROBLEM - puppet last run on db1018 is CRITICAL Puppet has 1 failures [06:30:40] PROBLEM - puppet last run on ruthenium is CRITICAL Puppet has 1 failures [06:30:41] PROBLEM - puppet last run on holmium is CRITICAL Puppet has 3 failures [06:31:11] PROBLEM - puppet last run on cp3042 is CRITICAL Puppet has 1 failures [06:32:12] PROBLEM - puppet last run on wtp2015 is CRITICAL Puppet has 1 failures [06:32:41] PROBLEM - puppet last run on mw2104 is CRITICAL Puppet has 1 failures [06:34:21] PROBLEM - puppet last run on mw2173 is CRITICAL Puppet has 1 failures [06:36:01] PROBLEM - puppet last run on mw2045 is CRITICAL Puppet has 2 failures [06:36:20] PROBLEM - puppet last run on mw2143 is CRITICAL Puppet has 1 failures [06:36:21] PROBLEM - puppet last run on mw2022 is CRITICAL Puppet has 1 failures [06:42:29] (03CR) 1020after4: "?" [puppet] - 10https://gerrit.wikimedia.org/r/205797 (https://phabricator.wikimedia.org/T548) (owner: 1020after4) [06:45:20] RECOVERY - puppet last run on db1018 is OK Puppet is currently enabled, last run 31 seconds ago with 0 failures [06:45:31] RECOVERY - puppet last run on ruthenium is OK Puppet is currently enabled, last run 33 seconds ago with 0 failures [06:46:01] RECOVERY - puppet last run on mw2104 is OK Puppet is currently enabled, last run 22 seconds ago with 0 failures [06:46:11] RECOVERY - puppet last run on cp3042 is OK Puppet is currently enabled, last run 54 seconds ago with 0 failures [06:46:21] RECOVERY - puppet last run on mw2143 is OK Puppet is currently enabled, last run 22 seconds ago with 0 failures [06:46:30] RECOVERY - puppet last run on mw2022 is OK Puppet is currently enabled, last run 9 seconds ago with 0 failures [06:47:20] RECOVERY - puppet last run on cp1056 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures [06:47:20] RECOVERY - puppet last run on wtp2015 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures [06:47:21] RECOVERY - puppet last run on holmium is OK Puppet is currently enabled, last run 1 minute ago with 0 failures [06:47:50] RECOVERY - puppet last run on mw2173 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures [06:47:51] RECOVERY - puppet last run on mw2045 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures [07:00:50] PROBLEM - SSH on ms-be1007 is CRITICAL - Socket timeout after 10 seconds [07:29:40] PROBLEM - NTP on ms-be1007 is CRITICAL: NTP CRITICAL: No response from NTP server [07:52:21] PROBLEM - dhclient process on analytics1016 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:52:21] PROBLEM - salt-minion processes on analytics1016 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:52:21] PROBLEM - Hadoop NodeManager on analytics1016 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:52:21] PROBLEM - Hadoop DataNode on analytics1016 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:53:00] PROBLEM - RAID on analytics1016 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:53:00] PROBLEM - puppet last run on analytics1016 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:53:01] PROBLEM - Disk space on analytics1016 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:53:20] PROBLEM - configured eth on analytics1016 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:53:20] PROBLEM - DPKG on analytics1016 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:53:20] PROBLEM - SSH on analytics1016 is CRITICAL - Socket timeout after 10 seconds [08:02:20] RECOVERY - Hadoop NodeManager on analytics1016 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager [08:02:20] RECOVERY - dhclient process on analytics1016 is OK: PROCS OK: 0 processes with command name dhclient [08:02:21] RECOVERY - Hadoop DataNode on analytics1016 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.hdfs.server.datanode.DataNode [08:02:21] RECOVERY - salt-minion processes on analytics1016 is OK: PROCS OK: 2 processes with regex args ^/usr/bin/python /usr/bin/salt-minion [08:03:00] RECOVERY - RAID on analytics1016 is OK no disks configured for RAID [08:03:00] RECOVERY - puppet last run on analytics1016 is OK Puppet is currently enabled, last run 28 minutes ago with 0 failures [08:03:00] RECOVERY - Disk space on analytics1016 is OK: DISK OK [08:03:11] RECOVERY - DPKG on analytics1016 is OK: All packages OK [08:03:11] RECOVERY - configured eth on analytics1016 is OK - interfaces up [08:03:21] RECOVERY - SSH on analytics1016 is OK: SSH OK - OpenSSH_6.6.1p1 Ubuntu-2ubuntu2 (protocol 2.0) [08:07:21] ms-be1007 is in a bad state. it closes my connection when i try to ssh in, and console says this: [08:07:24] [5423442.451365] BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:2:8093] [08:07:27] [5423442.458912] Stack: [08:07:30] [5423442.461360] Call Trace: [08:07:32] [5423442.464376] Code: 90 90 90 90 90 90 90 90 90 55 b8 00 00 01 00 48 89 e5 f0 0f c1 07 89 c2 c1 ea 10 66 39 c2 74 13 66 0f 1f 84 00 00 00 00 00 f3 90 <0f> b7 07 66 39 d0 75 f6 5d c3 0f 1f 40 00 8b 17 55 31 c0 48 89 [08:07:51] repeatedly for different CPU#s [08:09:13] i'm gonna reboot it [08:13:17] <_joe_> yeah [08:13:20] <_joe_> jgage: hey [08:13:24] <_joe_> good night :) [08:13:43] <_joe_> I just got to the computer for some FLOSS work, I can take care of it if you want [08:14:21] it's doing a fsck right now [08:14:55] i haven't messed with ms boxes at all, so i might need some assistance if it's unhealthy after it finishes booting [08:15:08] ah there it goes [08:15:11] RECOVERY - swift-account-server on ms-be1007 is OK: PROCS OK: 25 processes with regex args ^/usr/bin/python /usr/bin/swift-account-server [08:15:11] RECOVERY - Disk space on ms-be1007 is OK: DISK OK [08:15:11] RECOVERY - swift-object-updater on ms-be1007 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-object-updater [08:15:11] RECOVERY - swift-object-replicator on ms-be1007 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-object-replicator [08:15:11] RECOVERY - swift-account-auditor on ms-be1007 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-auditor [08:15:11] RECOVERY - configured eth on ms-be1007 is OK - interfaces up [08:15:20] RECOVERY - SSH on ms-be1007 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.4 (protocol 2.0) [08:15:21] RECOVERY - dhclient process on ms-be1007 is OK: PROCS OK: 0 processes with command name dhclient [08:15:21] RECOVERY - RAID on ms-be1007 is OK optimal, 14 logical, 14 physical [08:15:32] RECOVERY - swift-container-auditor on ms-be1007 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [08:15:41] RECOVERY - swift-container-server on ms-be1007 is OK: PROCS OK: 25 processes with regex args ^/usr/bin/python /usr/bin/swift-container-server [08:15:41] RECOVERY - swift-container-replicator on ms-be1007 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-replicator [08:15:50] RECOVERY - very high load average likely xfs on ms-be1007 is OK - load average: 8.33, 2.45, 0.87 [08:15:50] RECOVERY - swift-object-auditor on ms-be1007 is OK: PROCS OK: 3 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [08:15:51] RECOVERY - salt-minion processes on ms-be1007 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion [08:15:51] RECOVERY - DPKG on ms-be1007 is OK: All packages OK [08:16:12] RECOVERY - NTP on ms-be1007 is OK: NTP OK: Offset -0.1062968969 secs [08:16:12] RECOVERY - swift-account-replicator on ms-be1007 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-replicator [08:16:20] RECOVERY - swift-object-server on ms-be1007 is OK: PROCS OK: 101 processes with regex args ^/usr/bin/python /usr/bin/swift-object-server [08:16:20] RECOVERY - swift-container-updater on ms-be1007 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-updater [08:16:20] RECOVERY - swift-account-reaper on ms-be1007 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-reaper [08:16:24] !log ms-be1007 was unresponsive for ~6 hours, "soft lockup" output on console. rebooted. [08:16:27] Logged the message, Master [08:16:50] RECOVERY - puppet last run on ms-be1007 is OK Puppet is currently enabled, last run 5 seconds ago with 0 failures [08:21:44] ok i've dist-upgraded that box, but i didn't reboot for new kernel. now i'm going to bed. l8r _joe_ :) [08:29:51] PROBLEM - puppet last run on mw2192 is CRITICAL puppet fail [08:49:51] RECOVERY - puppet last run on mw2192 is OK Puppet is currently enabled, last run 48 seconds ago with 0 failures [10:07:53] robh: you there? [10:08:24] <_joe_> Steinsplitter: I don't think so, it's the weekend and robh is on PDT :) [10:08:33] <_joe_> Steinsplitter: any urgent issue with the sites? [10:09:23] _joe_: some uses sayed on irc that commons is sometimes slow today from eu. [10:10:39] <_joe_> Steinsplitter: I'm in Europe and it's definitely not slow for me when I read [10:11:02] <_joe_> where are the people that report this? I may try to extract more info [10:11:30] <_joe_> I mean in which channel [10:11:55] _joe_: https://commons.wikimedia.org/w/index.php?title=Special%3AListFiles&user=Lantus&ilshowall=1 it loads 4 ever [10:12:29] <_joe_> is this specific to the EU? [10:12:36] <_joe_> it took 4 seconds to load to me [10:12:43] <_joe_> which is definitely a lot [10:12:47] krd sayed it to me in pm... and i can confirm it. [10:12:54] 23 seconds [10:12:57] <_joe_> but not something I'll wake people up upon [10:13:18] <_joe_> 2 seconds for me [10:14:01] <_joe_> Can you look at the timeline in your browser console and see what takes so long to load? [10:15:49] mainly he GET [10:15:49] https://commons.wikimedia.org * [10:16:30] <_joe_> can I ask you to do a traceroute to 91.198.174.192 (text-lb.esams.wikimedia.org) and paste the result somewhere? [10:16:57] <_joe_> also, is this just for commons or for dewiki or enwiki too? [10:21:32] _joe_: http://pastebin.com/gycFu5HC , commons [10:23:27] <_joe_> uhm, quite some latency, but nothing that explains what you see [10:23:57] <_joe_> Steinsplitter: can you try upload-lb.esams.wikimedia.org? [10:26:47] <_joe_> Steinsplitter: also, are the other EU users having issues all in the same country and/or ISP? [10:27:12] <_joe_> sorry to ask so many questions, but I have to fetch as much as possible as I can't reproduce the issue [10:28:11] _joe_: http://pastebin.com/sZCSAqU4 - maybe it is just one mw* which is slow? [10:28:32] <_joe_> Steinsplitter: definitely not [10:29:06] <_joe_> Steinsplitter: is dewiki fast for you? if you load a large page I mean [10:29:35] yes [10:30:52] <_joe_> which doesn't make sense to me, as I am trying very hard to find a page that is slow on commons for me, but you access dewiki and commons from the same network connection to the same set of IPs [10:33:16] okay, thanks for looking _joe_ [10:34:29] <_joe_> Steinsplitter: I am trying to find any plausible cause, but if more users report the problem, please let me know and I'll page someone to look at network/peering issues. [10:34:50] <_joe_> my guess is there is some issue along the route from your ISPs to esams [10:38:51] PROBLEM - puppet last run on db2060 is CRITICAL puppet fail [10:57:02] RECOVERY - puppet last run on db2060 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures [12:07:12] PROBLEM - Debian mirror in sync with upstream on carbon is CRITICAL: /srv/mirrors/debian is over 12 hours old. [12:11:50] 6operations, 10Deployment-Systems: [Trebuchet] Salt times out on parsoid restarts - https://phabricator.wikimedia.org/T63882#1236601 (10Aklapper) a:5RyanLane>3None [12:15:24] RECOVERY - Debian mirror in sync with upstream on carbon is OK: /srv/mirrors/debian is over 0 hours old. [13:00:31] PROBLEM - puppet last run on cp3036 is CRITICAL puppet fail [13:03:09] 6operations, 10Architecture, 10MediaWiki-RfCs, 10RESTBase, and 5 others: RFC: Don't retry 503 unless allowed by Retry-After in Varnish - https://phabricator.wikimedia.org/T97206#1236770 (10BBlack) [13:18:41] RECOVERY - puppet last run on cp3036 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures [14:41:51] 6operations, 10Architecture, 10MediaWiki-RfCs, 10RESTBase, and 4 others: RFC: Request timeouts and retries - https://phabricator.wikimedia.org/T97204#1236818 (10BBlack) I don't actually think `Retry-After` has the right semantics here. I think it applies globally to the whole service, and thus if a server... [14:46:52] 6operations, 10Architecture, 10MediaWiki-RfCs, 10RESTBase, and 5 others: RFC: Don't retry 503 unless allowed by Retry-After in Varnish - https://phabricator.wikimedia.org/T97206#1236822 (10BBlack) a:3BBlack Re: `Retry-After`: I don't think we can (or want to) do anything with a backend's timeout value he... [14:48:52] 6operations, 10Architecture, 10MediaWiki-RfCs, 10RESTBase, and 5 others: RFC: Re-evaluate varnish-level request-restart behavior on 5xx - https://phabricator.wikimedia.org/T97206#1236828 (10BBlack) [14:55:49] 6operations: Fix all .erb variable warnings - https://phabricator.wikimedia.org/T97251#1236838 (10Andrew) 3NEW a:3Andrew [17:24:58] (03CR) 10Dereckson: [C: 031] Add draft namespace for fawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/204467 (https://phabricator.wikimedia.org/T92760) (owner: 10Mjbmr) [17:31:19] I wonder why people are so eager to copy a failed experiment. https://meta.wikimedia.org/wiki/Research:Wikipedia_article_creation [18:07:59] (03PS1) 10Dereckson: Removed legacy groups on fr.wikinews [mediawiki-config] - 10https://gerrit.wikimedia.org/r/206647 (https://phabricator.wikimedia.org/T90979) [18:17:17] (03PS1) 10Dereckson: Set $wgEnotifMinorEdits to true on wikimania2016.wikimedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/206648 [18:17:50] (03PS2) 10Dereckson: Set $wgEnotifMinorEdits to true on wikimania2016.wikimedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/206648 (https://phabricator.wikimedia.org/T96564) [18:26:55] (03CR) 10Dereckson: [C: 04-1] "Still pending discussion." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/200216 (https://phabricator.wikimedia.org/T94214) (owner: 10Cenarium) [18:59:11] (03PS1) 10Dereckson: Fixed whitespace issues in flaggedrevs.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/206650 [19:01:40] (03PS4) 10Dereckson: Give patrol to reviewers for testwiki/enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/199321 (https://phabricator.wikimedia.org/T93798) (owner: 10Cenarium) [19:02:20] (03PS5) 10Dereckson: Give patrol to reviewers for testwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/199321 (https://phabricator.wikimedia.org/T93798) (owner: 10Cenarium) [19:03:40] (03CR) 10Dereckson: "PS4: rebased against change I09efd1c8920583" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/199321 (https://phabricator.wikimedia.org/T93798) (owner: 10Cenarium) [19:09:26] 6operations, 10Beta-Cluster: [OPS] exim config points to mchenry.wmflabs.org - https://phabricator.wikimedia.org/T38996#1237019 (10hashar) [19:30:02] (03Abandoned) 10Cenarium: Add abusefilter group for testwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/200216 (https://phabricator.wikimedia.org/T94214) (owner: 10Cenarium) [19:42:51] PROBLEM - puppet last run on cp3043 is CRITICAL puppet fail [19:47:01] PROBLEM - puppet last run on dbstore2001 is CRITICAL puppet fail [19:56:01] PROBLEM - puppet last run on cp3019 is CRITICAL puppet fail [19:59:22] RECOVERY - puppet last run on cp3043 is OK Puppet is currently enabled, last run 44 seconds ago with 0 failures [20:03:41] RECOVERY - puppet last run on dbstore2001 is OK Puppet is currently enabled, last run 1 second ago with 0 failures [20:14:21] RECOVERY - puppet last run on cp3019 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures [20:22:43] (03PS3) 10Dereckson: Set up Babel categories for hu.wikiquote [mediawiki-config] - 10https://gerrit.wikimedia.org/r/203783 (https://phabricator.wikimedia.org/T94842) [20:27:52] (03CR) 10Alex Monk: Set up Babel categories for hu.wikiquote [mediawiki-config] - 10https://gerrit.wikimedia.org/r/203783 (https://phabricator.wikimedia.org/T94842) (owner: 10Dereckson) [20:36:06] (03CR) 10Dereckson: [C: 04-1] Modify AbuseFilter block configuration on eswikibooks (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/206510 (https://phabricator.wikimedia.org/T96669) (owner: 10Glaisher) [20:45:45] (03PS1) 10Dereckson: Content namespaces on fr.wiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/206719 (https://phabricator.wikimedia.org/T97228) [20:56:26] (03PS1) 10Andrew Bogott: Use @resolver instead of resolver. [puppet] - 10https://gerrit.wikimedia.org/r/206720 [20:56:28] (03PS1) 10Andrew Bogott: Use @certname instead of certname in .erb [puppet] - 10https://gerrit.wikimedia.org/r/206721 [20:56:30] (03PS1) 10Andrew Bogott: @qualify a few variables. [puppet] - 10https://gerrit.wikimedia.org/r/206722 [20:56:32] (03PS1) 10Andrew Bogott: @qualify some .erb variable references. [puppet] - 10https://gerrit.wikimedia.org/r/206723 [21:08:46] (03PS1) 10Faidon Liambotis: Depool esams, planned upsteam network maintenance [dns] - 10https://gerrit.wikimedia.org/r/206724 [22:12:01] (03PS1) 10Dereckson: Added unitedarchives.noip.me to $wgCopyUploadsDomains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/206727 (https://phabricator.wikimedia.org/T96664) [22:27:10] (03PS6) 10Merlijn van Deen: Extend Exim diamond collector for Tool Labs [puppet] - 10https://gerrit.wikimedia.org/r/206118 [22:27:50] (03CR) 10jenkins-bot: [V: 04-1] Extend Exim diamond collector for Tool Labs [puppet] - 10https://gerrit.wikimedia.org/r/206118 (owner: 10Merlijn van Deen) [22:29:51] (03PS7) 10Merlijn van Deen: Extend Exim diamond collector for Tool Labs [puppet] - 10https://gerrit.wikimedia.org/r/206118 [22:33:50] (03CR) 10Dereckson: [C: 031] Add abusefilter-modify-restricted right to sysop user group for idwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/206080 (https://phabricator.wikimedia.org/T96542) (owner: 10Mjbmr) [22:34:50] PROBLEM - puppet last run on mw2155 is CRITICAL puppet fail [22:40:04] (03CR) 10Merlijn van Deen: "Okay, it took me all evening, but this is now working correctly on toolsbeta-mail. Of course, there's a distinct lack of queued mails ther" [puppet] - 10https://gerrit.wikimedia.org/r/206118 (owner: 10Merlijn van Deen) [22:49:12] (03PS1) 10Dereckson: Import sources configuration on mr.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/206731 (https://phabricator.wikimedia.org/T96807) [22:53:12] RECOVERY - puppet last run on mw2155 is OK Puppet is currently enabled, last run 34 seconds ago with 0 failures [23:01:42] (03CR) 10Dereckson: "Must be rebased." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/203627 (https://phabricator.wikimedia.org/T15712) (owner: 10devunt) [23:05:31] 6operations, 10RESTBase, 10Traffic, 10VisualEditor, 7Performance: Set up an API base path for REST and action APIs - https://phabricator.wikimedia.org/T95229#1237227 (10BBlack) [23:05:41] (03PS7) 10Dereckson: Add Josa extension to ko.wikipedia.beta.wmflabs.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/203627 (https://phabricator.wikimedia.org/T15712) (owner: 10devunt) [23:05:46] (03CR) 10jenkins-bot: [V: 04-1] Add Josa extension to ko.wikipedia.beta.wmflabs.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/203627 (https://phabricator.wikimedia.org/T15712) (owner: 10devunt) [23:06:48] 6operations, 10Traffic, 7HTTPS, 5HTTPS-by-default: Switch to ECDSA hybrid certificates - https://phabricator.wikimedia.org/T86654#1237228 (10BBlack) 5Open>3stalled [23:06:55] (03PS8) 10Dereckson: Add Josa extension to ko.wikipedia.beta.wmflabs.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/203627 (https://phabricator.wikimedia.org/T15712) (owner: 10devunt) [23:07:10] 6operations, 10Traffic: Support ALPN + HTTP/2 - https://phabricator.wikimedia.org/T96848#1237229 (10BBlack) 5Open>3stalled [23:07:31] 6operations, 10Traffic: Support ALPN + HTTP/2 - https://phabricator.wikimedia.org/T96848#1227458 (10BBlack) [23:07:33] 6operations, 10Traffic: Evaluate limited caching inside nginx - https://phabricator.wikimedia.org/T96851#1237233 (10BBlack) [23:07:35] 6operations, 10Traffic: Package/backport openssl 1.0.2 + nginx 1.7.x or higher - https://phabricator.wikimedia.org/T96850#1237231 (10BBlack) 5Open>3stalled [23:07:43] (03CR) 10Dereckson: [C: 031] "Ready to deploy." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/203627 (https://phabricator.wikimedia.org/T15712) (owner: 10devunt) [23:16:16] 6operations, 10Traffic: Evaluate limited caching inside nginx - https://phabricator.wikimedia.org/T96851#1237251 (10BBlack) 5Open>3declined a:3BBlack Actually this isn't worth thinking about at present, because (a) we'd lose analytics for the cache hits within nginx unless we did a bunch of work to get t... [23:30:20] (03CR) 10Faidon Liambotis: [C: 032] Depool esams, planned upsteam network maintenance [dns] - 10https://gerrit.wikimedia.org/r/206724 (owner: 10Faidon Liambotis) [23:31:08] !log draining esams for planned upstream network maintenance (00:00-04:00 UTC) [23:31:15] Logged the message, Master [23:40:18] (03PS5) 10Dereckson: Enable Extension:Shorturl on sa. projects [mediawiki-config] - 10https://gerrit.wikimedia.org/r/201216 (https://phabricator.wikimedia.org/T94660) (owner: 10Shanmugamp7) [23:40:29] (03CR) 10Dereckson: [C: 031] Enable Extension:Shorturl on sa. projects [mediawiki-config] - 10https://gerrit.wikimedia.org/r/201216 (https://phabricator.wikimedia.org/T94660) (owner: 10Shanmugamp7) [23:53:22] (03PS1) 10Dereckson: Enable NewUserMessage on ne.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/206733 (https://phabricator.wikimedia.org/T96823)