[00:35:23] <sjoerddebruin2>	 hmmm
[00:37:19] <icinga-wm>	 PROBLEM - Ulsfo HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [1000.0]
[00:37:39] <icinga-wm>	 PROBLEM - Text HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0]
[00:39:09] <icinga-wm>	 PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [1000.0]
[00:39:18] <sjoerddebruin>	 Getting a lot of 503's and broken stylesheet loading
[00:40:15] <sjoerddebruin>	 Doesn't look good https://grafana.wikimedia.org/dashboard/file/varnish-http-errors.json?refresh=5m&orgId=1&from=now-1h&to=now
[00:41:19] <icinga-wm>	 PROBLEM - Ulsfo HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [1000.0]
[00:42:19] <icinga-wm>	 PROBLEM - Codfw HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0]
[00:42:39] <icinga-wm>	 PROBLEM - Eqiad HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [1000.0]
[00:43:58] <sjoerddebruin>	 moritzm, are you here?
[00:49:37] <sjoerddebruin>	 seems steady again
[00:52:16] <pajz>	 Not for me. dewiki down for me for at least half an hour.
[00:54:19] <icinga-wm>	 RECOVERY - Ulsfo HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0]
[00:54:22] <bblack>	 most likely that's unrelated
[00:54:39] <icinga-wm>	 RECOVERY - Eqiad HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0]
[00:54:54] <bblack>	 I've looked at these recent 5xx spikes, they're fairly isolated and overall they're low-rate vs all traffic.  it's not a "site down" sort of thing
[00:55:20] <bblack>	 but I can't find a good explanation yet, either
[00:55:39] <icinga-wm>	 RECOVERY - Text HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0]
[00:56:09] <icinga-wm>	 RECOVERY - Esams HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0]
[00:56:15] <bblack>	 they peak around 1.6% of requests returning 5xx
[00:56:34] <bblack>	 the spike duration has gotten a little wider each time, too
[00:57:19] <icinga-wm>	 RECOVERY - Codfw HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0]
[00:57:27] <bblack>	 the first one that looks like the current pattern was about 3 minutes wide, then there was a ~6 minute one, and the latest is closer to ~11 minutes
[00:57:43] <bblack>	 https://grafana.wikimedia.org/dashboard/db/varnish-aggregate-client-status-codes?panelId=3&fullscreen&orgId=1&var-site=All&var-cache_type=text&var-status_type=5&from=1494798744351&to=1494809747081
[00:57:48] <bblack>	 ^ the 3x spikes there
[00:57:49] <icinga-wm>	 PROBLEM - puppet last run on ms-be1027 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[01:00:45] <pajz>	 Anyway, working again here since about two minutes ago.
[01:01:43] <bblack>	 during the spikes, varnish stats show a dip in backend request rate (to e.g. MW), and a rise in total backend connections
[01:01:58] <bblack>	 that's the sort of pattern we'd expect if requests to MW (or another applayer backend) are stalling out and not answering quickly
[01:05:09] <bblack>	 mw fatalmonitor counts actually drop lower during the event
[01:07:28] <bblack>	 but one of the few fatals that does occur in that window is: Fatal error: request has exceeded memory limit in /srv/mediawiki/php-1.30.0-wmf.1/extensions/Echo/includes/DiscussionParser.php on line 641
[01:09:22] <bblack>	 no such ooms in the earlier spike time ranges, though, so maybe that one's just a fluke
[01:09:58] <pajz>	 (I'm back to dewiki down, btw, but if you say it's unrelated, I'll be silent from now ;).)
[01:12:16] <bblack>	 pajz: you might try anonymous browsing (e.g. using chrome incognito), in case it's related to session/login -specific things
[01:12:43] <bblack>	 dewiki is definitely up for the bulk of traffic (which is anonymous)
[01:13:15] <bblack>	 (or it could be some local network condition, of course)
[01:15:44] <bblack>	 fwiw there doesn't seem to be a very specific pattern to the failing URLs.  it hits API reqs, /wiki/Foo reqs, etc.  It does seem to be MW reqs (as opposed to e.g. restbase or cxserver)
[01:17:22] <bblack>	 it's also not specific to certain varnishes afaics, spreads all around
[01:17:51] <bblack>	 well, frontends anyways...
[01:18:51] <bblack>	 now I'm getting somewhere though... it does seem to focus through cp1053.eqiad.wmnet as the backend-most varnish in the 5xx's
[01:19:08] <bblack>	 (which could mean that cache has a problem, but could also mean it's just the chash destination of some problematic traffic too)
[01:19:10] <pajz>	 bblack, thanks. Possible. Incognito/Switching browser doesn't help, but using a VPN does. So surely could be a local issue; curious timing, though (haven't run into issues for ages, and it's affecting only Wikipedia as far as I can tell).
[01:22:50] <bblack>	 [Mon May 15 00:36:48 2017] mce_notify_irq: 4 callbacks suppressed
[01:22:50] <bblack>	 [Mon May 15 00:36:48 2017] mce: [Hardware Error]: Machine check events logged
[01:22:53] <bblack>	 [Mon May 15 00:36:48 2017] CPU2: Core temperature/speed normal
[01:22:56] <bblack>	 [Mon May 15 00:36:48 2017] mce: [Hardware Error]: Machine check events logged
[01:22:59] <bblack>	 ^ cp1053 syslogs, probably a failing machine :(
[01:24:10] <bblack>	 the MCEs have been going on for at least a week though, but maybe things have gotten worse
[01:25:17] <bblack>	 !log depooled cp1053 from all services (possible hardware issues)
[01:25:25] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[01:25:49] <icinga-wm>	 RECOVERY - puppet last run on ms-be1027 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures
[01:28:40] <wikibugs>	 06Operations, 10ops-eqiad, 10Traffic: cp1053 possible hardware issues - https://phabricator.wikimedia.org/T165252#3261314 (10BBlack)
[02:21:03] <logmsgbot>	 !log l10nupdate@tin scap sync-l10n completed (1.30.0-wmf.1) (duration: 07m 42s)
[02:21:11] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[02:27:02] <logmsgbot>	 !log l10nupdate@tin ResourceLoader cache refresh completed at Mon May 15 02:27:02 UTC 2017 (duration 5m 59s)
[02:27:10] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[04:13:30] <icinga-wm>	 PROBLEM - mailman I/O stats on fermium is CRITICAL: CRITICAL - I/O stats: Transfers/Sec=1326.40 Read Requests/Sec=6045.20 Write Requests/Sec=342.20 KBytes Read/Sec=27591.20 KBytes_Written/Sec=7091.20
[04:23:29] <icinga-wm>	 RECOVERY - mailman I/O stats on fermium is OK: OK - I/O stats: Transfers/Sec=11.60 Read Requests/Sec=5.00 Write Requests/Sec=4.00 KBytes Read/Sec=67.60 KBytes_Written/Sec=98.40
[05:23:29] <icinga-wm>	 PROBLEM - nova instance creation test on labnet1001 is CRITICAL: PROCS CRITICAL: 0 processes with command name python, args nova-fullstack
[06:23:37] <wikibugs>	 06Operations: ms-be2023 freeze - https://phabricator.wikimedia.org/T162854#3261604 (10MoritzMuehlenhoff) p:05Triage>03Normal
[06:52:31] <wikibugs>	 06Operations, 07HHVM: HHVM 3.18 crash on job runner / luasandbox - https://phabricator.wikimedia.org/T165043#3261623 (10tstarling) This sort of thing is much easier if there is a reproducible test case. Maybe we could parse a few different articles using benchmarkParse.php to try to trigger a crash. Failing th...
[07:20:20] <wikibugs>	 (03PS1) 10Giuseppe Lavagetto: build-alpine: do not error out if branch not present [puppet] - 10https://gerrit.wikimedia.org/r/353834
[07:21:05] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [V: 032 C: 032] build-alpine: do not error out if branch not present [puppet] - 10https://gerrit.wikimedia.org/r/353834 (owner: 10Giuseppe Lavagetto)
[07:36:56] <wikibugs>	 (03CR) 10Muehlenhoff: "This needs more context, what in particular is needed from the JDK packages?" [debs/gerrit] - 10https://gerrit.wikimedia.org/r/353765 (owner: 10Paladox)
[07:39:42] <wikibugs>	 (03CR) 10Muehlenhoff: [C: 04-1] Fix debian-rules-missing-recommended-target (031 comment) [debs/gerrit] - 10https://gerrit.wikimedia.org/r/353766 (owner: 10Paladox)
[07:56:50] <wikibugs>	 (03CR) 10Paladox: Fix debian-rules-missing-recommended-target (031 comment) [debs/gerrit] - 10https://gerrit.wikimedia.org/r/353766 (owner: 10Paladox)
[07:58:35] <wikibugs>	 06Operations, 10ops-eqiad: rack/setup/install ores1001-1009 - https://phabricator.wikimedia.org/T165171#3261853 (10akosiaris) Racking distribution sounds fine as well as naming.
[08:00:05] <jouncebot>	 Amir1: Dear anthropoid, the time has come. Please deploy ores_classification clean up party (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170515T0800).
[08:05:24] <wikibugs>	 06Operations, 10ops-codfw: Decomission mw2098 - https://phabricator.wikimedia.org/T164959#3261883 (10MoritzMuehlenhoff) a:05MoritzMuehlenhoff>03None
[08:06:07] <Amir1_>	 I start cleaning up now
[08:14:36] <wikibugs>	 06Operations, 10OTRS: Upgrade OTRS to 5.0.19 - https://phabricator.wikimedia.org/T165284#3261913 (10akosiaris)
[08:14:40] <wikibugs>	 06Operations, 10OTRS: Upgrade OTRS to 5.0.19 - https://phabricator.wikimedia.org/T165284#3261929 (10akosiaris) p:05Triage>03Normal
[08:16:54] <wikibugs>	 (03CR) 10Alexandros Kosiaris: "I vaguely remember the same and it makes sense. I 'll rebase for branch 1.13" [debs/pybal] - 10https://gerrit.wikimedia.org/r/353525 (owner: 10Alexandros Kosiaris)
[08:17:50] <Amir1_>	 !log start of cleaning up ores_classification table
[08:17:59] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:18:20] <wikibugs>	 (03PS1) 10Alexandros Kosiaris: Change the default LVS BGP behavior per service [debs/pybal] (1.13) - 10https://gerrit.wikimedia.org/r/353836
[08:18:25] <wikibugs>	 06Operations: ms-be2023 freeze - https://phabricator.wikimedia.org/T162854#3261930 (10fgiunchedi) 05Open>03Resolved a:03fgiunchedi I don't think we've seen a reoccurence of this, though it is odd for sure. Tentatively closing and we can reopen if it happens again.
[08:26:10] <moritzm>	 !log installing rtmpdump security updates on jessie
[08:26:17] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:29:46] <godog>	 !log swift eqiad-prod: ms-be1028/ms-be1039 object weight 3000 - T160640
[08:29:53] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:29:53] <stashbot>	 T160640: Rack and Setup ms-be1028-ms-1039 - https://phabricator.wikimedia.org/T160640
[08:47:05] <wikibugs>	 06Operations, 10netops: Report of esams unreachable from fastweb - https://phabricator.wikimedia.org/T165288#3262009 (10fgiunchedi)
[08:51:45] <wikibugs>	 06Operations, 10netops: Report of esams unreachable from fastweb - https://phabricator.wikimedia.org/T165288#3262036 (10fgiunchedi) I tried a traceroute from our side and it takes a different path  ``` filippo@cr1-esams> traceroute 2.235.74.121  traceroute to 2.235.74.121 (2.235.74.121), 30 hops max, 40 byte p...
[09:01:25] <wikibugs>	 06Operations, 10vm-requests, 05Goal, 07kubernetes: Set up kubernetes masters for codfw cluster - https://phabricator.wikimedia.org/T165291#3262063 (10akosiaris)
[09:02:04] <freddy2k1>	 I cant connect to the wikimedia cluster – it seems like my shared ip is banned
[09:03:40] <TheDragonFire>	 freddy2k1: https://en.wikipedia.org/wiki/Wikipedia:IP_block_exemption
[09:03:48] <wikibugs>	 06Operations, 13Patch-For-Review: Reduce rpcbind use - https://phabricator.wikimedia.org/T106477#3262088 (10MoritzMuehlenhoff) 05Open>03Resolved rpcbind and nfs-common have been removed from all jessie hosts except those which actually use NFS. In addition our base d-i jessie installation strips nfs-common...
[09:03:48] <freddy2k1>	 no, thats the error https://pastebin.com/DadUt36A
[09:04:08] <freddy2k1>	 I requested already an ip block exemption, TheDragonFire 
[09:05:29] <TheDragonFire>	 freddy2k1: It looks like you're behind some sort of proxy.
[09:05:44] <godog>	 freddy2k1: can you reach https://phabricator.wikimedia.org ?
[09:05:51] <freddy2k1>	 no, i cant
[09:05:58] <freddy2k1>	 but i can access wmflabs and wikitech
[09:06:36] <godog>	 yeah that makes sense, those are not in esams
[09:06:59] <godog>	 freddy2k1: can you share a traceroute 91.198.174.192 and your source ip?
[09:08:19] <freddy2k1>	 my source ip is 137.226.39.166
[09:09:43] <freddy2k1>	 thats the traceroute (can not work because the squid proxy works not on the same layer as traceroute): https://pastebin.com/HVHA7XnY
[09:10:12] <freddy2k1>	 traceroute to the squid proxy is fine
[09:10:56] <godog>	 !log swift codfw-prod: more ms-be2001/ms-be2012 decom - T162785
[09:11:04] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:11:05] <stashbot>	 T162785: Decomission ms-be2001 - ms-be2012 - https://phabricator.wikimedia.org/T162785
[09:11:23] <freddy2k1>	 I can access text-lb.eqiad.wikimedia.org via curl, but not text-lb.esams.wikimedia.org via curl
[09:12:49] <godog>	 freddy2k1: yeah, thanks we've been receiving reports of similar issues for esams
[09:13:20] <freddy2k1>	 okay, is there a way to access wikipedia now?
[09:14:14] <freddy2k1>	 I should teach a class how to edit and research on wikipedia, but if there are those connection issues, I cant obviously
[09:14:41] <wikibugs>	 (03CR) 10Chad: [C: 04-2] "Nothing, this is not needed. Could possibly put it under suggests, but we only *require* the JRE. We already install the JDK via puppet." [debs/gerrit] - 10https://gerrit.wikimedia.org/r/353765 (owner: 10Paladox)
[09:15:16] <godog>	 freddy2k1: we're investigating if we can resolve the issue soon yeah
[09:15:53] <freddy2k1>	 okay fine
[09:16:37] <freddy2k1>	 would it work, if use an ip adress located near to the eqiad cluster than an europe one?
[09:17:16] <godog>	 freddy2k1: yeah that would work, the issue is that one of peers in europe is having trouble
[09:17:55] <wikibugs>	 (03CR) 10Chad: "Or, just stop using this package. Cf T157414" [debs/gerrit] - 10https://gerrit.wikimedia.org/r/353766 (owner: 10Paladox)
[09:18:52] <freddy2k1>	 then i will try to find an open proxy to access eqiad. thanks a lot godog 
[09:19:53] <wikibugs>	 (03Abandoned) 10Filippo Giunchedi: lvs: add logstash [puppet] - 10https://gerrit.wikimedia.org/r/324371 (https://phabricator.wikimedia.org/T151971) (owner: 10Filippo Giunchedi)
[09:21:13] <_joe_>	 freddy2k1: try now?
[09:22:15] <freddy2k1>	 yes, now it works perfectly _joe_ 
[09:22:21] <freddy2k1>	 thank you very much!
[09:23:22] <wikibugs>	 06Operations, 10netops: Report of esams unreachable from fastweb - https://phabricator.wikimedia.org/T165288#3262009 (10ayounsi) Traffic is indeed not smooth as usual on the interface toward Init7. I called Init7 and disabled the v4 and v6 BGP sessions. The person I had on the phone mentioned that the engineer...
[09:26:50] <wikibugs>	 06Operations, 10netops: Report of esams unreachable from fastweb - https://phabricator.wikimedia.org/T165288#3262155 (10ayounsi) Got confirmation on IRC that the issue can't be reproduced.
[09:31:47] <wikibugs>	 06Operations, 10netops: Report of esams unreachable from fastweb - https://phabricator.wikimedia.org/T165288#3262169 (10ayounsi) a:03ayounsi
[09:34:44] <moritzm>	 !log installing bind security updates (we're using client-side libs/tools only)
[09:34:51] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:43:05] <wikibugs>	 06Operations, 10netops: Report of esams unreachable from fastweb - https://phabricator.wikimedia.org/T165288#3262251 (10ayounsi) From Init7: >We are experencing some BGP issues in our backbone. Troubleshooting is under way and I'll contact you once we fixed the issue.
[09:46:05] <wikibugs>	 (03PS1) 10Alexandros Kosiaris: Introduce acrux, acrab as codfw kubernetes masters [dns] - 10https://gerrit.wikimedia.org/r/353844 (https://phabricator.wikimedia.org/T165291)
[09:51:54] <wikibugs>	 06Operations, 05Goal, 13Patch-For-Review, 15User-Joe, 07kubernetes: Upgrade calico to 2.2, document build process. - https://phabricator.wikimedia.org/T165024#3262286 (10Joe)
[09:53:14] <Amir1_>	 It seems it's cleaning up the table so fast, probably the load is super low or the table is now small that it can do faster lookups
[09:54:02] <wikibugs>	 (03PS1) 10Giuseppe Lavagetto: profile::calico::builder: use calico release info [puppet] - 10https://gerrit.wikimedia.org/r/353845 (https://phabricator.wikimedia.org/T165024)
[09:54:52] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [C: 031] Introduce acrux, acrab as codfw kubernetes masters [dns] - 10https://gerrit.wikimedia.org/r/353844 (https://phabricator.wikimedia.org/T165291) (owner: 10Alexandros Kosiaris)
[10:03:38] <wikibugs>	 06Operations, 10Traffic, 13Patch-For-Review: varnish frontend transient memory usage keeps growing - https://phabricator.wikimedia.org/T165063#3262301 (10ema) 05Open>03Resolved a:03ema Crazy transient memory usage [[https://grafana.wikimedia.org/dashboard/db/varnish-transient-storage-usage?orgId=1&from...
[10:06:24] <wikibugs>	 06Operations, 10netops: Report of esams unreachable from Fastweb/Init7 - https://phabricator.wikimedia.org/T165288#3262317 (10Nemo_bis) p:05Triage>03High
[10:11:30] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [C: 032] profile::calico::builder: use calico release info [puppet] - 10https://gerrit.wikimedia.org/r/353845 (https://phabricator.wikimedia.org/T165024) (owner: 10Giuseppe Lavagetto)
[10:14:24] <moritzm>	 !log installing fop security updates on trusty
[10:14:33] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[10:18:46] <moritzm>	 !log installing batik security updates on trusty
[10:18:54] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[10:28:50] <wikibugs>	 06Operations, 10netops: Report of esams unreachable from Fastweb/Init7 - https://phabricator.wikimedia.org/T165288#3262418 (10Pyb) My connection is chaotic since this morning. Other customers from the french ISP Bouygues report the same problem. This is my traceroute results:  |--------------------------------...
[10:33:17] <wikibugs>	 (03CR) 10Alexandros Kosiaris: [C: 032] Introduce acrux, acrab as codfw kubernetes masters [dns] - 10https://gerrit.wikimedia.org/r/353844 (https://phabricator.wikimedia.org/T165291) (owner: 10Alexandros Kosiaris)
[10:36:53] <moritzm>	 !log rebooting mw2224-mw2242 for update to Linux 4.9
[10:37:00] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[10:49:56] <wikibugs>	 (03PS1) 10Filippo Giunchedi: logstash: move 'hostname' to 'host' for webrequest [puppet] - 10https://gerrit.wikimedia.org/r/353853 (https://phabricator.wikimedia.org/T149451)
[10:51:08] <wikibugs>	 06Operations, 10Wikimedia-Logstash, 13Patch-For-Review, 15User-Elukey, 15User-fgiunchedi: Get 5xx logs into kibana/logstash - https://phabricator.wikimedia.org/T149451#3262486 (10fgiunchedi)
[10:55:59] <icinga-wm>	 PROBLEM - HP RAID on ms-be1028 is CRITICAL: CHECK_NRPE: Socket timeout after 50 seconds.
[10:56:49] <icinga-wm>	 PROBLEM - HP RAID on ms-be1030 is CRITICAL: CHECK_NRPE: Socket timeout after 50 seconds.
[10:59:29] <icinga-wm>	 RECOVERY - nova instance creation test on labnet1001 is OK: PROCS OK: 1 process with command name python, args nova-fullstack
[11:00:19] <icinga-wm>	 PROBLEM - HP RAID on ms-be1031 is CRITICAL: CHECK_NRPE: Socket timeout after 50 seconds.
[11:00:29] <icinga-wm>	 PROBLEM - HP RAID on ms-be1039 is CRITICAL: CHECK_NRPE: Socket timeout after 50 seconds.
[11:00:39] <icinga-wm>	 PROBLEM - HP RAID on ms-be1037 is CRITICAL: CHECK_NRPE: Socket timeout after 50 seconds.
[11:00:51] <_joe_>	 swift troubles?
[11:00:59] <icinga-wm>	 PROBLEM - HP RAID on ms-be1029 is CRITICAL: CHECK_NRPE: Socket timeout after 50 seconds.
[11:01:08] <jynus>	 _joe_, normally not
[11:01:31] <jynus>	 there is an issue with HP RAID under disk load (it timeouts)ç
[11:01:43] <_joe_>	 ok
[11:01:46] <jynus>	 I have the same issue with some dbs
[11:01:49] <icinga-wm>	 PROBLEM - HP RAID on ms-be1038 is CRITICAL: CHECK_NRPE: Socket timeout after 50 seconds.
[11:02:00] <_joe_>	 yeah the load is not higher than 1 hour ago
[11:02:05] <jynus>	 as long as the host responds it is not normal, but a known issue
[11:02:07] <jynus>	 disk load
[11:02:09] <icinga-wm>	 PROBLEM - HP RAID on ms-be1035 is CRITICAL: CHECK_NRPE: Socket timeout after 50 seconds.
[11:02:13] <jynus>	 not necesarilly cpu or other load
[11:02:17] <_joe_>	 ack
[11:02:20] <jynus>	 or you know
[11:02:29] <icinga-wm>	 PROBLEM - nova instance creation test on labnet1001 is CRITICAL: PROCS CRITICAL: 0 processes with command name python, args nova-fullstack
[11:02:31] <jynus>	 controller load of reporting stuff
[11:02:32] <Amir1_>	 I need five more minutes to finish cleaning up the table
[11:02:44] <jynus>	 cool to me
[11:02:49] <icinga-wm>	 PROBLEM - HP RAID on ms-be1032 is CRITICAL: CHECK_NRPE: Socket timeout after 50 seconds.
[11:03:07] <jynus>	 I will do a quick check that everthing else is working ok
[11:03:19] <godog>	 sigh, rebalance in progress, the alarms are expected but I forgot to silence
[11:03:22] <godog>	 doing now
[11:03:42] <jynus>	 ah, so that is it- but everthing I said is right?
[11:04:30] <jynus>	 T141252 ?
[11:04:31] <stashbot>	 T141252: icinga hp raid check timeout on busy ms-be and db machines - https://phabricator.wikimedia.org/T141252
[11:05:05] <godog>	 yeah :(
[11:05:11] <jynus>	 _joe_, FYI ^
[11:05:29] <Amir1_>	 I stop the cleaning, it might help
[11:05:49] <jynus>	 Amir1_, what you do impacts databases
[11:05:56] <jynus>	 those are "image servers"
[11:05:59] <jynus>	 nothing to do
[11:06:21] <Amir1_>	 oh, okay
[11:06:29] <Amir1_>	 Overall it was almost done
[11:06:38] <jynus>	 I had already noticed your work here: :-) https://grafana.wikimedia.org/dashboard/db/mysql-aggregated?panelId=7&fullscreen&orgId=1
[11:06:56] <wikibugs>	 (03PS1) 10Alexandros Kosiaris: Put acrab, acrux in the correct block [dns] - 10https://gerrit.wikimedia.org/r/353856
[11:07:10] <Amir1_>	 :D
[11:07:11] <jynus>	 (that is an exageration, because row writes are amplified once perserver
[11:07:19] <wikibugs>	 (03CR) 10Alexandros Kosiaris: [C: 032] Put acrab, acrux in the correct block [dns] - 10https://gerrit.wikimedia.org/r/353856 (owner: 10Alexandros Kosiaris)
[11:07:42] <Amir1_>	 yeah, so replicas will add to that number
[11:09:08] <Amir1_>	 !log cleaning up ores_classification has finished 18M rows deleted, current number of rows 38,937,217  (T159753)
[11:09:15] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:09:16] <stashbot>	 T159753: Concerns about ores_classification table size on enwiki - https://phabricator.wikimedia.org/T159753
[11:09:47] <Amir1_>	 We need to shrink that once the jobs gets deployed in WMF 
[11:10:02] <Amir1_>	 in production I mean
[11:10:25] <jynus>	 yes, I know
[11:10:33] <jynus>	 it has to finish first :-)
[11:11:24] <Amir1_>	 Another 3-hour-window would be enough I think. I do it tomorrow. we'll see
[11:12:02] <jynus>	 don't worry, take your time
[11:12:15] <jynus>	 and agains, thanks for diong this
[11:13:00] <Amir1_>	 :) hope that'd be useful 
[11:18:09] <icinga-wm>	 PROBLEM - MediaWiki memcached error rate on graphite1001 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [5000.0]
[11:19:09] <icinga-wm>	 RECOVERY - MediaWiki memcached error rate on graphite1001 is OK: OK: Less than 40.00% above the threshold [1000.0]
[11:20:45] <wikibugs>	 (03PS1) 10Giuseppe Lavagetto: Add debian/repack to ease the upgrade process [calico-cni] - 10https://gerrit.wikimedia.org/r/353857 (https://phabricator.wikimedia.org/T165024)
[11:21:21] <wikibugs>	 06Operations, 05Prometheus-metrics-monitoring, 15User-fgiunchedi: Upgrade mysqld_exporter to 0.10.0 - https://phabricator.wikimedia.org/T161296#3262587 (10fgiunchedi) Diff in variables on db2048 (i.e. `connection_name` is added, no other changes)  ``` -mysql_slave_status_connect_retry{channel_name="",master_...
[11:40:28] <hashar>	 jouncebot: refresh
[11:40:30] <jouncebot>	 I refreshed my knowledge about deployments.
[11:40:31] <hashar>	 jouncebot: next
[11:40:32] <jouncebot>	 In 0 hour(s) and 19 minute(s): Wikidata (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170515T1200)
[11:41:08] <hashar>	 addshore: gilles:  Zeljko and I are attending the releng offsite so we can not take care of SWAT
[11:41:28] <hashar>	 looks like there a single config change so it should not be too hard to handle :-}
[11:44:55] <wikibugs>	 (03PS2) 10Giuseppe Lavagetto: Add debian/repack to ease the upgrade process [calico-cni] - 10https://gerrit.wikimedia.org/r/353857 (https://phabricator.wikimedia.org/T165024)
[11:47:27] <addshore>	 where is uour offsite hashar ?
[11:47:39] * hashar escapes to meeting
[11:47:44] <addshore>	 haha
[11:49:04] <addshore>	 I can swat :)
[11:57:24] <wikibugs>	 (03CR) 10Ottomata: [C: 031] "+1, but requires https://gerrit.wikimedia.org/r/#/c/352579 deployed first." [puppet] - 10https://gerrit.wikimedia.org/r/352582 (https://phabricator.wikimedia.org/T67508) (owner: 10Fdans)
[12:00:04] <jouncebot>	 aude: Respected human, time to deploy Wikidata (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170515T1200). Please do the needful.
[12:04:09] <wikibugs>	 (03PS1) 10Aude: Revert "Don't enable tabular-data data type yet on Wikidata" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/353862 (https://phabricator.wikimedia.org/T164207)
[12:06:42] <wikibugs>	 (03PS1) 10Alexandros Kosiaris: Add acrux, acrab to the infrastructure [puppet] - 10https://gerrit.wikimedia.org/r/353864 (https://phabricator.wikimedia.org/T165291)
[12:09:31] <wikibugs>	 (03PS1) 10Alexandros Kosiaris: Add kubemaster LVS service in codfw [puppet] - 10https://gerrit.wikimedia.org/r/353865
[12:11:02] <gilles>	 addshore: thanks!
[12:15:02] <wikibugs>	 (03CR) 10Alexandros Kosiaris: [C: 032] Add acrux, acrab to the infrastructure [puppet] - 10https://gerrit.wikimedia.org/r/353864 (https://phabricator.wikimedia.org/T165291) (owner: 10Alexandros Kosiaris)
[12:16:04] <wikibugs>	 (03PS3) 10Giuseppe Lavagetto: Add debian/repack to ease the upgrade process [calico-cni] - 10https://gerrit.wikimedia.org/r/353857 (https://phabricator.wikimedia.org/T165024)
[12:16:39] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [V: 032 C: 032] Add debian/repack to ease the upgrade process [calico-cni] - 10https://gerrit.wikimedia.org/r/353857 (https://phabricator.wikimedia.org/T165024) (owner: 10Giuseppe Lavagetto)
[12:17:29] <wikibugs>	 (03PS1) 10Giuseppe Lavagetto: New upstream version 1.8.3 [calico-cni] - 10https://gerrit.wikimedia.org/r/353867
[12:17:31] <wikibugs>	 (03PS1) 10Giuseppe Lavagetto: Updating debian version [calico-cni] - 10https://gerrit.wikimedia.org/r/353868
[12:17:33] <wikibugs>	 (03PS1) 10Giuseppe Lavagetto: package name change [calico-cni] - 10https://gerrit.wikimedia.org/r/353869
[12:26:57] <wikibugs>	 (03CR) 10Aude: [C: 032] Revert "Don't enable tabular-data data type yet on Wikidata" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/353862 (https://phabricator.wikimedia.org/T164207) (owner: 10Aude)
[12:29:33] <wikibugs>	 (03Merged) 10jenkins-bot: Revert "Don't enable tabular-data data type yet on Wikidata" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/353862 (https://phabricator.wikimedia.org/T164207) (owner: 10Aude)
[12:29:43] <icinga-wm>	 PROBLEM - swift-container-auditor on ms-be1019 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[12:29:44] <wikibugs>	 (03CR) 10jenkins-bot: Revert "Don't enable tabular-data data type yet on Wikidata" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/353862 (https://phabricator.wikimedia.org/T164207) (owner: 10Aude)
[12:30:03] <icinga-wm>	 PROBLEM - swift-account-auditor on ms-be1019 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[12:30:03] <icinga-wm>	 PROBLEM - swift-object-server on ms-be1019 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[12:30:03] <icinga-wm>	 PROBLEM - swift-container-server on ms-be1019 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[12:30:03] <icinga-wm>	 PROBLEM - swift-account-reaper on ms-be1019 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[12:30:04] <icinga-wm>	 PROBLEM - swift-container-updater on ms-be1019 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[12:30:04] <icinga-wm>	 PROBLEM - swift-object-replicator on ms-be1019 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[12:30:04] <icinga-wm>	 PROBLEM - dhclient process on ms-be1019 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[12:30:04] <icinga-wm>	 PROBLEM - swift-object-auditor on ms-be1019 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[12:30:04] <icinga-wm>	 PROBLEM - swift-account-server on ms-be1019 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[12:30:05] <icinga-wm>	 PROBLEM - swift-account-replicator on ms-be1019 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[12:30:05] <icinga-wm>	 PROBLEM - swift-container-replicator on ms-be1019 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[12:30:06] <icinga-wm>	 PROBLEM - salt-minion processes on ms-be1019 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[12:30:13] <icinga-wm>	 PROBLEM - swift-object-updater on ms-be1019 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[12:30:33] <icinga-wm>	 RECOVERY - swift-container-auditor on ms-be1019 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-auditor
[12:30:53] <icinga-wm>	 RECOVERY - swift-object-server on ms-be1019 is OK: PROCS OK: 101 processes with regex args ^/usr/bin/python /usr/bin/swift-object-server
[12:30:53] <icinga-wm>	 RECOVERY - swift-account-auditor on ms-be1019 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-auditor
[12:30:53] <icinga-wm>	 RECOVERY - swift-account-replicator on ms-be1019 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-replicator
[12:30:53] <icinga-wm>	 RECOVERY - swift-object-auditor on ms-be1019 is OK: PROCS OK: 3 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor
[12:30:54] <icinga-wm>	 RECOVERY - swift-account-server on ms-be1019 is OK: PROCS OK: 41 processes with regex args ^/usr/bin/python /usr/bin/swift-account-server
[12:30:54] <icinga-wm>	 RECOVERY - swift-container-updater on ms-be1019 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-updater
[12:30:54] <icinga-wm>	 RECOVERY - swift-object-replicator on ms-be1019 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-object-replicator
[12:30:54] <icinga-wm>	 RECOVERY - swift-container-replicator on ms-be1019 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-replicator
[12:30:54] <icinga-wm>	 RECOVERY - dhclient process on ms-be1019 is OK: PROCS OK: 0 processes with command name dhclient
[12:30:55] <icinga-wm>	 RECOVERY - salt-minion processes on ms-be1019 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion
[12:30:55] <icinga-wm>	 RECOVERY - swift-account-reaper on ms-be1019 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-reaper
[12:30:56] <icinga-wm>	 RECOVERY - swift-container-server on ms-be1019 is OK: PROCS OK: 41 processes with regex args ^/usr/bin/python /usr/bin/swift-container-server
[12:31:03] <icinga-wm>	 RECOVERY - swift-object-updater on ms-be1019 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-object-updater
[12:33:00] <logmsgbot>	 !log aude@tin Synchronized wmf-config/Wikibase-production.php: Enable data type for tabular data (duration: 00m 41s)
[12:33:02] <wikibugs>	 (03CR) 10Ema: [C: 031] Change the default LVS BGP behavior per service [debs/pybal] (1.13) - 10https://gerrit.wikimedia.org/r/353836 (owner: 10Alexandros Kosiaris)
[12:33:09] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:33:37] <wikibugs>	 06Operations, 10Continuous-Integration-Infrastructure, 10Wikidata, 07HHVM, and 2 others: CI tests failing with segfault - https://phabricator.wikimedia.org/T165074#3262796 (10MoritzMuehlenhoff) I've built new HHVM packages with a patch as proposed by upstream in https://github.com/facebook/hhvm/issues/7779...
[12:33:59] <wikibugs>	 (03CR) 10Alexandros Kosiaris: [C: 032] Change the default LVS BGP behavior per service [debs/pybal] (1.13) - 10https://gerrit.wikimedia.org/r/353836 (owner: 10Alexandros Kosiaris)
[12:34:22] <wikibugs>	 06Operations, 10Continuous-Integration-Infrastructure, 10Wikidata, 07HHVM, and 2 others: CI tests failing with segfault - https://phabricator.wikimedia.org/T165074#3262799 (10MoritzMuehlenhoff) (Tested in mediawiki-vagrant)
[12:39:25] <wikibugs>	 (03PS1) 10Ayounsi: LibreNMS: Use default OSM tiles provider + simplify syslog filtering [puppet] - 10https://gerrit.wikimedia.org/r/353871 (https://phabricator.wikimedia.org/T164911)
[12:45:17] <wikibugs>	 (03CR) 10Ayounsi: [C: 032] LibreNMS: Use default OSM tiles provider + simplify syslog filtering [puppet] - 10https://gerrit.wikimedia.org/r/353871 (https://phabricator.wikimedia.org/T164911) (owner: 10Ayounsi)
[12:53:17] <wikibugs>	 (03CR) 10Filippo Giunchedi: [C: 031] Setup apache vhost on scap proxies as well (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/344221 (owner: 10Chad)
[12:53:18] <wikibugs>	 06Operations, 10netops, 13Patch-For-Review: LibreNMS improvements - https://phabricator.wikimedia.org/T164911#3262843 (10ayounsi)
[12:53:22] <wikibugs>	 (03CR) 10Filippo Giunchedi: Setup apache vhost on scap proxies as well [puppet] - 10https://gerrit.wikimedia.org/r/344221 (owner: 10Chad)
[12:54:14] <akosiaris>	 !log upload pybal 1.13.6 to apt.wikimedia.org/jessie-wikimedia/main
[12:54:20] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:55:23] <Zppix>	 thanks akosiaris  :)
[12:58:58] <wikibugs>	 06Operations, 10netops: Report of esams unreachable from Fastweb/Init7 - https://phabricator.wikimedia.org/T165288#3262862 (10Nemo_bis)
[12:59:09] <wikibugs>	 (03PS5) 10Addshore: Add QuickSurvey for reader segmentation research [mediawiki-config] - 10https://gerrit.wikimedia.org/r/353053 (https://phabricator.wikimedia.org/T131949) (owner: 10Nschaaf)
[12:59:28] <addshore>	 aude, im guessing you are all done with your slot? :)
[12:59:40] <Zppix>	 jouncebot:  refresh
[12:59:42] <jouncebot>	 I refreshed my knowledge about deployments.
[13:00:04] <jouncebot>	 addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, and thcipriani: Respected human, time to deploy European Mid-day SWAT(Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170515T1300). Please do the needful.
[13:00:04] <jouncebot>	 schana, gilles, addshore, and James_F: A patch you scheduled for European Mid-day SWAT(Max 8 patches) is about to be deployed. Please be available during the process.
[13:00:12] <gilles>	 o/
[13:00:15] <schana>	 here
[13:00:21] <addshore>	 \o
[13:00:30] <Zppix>	 that timing on that refresh cmd from me was perfect xD
[13:00:31] <addshore>	 schana: yours is first!
[13:00:39] <wikibugs>	 (03CR) 10Addshore: [C: 032] Add QuickSurvey for reader segmentation research [mediawiki-config] - 10https://gerrit.wikimedia.org/r/353053 (https://phabricator.wikimedia.org/T131949) (owner: 10Nschaaf)
[13:01:45] <wikibugs>	 (03Merged) 10jenkins-bot: Add QuickSurvey for reader segmentation research [mediawiki-config] - 10https://gerrit.wikimedia.org/r/353053 (https://phabricator.wikimedia.org/T131949) (owner: 10Nschaaf)
[13:01:57] <wikibugs>	 (03CR) 10jenkins-bot: Add QuickSurvey for reader segmentation research [mediawiki-config] - 10https://gerrit.wikimedia.org/r/353053 (https://phabricator.wikimedia.org/T131949) (owner: 10Nschaaf)
[13:02:08] <addshore>	 schana: is it testible before i sync it?
[13:02:18] <schana>	 testible how?
[13:02:27] <schana>	 I'm waiting to try it on wiki
[13:02:31] <addshore>	 on the mwdebug servers!
[13:02:43] <addshore>	 do you have the browser extension installed?
[13:02:48] <schana>	 no
[13:03:07] <Zppix>	 schana:  chome or firefox?
[13:03:10] <Zppix>	 chrome*
[13:03:11] <schana>	 chrome
[13:03:15] <Zppix>	 schana:  https://chrome.google.com/webstore/detail/wikimediadebug/binmakecefompkjggiklgjenddjoifbb
[13:03:28] <addshore>	 your code is in mwdebug1002 right now
[13:04:45] <schana>	 if I'm using the extension right, it doesn't look like the quick survey is live
[13:04:56] <schana>	 (looking at wgEnabledQuickSurveys in console)
[13:05:08] <schana>	 or trying page with ?quicksurvey=true
[13:05:36] <addshore>	 schana: did you set the server to mwdebug1002?
[13:05:41] <schana>	 yes
[13:05:42] <addshore>	 and turn it on?
[13:05:44] <schana>	 yes
[13:05:51] <addshore>	 what URL are you checking on?
[13:05:57] <schana>	 https://de.wikipedia.org/wiki/Apple?quicksurvey=true
[13:07:09] <Zppix>	 make sure read-only is off
[13:07:23] <schana>	 it's off
[13:07:52] <Zppix>	 try shift+f5
[13:08:24] <addshore>	 I see the ext.quicksurveys.init module loaded when viewing the page from mwdeug1002
[13:08:50] <Zppix>	 addshore:  gerrit change, ill see if i cannot see any changes when i try
[13:08:54] <Zppix>	 link*
[13:09:02] <schana>	 looks like the variable is now present for de
[13:09:12] <schana>	 still using the debug extension
[13:09:18] <Zppix>	 schana:  so it works now?
[13:09:33] <schana>	 I'm not able to trigger it with the url parameter
[13:09:42] <schana>	 but that may be a QuickSurveys thing
[13:09:48] <schana>	 I'm not familiar with that codebase
[13:10:07] <Zppix>	 neither am I
[13:10:21] <schana>	 https://www.mediawiki.org/wiki/Extension:QuickSurveys#How_to_load_a_specific_survey
[13:10:24] <schana>	 for reference
[13:10:51] <Dereckson>	 schana: In your config, you've a 'coverage' option, perhaps that forbids the display?
[13:10:55] <logmsgbot>	 !log addshore@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:353053|Add QuickSurvey for reader segmentation research]] T131949 T164769 T164894 T164960 T164963 (duration: 00m 40s)
[13:11:01] <Dereckson>	 To load a random survey append ?quicksurvey=true to the URL;
[13:11:05] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:11:05] <stashbot>	 T164894: Test reader survey in multiple languages - Japanese - https://phabricator.wikimedia.org/T164894
[13:11:06] <stashbot>	 T164963: Test reader survey in multiple languages - Hebrew - https://phabricator.wikimedia.org/T164963
[13:11:06] <stashbot>	 T131949: Repeat the big English reader survey in one or two more languages  - https://phabricator.wikimedia.org/T131949
[13:11:06] <stashbot>	 T164769: Test reader survey in multiple languages - Romanian - https://phabricator.wikimedia.org/T164769
[13:11:06] <stashbot>	 T164960: Test reader survey in multiple languages - German - https://phabricator.wikimedia.org/T164960
[13:11:07] <schana>	 setting the url parameter should force it on
[13:11:08] <Dereckson>	 it seems to only enable a survet
[13:11:15] <Dereckson>	 To load an external survey whose name is 'external example survey' append ?quicksurvey=external-survey-external example survey to the URL.
[13:11:18] <Dereckson>	 this is the one you should use
[13:11:23] <schana>	 https://de.wikipedia.org/wiki/Apple?quicksurvey=Reader-segmentation-3-de-test
[13:11:26] <schana>	 still doesn't work
[13:11:36] <Zppix>	 when the browser's "Do Not Track" feature is turned on;
[13:11:36] <Zppix>	 on skin Minerva when the beta optin panel is shown;
[13:11:36] <Zppix>	 if a survey is an external one and points to non-https location when the config variable `wgQuickSurveysRequireHttps` is set to `true`.
[13:11:43] <Zppix>	 it wont show
[13:12:36] <wikibugs>	 (03PS1) 10Ema: VCL: lower grace for transient n-hit-wonder objects [puppet] - 10https://gerrit.wikimedia.org/r/353874 (https://phabricator.wikimedia.org/T165063)
[13:13:19] <schana>	 works in firefox with the debug extension
[13:13:26] <schana>	 must be some chrome setting
[13:13:39] <James_F>	 I'm still here, sorry.
[13:13:50] <Dereckson>	 https://de.wikipedia.org/wiki/Apple?quicksurvey=internal%20example%20survey doesn't work either
[13:13:54] <wikibugs>	 06Operations, 13Patch-For-Review, 15User-fgiunchedi: Delete non-used and/or non-requested thumbnail sizes periodically - https://phabricator.wikimedia.org/T162796#3262945 (10fgiunchedi) List of candidates for deletion: (note some criteria might overlap)  | Criteria | Count | Bytes (GB) | | -- | -- | -- | | W...
[13:14:11] <wikibugs>	 (03CR) 10BBlack: [C: 031] VCL: lower grace for transient n-hit-wonder objects [puppet] - 10https://gerrit.wikimedia.org/r/353874 (https://phabricator.wikimedia.org/T165063) (owner: 10Ema)
[13:14:15] <Zppix>	 schana:  do not track must be on
[13:14:16] <Dereckson>	 		"enabled": true,
[13:14:29] <schana>	 I just checked ja he ro and de
[13:14:33] <Dereckson>	 This one could be the issue
[13:14:33] <schana>	 they all work for me in firefox
[13:14:48] <Dereckson>	 but not on Chrome?
[13:15:01] <addshore>	 gilles: are you around and do your changes have to go out together?
[13:15:01] <Zppix>	 let me check Dereckson 
[13:15:07] <wikibugs>	 (03CR) 10Ema: [C: 032] VCL: lower grace for transient n-hit-wonder objects [puppet] - 10https://gerrit.wikimedia.org/r/353874 (https://phabricator.wikimedia.org/T165063) (owner: 10Ema)
[13:15:08] <Zppix>	 addshore:  mwdebug1002 right?
[13:15:08] <schana>	 not on chrome
[13:15:13] <schana>	 but I might have do not track on
[13:15:48] <addshore>	 Zppix: yes, well, everywhere, the sync has been done
[13:15:55] <schana>	 turning do not track off makes the survey load in chrome
[13:16:13] <icinga-wm>	 PROBLEM - Check correctness of the icinga configuration on tegmen is CRITICAL: Icinga configuration contains errors
[13:16:20] <Dereckson>	 schana: my issue is this:
[13:16:24] <Dereckson>	 var_dump($wgQuickSurveysConfig[1]['enabled'])
[13:16:24] <Dereckson>	 bool(false)
[13:16:26] <Zppix>	 schana: Dereckson  it works on my end
[13:16:40] <Zppix>	 i use chrome
[13:16:40] <addshore>	 James_F: are you happy for both of your changes to be deployed at once?
[13:17:03] <James_F>	 Yes.
[13:18:22] <schana>	 Dereckson: I'm not sure what you're referring to
[13:19:24] <logmsgbot>	 !log addshore@tin Synchronized php-1.30.0-wmf.1/extensions/Cognate/src/CognateStore.php: SWAT: [[gerrit:353860|Add a clear-first option to populatePages script]] T164407 PT 1/2 (duration: 00m 40s)
[13:19:32] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:19:32] <stashbot>	 T164407: Cognate has been disabled from WMF because it caused an outage on x1 by overtaking 10000 concurrent connections - https://phabricator.wikimedia.org/T164407
[13:19:33] <gilles>	 addshore: I'm here and yes they can go out together
[13:20:11] <addshore>	 gilles: can, or should? :)
[13:20:18] <gilles>	 they don't have to, whichever way saves you time
[13:20:23] <addshore>	 Great!
[13:20:29] <logmsgbot>	 !log addshore@tin Synchronized php-1.30.0-wmf.1/extensions/Cognate/maintenance/populateCognatePages.php: SWAT: [[gerrit:353860|Add a clear-first option to populatePages script]] T164407 PT 2/2 (duration: 00m 39s)
[13:20:37] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:20:52] <Dereckson>	 schana: to the fact sample surveys are disabled (the first two, 0 and 1), only yours is enabled (the third, 2) / ack'ed it works
[13:23:30] <wikibugs>	 (03PS1) 10Muehlenhoff: package_builder: Install patchutils [puppet] - 10https://gerrit.wikimedia.org/r/353875
[13:25:23] <wikibugs>	 (03CR) 10Alexandros Kosiaris: [C: 031] package_builder: Install patchutils [puppet] - 10https://gerrit.wikimedia.org/r/353875 (owner: 10Muehlenhoff)
[13:26:04] <wikibugs>	 (03PS1) 10Nschaaf: Disable test reader QuickSurveys [mediawiki-config] - 10https://gerrit.wikimedia.org/r/353876 (https://phabricator.wikimedia.org/T131949)
[13:27:45] <moritzm>	 !log uploaded HHVM 3.18.2+dfsg-1+wmf3 to apt.wikimedia.org (addresses segfault in XML reader (T162586, T165074)
[13:27:53] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:27:53] <stashbot>	 T165074: CI tests failing with segfault - https://phabricator.wikimedia.org/T165074
[13:27:54] <stashbot>	 T162586: HHVM segfault in memory cleanup - https://phabricator.wikimedia.org/T162586
[13:28:26] <addshore>	 *twiddles thumbs waiting for jenkins*
[13:29:33] <icinga-wm>	 RECOVERY - nova instance creation test on labnet1001 is OK: PROCS OK: 1 process with command name python, args nova-fullstack
[13:29:44] <wikibugs>	 (03PS13) 10Giuseppe Lavagetto: restbase: migration to role/profile for the dev cluster [puppet] - 10https://gerrit.wikimedia.org/r/352851
[13:29:58] <wikibugs>	 06Operations, 07HHVM, 07Upstream: HHVM segfault in memory cleanup - https://phabricator.wikimedia.org/T162586#3262991 (10MoritzMuehlenhoff) This is fixed in 3.18.2+dfsg-1+wmf3. So far this has only been reproduced with the test case from the test suite, I'll keep this bug open until it's fully rolled out to...
[13:30:34] <schana>	 addshore: do you have a timestamp of when the surveys went live?
[13:30:52] <addshore>	 13:10 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add QuickSurvey for reader segmentation research T131949 T164769 T164894 T164960 T164963 (duration: 00m 40s)
[13:30:52] <stashbot>	 T164894: Test reader survey in multiple languages - Japanese - https://phabricator.wikimedia.org/T164894
[13:30:53] <stashbot>	 T164963: Test reader survey in multiple languages - Hebrew - https://phabricator.wikimedia.org/T164963
[13:30:53] <stashbot>	 T131949: Repeat the big English reader survey in one or two more languages  - https://phabricator.wikimedia.org/T131949
[13:30:53] <stashbot>	 T164769: Test reader survey in multiple languages - Romanian - https://phabricator.wikimedia.org/T164769
[13:30:53] <stashbot>	 T164960: Test reader survey in multiple languages - German - https://phabricator.wikimedia.org/T164960
[13:31:02] <schana>	 thanks
[13:32:33] <icinga-wm>	 PROBLEM - nova instance creation test on labnet1001 is CRITICAL: PROCS CRITICAL: 0 processes with command name python, args nova-fullstack
[13:33:40] <addshore>	 James_F: looks like jenkins finally merged them
[13:33:55] <James_F>	 Yay.
[13:34:00] <James_F>	 Pulling to mw1002?
[13:34:03] <addshore>	 will do
[13:34:11] <addshore>	 should be there now James_F 
[13:34:27] <aude>	 strange to have SF people awake at this hour :)
[13:35:03] <James_F>	 addshore: LGTM.
[13:35:09] <addshore>	 ack!
[13:35:16] <James_F>	 aude: I'm currently in London.
[13:35:30] <aude>	 yeah, figured :)
[13:36:22] <wikibugs>	 (03PS1) 10Filippo Giunchedi: swift: introduce storage policies [puppet] - 10https://gerrit.wikimedia.org/r/353878 (https://phabricator.wikimedia.org/T151648)
[13:36:30] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [C: 032] restbase: migration to role/profile for the dev cluster [puppet] - 10https://gerrit.wikimedia.org/r/352851 (owner: 10Giuseppe Lavagetto)
[13:36:36] <_joe_>	 mobrovac: ^^
[13:36:48] <mobrovac>	 kk
[13:36:48] <_joe_>	 going to apply it one cluster at a time, starting with aqs
[13:36:49] <addshore>	 James_F: syncing
[13:37:05] <James_F>	 Ta.
[13:37:24] <logmsgbot>	 !log addshore@tin Synchronized php-1.30.0-wmf.1/extensions/VisualEditor: SWAT: [[gerrit:353861|#1]] [[gerrit:353863|#2]] T165238 T165238 VisualEditor (duration: 00m 41s)
[13:37:27] <addshore>	 James_F: ^^
[13:37:32] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:37:34] <stashbot>	 T165238: Source editor fails to load on direct non-view page loads where the wiki doesn't have Single Edit Tab enabled - https://phabricator.wikimedia.org/T165238
[13:37:44] <addshore>	 gilles: looks like the mediawiki tests are still running for yours! nearly there!
[13:37:54] <icinga-wm>	 PROBLEM - puppet last run on aqs1004 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[13:37:58] <James_F>	 addshore: Thanks!
[13:38:53] <icinga-wm>	 PROBLEM - HHVM jobrunner on mw1165 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[13:38:54] <icinga-wm>	 RECOVERY - puppet last run on aqs1004 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures
[13:39:53] <addshore>	 gilles: looks like they have been merged!
[13:39:57] <gilles>	 indeed
[13:40:57] <addshore>	 gilles: they are on mwdebug1002
[13:41:04] <gilles>	 testing
[13:41:25] <_joe_>	 mobrovac: no changes on any cluster for now, next I'm gonna reenable puppet everywhere but the dev cluster
[13:41:56] <mobrovac>	 _joe_: already applied to RB prod?
[13:42:04] <_joe_>	 to one machine
[13:42:06] <_joe_>	 noop
[13:42:09] <mobrovac>	 kk
[13:42:22] <_joe_>	 I usually do one machine per role, basically
[13:43:42] <_joe_>	 I'm going to apply it to restbase-dev1001 now
[13:44:31] <_joe_>	 heh I forgot one commit to the private repo
[13:46:21] <gilles>	 addshore: seems to work for djvu, I can't find a video small enough to pass through the debug header bug where we can't uppload large files. will test once it's deployed
[13:46:32] <addshore>	 synicng
[13:46:37] <addshore>	 urm... syncing... :P
[13:46:58] <addshore>	 James_F: when are you heading to vienna?
[13:47:03] <icinga-wm>	 PROBLEM - puppet last run on restbase-dev1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[13:47:10] <logmsgbot>	 !log addshore@tin Synchronized php-1.30.0-wmf.1/extensions/TimedMediaHandler/handlers: SWAT: [[gerrit:353505|Fix X-Content-Dimensions support]] T150741 (duration: 00m 40s)
[13:47:14] <James_F>	 addshore: Thursday.
[13:47:17] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:47:17] <stashbot>	 T150741: Thumbor should reject thumbnail requests that are the same size as the original or bigger - https://phabricator.wikimedia.org/T150741
[13:47:37] <James_F>	 addshore: You?
[13:47:52] <wikibugs>	 06Operations, 05Prometheus-metrics-monitoring: Add Prometheus machine metric to track core dumps - https://phabricator.wikimedia.org/T165323#3263065 (10MoritzMuehlenhoff)
[13:48:41] <logmsgbot>	 !log addshore@tin Synchronized php-1.30.0-wmf.1/includes/media/DjVu.php: SWAT: [[gerrit:353504|Add X-Content-Dimensions support to DjVu]] T150741 (duration: 00m 39s)
[13:48:48] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:48:55] <wikibugs>	 (03PS2) 10Filippo Giunchedi: swift: introduce storage policies [puppet] - 10https://gerrit.wikimedia.org/r/353878 (https://phabricator.wikimedia.org/T151648)
[13:49:00] <_joe_>	 mobrovac: puppet is applying correctly; you might want to restart restbase once I'm done
[13:49:03] <icinga-wm>	 RECOVERY - puppet last run on restbase-dev1001 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures
[13:49:21] <mobrovac>	 yup _joe_, let me know once it's applied on the whole dev cluster
[13:49:28] <addshore>	 gilles: all done!
[13:49:43] <addshore>	 James_F: I'm in prague, but also heading to vienna on thursday
[13:49:49] <James_F>	 Aha. See you there. :-)
[13:49:59] <addshore>	 You might be on the same flight as Tom and a couple of others :P
[13:50:08] <wikibugs>	 06Operations, 10netops, 13Patch-For-Review: analytics hosts frequently tripping 'port utilization threshold' librenms alerts - https://phabricator.wikimedia.org/T133852#3263083 (10ayounsi) @fgiunchedi indeed, it's happening again.  During those jobs, ports are completely saturated. Because of the nature of t...
[13:50:10] <_joe_>	 mobrovac: btw restbase was configured to contact eventbus in eqiad
[13:50:11] <gilles>	 addshore: second patch works fine, thank you very much
[13:50:17] <_joe_>	 from every DC
[13:50:25] <_joe_>	 was that by design or by omission?
[13:50:26] <addshore>	 lovely, and thus SWAT is done!
[13:50:54] <moritzm>	 !log upgrading mwdebug servers to 3.18.2+wmf3
[13:51:00] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:51:59] <mobrovac>	 _joe_: as it should, in the dev cluster as it exists only in eqiad
[13:52:02] <wikibugs>	 06Operations, 06Release-Engineering-Team, 10Traffic: Can't upload large files with X-Wikimedia-Debug turned on - https://phabricator.wikimedia.org/T165324#3263087 (10Gilles)
[13:52:32] <_joe_>	 mobrovac: no, every server points to eqiad
[13:52:55] <mobrovac>	 _joe_: ok, let's step back, which rb cluster are we talking about?
[13:53:16] <_joe_>	 mobrovac: all of them had eventlogging_service_uri: "http://eventbus.svc.eqiad.wmnet:8085/v1/events"
[13:53:32] <_joe_>	 production, test, dev, both DCs
[13:53:47] <_joe_>	 I just changed it to the discovery URI for the dev cluster
[13:54:14] <_joe_>	 but I wanted to check if this is eqiad-only by design
[13:55:40] <mobrovac>	 lemme see the cp config and will answer it _joe_ :P
[13:57:56] <mobrovac>	 _joe_: omission, it was done that way as we were using only one DC, but now that we have two in operation it can be local
[13:58:15] <_joe_>	 ok thanks
[13:58:17] <mobrovac>	 _joe_: this is only relevant for purges as rb only sends purge events to eventbus
[13:58:23] <_joe_>	 ok
[13:58:43] <_joe_>	 so if we send purge events to EB in codfw, would it be seen by clients in eqiad?
[13:59:33] <icinga-wm>	 RECOVERY - nova instance creation test on labnet1001 is OK: PROCS OK: 1 process with command name python, args nova-fullstack
[14:00:04] <mobrovac>	 _joe_: CP clients? yes
[14:00:32] <_joe_>	 uhm ok
[14:02:33] <icinga-wm>	 PROBLEM - nova instance creation test on labnet1001 is CRITICAL: PROCS CRITICAL: 0 processes with command name python, args nova-fullstack
[14:05:30] <wikibugs>	 06Operations, 05Goal, 07kubernetes: Expand the infrastructure to codfw - https://phabricator.wikimedia.org/T162041#3263125 (10akosiaris)
[14:13:56] <wikibugs>	 (03CR) 10Alexandros Kosiaris: [C: 032] Add kubemaster LVS service in codfw [puppet] - 10https://gerrit.wikimedia.org/r/353865 (owner: 10Alexandros Kosiaris)
[14:14:00] <wikibugs>	 (03PS2) 10Alexandros Kosiaris: Add kubemaster LVS service in codfw [puppet] - 10https://gerrit.wikimedia.org/r/353865
[14:14:03] <wikibugs>	 (03CR) 10Alexandros Kosiaris: [V: 032 C: 032] Add kubemaster LVS service in codfw [puppet] - 10https://gerrit.wikimedia.org/r/353865 (owner: 10Alexandros Kosiaris)
[14:20:31] <wikibugs>	 06Operations, 05Goal, 07kubernetes: Expand the infrastructure to codfw - https://phabricator.wikimedia.org/T162041#3263163 (10akosiaris)
[14:20:33] <wikibugs>	 06Operations, 10vm-requests, 05Goal, 13Patch-For-Review, 07kubernetes: Set up kubernetes masters for codfw cluster - https://phabricator.wikimedia.org/T165291#3263160 (10akosiaris) 05Open>03Resolved a:03akosiaris kubernetes master `acrab` and `acrux` are up and running and LVS service IP `10.2.1.8`...
[14:21:23] <wikibugs>	 06Operations, 05Goal, 07kubernetes: Expand the infrastructure to codfw - https://phabricator.wikimedia.org/T162041#3150620 (10akosiaris)
[14:22:03] <wikibugs>	 06Operations, 10Continuous-Integration-Infrastructure, 10Wikidata, 07HHVM, and 2 others: CI tests failing with segfault - https://phabricator.wikimedia.org/T165074#3263168 (10Ladsgroup) Just saying this also happens in travis instances causing Wikibase travis tests to fail. https://travis-ci.org/wikimedia/...
[14:22:15] <mobrovac>	 _joe_: still applying?
[14:22:27] <_joe_>	 mobrovac: sorry, no, done
[14:22:56] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [C: 031] "https://puppet-compiler.wmflabs.org/6415" [puppet] - 10https://gerrit.wikimedia.org/r/353047 (owner: 10Giuseppe Lavagetto)
[14:23:51] <mobrovac>	 ok thnx _joe_, will restrat now then
[14:24:52] <logmsgbot>	 !log mobrovac@tin Started restart [restbase/deploy@c70a1e1] (dev-cluster): Restart after applying https://gerrit.wikimedia.org/r/#/c/352851/
[14:24:58] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:29:35] <mobrovac>	 _joe_: dev cluster looking good!
[14:29:50] <_joe_>	 cool
[14:30:02] <_joe_>	 I'll move on with the other changes then
[14:30:13] <icinga-wm>	 PROBLEM - carbon-cache too many creates on graphite1001 is CRITICAL: CRITICAL: 1.69% of data above the critical threshold [1000.0]
[14:30:33] <icinga-wm>	 RECOVERY - nova instance creation test on labnet1001 is OK: PROCS OK: 1 process with command name python, args nova-fullstack
[14:30:34] <mobrovac>	 thnx
[14:31:06] <wikibugs>	 06Operations, 10Continuous-Integration-Infrastructure, 10Wikidata, 07HHVM, and 2 others: CI tests failing with segfault - https://phabricator.wikimedia.org/T165074#3263202 (10MoritzMuehlenhoff) @Ladsgroup I'm not sure how that Travis setup is configured, but if you make it install HHVM 3.18.2+dfsg-1+wmf3,...
[14:32:11] <wikibugs>	 (03Abandoned) 10Paladox: Install openjdk jdk version instead of jre [debs/gerrit] - 10https://gerrit.wikimedia.org/r/353765 (owner: 10Paladox)
[14:33:08] <wikibugs>	 (03CR) 10Paladox: "> Or, just stop using this package. Cf T157414" [debs/gerrit] - 10https://gerrit.wikimedia.org/r/353766 (owner: 10Paladox)
[14:33:33] <icinga-wm>	 PROBLEM - nova instance creation test on labnet1001 is CRITICAL: PROCS CRITICAL: 0 processes with command name python, args nova-fullstack
[14:39:47] <wikibugs>	 06Operations, 10ops-codfw, 13Patch-For-Review: codfw: kubernetes200[1-4] racking and onsite setup task - https://phabricator.wikimedia.org/T164851#3263242 (10Papaul) @akosiaris i am getting this while trying to install the systems           ┌────────────────────┤ [!!] Partition disks ├──────────────────┐...
[14:43:25] <wikibugs>	 06Operations, 10Traffic: Can't upload large files with X-Wikimedia-Debug turned on - https://phabricator.wikimedia.org/T165324#3263258 (10greg) (not really a RelEng task, we care about the debug servers and use them, but Ops manages them and the nginx config)
[14:46:25] <wikibugs>	 (03PS3) 10Giuseppe Lavagetto: cassandra::instance: allow use of default values [puppet] - 10https://gerrit.wikimedia.org/r/353047
[14:47:42] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [C: 032] cassandra::instance: allow use of default values [puppet] - 10https://gerrit.wikimedia.org/r/353047 (owner: 10Giuseppe Lavagetto)
[14:49:24] <wikibugs>	 (03PS3) 10Giuseppe Lavagetto: restbase: convert test cluster to role/profile [puppet] - 10https://gerrit.wikimedia.org/r/353048
[15:06:00] <wikibugs>	 06Operations, 10ops-codfw, 13Patch-For-Review: codfw: kubernetes200[1-4] racking and onsite setup task - https://phabricator.wikimedia.org/T164851#3263423 (10RobH)
[15:07:52] <logmsgbot>	 !log mobrovac@tin Started deploy [citoid/deploy@3ed34ef]: Better publishing date extraction support - T132308
[15:07:59] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:08:00] <stashbot>	 T132308: Figure out how to deal with incomplete dates, i.e. year only or year and month only - https://phabricator.wikimedia.org/T132308
[15:10:42] <logmsgbot>	 !log mobrovac@tin Finished deploy [citoid/deploy@3ed34ef]: Better publishing date extraction support - T132308 (duration: 02m 49s)
[15:10:49] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:12:06] <wikibugs>	 06Operations, 10ops-eqiad, 06Analytics-Kanban, 06DC-Ops: analytics1030 stuck in console while booting - https://phabricator.wikimedia.org/T162046#3263525 (10Cmjohnson) I've been contacted by Dell regarding the support task. The part is back ordered and may be a few more days.
[15:20:10] <wikibugs>	 06Operations, 10ops-codfw: Decomission mw2098 - https://phabricator.wikimedia.org/T164959#3263577 (10RobH) a:03RobH Ok, I'll steal this task for the decom, because it has to have a few things.  1) All decom tasks should be flagged with #hardware-requests 2) All decom tasks should have the decom checklist cop...
[15:22:38] <wikibugs>	 (03PS4) 10Giuseppe Lavagetto: restbase: convert test cluster to role/profile [puppet] - 10https://gerrit.wikimedia.org/r/353048
[15:24:13] <icinga-wm>	 PROBLEM - carbon-cache too many creates on graphite1001 is CRITICAL: CRITICAL: 1.69% of data above the critical threshold [1000.0]
[15:26:13] <icinga-wm>	 RECOVERY - carbon-cache too many creates on graphite1001 is OK: OK: Less than 1.00% above the threshold [500.0]
[15:27:33] <wikibugs>	 06Operations, 10ops-codfw, 10hardware-requests: Decomission mw2098 - https://phabricator.wikimedia.org/T164959#3263657 (10RobH) a:05RobH>03faidon
[15:29:16] <wikibugs>	 06Operations, 10ops-codfw, 10hardware-requests: Decomission mw2098 - https://phabricator.wikimedia.org/T164959#3263664 (10faidon) a:05faidon>03RobH Sounds fine, approved.
[15:30:15] <wikibugs>	 06Operations, 10ops-codfw, 10hardware-requests: Decomission mw2098 - https://phabricator.wikimedia.org/T164959#3263666 (10RobH)
[15:30:18] <wikibugs>	 (03PS5) 10Giuseppe Lavagetto: restbase: convert test cluster to role/profile [puppet] - 10https://gerrit.wikimedia.org/r/353048
[15:37:31] <wikibugs>	 06Operations, 10ops-eqiad: decommission indium - https://phabricator.wikimedia.org/T165345#3263693 (10Jgreen)
[15:38:03] <wikibugs>	 06Operations, 10ops-eqiad: Analytics1040 system board repair needed - https://phabricator.wikimedia.org/T164942#3263708 (10Cmjohnson) The new system board has been ordered through Dell but is back ordered....Should hopefully be in this week.
[15:39:26] <akosiaris>	 !log upgrade pybal to 1.13.6 across the LVS fleet
[15:39:33] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:50:12] <wikibugs>	 (03PS6) 10Giuseppe Lavagetto: restbase: convert test cluster to role/profile [puppet] - 10https://gerrit.wikimedia.org/r/353048
[15:52:20] <wikibugs>	 06Operations, 10media-storage, 13Patch-For-Review, 15User-fgiunchedi: Implement storage policies for swift - https://phabricator.wikimedia.org/T151648#3263739 (10fgiunchedi)
[15:55:21] <wikibugs>	 (03PS7) 10Giuseppe Lavagetto: restbase: convert test cluster to role/profile [puppet] - 10https://gerrit.wikimedia.org/r/353048
[16:00:37] <wikibugs>	 (03PS3) 10Paladox: Gerrit: Remove "" around T\\d+ in gerrit.config [puppet] - 10https://gerrit.wikimedia.org/r/352710
[16:01:11] <wikibugs>	 (03PS5) 10Paladox: Jenkins: Add noncanon to jenkins proxy site [puppet] - 10https://gerrit.wikimedia.org/r/351391
[16:21:43] <icinga-wm>	 PROBLEM - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[16:21:53] <icinga-wm>	 PROBLEM - Check Varnish expiry mailbox lag on cp1099 is CRITICAL: CRITICAL: expiry mailbox lag is 2025896
[16:22:33] <icinga-wm>	 RECOVERY - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is OK: All endpoints are healthy
[16:24:31] <wikibugs>	 06Operations, 10media-storage, 15User-fgiunchedi: Running swiftrepl is not puppetized - https://phabricator.wikimedia.org/T162123#3263834 (10fgiunchedi)
[16:24:33] <wikibugs>	 06Operations, 15User-fgiunchedi: Reduce Swift technical debt - https://phabricator.wikimedia.org/T162792#3263833 (10fgiunchedi)
[16:29:33] <icinga-wm>	 RECOVERY - nova instance creation test on labnet1001 is OK: PROCS OK: 1 process with command name python, args nova-fullstack
[16:32:23] <icinga-wm>	 PROBLEM - Disk space on elastic1025 is CRITICAL: DISK CRITICAL - free space: /srv 61501 MB (12% inode=99%)
[16:32:33] <icinga-wm>	 PROBLEM - nova instance creation test on labnet1001 is CRITICAL: PROCS CRITICAL: 0 processes with command name python, args nova-fullstack
[16:44:23] <icinga-wm>	 RECOVERY - Disk space on elastic1025 is OK: DISK OK
[16:44:55] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [C: 032] restbase: convert test cluster to role/profile [puppet] - 10https://gerrit.wikimedia.org/r/353048 (owner: 10Giuseppe Lavagetto)
[16:45:06] <wikibugs>	 06Operations, 10DBA: Adapt wmf-mariadb10 package for jessie or puppetize differently its service to adapt it to systemd - https://phabricator.wikimedia.org/T116903#3263938 (10jcrespo) we will repurpose this for stretch, we'll keep probably 10.0 on jesssie using inet.d.
[16:46:15] <wikibugs>	 06Operations, 10DBA: Adapt wmf-mariadb101 package for stretch and adapt its service to systemd - https://phabricator.wikimedia.org/T116903#3263939 (10jcrespo)
[16:46:51] <setup_>	 What CPU's do Wikipedia use?
[16:48:36] <moritzm>	 setup_: it's all Intel except one system
[16:49:32] <setup_>	 interesting
[16:49:46] <setup_>	 Which specs are they
[16:50:03] <icinga-wm>	 PROBLEM - puppet last run on cerium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:52:03] <icinga-wm>	 RECOVERY - puppet last run on cerium is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures
[16:55:23] <_joe_>	 urandom: cerium is done, I restarted restbase and cassandra there with no issues
[16:56:41] <wikibugs>	 06Operations, 10ops-codfw, 06DC-Ops, 06Discovery, and 3 others: elastic2020 is powered off and does not want to restart - https://phabricator.wikimedia.org/T149006#3264019 (10RobH) So part of the issue on this system is it is a lease, not WMF owned.  We cannot just use shelf spares, since we have to use ap...
[16:57:21] <urandom>	 _joe_: k, i'll have a look-see
[16:59:15] <urandom>	 _joe_: LGTM
[16:59:24] <_joe_>	 urandom: great!
[16:59:31] <_joe_>	 I'll do the main cluster tomorrow then
[16:59:53] <urandom>	 _joe_: you planning on a restart there as well?
[16:59:58] <urandom>	 or was that a precaution here?
[16:59:59] <_joe_>	 no
[17:00:00] <urandom>	 k
[17:00:04] <jouncebot>	 gehel: Dear anthropoid, the time has come. Please deploy Weekly Wikidata query service deployment window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170515T1700).
[17:00:08] <_joe_>	 it was here since the seeds list is wrong
[17:00:14] <urandom>	 right
[17:00:15] <_joe_>	 I hope that's not the case for the main cluster
[17:00:23] <urandom>	 heh
[17:01:52] <wikibugs>	 06Operations, 10ops-codfw, 06DC-Ops, 06Discovery, and 3 others: elastic2020 is powered off and does not want to restart - https://phabricator.wikimedia.org/T149006#3264027 (10Papaul) Wed 5/10/2017 10:45 AM Thank you Papaul,  I have put in a request to Intel Support. They will reply with a form that we will...
[17:03:33] <wikibugs>	 06Operations, 10ops-codfw, 06DC-Ops, 06Discovery, and 3 others: elastic2020 is powered off and does not want to restart - https://phabricator.wikimedia.org/T149006#3264036 (10Papaul) Thu 5/11/2017 10:29 AM from Please see below. Bo Rivera  Please see below.  Please see below. Hello, An update was made to s...
[17:05:33] <mutante>	 _joe_: so i did this https://wikitech.wikimedia.org/w/index.php?title=Switch_Datacenter/DeploymentServer&diff=prev&oldid=1759120    but in the actual  operations/switchdc i dont see it yet
[17:07:08] <_joe_>	 mutante: sorry, it wasn't that, it was maintenance_server
[17:07:25] <_joe_>	 (but I wanted to underline we need to check there too)
[17:07:36] <mutante>	 _joe_: ah, ok, yea makes sense
[17:08:13] <mutante>	 i see the mediawiki.py for that, yep
[17:11:53] <icinga-wm>	 RECOVERY - Check Varnish expiry mailbox lag on cp1099 is OK: OK: expiry mailbox lag is 19864
[17:15:32] <wikibugs>	 06Operations, 10ops-codfw, 06DC-Ops, 06Discovery, and 3 others: elastic2020 is powered off and does not want to restart - https://phabricator.wikimedia.org/T149006#3264134 (10RobH) Ok, I'm going to attempt to summarize what I know to be the current issue(s) with elastic2020.  * System has issues starting b...
[17:22:16] <logmsgbot>	 !log mobrovac@tin Started deploy [restbase/deploy@c70a1e1] (dev-cluster): Bring RESTBase up to date in the Dev Cluster
[17:22:24] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:22:38] <wikibugs>	 06Operations: Check long-running screen/tmux sessions - https://phabricator.wikimedia.org/T165348#3264136 (10Reedy)
[17:24:08] <logmsgbot>	 !log mobrovac@tin Finished deploy [restbase/deploy@c70a1e1] (dev-cluster): Bring RESTBase up to date in the Dev Cluster (duration: 01m 51s)
[17:24:15] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:25:21] <wikibugs>	 06Operations, 10ops-codfw, 06DC-Ops, 06Discovery, and 3 others: elastic2020 is powered off and does not want to restart - https://phabricator.wikimedia.org/T149006#3264141 (10Papaul) Email Dasher about the failed SSD may 1 Hello Brynden, I received the main board and was in the processing of installing and...
[17:26:43] <wikibugs>	 06Operations, 10ops-eqiad, 10netops: Interface errors on asw-c-eqiad:xe-8/0/38 - https://phabricator.wikimedia.org/T165008#3264152 (10Cmjohnson)
[17:27:15] <wikibugs>	 06Operations, 10ops-eqiad, 06Analytics-Kanban: analytics1030 stuck in console while booting - https://phabricator.wikimedia.org/T162046#3264155 (10Cmjohnson)
[17:33:56] <wikibugs>	 06Operations, 10ops-eqiad, 15User-fgiunchedi: HP RAID icinga alert on ms-be1021 - https://phabricator.wikimedia.org/T163777#3264163 (10Cmjohnson) A case has been opened for this server.  Let's work this one and them move on to the others...ms-be1016, 1019 and 1020 should be included in the list.   Your case...
[17:37:40] <wikibugs>	 06Operations, 10ops-codfw, 06DC-Ops, 06Discovery, and 3 others: elastic2020 is powered off and does not want to restart - https://phabricator.wikimedia.org/T149006#3264166 (10RobH) Ok, I've emailed Dasher to inquire about this with the following:    > Dasher Folks, >  > So it seems some of this conversatio...
[17:48:25] <gilles>	 addshore: during the SWAT earlier, did you deploy only to group0? group1 and group2 are also running 1.30.0-wmf.1
[17:52:24] <addshore>	 I deployed them to everything running the branch!
[17:53:27] <addshore>	 gilles: why?
[17:53:50] <gilles>	 ah, thanks, just checking
[17:57:29] <wikibugs>	 06Operations, 10Analytics, 10Analytics-Cluster: rack/setup/install replacement to stat1003 (stat1004 or misc name?) - https://phabricator.wikimedia.org/T165366#3264224 (10RobH)
[17:57:58] <wikibugs>	 06Operations, 10procurement: rack/setup/install replacement to stat1002 (stat1004 or misc name?) - https://phabricator.wikimedia.org/T165368#3264256 (10RobH)
[17:58:03] <wikibugs>	 06Operations, 10Analytics, 10Analytics-Cluster: rack/setup/install replacement to stat1003 (stat1005 or misc name?) - https://phabricator.wikimedia.org/T165366#3264272 (10RobH)
[17:58:45] <wikibugs>	 06Operations, 10Analytics-Cluster, 06Analytics-Kanban: Reinstall  Analytics Hadoop Cluster with Debian Jessie - https://phabricator.wikimedia.org/T157807#3264282 (10RobH)
[17:59:47] <wikibugs>	 06Operations, 10Analytics-Cluster, 06Analytics-Kanban: Reinstall  Analytics Hadoop Cluster with Debian Jessie - https://phabricator.wikimedia.org/T157807#3017036 (10RobH)
[18:00:05] <jouncebot>	 addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, and thcipriani: Respected human, time to deploy Morning SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170515T1800). Please do the needful.
[18:00:05] <jouncebot>	 schana, Jdlrobson, and raynor: A patch you scheduled for Morning SWAT (Max 8 patches) is about to be deployed. Please be available during the process.
[18:00:18] <jdlrobson>	 here \o
[18:00:19] <schana>	 hello
[18:00:26] <wikibugs>	 06Operations, 10Analytics, 10Analytics-Cluster, 10Research-management: GPU upgrade for stats machine - https://phabricator.wikimedia.org/T148843#3264307 (10RobH)
[18:00:28] <wikibugs>	 06Operations, 10Analytics, 10Analytics-Cluster, 10hardware-requests: EQIAD: stat1002 replacement - https://phabricator.wikimedia.org/T159838#3264301 (10RobH) 05Open>03Resolved This is ordered and being received in on linked #procurement task, as well as setup on task T165368.  As such, this #hw-request...
[18:00:36] <wikibugs>	 06Operations, 10Analytics, 10Analytics-Cluster, 10hardware-requests: EQIAD: stat1003 replacement - https://phabricator.wikimedia.org/T159839#3264308 (10RobH) 05Open>03Resolved This is ordered and being received in on linked #procurement task, as well as setup on task T165366.  As such, this #hw-request...
[18:01:31] <raynor>	 hello o/
[18:09:46] <jdlrobson>	 who's doing swat today? all of releng are out
[18:09:57] <jdlrobson>	 aude: RainbowSprinkles RoanKattouw Dereckson ?
[18:10:13] <RoanKattouw>	 I can do it but lemme find a charger first
[18:10:19] <RoanKattouw>	 I would like my laptop to not die halfway :)
[18:10:23] <jdlrobson>	 thanks RoanKattouw 
[18:11:17] <Reedy>	 RoanKattouw: I was on the Dover-Calais ferry yesterday
[18:11:20] <Reedy>	 I so wanted to deploy something
[18:11:41] <Reedy>	 Uh, no I wasn't
[18:11:43] <Reedy>	 Dover-Dunkirk
[18:12:19] <RoanKattouw>	 haha
[18:12:28] <RoanKattouw>	 I tried to investigate today's VE UBNs from a train
[18:12:48] <Reedy>	 Found a charger?
[18:12:49] <RoanKattouw>	 But the train wifi was broken and my 3G was pretty slow, so I didn't get anywhere before my train arrived
[18:13:08] <RoanKattouw>	 Yup I'm plugged in
[18:14:22] <wikibugs>	 (03CR) 10Catrope: [C: 032] Disable test reader QuickSurveys [mediawiki-config] - 10https://gerrit.wikimedia.org/r/353876 (https://phabricator.wikimedia.org/T131949) (owner: 10Nschaaf)
[18:15:06] <mutante>	 that's what is cool about the Amtrak trains over here, they are slow enough for 3/4G roaming to still work
[18:15:25] <wikibugs>	 (03Merged) 10jenkins-bot: Disable test reader QuickSurveys [mediawiki-config] - 10https://gerrit.wikimedia.org/r/353876 (https://phabricator.wikimedia.org/T131949) (owner: 10Nschaaf)
[18:15:31] <mutante>	 (and electric outlet) merged from train , succesfully
[18:15:51] <wikibugs>	 (03CR) 10jenkins-bot: Disable test reader QuickSurveys [mediawiki-config] - 10https://gerrit.wikimedia.org/r/353876 (https://phabricator.wikimedia.org/T131949) (owner: 10Nschaaf)
[18:16:23] <RoanKattouw>	 schana: Your patch is on mwdebug1002, please test
[18:16:28] <schana>	 ack
[18:17:24] <schana>	 looks good, thanks
[18:20:39] <logmsgbot>	 !log catrope@tin Synchronized wmf-config/InitialiseSettings.php: Disable test reader QuickSurveys (T131949, T164769, T164894, T164960, T164943) (duration: 00m 40s)
[18:20:49] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:20:50] <stashbot>	 T164894: Test reader survey in multiple languages - Japanese - https://phabricator.wikimedia.org/T164894
[18:20:50] <stashbot>	 T164943: Outline needed changes to github-webhook - https://phabricator.wikimedia.org/T164943
[18:20:50] <stashbot>	 T131949: Repeat the big English reader survey in one or two more languages  - https://phabricator.wikimedia.org/T131949
[18:20:50] <stashbot>	 T164769: Test reader survey in multiple languages - Romanian - https://phabricator.wikimedia.org/T164769
[18:20:51] <stashbot>	 T164960: Test reader survey in multiple languages - German - https://phabricator.wikimedia.org/T164960
[18:31:11] <mutante>	 @seen hashar
[18:31:11] <wm-bot>	 mutante: Last time I saw hashar they were quitting the network with reason: Quit: Textual IRC Client: www.textualapp.com N/A at 5/15/2017 11:47:41 AM (6h43m29s ago)
[18:31:54] <jdlrobson>	 RoanKattouw: ready when you are
[18:32:04] <RoanKattouw>	 Oh it finally merged
[18:32:15] <RoanKattouw>	 Sorry it was taking so long that I had gotten distracted catching up on other backlog
[18:32:28] <RoanKattouw>	 Thanks for the ping
[18:33:07] <RoanKattouw>	 jdlrobson: Ready for you on mwdebug1002
[18:33:18] <raynor>	 \o/
[18:33:28] <jdlrobson>	 so raynor you know how to test this?
[18:34:06] <raynor>	 yes
[18:34:25] <raynor>	 I think yes
[18:37:46] <raynor>	 RoanKattouw: it works properly on debug1002 - good to go
[18:37:57] <jdlrobson>	 RoanKattouw: yup same here
[18:38:03] <jdlrobson>	 sync away
[18:38:40] <RoanKattouw>	 Cool, syncing
[18:42:24] <RoanKattouw>	 Hmm
[18:42:26] <RoanKattouw>	 18:39:36 Check 'Logstash Error rate for mw1279.eqiad.wmnet' failed: ERROR: 18% OVER_THRESHOLD (Avg. Error rate: Before: 0.16, After: 2.00, Threshold: 1.63)
[18:42:35] <RoanKattouw>	 Let's see if I can find out what that was
[18:42:49] <RoanKattouw>	 It was only one of the canaries so I'm a bit skeptical
[18:44:16] <RoanKattouw>	 Oh it's because it's an API host, and there's an unrelated error coming from the API
[18:44:24] <RoanKattouw>	 Which I will fix later
[18:44:32] <RoanKattouw>	 Now trying to sync again, let's see if it'll let me get away with it this time
[18:45:04] <RoanKattouw>	 !log Canary failing on mw1279 due to Wikimedia\Rdbms\Database::makeList: empty input for field rev_id from ApiQueryRevisions
[18:45:11] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:45:14] <RoanKattouw>	 OK, it passed this time
[18:45:16] <logmsgbot>	 !log catrope@tin Synchronized php-1.30.0-wmf.1/extensions/MobileFrontend/: Revert "Use csrf token for watching" (T165209) (duration: 00m 41s)
[18:45:23] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:45:24] <stashbot>	 T165209: Watchstar feature broken: Tapping the watchlist star while logged in results in "mobile-frontend-watchlist-error" popup message - https://phabricator.wikimedia.org/T165209
[18:50:25] <jdlrobson>	 RoanKattouw: i've got to rush off but raynor will double check on production. Thanks for your help today! :)
[18:50:55] <RoanKattouw>	 Aha, the API error is fixed in master already thanks to andre__ 
[18:50:57] <RoanKattouw>	 *anomie
[18:51:01] <raynor>	 yup - I'm here, just let me know when to check it
[18:52:10] <RoanKattouw>	 raynor: It's in production already, so please test it there now
[18:52:34] <RoanKattouw>	 You already tested it in debug so it's probably fine, but you can never have too much testing :)
[18:52:35] <raynor>	 on it
[18:54:08] <raynor>	 tested on two wikis, works
[18:54:51] <raynor>	 RoanKattouw: thanks for deployment, everything works properly \o/
[18:57:00] <wikibugs>	 06Operations, 10MediaWiki-ResourceLoader, 10MediaWiki-extensions-CentralNotice, 06Performance-Team, and 2 others: Provide location, logged-in status and device information in ResourceLoaderContext - https://phabricator.wikimedia.org/T103695#1396785 (10AndyRussG) @Krinkle Thanks so much for the explanation!...
[19:01:52] <wikibugs>	 06Operations, 10Ops-Access-Requests: add Arzhel Younsi to datacenter access lists - https://phabricator.wikimedia.org/T165054#3264711 (10RobH)
[19:03:14] <wikibugs>	 06Operations, 10Ops-Access-Requests: add Arzhel Younsi to datacenter access lists - https://phabricator.wikimedia.org/T165054#3255501 (10RobH) @ayounsi:  You can now login to https://wikimedia.gocyrusone.com/ via your email address, and use the password reset option to get your codfw login details.  You are no...
[19:18:51] <wikibugs>	 06Operations, 10ops-eqiad: rack/setup/install replacement to stat1002 (stat1004 or misc name?) - https://phabricator.wikimedia.org/T165368#3264840 (10RobH)
[19:19:39] <wikibugs>	 06Operations, 10ops-eqiad, 10Analytics, 10Analytics-Cluster: rack/setup/install replacement to stat1002 (stat1004 or misc name?) - https://phabricator.wikimedia.org/T165368#3264256 (10RobH)
[19:19:44] <wikibugs>	 06Operations, 10ops-eqiad, 10Analytics, 10Analytics-Cluster: rack/setup/install replacement to stat1003 (stat1005 or misc name?) - https://phabricator.wikimedia.org/T165366#3264842 (10RobH)
[19:32:32] <wikibugs>	 (03CR) 10Dzahn: "the jdk package is already installed on contint1001, but not on contint2001. (manually installed?). going ahead." [puppet] - 10https://gerrit.wikimedia.org/r/348961 (owner: 10Chad)
[19:32:44] <wikibugs>	 (03PS4) 10Dzahn: Jenkins: install jdk, not just jre [puppet] - 10https://gerrit.wikimedia.org/r/348961 (owner: 10Chad)
[19:34:20] <wikibugs>	 (03CR) 10Dzahn: [C: 032] Jenkins: install jdk, not just jre [puppet] - 10https://gerrit.wikimedia.org/r/348961 (owner: 10Chad)
[19:36:24] <wikibugs>	 (03CR) 10Dzahn: "contint1001: no-op  contint2001: Notice: /Stage[main]/Jenkins/Package[openjdk-7-jdk]/ensure: ensure changed 'purged' to 'present'" [puppet] - 10https://gerrit.wikimedia.org/r/348961 (owner: 10Chad)
[19:37:52] <wikibugs>	 (03PS4) 10Dzahn: Labs contint: Install php5-gmp and php7.0-gmp [puppet] - 10https://gerrit.wikimedia.org/r/353194 (https://phabricator.wikimedia.org/T164977) (owner: 10Paladox)
[19:42:29] <logmsgbot>	 !log catrope@tin Synchronized php-1.30.0-wmf.1/includes/api/ApiQueryRevisions.php: T165100 (duration: 00m 40s)
[19:42:38] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:42:38] <stashbot>	 T165100: Wikimedia\Rdbms\Database::makeList: empty input for field rev_id - https://phabricator.wikimedia.org/T165100
[19:55:53] <icinga-wm>	 PROBLEM - HP RAID on ms-be1020 is CRITICAL: CHECK_NRPE: Socket timeout after 50 seconds.
[20:00:05] <jouncebot>	 gwicke, cscott, arlolra, subbu, bearND, halfak, and Amir1: Respected human, time to deploy Services – Parsoid / OCG / Citoid / Mobileapps / ORES / … (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170515T2000). Please do the needful.
[20:00:16] <halfak>	 Nothing for ORES
[20:02:36] <subbu>	 deploying parsoing in a little bit
[20:02:38] <subbu>	 parsoid
[20:04:32] <logmsgbot>	 !log ssastry@tin Started deploy [parsoid/deploy@132d0e5]: Updating Parsoid to a182c227
[20:04:39] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:05:01] <bearND>	 no MCS deploy today
[20:10:11] <wikibugs>	 06Operations, 10DBA, 10Gerrit, 06Release-Engineering-Team, 13Patch-For-Review: Gerrit shows HTTP 500 error when pasting extended unicode characters - https://phabricator.wikimedia.org/T145885#3265015 (10Paladox) Not sure if we should bother doing this as I found problems when upgrading a gerrit install f...
[20:11:54] <logmsgbot>	 !log ssastry@tin Finished deploy [parsoid/deploy@132d0e5]: Updating Parsoid to a182c227 (duration: 07m 21s)
[20:12:00] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:20:02] <subbu>	 !log Updated Parsoid to a182c227 (T141226, T164792, T37247, T153107, T163091, T164006, T161151, T162920, T163549)
[20:20:19] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:20:20] <stashbot>	 T162920: In multi-content/template-block scenarios, Linter displays "--" in the "Through a template"? column - https://phabricator.wikimedia.org/T162920
[20:20:20] <stashbot>	 T161151: Parsoid should resolve template paths before providing them to Linter - https://phabricator.wikimedia.org/T161151
[20:20:20] <stashbot>	 T164792: Add class mw-parser-output to Parsoid's output - https://phabricator.wikimedia.org/T164792
[20:20:20] <stashbot>	 T164006: Suggestion: API for fetching lint errors for a specific revision - https://phabricator.wikimedia.org/T164006
[20:20:20] <stashbot>	 T163091: Parsoid: Add API endpoint to get lint errors for arbitrary wikitext - https://phabricator.wikimedia.org/T163091
[20:20:20] <stashbot>	 T153107: Parsoid is generating [[Foo|Foo]] instead of [[Foo]] for some VE edits - https://phabricator.wikimedia.org/T153107
[20:20:21] <stashbot>	 T37247: content-holding <div> should only contain the page text - https://phabricator.wikimedia.org/T37247
[20:20:21] <stashbot>	 T141226: Missing data-mw content in wikitext leads to html2wt exceptions - https://phabricator.wikimedia.org/T141226
[20:20:22] <stashbot>	 T163549: Only lint pages that have wikitext contentmodel - https://phabricator.wikimedia.org/T163549
[20:20:44] <icinga-wm>	 PROBLEM - Nginx local proxy to apache on mw1263 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 0.156 second response time
[20:21:43] <icinga-wm>	 RECOVERY - Nginx local proxy to apache on mw1263 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 613 bytes in 0.232 second response time
[20:25:22] <anomie>	 Hmm. That polluted the 'mention' lists on all those tasks with tasks that are completely unrelated except that they happen to have been deployed in Parsoid at the same time.
[20:26:32] <MatmaRex>	 anomie: that's probably intentional, to note that the fixes for those have been deployed, since i guess parsoid doesn't have a strict release schedule like mediawiki.
[20:26:50] <MatmaRex>	 oh, you mean, each task refers to each other. hmm.
[20:27:05] <MatmaRex>	 that's silly but harmless!
[20:29:50] <subbu>	 anomie, i suppose the alternative is to have n log statements .. which can be painful.
[20:30:32] <subbu>	 or maybe a new stashbot feature to edit out unrelated mentions.
[20:30:33] <icinga-wm>	 RECOVERY - nova instance creation test on labnet1001 is OK: PROCS OK: 1 process with command name python, args nova-fullstack
[20:30:54] <subbu>	 but what is the standard practice for doing this?
[20:32:12] <anomie>	 I don't know, most things ride the train and so probably could use the tags like "MW-1.30-release-notes (WMF-deploy-2017-05-23_(1.30.0-wmf.2))" that get bot-added on merge. What's the problem being solved by mentioning all the task numbers?
[20:32:32] <subbu>	 we resolve tasks once they are gerrit-merged.
[20:32:50] <subbu>	 the stashbot mention is a notification that the code is now actually live in production.
[20:33:33] <icinga-wm>	 PROBLEM - nova instance creation test on labnet1001 is CRITICAL: PROCS CRITICAL: 0 processes with command name python, args nova-fullstack
[20:38:59] <wikibugs>	 06Operations, 10ops-codfw, 06DC-Ops, 06Discovery, and 3 others: elastic2020 is powered off and does not want to restart - https://phabricator.wikimedia.org/T149006#3265142 (10Papaul) @Robh yes we do; but there are 300GB
[20:42:40] <wikibugs>	 06Operations, 10ops-codfw, 06DC-Ops, 06Discovery, and 3 others: elastic2020 is powered off and does not want to restart - https://phabricator.wikimedia.org/T149006#3265143 (10RobH) @Papaul:  The spares tracking shows that we have 3 of the Intel S3610 800GB ssds on the spare shelf?  We recently ordered thes...
[20:47:09] <wikibugs>	 06Operations, 10ops-codfw, 06DC-Ops, 06Discovery, and 3 others: elastic2020 is powered off and does not want to restart - https://phabricator.wikimedia.org/T149006#3265144 (10Papaul) @Robh yes we do have some 800GB SSDs for spare but the one we are trying to replace is DC S3500 series.
[20:48:16] <gilles>	 !log run refreshImageMetadata --force for group1 + group2 wikis except commons on terbium T150741
[20:48:23] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:48:23] <stashbot>	 T150741: Thumbor should reject thumbnail requests that are the same size as the original or bigger - https://phabricator.wikimedia.org/T150741
[20:53:45] <wikibugs>	 06Operations, 10ops-codfw, 06DC-Ops, 06Discovery, and 3 others: elastic2020 is powered off and does not want to restart - https://phabricator.wikimedia.org/T149006#3265154 (10RobH) Ahh, sorry for the miscommunication then.  So, here is where we stand on this system  * It is a lease, if a shelf spare is use...
[20:54:42] <wikibugs>	 06Operations, 10ops-codfw, 06DC-Ops, 06Discovery, and 3 others: elastic2020 is powered off and does not want to restart - https://phabricator.wikimedia.org/T149006#3265155 (10RobH) @Gehel: Can you advise if this can remain offline for another week or two for the SSD replacement.  See my comment above for f...
[21:00:05] <jouncebot>	 dapatrick, bawolff, and Reedy: Dear anthropoid, the time has come. Please deploy Weekly Security deployment window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170515T2100).
[21:08:03] <icinga-wm>	 PROBLEM - puppet last run on contint2001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[21:13:08] <logmsgbot>	 !log mobrovac@tin Started deploy [restbase/deploy@c52add0]: Expose the new /transform/wikitext/to/lint end point to the public - T163091
[21:13:16] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[21:13:16] <stashbot>	 T163091: Parsoid: Add API endpoint to get lint errors for arbitrary wikitext - https://phabricator.wikimedia.org/T163091
[21:15:36] <wikibugs>	 (03PS1) 10RobH: decommission mw2098 [puppet] - 10https://gerrit.wikimedia.org/r/353918
[21:17:22] <wikibugs>	 (03PS1) 10RobH: decommission mw2098 (production dns) [dns] - 10https://gerrit.wikimedia.org/r/353920
[21:18:48] <wikibugs>	 06Operations, 10ops-codfw, 10hardware-requests, 13Patch-For-Review: Decomission mw2098 - https://phabricator.wikimedia.org/T164959#3265240 (10RobH) a:05RobH>03Papaul @Papaul:  Before I move through the checklist and disable everything, I'll need to know what the switch port is for this server?  The mw...
[21:19:40] <logmsgbot>	 !log mobrovac@tin Finished deploy [restbase/deploy@c52add0]: Expose the new /transform/wikitext/to/lint end point to the public - T163091 (duration: 06m 32s)
[21:19:48] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[21:19:48] <stashbot>	 T163091: Parsoid: Add API endpoint to get lint errors for arbitrary wikitext - https://phabricator.wikimedia.org/T163091
[21:20:21] <wikibugs>	 06Operations, 10ops-codfw, 10hardware-requests, 13Patch-For-Review: Decomission mw2098 - https://phabricator.wikimedia.org/T164959#3265245 (10RobH)
[21:20:35] <wikibugs>	 06Operations, 10ops-codfw, 10hardware-requests, 13Patch-For-Review: Decomission mw2098 - https://phabricator.wikimedia.org/T164959#3252410 (10RobH)
[21:23:22] <wikibugs>	 06Operations, 06Performance-Team, 10Thumbor, 05MW-1.30-release-notes (WMF-deploy-2017-05-09_(1.30.0-wmf.1)), 13Patch-For-Review: Thumbor should reject thumbnail requests that are the same size as the original or bigger - https://phabricator.wikimedia.org/T150741#3265252 (10Gilles)
[21:23:26] <wikibugs>	 (03PS4) 10XXN: Fixing "Book_talk" namespace definition for ro.wikipedia: [mediawiki-config] - 10https://gerrit.wikimedia.org/r/352728
[21:25:29] <wikibugs>	 (03Draft2) 10Zppix: Raise the account creation limit for www.enwp.org/WP:Meetup/Eugene/WikiAPA [mediawiki-config] - 10https://gerrit.wikimedia.org/r/353921
[21:30:01] <Zppix>	 jouncebot:  next
[21:30:01] <jouncebot>	 In 1 hour(s) and 29 minute(s): Evening SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170515T2300)
[21:31:23] <icinga-wm>	 PROBLEM - Nginx local proxy to apache on mw1181 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 0.152 second response time
[21:31:23] <icinga-wm>	 PROBLEM - Apache HTTP on mw1181 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 0.074 second response time
[21:31:46] <Zppix>	 jouncebot:  refresh
[21:31:48] <jouncebot>	 I refreshed my knowledge about deployments.
[21:32:23] <icinga-wm>	 RECOVERY - Nginx local proxy to apache on mw1181 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 613 bytes in 0.174 second response time
[21:32:23] <icinga-wm>	 RECOVERY - Apache HTTP on mw1181 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 612 bytes in 0.100 second response time
[21:36:03] <icinga-wm>	 RECOVERY - puppet last run on contint2001 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures
[21:49:22] <Zppix>	 hey does throttle.php support ipv6?
[21:53:39] <wikibugs>	 (03CR) 10Milimetric: "I have a couple of questions.  First, does any other config need to change for the Collection extension to recognize the new namespace nam" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/352728 (owner: 10XXN)
[21:56:11] <wikibugs>	 (03CR) 10Hashar: "Danke!!!" [puppet] - 10https://gerrit.wikimedia.org/r/348961 (owner: 10Chad)
[21:59:19] <wikibugs>	 (03PS3) 10Zppix: Raise the account creation limit for www.enwp.org/WP:Meetup/Eugene/WikiAPA [mediawiki-config] - 10https://gerrit.wikimedia.org/r/353921
[22:01:11] <wikibugs>	 06Operations, 10ops-codfw, 06DC-Ops, 06Discovery, and 3 others: elastic2020 is powered off and does not want to restart - https://phabricator.wikimedia.org/T149006#3265416 (10Gehel) Yes, elastic2020 can stay offline for one more week.
[22:05:43] <Zppix>	 to evening swat swatter I may be a bit late fyi
[22:16:50] <logmsgbot>	 !log mobrovac@tin Started deploy [restbase/deploy@d98af6f]: Wt2lint bug fix - T163091
[22:16:57] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[22:16:57] <stashbot>	 T163091: Parsoid: Add API endpoint to get lint errors for arbitrary wikitext - https://phabricator.wikimedia.org/T163091
[22:23:34] <logmsgbot>	 !log mobrovac@tin Finished deploy [restbase/deploy@d98af6f]: Wt2lint bug fix - T163091 (duration: 06m 44s)
[22:23:40] <wikibugs>	 (03PS5) 10XXN: Fixing "Book_talk" namespace alias for ro.wikipedia: [mediawiki-config] - 10https://gerrit.wikimedia.org/r/352728
[22:23:43] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[22:23:43] <stashbot>	 T163091: Parsoid: Add API endpoint to get lint errors for arbitrary wikitext - https://phabricator.wikimedia.org/T163091
[22:24:22] <wikibugs>	 06Operations, 10ops-codfw, 06DC-Ops, 06Discovery, and 3 others: elastic2020 is powered off and does not want to restart - https://phabricator.wikimedia.org/T149006#3265838 (10RobH) cool, we'll avoid using a shelf spare then and i'll be following up with dasher on a daily basis until resolution.
[22:25:34] <wikibugs>	 (03CR) 10XXN: [C: 031] "1. AFAIK - no; 2. The default Namespace definitions were already set in  /r/#/c/139766/ In fact this is a namespace alias (for accessibili" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/352728 (owner: 10XXN)
[22:37:33] <icinga-wm>	 PROBLEM - HP RAID on ms-be1019 is CRITICAL: CHECK_NRPE: Socket timeout after 50 seconds.
[22:37:51] <wikibugs>	 (03PS1) 10Dzahn: wikistats: add support for Debian stretch [puppet] - 10https://gerrit.wikimedia.org/r/353926
[22:40:29] <wikibugs>	 (03PS2) 10Dzahn: wikistats: add support for Debian stretch [puppet] - 10https://gerrit.wikimedia.org/r/353926
[22:41:00] <wikibugs>	 (03PS3) 10Dzahn: wikistats: add support for Debian stretch [puppet] - 10https://gerrit.wikimedia.org/r/353926
[22:42:13] <icinga-wm>	 RECOVERY - HP RAID on ms-be1035 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, Controller, Battery/Capacitor
[22:42:38] <wikibugs>	 (03CR) 10Dzahn: [C: 032] wikistats: add support for Debian stretch [puppet] - 10https://gerrit.wikimedia.org/r/353926 (owner: 10Dzahn)
[22:51:32] <wikibugs>	 (03PS4) 10Zppix: Raise the account creation limit for www.enwp.org/WP:Meetup/Eugene/WikiAPA [mediawiki-config] - 10https://gerrit.wikimedia.org/r/353921 (https://phabricator.wikimedia.org/T165421)
[22:55:50] <Zppix>	 jouncebot:  next
[22:55:51] <jouncebot>	 In 0 hour(s) and 4 minute(s): Evening SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170515T2300)
[22:57:33] <icinga-wm>	 PROBLEM - HP RAID on ms-be1019 is CRITICAL: CHECK_NRPE: Socket timeout after 50 seconds.
[22:59:13] <wikibugs>	 (03PS1) 10Dzahn: wikistats: more stretch support, php-cli package [puppet] - 10https://gerrit.wikimedia.org/r/353928
[22:59:58] <wikibugs>	 (03PS2) 10Dzahn: wikistats: more stretch support, php-cli package [puppet] - 10https://gerrit.wikimedia.org/r/353928
[23:00:05] <jouncebot>	 addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, and thcipriani: Respected human, time to deploy Evening SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170515T2300). Please do the needful.
[23:00:05] <jouncebot>	 mooeypoo and Zppix: A patch you scheduled for Evening SWAT (Max 8 patches) is about to be deployed. Please be available during the process.
[23:00:10] <Zppix>	 o/
[23:00:56] <mooeypoo>	 \o
[23:01:26] <wikibugs>	 (03CR) 10Dzahn: [C: 032] wikistats: more stretch support, php-cli package [puppet] - 10https://gerrit.wikimedia.org/r/353928 (owner: 10Dzahn)
[23:02:18] <mutante>	 !deploy_roulette
[23:02:20] <Zppix>	 feel free to do mooeypoo's patch first as mine takes little time
[23:02:30] * Zppix spins the bottle for mutante 
[23:02:35] * Zppix lands on mutante 
[23:02:44] <Zppix>	 xD
[23:02:44] <mutante>	 evades the bottle
[23:03:05] <Zppix>	 mutante:  the bottle cannot be evaded
[23:03:05] * mooeypoo dances towards the bottle
[23:03:34] <mutante>	 Zppix: jouncebot says "user not found" in deployer list
[23:04:08] <Zppix>	 jouncebot:  told me that it needed a second to re query the db and that it queried again and it choose yours lol
[23:04:20] <Zppix>	 mutante: 
[23:07:24] <mutante>	 Zppix: nice try, can't re-use db connection. already open. seriously, i'm not deploying and jouncebot already pings the people
[23:08:33] <mutante>	 if you have something for puppet swat that wold be different
[23:09:24] <Zppix>	 mutante:  :(
[23:09:45] <Zppix>	 whose gonna deploy
[23:10:38] <p858snake>	 if no one can deploy, you just reschedule for another day unless its critically urgent
[23:12:25] <mutante>	 maybe they will deploy but at "Vienna"-window.
[23:12:25] <robh>	 i thought this week was a code freeze week due to offsites?
[23:12:51] <robh>	 ahh, there is a note that no train
[23:13:01] <robh>	 and any swats involiving release engineering should be delayed
[23:13:06] <robh>	 or others maybe avail
[23:13:23] <mutante>	 heh. thanks rob
[23:13:34] <robh>	 https://wikitech.wikimedia.org/wiki/Deployments#Week_of_May_15th
[23:14:09] <Zppix>	 .... but is there no swatters...
[23:14:59] <bd808>	 Zppix: your patch isn't critical for this week. The event it raises the limit for is 2 weeks out
[23:15:21] <p858snake>	 Zppix: people are travelling today/tomorrow, i'm sure people will be willing to help later on
[23:15:31] <mooeypoo>	 Wait, no swatters?
[23:15:43] <bd808>	 mooeypoo: you have the powers right?
[23:15:47] <mooeypoo>	 I do not
[23:15:54] <mooeypoo>	 I also am not sure how to use said powers
[23:16:08] <p858snake>	 but as others have pointed out, non critical things should probably be delayed a little bit
[23:16:16] <mooeypoo>	 This is fairly critical?
[23:16:22] <bd808>	 mooeypoo: looking
[23:17:16] <bd808>	 mooeypoo's patch is pretty safe looking and it fixes a user facing bug
[23:17:19] <bd808>	 I acn deploy it
[23:17:21] <bd808>	 *can
[23:17:28] <Zppix>	 bd808:  mine is basic doesnt even require testing
[23:17:54] <mooeypoo>	 Yeah it's fixing a bad bug in RCFiltes, which people are excitedly using after the blog post
[23:18:26] * bd808 waits for jerkins
[23:18:47] * mooeypoo awaits to test/verify
[23:20:16] <bd808>	 mooeypoo: you should learn how to deploy too. It's both fun and useful. :)
[23:20:36] <bd808>	 you have the shirt already too!
[23:20:49] * mooeypoo nods
[23:20:56] <mooeypoo>	 I am worried I'd need a jacket 
[23:21:02] <mooeypoo>	 But yes, I should
[23:21:09] <mooeypoo>	 What's the upgrade over a shirt? A hat?
[23:21:30] <bd808>	 I think a facial tattoo ;)
[23:21:35] <p858snake>	 mooeypoo: if you are going to the hackathon, Reedy will probably teach you
[23:21:39] <mooeypoo>	 rofl 
[23:21:43] * mooeypoo will ask
[23:21:49] <mooeypoo>	 Roan can probably do that too
[23:21:59] <p858snake>	 yes, also a good choice
[23:22:05] <mooeypoo>	 I'll have a deploy-party. There should be cookies somewhere there.
[23:22:44] <bd808>	 Sam loves to do deploys from the hackathon. Bonus is that it always makes Greg nervous.
[23:23:10] * bd808 watches little progress meters crawl on the zuul status page
[23:23:17] <mutante>	 air deploy from 10.000ft
[23:23:48] <bd808>	 I don't know if he's done that one yet. There was the English channel crossing train deploy though
[23:24:05] * bd808 does not recommend
[23:24:47] <bd808>	 he lost wifi mid-scap and didn't finish until he had driven into Berlin
[23:24:56] <mutante>	 :o
[23:25:00] <p858snake>	 domas whitepaged the site from the plane (and fixed it)
[23:25:41] <p858snake>	 sam has deployed from everything I think plane/train/boat/meetups
[23:26:04] <mooeypoo>	 Does that count for a face tattoo?
[23:26:26] <mooeypoo>	 I'm just looking for the requirements 
[23:27:01] <bd808>	 I was thinking something along the lines of the "poor impulse control" tattoo from Snow Crash
[23:27:16] <mooeypoo>	 ha
[23:27:23] <bd808>	 that trusty test is not speedy...
[23:27:42] <Zppix>	 mooeypoo:  to get the face tattoo it requires deployment from the sun XD
[23:27:52] <Zppix>	 bd808:  i hope you mean jessie
[23:28:37] <bd808>	 Zppix: nope. we still run trusty on gate-and-submit. wikitech is still running on php 5.whatever 
[23:28:52] <Zppix>	 i thought releng got rid of trusty...
[23:29:00] <mooeypoo>	 Zppix, what, like this ? https://upload.wikimedia.org/wikipedia/commons/thumb/3/32/SPARCstation_1.jpg/220px-SPARCstation_1.jpg
[23:29:15] <Zppix>	 no i mean ON the sun mooeypoo 
[23:29:24] <bd808>	 E_TOOHOT
[23:29:33] <icinga-wm>	 RECOVERY - nova instance creation test on labnet1001 is OK: PROCS OK: 1 process with command name python, args nova-fullstack
[23:29:37] <mooeypoo>	 Zppix, it won't be very comfortable sitting on that one, though doable.
[23:29:56] <Zppix>	 bd808:  you mean WMF doesnt have standard-issue sun suits?
[23:30:01] <bd808>	 I had an E450 that made a good coffee table
[23:30:16] <mooeypoo>	 Zppix, maybe I should've shared this one to be more explicit in my joke https://en.wikipedia.org/wiki/Sun_Microsystems#/media/File:SPARCstation_1.jpg
[23:30:36] <Zppix>	 mooeypoo:  thats when you file for unsafe working conditions :P
[23:30:48] <mooeypoo>	 As opposed to deploying from The Sun
[23:31:08] <mooeypoo>	 I am slowly building the image of what your requirements for this look like, Zppix 
[23:31:27] <Zppix>	 mooeypoo:  and?
[23:31:29] * bd808 grumbles that 12 minutes have passed and the tests are still running
[23:31:34] <mutante>	 https://wikitech.wikimedia.org/wiki/Obsolete:Sun_storage
[23:31:39] <Zppix>	 bd808:  try turning it off and on again :P
[23:31:40] <mutante>	 ^ yes, WMF once had a sun
[23:32:00] <bd808>	 toolserver was mostly Sun hardware too
[23:32:18] <Zppix>	 no i mean this sun https://en.wikipedia.org/wiki/File:The_Sun_by_the_Atmospheric_Imaging_Assembly_of_NASA%27s_Solar_Dynamics_Observatory_-_20100819.jpg
[23:32:33] <icinga-wm>	 PROBLEM - nova instance creation test on labnet1001 is CRITICAL: PROCS CRITICAL: 0 processes with command name python, args nova-fullstack
[23:32:39] <Zppix>	 ^ talk about long link
[23:33:02] <mooeypoo>	 Zppix, interesting, I don't recognize this version
[23:33:07] <mooeypoo>	 Is it running Oracle?
[23:33:24] <Zppix>	 mooeypoo:  no its running Hot_AF v0.1
[23:33:47] <mooeypoo>	 Zppix, you haven't seen long links until you worked with RCFilters (incidentally, I'm working on a fix for that now...)
[23:33:54] <bd808>	 ok, jerkins finally finished
[23:33:58] <mooeypoo>	 yay
[23:34:02] <mooeypoo>	 where do I test
[23:34:22] <Zppix>	 mooeypoo:  no i have
[23:35:16] <Zppix>	 for example mooeypoo  this used to be a website www.thisisaverylongurlidontknowwhyiregisteredthis.com/youthoughtiwasdoneyouwerewrong/stillnotdone/hi
[23:35:19] <bd808>	 mooeypoo: Its on mwdebug1001 now
[23:36:11] * mooeypoo goes to test
[23:39:00] <mooeypoo>	 uhm.. I'm testing on enwiki with the chrome extension on 1001 and I don't see the fix running, am I doing it wrong?
[23:39:19] * mooeypoo does a hard refresh
[23:39:20] <bd808>	 hmmm... maybe. It seems to be working for me.
[23:39:22] <mooeypoo>	 hang on
[23:39:29] <bd808>	 https://www.mediawiki.org/wiki/Special:RecentChanges?hideliu=0&hideanons=0&userExpLevel=&hidemyself=0&hidebyothers=0&hidebots=1&hidehumans=0&hidepatrolled=0&hideunpatrolled=0&hideminor=0&hidemajor=0&hidelastrevision=0&hidepreviousrevisions=0&hidepageedits=0&hidenewpages=0&hidecategorization=1&hidelog=0&watchlist=&highlight=1&registration__hideanons_color=c5&changeType__hidenewpages_color=c1&userExpLevel__newcomer_color=c3
[23:39:46] <mooeypoo>	 YES! works now
[23:39:52] <bd808>	 sweet.
[23:39:57] <mooeypoo>	 ok stupid chrome and its immovable cache
[23:41:50] <logmsgbot>	 !log bd808@tin Synchronized php-1.30.0-wmf.1/resources/src/mediawiki.rcfilters/mw.rcfilters.Controller.js: RCFilters: Actually read/write highlight parameter (T165107) (duration: 00m 40s)
[23:41:59] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:41:59] <stashbot>	 T165107: Highlight settings contained in RC Page URLs fail to load  - https://phabricator.wikimedia.org/T165107
[23:42:03] <mooeypoo>	 \o/
[23:42:31] <bd808>	 it may take a while for that to propagate through varnish
[23:42:43] <mooeypoo>	 thanks bd808 
[23:43:11] <mooeypoo>	 I just hard-refreshed without the testing extension on on enwiki and it works
[23:43:13] <mooeypoo>	 thanks!
[23:48:07] <wikibugs>	 (03PS1) 10Dzahn: wikistats: puppetize deploy script [puppet] - 10https://gerrit.wikimedia.org/r/353932
[23:49:16] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] wikistats: puppetize deploy script [puppet] - 10https://gerrit.wikimedia.org/r/353932 (owner: 10Dzahn)
[23:51:21] <wikibugs>	 (03PS2) 10Dzahn: wikistats: puppetize deploy script [puppet] - 10https://gerrit.wikimedia.org/r/353932
[23:52:22] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] wikistats: puppetize deploy script [puppet] - 10https://gerrit.wikimedia.org/r/353932 (owner: 10Dzahn)
[23:54:15] <wikibugs>	 (03PS3) 10Dzahn: wikistats: puppetize deploy script [puppet] - 10https://gerrit.wikimedia.org/r/353932
[23:55:11] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] wikistats: puppetize deploy script [puppet] - 10https://gerrit.wikimedia.org/r/353932 (owner: 10Dzahn)
[23:56:35] <wikibugs>	 (03PS4) 10Dzahn: wikistats: puppetize deploy script [puppet] - 10https://gerrit.wikimedia.org/r/353932