[00:22:05] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [00:24:55] RECOVERY - RAID on searchidx1001 is OK: OK: optimal, 1 logical, 4 physical [01:33:36] bblack, around? [01:48:15] (03PS1) 10Ori.livneh: Be multithreaded. [operations/software/mwprof] - 10https://gerrit.wikimedia.org/r/101793 [02:08:18] (03PS1) 10Faidon Liambotis: auto-install: move private1-ulsfo to module [operations/puppet] - 10https://gerrit.wikimedia.org/r/101799 [02:08:19] (03PS1) 10Faidon Liambotis: auto-install: disable swap on appservers (mw.cfg) [operations/puppet] - 10https://gerrit.wikimedia.org/r/101800 [02:11:13] !log LocalisationUpdate completed (1.23wmf6) at Mon Dec 16 02:11:13 UTC 2013 [02:11:31] Logged the message, Master [02:13:11] !log salt swapoff -a; sed -i "/swap/d" /etc/fstab on all srv*, mw* [02:13:26] Logged the message, Master [02:13:33] (03CR) 10Faidon Liambotis: [C: 032] auto-install: move private1-ulsfo to module [operations/puppet] - 10https://gerrit.wikimedia.org/r/101799 (owner: 10Faidon Liambotis) [02:14:07] (03CR) 10Faidon Liambotis: [C: 032] auto-install: disable swap on appservers (mw.cfg) [operations/puppet] - 10https://gerrit.wikimedia.org/r/101800 (owner: 10Faidon Liambotis) [02:19:59] !log LocalisationUpdate completed (1.23wmf7) at Mon Dec 16 02:19:59 UTC 2013 [02:20:16] Logged the message, Master [02:34:36] !log LocalisationUpdate ResourceLoader cache refresh completed at Mon Dec 16 02:34:36 UTC 2013 [02:34:51] Logged the message, Master [02:42:04] hey uhm [02:42:07] made a machine on labs [02:42:18] connected once to it [02:42:23] then couldn't connect to it anymore [02:42:30] tried to install a package on it from a deb [02:42:34] dpkg just stalled [02:43:17] and the link was quite slow, not sure why, maybe cause I'm on the other side of the ocean ? [02:43:51] well, yeah, anyway. I'll probably circle around tommorow again about this, probably not a good time right now [02:45:12] it's not the right time nor the right channel :) [02:45:47] paravoid: true [02:53:17] (03CR) 10Faidon Liambotis: [C: 04-1] "Good stuff! (very cursory look)" (033 comments) [operations/software/mwprof] - 10https://gerrit.wikimedia.org/r/101793 (owner: 10Ori.livneh) [03:37:43] (03CR) 10MZMcBride: "Hashar: jenkins-bot seems to be complaining about RewriteEngine, but I'm not sure why." [operations/apache-config] - 10https://gerrit.wikimedia.org/r/101787 (owner: 10John F. Lewis) [04:09:31] has gerrit.wikimedia.org key changed? [04:09:36] RSA key fingerprint is 83:fe:34:4b:16:2c:9e:95:1d:f6:d7:7d:ee:28:03:02. [04:11:57] hmm, actually i cant upload anything to gerrit, :( [04:15:22] yurik-road: you exceeded your patch quota [04:15:32] you have to relax until january [04:15:40] ori-l, funny :) [04:15:55] althuogh, ori-l, who should be talking! :-P [04:16:03] how's your tooth doing? [04:16:18] git pull fails :( [04:17:35] its back! [04:17:51] my tooth is awful :/ [04:17:59] :( [04:24:23] ori-l: https://github.com/trebuchet-deploy/trigger#extending-trigger [04:24:50] specifically: https://github.com/trebuchet-deploy/trigger#extending-trigger [04:24:55] PROBLEM - MySQL Slave Running on db1026 is CRITICAL: CRIT replication Slave_IO_Running: Yes Slave_SQL_Running: Yes Last_Error: Error Deadlock found when trying to get lock: try restarting transac [04:25:34] ugh. stupid markdown [04:25:54] Ryan_Lane: that is abusing decorators a little, I think -- composing classes via inheritance is a better model when you need this much configurability [04:26:05] like Django views, or python's threading library for that matter [04:26:18] you subclass thread and override run [04:26:30] I'm following an openstack model here [04:26:45] I think the decorators way of handling this is rather nice. [04:26:45] do you fabric? [04:26:55] RECOVERY - MySQL Slave Running on db1026 is OK: OK replication Slave_IO_Running: Yes Slave_SQL_Running: Yes Last_Error: [04:27:12] do you know fabric, even [04:27:41] I know if its existence [04:28:02] I decided against it pretty early on [04:28:10] due to its reliance on ssh [04:29:25] oh, yeah, i wasn't suggesting using it [04:29:40] anyway, abusing decorators like this lets you configure each function without the overhead of a class [04:29:42] it has some nice patterns for building up a library of snippets of code for remote execution [04:29:56] ah. right. this is not for remote execution [04:30:00] i was just going to suggest robbing it for ideas [04:30:11] this is just for extending argparse [04:30:44] meh [04:30:50] all of the remote stuff occurs via salt [04:31:02] (03PS1) 10Springle: depool db1026 during wikidata.wb_terms schema changes (slave sql thread deadlocks if attempted while online) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101807 [04:31:08] a while back i wrote this thing that used the inspect module to get the function signature and generate an argparser based on that [04:31:24] (03CR) 10Springle: [C: 032] depool db1026 during wikidata.wb_terms schema changes (slave sql thread deadlocks if attempted while online) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101807 (owner: 10Springle) [04:31:45] it's cute but ultimately annoying and inflexible [04:31:55] what is? the decorators? [04:32:19] a little, yeah [04:32:27] in which ways is it inflexible? [04:32:42] !log springle synchronized wmf-config/db-eqiad.php 'depool db1026 during schema changes' [04:33:00] Logged the message, Master [04:33:08] well, it's not always easy to anticipate, but here's one off the top of my head [04:33:15] implementing argument mutual exclusion [04:33:21] or argument groups [04:33:26] both supported by argparse [04:33:33] if your decorator was just [04:33:47] @util.args(argument_parser_instance) you could do it [04:37:41] https://github.com/openstack/python-novaclient/blob/master/novaclient/v3/shell.py#L207 [04:38:10] ugh [04:38:24] that is pretty horrible, come on [04:38:33] it's two screenfuls of decorators [04:38:56] you'd have two screenfuls of argparse extension there no matter what [04:39:34] I'm not opposed to another method of extension, but this one is relatively straightforward [04:40:03] and the code is easily adapted, since it's the same license [04:40:11] re: two screenfuls no matter what [04:40:37] yes, but they nevertheless deviated from the standard argparse pattern, presumably because they think this syntax is clearer or more convenient [04:41:02] and i just don't think it's true, since the meaning a decorator conveys most eloquently is: "this function you're about to see, it's a <...>" [04:41:34] examples: flask's @route (it's the handler for /index.html) django's @signal_handler, etc [04:41:55] when you see the '@' you think: i'm about to see a function [04:42:04] but then you have to put that in a buffer while you're reading unrelated things [04:42:42] this thing that i'll show you in a moment, once i show it to you, which will be shortly, like no more than another line or two, then you'll see, that is is, a thing that, ... [04:44:14] anyways, code aesthetics are subjective, and if you find that it's a good API, then don't let me and my toothache get you down [04:44:19] heh [04:44:41] no worries. I understand you're dislike of the code [04:45:08] one of the reasons I liked this model was that it kept the argparse code in the same place as the action [04:45:18] and the extension model is specific to actions [04:45:31] an alternative would be to limit each action to an extension [04:46:29] and handle the subparser via a function in the extension [04:47:20] anyway, it's not a major change for either myself or extension authors down the line if I switch up the model [04:47:50] I was mostly point this out to let you know it's now possible to add a 'traps' action, or something like that [04:48:01] I was thinking that git notes could be a good way of handling that [04:48:11] *pointing [04:55:31] i've never used git notes [04:55:37] been meaning to try them [04:58:15] I believe if you add notes to a repo they stick all the way through [04:59:11] yep [05:01:23] (03PS1) 10Legoktm: Add MassMessage to $wgDebugLogGroups [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101809 [05:02:05] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [05:02:36] ori-l: ^ too. I'm not sure if anything else needs to be done to make that work... [05:02:49] and thanks :D [05:03:05] RECOVERY - RAID on searchidx1001 is OK: OK: optimal, 1 logical, 4 physical [05:05:08] legoktm: you can create a remote branch using gerrit's UI, called 1.23wmf6, and specify 8077269c2120bc39aa43bfb62b4ee267847f34f3 as its starting point, because that's the commit wmf6 is currently on [05:05:14] and you can cherry-pick the debug logging patch into that branch [05:05:27] if you do that, I could sync it [05:05:44] sure [05:08:52] ori-l: done [05:10:32] legoktm: you should also bump the submodule commit in the wmf6 branch [05:10:55] ok [05:13:38] (03CR) 10Ori.livneh: Be multithreaded. (031 comment) [operations/software/mwprof] - 10https://gerrit.wikimedia.org/r/101793 (owner: 10Ori.livneh) [05:16:12] ori-l: is there an interface in gerrit to do that, or do I need to do it manually? [05:17:25] manually. so, assuming you have a 1.22wmf6 branch tracking origin/1.22wmf6, and assuming it's checked out: [05:17:45] 1.23, i mean [05:18:01] I found http://stackoverflow.com/questions/8191299/update-a-submodule-to-the-latest-commit [05:18:23] git submodule update --init extensions/MassMessage ; cd extensions/MassMessage ; git fetch ; git checkout origin/1.23wmf6 ; cd ../.. ; git add extensions/MassMessage ; git commit -m 'Updating MassMessage to tip of 1.22wmf6 branch' ; git review [05:19:30] ok [05:29:56] (03CR) 10Ori.livneh: [C: 032] Add MassMessage to $wgDebugLogGroups [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101809 (owner: 10Legoktm) [05:30:27] (03Merged) 10jenkins-bot: Add MassMessage to $wgDebugLogGroups [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101809 (owner: 10Legoktm) [05:34:29] !log ori synchronized php-1.23wmf6/extensions/MassMessage/MassMessageJob.php 'Iec240623a: Add debug logging for bug 57464' [05:34:40] !log ori updated /a/common to {{Gerrit|If79a9443a}}: Add MassMessage to $wgDebugLogGroups [05:34:45] Logged the message, Master [05:35:00] Logged the message, Master [05:35:52] !log ori synchronized wmf-config/InitialiseSettings.php 'If79a9443a: Add MassMessage to ' [05:36:02] * ori-l headdesks. [05:36:08] :/ [05:36:09] Logged the message, Master [05:36:19] I always forget to escape $ in sync messages [05:36:31] do you know how long it will take the code to propagate to the job queue runners? [05:36:43] negative one minute? [05:36:50] :D [05:36:57] let me send a test message [05:39:16] [[Special:Log/massmessage]] skipbadns * MediaWiki message delivery * Delivery of "Testing [[bugzilla:57464|bug 57464]]" to [[Legoktm]] was skipped because target was in a namespace that cannot be posted in [05:39:28] ori-l: did anything show up in MassMessage.log? [05:39:47] yes [05:39:58] i'll tell you on monday [05:40:23] what do you think this is? you just push a patch and get logs? psh. [05:40:26] er, alright.. [05:40:31] :| [05:40:31] i'm just trolling [05:40:46] https://dpaste.de/MAbS/raw [05:40:55] I clicked. [05:41:35] thanks [05:41:49] I think this falls under the "something else is terribly wrong" [05:41:58] yay, i was hoping for that [05:42:10] the interwiki prefix just vanished [05:45:03] forgive me, but it sounded so ominous, i got excited [05:45:03] "what do you mean, the interwiki prefix just vanished?!" [05:45:04] "i'm telling you, chief, it's just gone!" [05:45:04] "well go out there and find it!" [05:45:04] haha [05:45:04] I was assuming that the title object that goes in would be deserialized exactly the same, which doesn't seem to be the case. [05:51:09] ohhhhhhh [05:51:53] I blame core. [05:51:58] +1 [05:52:09] JobQueueRedis::getNewJobFields [05:52:22] and JobQueueRedis::getJobFromFields [05:52:28] $title = Title::makeTitleSafe( $fields['namespace'], $fields['title'] ); [05:53:12] we should make logmsgbot echo to -dev [05:53:32] this conversation isn't opsy but we keep gravitating here because of the sync notices [06:07:21] (03PS13) 10Yurik: Handle proxies for Wikipedia Zero [operations/puppet] - 10https://gerrit.wikimedia.org/r/88261 (owner: 10Dr0ptp4kt) [06:17:04] (03CR) 10Tim Starling: [C: 04-1] "Also -1 due to the relicensing." (032 comments) [operations/software/mwprof] - 10https://gerrit.wikimedia.org/r/101793 (owner: 10Ori.livneh) [06:23:45] (03CR) 10Tim Starling: "Yes, delivering a 404 is the responsibility of the target domain, but some URLs under secure.wikimedia.org are not redirected." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99024 (owner: 10Tim Starling) [06:24:41] (03PS2) 10Tim Starling: Re-add the docroot/secure directory [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99024 [06:25:00] (03CR) 10Tim Starling: [C: 032] Re-add the docroot/secure directory [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99024 (owner: 10Tim Starling) [06:27:13] !log tstarling synchronized docroot/secure/404.html [06:27:14] (03PS2) 10Tim Starling: secure.wikimedia.org ErrorDocument [operations/apache-config] - 10https://gerrit.wikimedia.org/r/99026 [06:27:17] (03CR) 10jenkins-bot: [V: 04-1] secure.wikimedia.org ErrorDocument [operations/apache-config] - 10https://gerrit.wikimedia.org/r/99026 (owner: 10Tim Starling) [06:27:31] Logged the message, Master [06:29:26] (03CR) 10Tim Starling: [C: 032] secure.wikimedia.org ErrorDocument [operations/apache-config] - 10https://gerrit.wikimedia.org/r/99026 (owner: 10Tim Starling) [06:29:28] (03CR) 10jenkins-bot: [V: 04-1] secure.wikimedia.org ErrorDocument [operations/apache-config] - 10https://gerrit.wikimedia.org/r/99026 (owner: 10Tim Starling) [06:29:44] (03CR) 10Tim Starling: [V: 032] secure.wikimedia.org ErrorDocument [operations/apache-config] - 10https://gerrit.wikimedia.org/r/99026 (owner: 10Tim Starling) [06:36:07] (03PS1) 10Ori.livneh: Make it possible for logmsgbot to report to more than one channel [operations/puppet] - 10https://gerrit.wikimedia.org/r/101816 [06:39:32] (03PS1) 10Springle: repool db1026 after schema changes, LB lowered for warm up [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101817 [06:40:03] (03PS2) 10Springle: repool db1026 after schema changes, LB lowered for warm up [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101817 [06:40:55] (03CR) 10Springle: [C: 032] repool db1026 after schema changes, LB lowered for warm up [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101817 (owner: 10Springle) [06:42:04] !log springle synchronized wmf-config/db-eqiad.php 'repool db1026 after schema changes, LB lowered during warm up' [06:42:18] Logged the message, Master [06:55:01] (03PS14) 10Yurik: Handle proxies for Wikipedia Zero [operations/puppet] - 10https://gerrit.wikimedia.org/r/88261 (owner: 10Dr0ptp4kt) [06:59:08] (03PS2) 10Ori.livneh: Be multithreaded. [operations/software/mwprof] - 10https://gerrit.wikimedia.org/r/101793 [07:09:25] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [07:37:10] (03Abandoned) 10Arav93: Renamed $wmf* to $wmg* [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/94598 (owner: 10Arav93) [07:50:41] (03PS3) 10Ori.livneh: Be multithreaded. [operations/software/mwprof] - 10https://gerrit.wikimedia.org/r/101793 [08:12:25] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [08:32:25] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [08:57:57] (03PS1) 10Arav93: Renamed $wmf* to $wmg* Bug:43956 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101820 [09:01:55] RECOVERY - DPKG on mw1017 is OK: All packages OK [09:03:06] (03PS2) 10Peachey88: Renamed $wmf* to $wmg* Bug:43956 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101820 (owner: 10Arav93) [09:06:28] paravoid, around? [09:06:40] (03CR) 10Peachey88: "When doing commit messages, the "Bug:" line should have a blank line inbetween the message and the bug line, Also a space after the colon " [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101820 (owner: 10Arav93) [09:08:38] (03CR) 10Arav93: "Sorry, Did you change it , or should I ?" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101820 (owner: 10Arav93) [09:10:52] (03PS4) 10Ori.livneh: Be multithreaded. [operations/software/mwprof] - 10https://gerrit.wikimedia.org/r/101793 [09:11:11] (03CR) 10Peachey88: "I have already do it (It's patchset two on this commit)" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101820 (owner: 10Arav93) [09:22:18] (03PS1) 10Stefan.petrea: Json schema, output and test [operations/software/mwprof] - 10https://gerrit.wikimedia.org/r/101821 [09:22:57] damnit, my use of gerrit is suboptimal [09:23:43] ori-l: so I made use of json-glib and i gave no warnings, now in the meantime you pushed PS3 and PS4 and I did some merges with your code. I probably messed something up [09:23:58] but there is some JSON support now [09:24:25] it's all right, we'll figure it out [09:24:42] i'll probably need to go through another patchset :/ [09:24:42] I also made a test that starts mwprof , throws stuff at it on UDP, then connects through TCP to it, gets the stats, kills it. And now the unfinished part, testing the output stats against a JSON schema (which is also unfinished) . [09:24:56] oh cool! [09:25:12] ori-l: will you have time to review my patch too ? :) [09:25:33] yes, but not tonight, it's late [09:25:35] ok [09:25:39] thank you [09:25:51] thank you, good night [09:26:06] good night [09:26:40] * average goes for cigarettes and another for another attempt to acquire more powerful hardware [09:35:42] (03PS2) 10Stefan.petrea: Json schema, output and test [operations/software/mwprof] - 10https://gerrit.wikimedia.org/r/101821 [09:48:40] (03CR) 10Ori.livneh: "Tim: I incorporated the original permission statement from collector.c verbatim. I'm not sure what I got wrong. My sole motivation was to " [operations/software/mwprof] - 10https://gerrit.wikimedia.org/r/101793 (owner: 10Ori.livneh) [09:52:44] (03PS5) 10Ori.livneh: Rewrite for multithreading [operations/software/mwprof] - 10https://gerrit.wikimedia.org/r/101793 [10:03:25] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [10:35:00] (03CR) 10Tim Starling: [C: 04-1] "You added GPL licensing and a "copyright WMF" statement to files which were previously public domain and mostly contributed by a volunteer" [operations/software/mwprof] - 10https://gerrit.wikimedia.org/r/101793 (owner: 10Ori.livneh) [10:38:43] paravoid, I added proxy handling to VCL - https://gerrit.wikimedia.org/r/#/c/88261/ , please let me know if you see any issues with this approach [10:39:32] thx, and off to bed i go :) [11:18:27] (03PS1) 10Springle: decom all pmtpa s[1-7] db nodes except the temporary masters [operations/puppet] - 10https://gerrit.wikimedia.org/r/101825 [11:19:47] (03CR) 10Springle: [C: 032] decom all pmtpa s[1-7] db nodes except the temporary masters [operations/puppet] - 10https://gerrit.wikimedia.org/r/101825 (owner: 10Springle) [11:27:21] (03PS1) 10Springle: final s[1-7] pmtpa dbs state before decom [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101826 [11:27:40] (03CR) 10Springle: [C: 032] final s[1-7] pmtpa dbs state before decom [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101826 (owner: 10Springle) [11:28:34] !log springle synchronized wmf-config/db-pmtpa.php [11:28:50] Logged the message, Master [11:45:40] yay! [12:18:22] mutante: did you eventually get any off-ticket response to RT 6264? (db29 pgehres) [12:40:43] (03PS15) 10Dr0ptp4kt: WIP: Show W0 (set X-CS) for Opera Mini where applicable. [operations/puppet] - 10https://gerrit.wikimedia.org/r/88261 [12:56:04] (03Abandoned) 10Hashar: parsoid: startup script now has cleared out FDs [operations/puppet] - 10https://gerrit.wikimedia.org/r/99656 (owner: 10Hashar) [12:56:21] hey hashar [13:15:30] (03CR) 10Mark Bergsma: [C: 04-1] "I think PS 14, though elaborate, is more clear than PS 15." (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/88261 (owner: 10Dr0ptp4kt) [13:16:59] (03Restored) 10Hashar: parsoid: startup script now has cleared out FDs [operations/puppet] - 10https://gerrit.wikimedia.org/r/99656 (owner: 10Hashar) [13:17:12] (03PS2) 10Hashar: beta: manage parsoid using upstart [operations/puppet] - 10https://gerrit.wikimedia.org/r/99656 [13:17:43] (03CR) 10Dr0ptp4kt: "Agreed. Let's use PS14 instead." [operations/puppet] - 10https://gerrit.wikimedia.org/r/88261 (owner: 10Dr0ptp4kt) [13:18:57] (mark, not abandon the change, just use PS14 instead. i couldn't get stuff to gerrit on friday night for some reason, but yurik cleaned up what i had emailed him for manua review.) [13:21:52] (03PS1) 10Springle: pull pmtpa db boxes from m1, m2, x1, es1, es2, es3 for decom and/or shipping [operations/puppet] - 10https://gerrit.wikimedia.org/r/101835 [13:22:58] (03CR) 10Springle: [C: 032] pull pmtpa db boxes from m1, m2, x1, es1, es2, es3 for decom and/or shipping [operations/puppet] - 10https://gerrit.wikimedia.org/r/101835 (owner: 10Springle) [13:25:38] (03PS1) 10Springle: depool pmtpa es[234] for decom and/or shipping [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101836 [13:26:05] (03CR) 10Springle: [C: 032] depool pmtpa es[234] for decom and/or shipping [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101836 (owner: 10Springle) [13:26:30] (03PS6) 10Stefan.petrea: Rewrite for multithreading [operations/software/mwprof] - 10https://gerrit.wikimedia.org/r/101793 (owner: 10Ori.livneh) [13:26:51] !log springle synchronized wmf-config/db-pmtpa.php [13:27:08] Logged the message, Master [13:27:59] (03PS3) 10Hashar: beta: manage parsoid using upstart [operations/puppet] - 10https://gerrit.wikimedia.org/r/99656 [13:32:06] (03CR) 10Faidon Liambotis: [C: 04-1] "See discussion on Bugzilla." [operations/apache-config] - 10https://gerrit.wikimedia.org/r/101787 (owner: 10John F. Lewis) [13:33:11] (03Abandoned) 10Stefan.petrea: Json schema, output and test [operations/software/mwprof] - 10https://gerrit.wikimedia.org/r/101821 (owner: 10Stefan.petrea) [13:33:18] (03PS7) 10Dan-nl: Production configuration for GWToolset [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101061 [13:33:54] (03PS1) 10ArielGlenn: add missing analytics row a network info [operations/puppet] - 10https://gerrit.wikimedia.org/r/101837 [13:35:07] (03CR) 10Dan-nl: "adding add and remove group privileges on group ‘gwtoolset’ for sysops." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101061 (owner: 10Dan-nl) [13:36:14] except that I can't pick PS14 [13:37:52] (03CR) 10ArielGlenn: [C: 032] add missing analytics row a network info [operations/puppet] - 10https://gerrit.wikimedia.org/r/101837 (owner: 10ArielGlenn) [13:39:40] (03PS1) 10Springle: depool pmtpa db boxes from es2, es3, x1 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101838 [13:41:25] (03CR) 10Springle: [C: 032] depool pmtpa db boxes from es2, es3, x1 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101838 (owner: 10Springle) [13:42:17] !log springle synchronized wmf-config/db-pmtpa.php [13:42:33] Logged the message, Master [13:44:48] (03PS4) 10Hashar: beta: manage parsoid using upstart [operations/puppet] - 10https://gerrit.wikimedia.org/r/99656 [14:01:06] (03PS1) 10Springle: keep one pmtpa es[123] host each on 12th floor [operations/puppet] - 10https://gerrit.wikimedia.org/r/101843 [14:02:16] (03CR) 10Springle: [C: 032] keep one pmtpa es[123] host each on 12th floor [operations/puppet] - 10https://gerrit.wikimedia.org/r/101843 (owner: 10Springle) [14:05:23] (03PS1) 10Springle: keep one pmtpa es[123] host each on 12th floor [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101844 [14:05:48] (03CR) 10Springle: [C: 032] keep one pmtpa es[123] host each on 12th floor [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101844 (owner: 10Springle) [14:06:26] !log ganglia-monitor restart on srv*/mw*; gmond bug with swapoff [14:06:34] !log springle synchronized wmf-config/db-pmtpa.php [14:06:40] !g 101844,1 [14:06:41] https://gerrit.wikimedia.org/r/#q,101844,1,n,z [14:06:42] Logged the message, Master [14:06:58] Logged the message, Master [14:07:58] (03PS2) 10Hashar: beta: properly connect to parsoid instance [operations/puppet] - 10https://gerrit.wikimedia.org/r/99659 [14:36:44] (03PS1) 10Mark Bergsma: Remove all node definitions for Squids [operations/puppet] - 10https://gerrit.wikimedia.org/r/101856 [14:36:45] (03PS1) 10Mark Bergsma: Move all existing Squids to the decommission lists [operations/puppet] - 10https://gerrit.wikimedia.org/r/101857 [14:36:45] (03PS1) 10Mark Bergsma: Remove pmtpa Squid LVS monitoring [operations/puppet] - 10https://gerrit.wikimedia.org/r/101858 [14:36:47] (03PS1) 10Mark Bergsma: Update Icinga cache groups [operations/puppet] - 10https://gerrit.wikimedia.org/r/101859 [14:36:48] (03PS1) 10Mark Bergsma: Remove role::cache::squid [operations/puppet] - 10https://gerrit.wikimedia.org/r/101860 [14:40:13] wow [14:40:20] that's excellent :-) [14:40:26] bye bye squid [14:43:52] apergos: if you feel brave, I got an upstart script for Parsoid on https://gerrit.wikimedia.org/r/#/c/99656/ [14:44:10] apergos: made it to only apply on beta/labs, production remaining unchanged with the old shell wrapper + init.d [14:44:18] let's have a look [14:44:57] (03CR) 10Hashar: "Forgot to say I have tested it out on deployment-parsoid2.pmtpa.wmflabs and managed to restart the server via ssh without it hanging on op" [operations/puppet] - 10https://gerrit.wikimedia.org/r/99656 (owner: 10Hashar) [14:45:27] gotta look at your icinga / ferm rules as well :D [14:46:00] pretty sure they will need a bunch of fixups (that's one reason it's a draft) [14:46:40] righ tnow it purges and rewrites the rules each time, with a large diff in the puppet logs, which I hate [14:48:37] !log zuul made gate-and-submit pipeline a dependent pipeline. Changes would thus be triggered in parallel whenever a repo has several +2 attempting to land in. That should speed up gating process. See also {{bug|48419}} and {{gerrit|101839}} [14:48:54] Logged the message, Master [14:51:03] !log jenkins enabled linting jobs to be runnable in parallel. Whenever several changes are made on the same repo, Jenkins will trigger a linting job per change. That will dramatically speed up the processing of changes since some jobs are now parallelized instead of serialized. [14:51:12] hashar: I see you don't keep logs over there, maybe you want to? + logrotate [14:51:19] Logged the message, Master [14:51:21] or at least the script looks that way [14:51:28] apergos: ah yeah forgot about the log damn [14:51:47] I believe you are writing to /dev/null which was my easy solution :-D [14:52:04] maybe I should just >> /var/log/parsoid/parsoid.log [14:52:14] I am not sure how logrotate would work [14:52:37] aka might need to restart parsoid to let upstart point to the new file [14:56:32] (03PS1) 10Mark Bergsma: Remove -squid host lists [operations/puppet] - 10https://gerrit.wikimedia.org/r/101863 [14:56:33] (03PS1) 10Mark Bergsma: Remove Squid manifests and files [operations/puppet] - 10https://gerrit.wikimedia.org/r/101864 [14:58:24] https://github.com/wikimedia/mediawiki-vagrant/blob/master/puppet/modules/mediawiki/manifests/parsoid.pp [14:58:25] hmm [15:02:02] hm [15:04:02] https://git.wikimedia.org/blob/mediawiki%2Fvagrant/c79a03b12cd6835f05a2296fe769ffd56da5c220/puppet%2Fmodules%2Fmediawiki%2Ftemplates%2Fparsoid.conf.erb he runs it without redirection, I wonder what that does [15:07:38] I guess by default it is send to the console [15:07:41] or maybe dev/null :( [15:08:51] !log Depooled all pmtpa Squids in PyBal [15:09:07] Logged the message, Master [15:10:32] PROBLEM - LVS HTTP IPv4 on wikiversity-lb.pmtpa.wikimedia.org is CRITICAL: Connection refused [15:10:32] PROBLEM - LVS HTTP IPv6 on wikisource-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.072 second response time [15:10:32] PROBLEM - LVS HTTP IPv6 on wikipedia-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.072 second response time [15:10:35] PROBLEM - LVS HTTP IPv6 on wikimedia-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.073 second response time [15:10:36] PROBLEM - LVS HTTPS IPv6 on wikisource-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.193 second response time [15:10:36] PROBLEM - LVS HTTPS IPv4 on wikisource-lb.pmtpa.wikimedia.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.196 second response time [15:10:36] PROBLEM - LVS HTTPS IPv4 on wikinews-lb.pmtpa.wikimedia.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.186 second response time [15:10:36] PROBLEM - LVS HTTPS IPv4 on foundation-lb.pmtpa.wikimedia.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.190 second response time [15:10:36] PROBLEM - LVS HTTPS IPv4 on wikidata-lb.pmtpa.wikimedia.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 2.198 second response time [15:10:39] PROBLEM - LVS HTTPS IPv4 on wikipedia-lb.pmtpa.wikimedia.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 2.201 second response time [15:10:43] PROBLEM - LVS HTTPS IPv6 on wikiquote-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 4.193 second response time [15:10:43] PROBLEM - LVS HTTPS IPv6 on wiktionary-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 7.205 second response time [15:10:53] PROBLEM - LVS HTTP IPv4 on mediawiki-lb.pmtpa.wikimedia.org is CRITICAL: Connection refused [15:10:53] PROBLEM - LVS HTTP IPv4 on wikipedia-lb.pmtpa.wikimedia.org is CRITICAL: Connection refused [15:10:56] uhh [15:10:57] PROBLEM - LVS HTTP IPv6 on wikinews-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.074 second response time [15:10:57] PROBLEM - LVS HTTPS IPv6 on wikipedia-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.197 second response time [15:11:01] PROBLEM - LVS HTTPS IPv4 on mediawiki-lb.pmtpa.wikimedia.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.195 second response time [15:11:01] PROBLEM - LVS HTTPS IPv6 on wikidata-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.196 second response time [15:11:02] oh yeah :) [15:11:05] PROBLEM - LVS HTTPS IPv6 on foundation-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 2.204 second response time [15:11:10] :-D [15:11:15] I suppose paging is broken for me [15:11:15] PROBLEM - LVS HTTP IPv6 on wikidata-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.082 second response time [15:11:19] PROBLEM - LVS HTTP IPv4 on wikivoyage-lb.pmtpa.wikimedia.org is CRITICAL: Connection refused [15:11:22] PROBLEM - LVS HTTP IPv4 on wiktionary-lb.pmtpa.wikimedia.org is CRITICAL: Connection refused [15:11:23] PROBLEM - LVS HTTP IPv4 on foundation-lb.pmtpa.wikimedia.org is CRITICAL: Connection refused [15:11:23] PROBLEM - LVS HTTP IPv6 on wiktionary-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.073 second response time [15:11:23] PROBLEM - LVS HTTP IPv6 on wikivoyage-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.073 second response time [15:11:23] well it's working great for me :-D [15:11:27] PROBLEM - LVS HTTPS IPv4 on wikimedia-lb.pmtpa.wikimedia.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.258 second response time [15:11:31] PROBLEM - LVS HTTP IPv4 on wikibooks-lb.pmtpa.wikimedia.org is CRITICAL: Connection refused [15:11:31] PROBLEM - LVS HTTP IPv4 on wikimedia-lb.pmtpa.wikimedia.org is CRITICAL: Connection refused [15:11:34] PROBLEM - LVS HTTP IPv6 on mediawiki-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 1.086 second response time [15:11:34] PROBLEM - LVS HTTP IPv4 on wikiquote-lb.pmtpa.wikimedia.org is CRITICAL: Connection refused [15:11:34] PROBLEM - LVS HTTPS IPv4 on wikivoyage-lb.pmtpa.wikimedia.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.190 second response time [15:11:38] PROBLEM - LVS HTTPS IPv4 on wikiquote-lb.pmtpa.wikimedia.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.189 second response time [15:11:38] PROBLEM - LVS HTTPS IPv6 on wikiversity-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 1.185 second response time [15:11:38] PROBLEM - LVS HTTPS IPv6 on wikibooks-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 3.208 second response time [15:11:38] PROBLEM - LVS HTTPS IPv6 on mediawiki-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 7.218 second response time [15:11:39] PROBLEM - LVS HTTP IPv6 on foundation-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 7.095 second response time [15:11:39] PROBLEM - LVS HTTP IPv6 on wikibooks-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 7.094 second response time [15:11:39] PROBLEM - LVS HTTPS IPv4 on wiktionary-lb.pmtpa.wikimedia.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 7.192 second response time [15:11:42] \o/ [15:11:44] I didn't get paged either but I suppose that's because my paging timezone is still PST [15:11:48] good thing [15:11:48] PROBLEM - LVS HTTP IPv4 on wikinews-lb.pmtpa.wikimedia.org is CRITICAL: Connection refused [15:11:48] PROBLEM - LVS HTTP IPv6 on wikiversity-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.077 second response time [15:11:48] PROBLEM - LVS HTTP IPv6 on wikiquote-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.074 second response time [15:11:48] PROBLEM - LVS HTTPS IPv6 on wikivoyage-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.191 second response time [15:11:50] yeah [15:11:52] PROBLEM - LVS HTTPS IPv4 on wikiversity-lb.pmtpa.wikimedia.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.191 second response time [15:11:52] PROBLEM - LVS HTTPS IPv4 on wikibooks-lb.pmtpa.wikimedia.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.195 second response time [15:11:52] PROBLEM - LVS HTTP IPv4 on wikidata-lb.pmtpa.wikimedia.org is CRITICAL: Connection refused [15:11:55] PROBLEM - LVS HTTP IPv4 on wikisource-lb.pmtpa.wikimedia.org is CRITICAL: Connection refused [15:11:56] PROBLEM - LVS HTTPS IPv6 on wikinews-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 2.205 second response time [15:11:56] PROBLEM - LVS HTTPS IPv6 on wikimedia-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 4.201 second response time [15:11:56] PROBLEM - LVS HTTPS IPv6 on upload-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.193 second response time [15:12:20] paging works great for me [15:12:22] PROBLEM - LVS HTTPS IPv4 on upload-lb.pmtpa.wikimedia.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.185 second response time [15:12:25] :-D [15:12:27] Speaking of, could someone change that to CET for me? I think I might be the only person who is on the paging list but doesn't have the ability to change their paging timezone [15:12:27] PROBLEM - LVS HTTP IPv4 on upload-lb.pmtpa.wikimedia.org is CRITICAL: Connection refused [15:12:28] my phone is having a seizure [15:12:29] same here ... [15:12:40] sorry ;D [15:12:43] oh there it goes [15:12:44] PROBLEM - LVS HTTP IPv6 on upload-lb.pmtpa.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.072 second response time [15:13:05] RoanKattouw: sure, I 'll do it [15:13:22] what's the puppet master db now? [15:13:28] db1001 [15:13:31] tnx [15:13:50] hey, I got pages! [15:13:53] cool [15:13:55] it works again [15:13:56] :) [15:14:02] how very reliable [15:14:38] RoanKattouw: you know it's in puppet, right? :) [15:14:46] for some reason i don't have pages when I am in USA. So yeah.. reliable :-) [15:15:36] that was a lot of notifications [15:15:48] nothing to see here, move along [15:17:38] (03PS2) 10Mark Bergsma: Remove Squid manifests and files [operations/puppet] - 10https://gerrit.wikimedia.org/r/101864 [15:17:39] (03PS2) 10Mark Bergsma: Remove role::cache::squid [operations/puppet] - 10https://gerrit.wikimedia.org/r/101860 [15:17:40] (03PS2) 10Mark Bergsma: Remove -squid host lists [operations/puppet] - 10https://gerrit.wikimedia.org/r/101863 [15:17:41] (03PS2) 10Mark Bergsma: Remove all node definitions for Squids [operations/puppet] - 10https://gerrit.wikimedia.org/r/101856 [15:17:42] (03PS2) 10Mark Bergsma: Move all existing Squids to the decommission lists [operations/puppet] - 10https://gerrit.wikimedia.org/r/101857 [15:17:43] (03PS2) 10Mark Bergsma: Remove pmtpa Squid LVS monitoring [operations/puppet] - 10https://gerrit.wikimedia.org/r/101858 [15:17:44] (03PS2) 10Mark Bergsma: Update Icinga cache groups [operations/puppet] - 10https://gerrit.wikimedia.org/r/101859 [15:18:34] test [15:18:35] paravoid: AFAIK the list of which timezones are available is in puppet, but the manifest of which individual is in which timezone is in some private git repo somewhere [15:18:42] (03CR) 10Hashar: "The $wgAddGroups and $wgRemoveGroups will be shipped by the extension as of https://gerrit.wikimedia.org/r/#/c/101861/" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101488 (owner: 10Hashar) [15:18:48] RoanKattouw: looks like it, fixing [15:19:13] roan is in CET now? [15:19:40] mark: Temporarily for two weeks (holidays) [15:19:44]