[00:00:33] New patchset: MaxSem; "Postgres module for OSM" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/36155 [00:00:41] New patchset: MaxSem; "WIP: OSM module" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/36222 [00:03:46] that's not really it [00:05:20] PROBLEM - Host constable is DOWN: PING CRITICAL - Packet loss = 100% [00:05:27] New patchset: Reedy; "Expose CDB files for user download" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52315 [00:05:55] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52315 [00:07:01] New patchset: MaxSem; "WIP: OSM module" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/36222 [00:08:03] Hey greg-g and Reedy, is there a time this week when we can schedule deploying https://gerrit.wikimedia.org/r/#/c/21322 ? [00:08:10] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 00:08:01 UTC 2013 [00:08:40] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [00:08:50] PROBLEM - Puppet freshness on virt11 is CRITICAL: Puppet has not run in the last 10 hours [00:09:04] New patchset: Reedy; "Kill asterix" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52316 [00:09:15] !log hashar synchronized wmf-config/InitialiseSettings-labs.php [00:09:20] Logged the message, Master [00:09:26] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52316 [00:09:28] !log snapshot1002: rsync: change_dir#3 "/apache/common-local" failed: No such file or directory (2) [00:09:33] Logged the message, Master [00:09:33] apergos: ^^^^ snapshot1002: rsync: change_dir#3 "/apache/common-local" failed: No such file or directory (2) [00:09:46] hashar: It got reinstalled recently ish [00:09:51] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 00:09:40 UTC 2013 [00:09:51] csteipp: Any reason we can't just do it now? ;) [00:10:20] Reedy: if Ryan_Lane is ok with that, I'm all for it. [00:10:30] fine with me [00:10:40] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [00:10:40] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 00:10:38 UTC 2013 [00:10:45] I'm going to be around for at least another couple of hours [00:10:54] Reedy: should still be out of dsh groups I guess [00:11:01] I'll notify ops list, so they know it changed [00:11:13] hashar: just need someone to run sync-common as root on it [00:11:40] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [00:11:48] csteipp: Do you want to do it? [00:12:03] Reedy: I can [00:12:14] Just merge, pull, and sync file, right? [00:12:16] New patchset: Reedy; "(bug 39380) Enabling secure login (HTTPS)." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/21322 [00:12:21] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 00:12:12 UTC 2013 [00:12:27] Yup [00:12:40] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [00:12:50] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 00:12:48 UTC 2013 [00:13:40] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [00:14:30] !log hashar synchronized wmf-config/InitialiseSettings-labs.php [00:14:35] Logged the message, Master [00:15:47] New patchset: Ryan Lane; "Cleanup backups older than 7 days on wikitech" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52321 [00:16:23] notpeter: nagios is red! :-]  Regarding Lucene, the search boxes are part of the dsh mediawiki-installation group. They do receive copies of the InitialiseSettings.php file under /usr/local/apache/common-local. I guess we can get rid of the cronjob. [00:17:08] Change merged: CSteipp; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/21322 [00:18:47] csteipp: (typing what I just voice) yeah, whenever you wanna do it today :) [00:19:04] * greg-g likes transparency, or something [00:19:55] hashar: that's just the search indexers [00:21:00] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 00:20:51 UTC 2013 [00:21:40] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [00:21:49] !log csteipp synchronized wmf-config/CommonSettings.php 'Enabling HTTPS login' [00:21:55] Logged the message, Master [00:24:15] notpeter: :( [00:25:01] New patchset: MaxSem; "WIP: OSM module" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/36222 [00:25:42] Change abandoned: MaxSem; "Merged with https://gerrit.wikimedia.org/r/#/c/36222/" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/36155 [00:26:27] hashar: I mean, we totally could just make every search node a mediawiki installation [00:26:35] but this seems like a very heavy-handed approach.... [00:26:40] hm, are there supposed to be lock icons next to the create account and log in links on the user login page with this, csteipp? [00:27:03] notpeter: I am not sure we are willing to add 30 or so boxes to the main sync scripts [00:27:12] notpeter: so that was probably not a good idea :-] [00:27:21] Those are there now... it would be nice if they were styled a bit more [00:27:38] so: https://ganglia.wikimedia.org/latest/?r=hour&cs=&ce=&s=by+name&c=SSL%2520cluster%2520eqiad&tab=m&vn= [00:27:43] and: https://ganglia.wikimedia.org/latest/?r=hour&cs=&ce=&s=by+name&c=SSL%2520cluster%2520esams&tab=m&vn= [00:27:58] that's how we'll know if we need to revert it :) [00:29:08] binasher: how does the naming in gdash-dashboards/ work? [00:31:26] AaronSchulz: the directory names map to URI path names while the dash.yaml naming is how it appears in the menu [00:31:41] csteipp: <3 [00:31:55] this makes using the sites so much less annoying [00:31:58] AaronSchulz: graph order is just by lexical sort of the .graph files under a dash directory [00:32:11] * AaronSchulz just wants to split lockmanger and streamfile [00:35:27] AaronSchulz: oh yeah :) i'll review if you'd like [00:36:25] mutante-away: i reviewed the SPF thing and commented on the ticket [00:37:32] !log added new deb of twemproxy 0.2.3 built with --enable-debug=log to precise-wikimedia [00:37:37] Logged the message, Master [00:38:49] PROBLEM - Solr on vanadium is CRITICAL: Average request time is 460.14417 (gt 400) [00:41:55] New patchset: Pyoungmeister; "site.pp changes for pmtpa x1 shard" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52328 [00:42:42] solr on vanadium thinks it's lsearchd [00:42:59] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52328 [00:43:45] binasher: lol [00:44:09] New review: Hashar; "Works the same as the other helpers and Chris confirmed that is pretty much the authoritative db." [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/52253 [00:45:29] hm, having a production service used by mediawiki running on vanadium probably isn't a good idea [00:45:48] New patchset: Ryan Lane; "Cleanup backups older than 7 days on wikitech" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52321 [00:46:09] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 00:46:04 UTC 2013 [00:46:39] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [00:46:59] PROBLEM - MySQL Replication Heartbeat on db38 is CRITICAL: NRPE: Unable to read output [00:47:49] PROBLEM - mysqld processes on db38 is CRITICAL: PROCS CRITICAL: 0 processes with command name mysqld [00:49:09] PROBLEM - MySQL Replication Heartbeat on db36 is CRITICAL: NRPE: Unable to read output [00:50:20] RoanKattouw: constable was stuck on formatting :-/ may finish this up tomorrow [00:50:23] stupid body [00:51:03] OK take your time [00:51:49] RECOVERY - mysqld processes on db38 is OK: PROCS OK: 1 process with command name mysqld [00:52:05] New patchset: Asher; "adding twemproxy to mediawiki::packages" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52333 [00:52:17] notpeter: does ^^ look ok to you? [00:53:12] binasher: do you specifically want that on every mediawiki installtion? [00:53:15] or just every apache? [00:54:08] every mediawiki installation that might talk to memcached [00:54:51] New patchset: Aaron Schulz; "Split out LockManager/StreamFile graphs." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52335 [00:54:56] * AaronSchulz really wants the arbitrary name hashing [00:55:14] binasher: I just hacked that up ^ [00:56:49] these will be more interesting when a real LockManager is in use :) [00:58:14] Can I get someone else to help test the ssl login, by unchecking "Stay connected to HTTPS after login", and seeing if they go back to http? [00:58:49] I'm getting kept in ssl.. [00:59:04] trying to confirm if it's just me.. [00:59:24] AaronSchulz looks ok, i'm going to merge [00:59:25] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52335 [01:00:09] RECOVERY - mysqld processes on db36 is OK: PROCS OK: 1 process with command name mysqld [01:00:35] csteipp: Yeah, seems to stay on ssl [01:01:06] binasher: I need to add some stats to jobq too [01:01:11] * AaronSchulz is done for today [01:01:47] i think the jobq stats from mediawiki are still a lil janky, at least for deduplication [01:02:08] csteipp: Almost sounds like it's following $wgSecureLoginDefaultHTTPS, not the checkbox value [01:02:30] if ( $wgSecureLoginDefaultHTTPS && $this->mAction != 'submitlogin' && !$this->mLoginattempt ) { [01:02:30] $this->mStickHTTPS = true; [01:02:30] } [01:02:30] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [01:04:28] Yeah, the if ( $wgSecureLogin && !$this->mStickHTTPS ) { should account for that on redirect though [01:04:44] Lets flip it and see if behavior still follows [01:04:44] moment [01:05:45] !log reedy synchronized wmf-config/CommonSettings.php 'Set wgSecureLoginDefaultHTTPS to false for testing' [01:06:13] The checkbox is still checked :/ [01:06:30] hmm, memory [01:07:12] yeah, it's off for me on enwiki [01:07:30] PROBLEM - Puppet freshness on ms1004 is CRITICAL: Puppet has not run in the last 10 hours [01:07:40] but... logging in I still get ssl [01:07:44] Same behaviour though [01:07:46] indeed [01:08:27] !log reedy synchronized wmf-config/CommonSettings.php 'Revert wgSecureLoginDefaultHTTPS' [01:08:30] At which point should it take you back to http? [01:09:04] When you do the login post, you should get a redirect back to http [01:09:05] The return to [last page]? [01:09:12] ah, straight away then [01:09:55] Yeah [01:09:56] New patchset: Ryan Lane; "Fix nova config for network bonds" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52338 [01:10:14] Guess we need to check if ( $wgSecureLogin && !$this->mStickHTTPS ) { in executeReturnTo [01:10:45] We do on line 992, right? [01:11:00] PROBLEM - Varnish traffic logger on cp1028 is CRITICAL: PROCS CRITICAL: 2 processes with command name varnishncsa [01:11:12] New patchset: Hashar; "adapt Lucene configuration file to support beta" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52340 [01:11:19] Yeah. I mean dump the values on that line to confirm it's what it's supposed to be [01:11:35] ah, gatcha [01:12:12] The weird issue is that locally, it works fine [01:12:30] (my local dev instances) [01:12:46] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52321 [01:14:05] New patchset: Ryan Lane; "Fix nova config for network bonds" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52338 [01:15:00] RECOVERY - Varnish traffic logger on cp1028 is OK: PROCS OK: 3 processes with command name varnishncsa [01:17:06] New patchset: Ryan Lane; "Fix nova config for network bonds" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52338 [01:17:20] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 01:17:10 UTC 2013 [01:17:40] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [01:17:56] Ugh [01:18:33] Better we make them stay on https than always back to http [01:19:05] I agree... but I feel bad for screwing people who really really want http [01:19:46] New review: Asher; "(1 comment)" [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/52340 [01:20:26] We should make a special site for them to browse wikipedia on ie6 [01:20:28] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52338 [01:20:51] I support that :) [01:21:11] So, I need to leave the office soon... Is this bad enough that we should revert? [01:21:20] this only affects ie6? [01:21:26] No, everyone [01:21:28] ah [01:22:00] RECOVERY - Puppet freshness on virt2 is OK: puppet ran at Wed Mar 6 01:21:59 UTC 2013 [01:22:39] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52333 [01:25:10] PROBLEM - Varnish traffic logger on cp1033 is CRITICAL: PROCS CRITICAL: 2 processes with command name varnishncsa [01:25:15] csteipp: well, it's broken.... [01:25:27] I'd say let's revert [01:25:36] Cool. I've got a draft for it.. [01:25:37] at minimum it may confuse users [01:25:46] Yep [01:25:56] https://gerrit.wikimedia.org/r/#/c/52344/ anyone? [01:25:57] but it may also affect some users who don't have https, though I have a feeling it's a very, very small set of users [01:28:06] Reedy: mind if I reset to HEAD in common, and then push a revert patch? [01:28:10] RECOVERY - Puppet freshness on virt7 is OK: puppet ran at Wed Mar 6 01:28:05 UTC 2013 [01:28:28] I thought I'd reverted that already, haha [01:28:30] Yeah, feel free [01:29:43] Why won't my notifications go away on wikitech? :/ [01:31:20] RECOVERY - Puppet freshness on virt9 is OK: puppet ran at Wed Mar 6 01:31:12 UTC 2013 [01:31:23] Reedy: can you +2 https://gerrit.wikimedia.org/r/#/c/52344/ ? [01:32:39] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52344 [01:33:50] !log csteipp synchronized wmf-config/CommonSettings.php 'Revert https patch' [01:34:44] * csteipp cries a little on the inside [01:35:01] PROBLEM - Varnish traffic logger on cp1028 is CRITICAL: PROCS CRITICAL: 2 processes with command name varnishncsa [01:35:56] New patchset: Ryan Lane; "Ok. I give up trying to do this a sane way." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52348 [01:37:42] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52348 [01:40:12] Ryan_Lane: Can I get some more rights (sysop?) on new wikitech please? Was going to setup a couple of gadgets but I can't edit-interface :( [01:40:17] RECOVERY - Varnish traffic logger on cp1033 is OK: PROCS OK: 3 processes with command name varnishncsa [01:43:35] Reedy: hm. [01:44:47] so… there's an issue with doing so [01:45:14] that would allow you to hijack others' accounts [01:45:23] having sysop? [01:45:26] yes [01:45:39] Can we add a group with editinterface then? [01:45:44] same same [01:45:48] Ohh [01:46:36] Reedy: want me to install it? [01:46:50] all ops folks can do this [01:46:59] I can likely give you additional permissions [01:47:03] I wonder if there's a nice way we can export the relevant pages [01:47:14] Cause from memory it's a little fiddly getting all the messages set [01:47:24] but, if I do that, you'll need to have two factor auth enabled [01:47:57] RECOVERY - Varnish traffic logger on cp1028 is OK: PROCS OK: 3 processes with command name varnishncsa [01:47:58] hm. there's no real way to enforce, that for editing interface, though [01:48:07] s/,// [01:55:00] Ryan_Lane: Duh [01:55:03] I forgot MaxSem added it [01:55:04] https://www.mediawiki.org/wiki/Special:Gadgets/export/Navigation_popups [01:55:07] ^ Export feature! :D [01:55:11] heh [01:55:27] Import that, and then on MediaWiki:Gadgets-definition add Navigation_popups|popups.js|navpop.css [01:55:30] Should be somewhere near.. [01:55:57] Lots of JS.. [01:56:03] oh [01:56:07] transwiki import works too [01:57:02] Even better [01:57:05] is the import source for that mw? [01:57:36] Presumably.. [01:57:50] Never tried transwiki importing gadgets though [01:58:00] I think it doesn't work [01:58:40] https://www.mediawiki.org/w/index.php?title=Special%3AExport&pages=MediaWiki%3Agadget-Navigation_popups%0D%0AMediaWiki%3AGadget-popups.js%0D%0AMediaWiki%3AGadget-navpop.css%0D%0A&wpDownload=1&templates=1 [01:58:52] Wonder if putting that in open file will work.. [01:59:24] hm [01:59:30] I don't see gadgets listed in the preferences [01:59:56] i'm guessing you've got memcached setup? [02:00:02] yeah [02:02:28] bad formatting [02:02:37] it's there now [02:02:40] aha [02:03:12] Yup, and it works. Yay [02:03:20] Thanks [02:03:25] yw [02:03:26] i think we can live with [02:03:29] I need to add a message for it [02:03:37] I'm going to export it from mediawiki [02:04:05] It should almost be in WikimediaMessages or something [02:04:31] done :) [02:04:54] https://wikitech.wikimedia.org/w/index.php?title=W:Wikipedia:Tools/Navigation_popups&action=edit&redlink=1 [02:04:58] Aweosme link :D [02:05:21] bleh [02:05:46] MediaWiki:gadget-Navigation_popups [02:06:09] hm [02:06:14] why does w: not work? [02:06:41] https://wikitech.wikimedia.org/w/api.php?action=query&meta=siteinfo&siprop=interwikimap&sifilteriw=local [02:06:45] https://noc.wikimedia.org/conf/interwiki.cdb [02:06:51] * Ryan_Lane groans [02:06:56] $wgInterwikiCache = "$IP/cache/interwiki.cdb"; [02:07:08] should I just download that cdb? [02:07:42] saves you having to mess around with the dumpInterwiki maintenance script [02:08:02] At worst it might need updating from time to time [02:08:12] seems that didn't help [02:08:52] New patchset: Aude; "Settings for deploying wikidata to more wikipedias [WIP]" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52351 [02:09:13] Reedy: don't merge it yet :) [02:09:18] heh [02:11:27] PROBLEM - Puppet freshness on es3 is CRITICAL: Puppet has not run in the last 10 hours [02:13:09] New review: Aude; "still need to add settings for sort order of interwiki links, prepending, etc." [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/52351 [02:13:26] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [02:14:26] PROBLEM - Puppet freshness on cp1003 is CRITICAL: Puppet has not run in the last 10 hours [02:26:56] PROBLEM - Varnish traffic logger on cp1028 is CRITICAL: PROCS CRITICAL: 2 processes with command name varnishncsa [02:28:21] !log LocalisationUpdate completed (1.21wmf10) at Wed Mar 6 02:28:21 UTC 2013 [02:28:43] Krenair: [02:28:44] 22 req] Exception from line 202 of /usr/local/apache/common-local/php-1.21wmf10/includes/Autopromote.php: Unrecognized condition APCOND_FR_EDITSUMMARYCOUNT for a [02:28:44] utopromotion! [02:28:44] 12 Exception from line 202 of /usr/local/apache/common-local/php-1.21wmf10/includes/Autopromote.php: Unrecognized condition APCOND_FR_EDITSUMMARYCOUNT for autopr [02:28:44] omotion! [02:29:01] Oh, I wonder if they're used too early [02:29:14] maybe not [02:33:06] PROBLEM - Varnish traffic logger on cp1023 is CRITICAL: PROCS CRITICAL: 2 processes with command name varnishncsa [02:33:06] PROBLEM - Varnish traffic logger on cp1033 is CRITICAL: PROCS CRITICAL: 2 processes with command name varnishncsa [02:35:46] New patchset: Reedy; "Disable trwiki entries for wmgAutopromoteOnceonEdit" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52353 [02:36:02] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52353 [02:36:35] !log reedy synchronized wmf-config/InitialiseSettings.php [02:36:52] New review: Reedy; "Looks like it's slightly buggy" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/38252 [02:44:06] RECOVERY - Varnish traffic logger on cp1033 is OK: PROCS OK: 3 processes with command name varnishncsa [02:47:06] PROBLEM - LVS HTTP IPv4 on parsoidcache.svc.pmtpa.wmnet is CRITICAL: Connection timed out [02:49:37] RECOVERY - Puppet freshness on virt5 is OK: puppet ran at Wed Mar 6 02:49:29 UTC 2013 [02:49:37] RECOVERY - Puppet freshness on virt10 is OK: puppet ran at Wed Mar 6 02:49:29 UTC 2013 [02:49:37] RECOVERY - Puppet freshness on virt8 is OK: puppet ran at Wed Mar 6 02:49:30 UTC 2013 [02:49:37] RECOVERY - Puppet freshness on virt11 is OK: puppet ran at Wed Mar 6 02:49:30 UTC 2013 [02:49:47] RECOVERY - Puppet freshness on virt6 is OK: puppet ran at Wed Mar 6 02:49:37 UTC 2013 [02:49:47] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 02:49:39 UTC 2013 [02:49:47] RECOVERY - Puppet freshness on virt1007 is OK: puppet ran at Wed Mar 6 02:49:41 UTC 2013 [02:50:32] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [02:50:58] RECOVERY - Varnish traffic logger on cp1028 is OK: PROCS OK: 3 processes with command name varnishncsa [02:50:58] RECOVERY - Varnish traffic logger on cp1023 is OK: PROCS OK: 3 processes with command name varnishncsa [02:51:09] ACKNOWLEDGEMENT - LVS HTTP IPv4 on parsoidcache.svc.pmtpa.wmnet is CRITICAL: Connection timed out LeslieCarr yep [02:52:27] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 02:52:23 UTC 2013 [02:52:46] !log LocalisationUpdate completed (1.21wmf11) at Wed Mar 6 02:52:46 UTC 2013 [02:53:27] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [03:08:25] who wants to boot morebots? [03:09:19] * jeremyb_ nominates LeslieCarr [03:23:57] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 03:23:47 UTC 2013 [03:24:27] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [03:28:07] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:30:07] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 8.630 second response time [03:41:47] PROBLEM - Puppet freshness on virt1005 is CRITICAL: Puppet has not run in the last 10 hours [04:03:07] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:11:07] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 6.826 second response time [04:11:47] PROBLEM - Puppet freshness on amslvs1 is CRITICAL: Puppet has not run in the last 10 hours [04:11:47] PROBLEM - Puppet freshness on amssq46 is CRITICAL: Puppet has not run in the last 10 hours [04:11:47] PROBLEM - Puppet freshness on ms6 is CRITICAL: Puppet has not run in the last 10 hours [04:11:47] PROBLEM - Puppet freshness on ssl3003 is CRITICAL: Puppet has not run in the last 10 hours [04:13:01] PROBLEM - Puppet freshness on amslvs2 is CRITICAL: Puppet has not run in the last 10 hours [04:13:01] PROBLEM - Puppet freshness on amslvs3 is CRITICAL: Puppet has not run in the last 10 hours [04:13:01] PROBLEM - Puppet freshness on amssq32 is CRITICAL: Puppet has not run in the last 10 hours [04:13:01] PROBLEM - Puppet freshness on amssq38 is CRITICAL: Puppet has not run in the last 10 hours [04:13:01] PROBLEM - Puppet freshness on amssq43 is CRITICAL: Puppet has not run in the last 10 hours [04:14:27] PROBLEM - Puppet freshness on amssq31 is CRITICAL: Puppet has not run in the last 10 hours [04:14:27] PROBLEM - Puppet freshness on amssq34 is CRITICAL: Puppet has not run in the last 10 hours [04:14:27] PROBLEM - Puppet freshness on amssq33 is CRITICAL: Puppet has not run in the last 10 hours [04:14:27] PROBLEM - Puppet freshness on amssq37 is CRITICAL: Puppet has not run in the last 10 hours [04:14:27] PROBLEM - Puppet freshness on amssq40 is CRITICAL: Puppet has not run in the last 10 hours [04:14:28] PROBLEM - Puppet freshness on amssq41 is CRITICAL: Puppet has not run in the last 10 hours [04:14:28] PROBLEM - Puppet freshness on amssq35 is CRITICAL: Puppet has not run in the last 10 hours [04:14:29] PROBLEM - Puppet freshness on amssq52 is CRITICAL: Puppet has not run in the last 10 hours [04:14:29] PROBLEM - Puppet freshness on amssq53 is CRITICAL: Puppet has not run in the last 10 hours [04:14:30] PROBLEM - Puppet freshness on amssq59 is CRITICAL: Puppet has not run in the last 10 hours [04:14:30] PROBLEM - Puppet freshness on amssq61 is CRITICAL: Puppet has not run in the last 10 hours [04:14:31] PROBLEM - Puppet freshness on cp3021 is CRITICAL: Puppet has not run in the last 10 hours [04:14:31] PROBLEM - Puppet freshness on knsq16 is CRITICAL: Puppet has not run in the last 10 hours [04:14:32] PROBLEM - Puppet freshness on cp3009 is CRITICAL: Puppet has not run in the last 10 hours [04:14:32] PROBLEM - Puppet freshness on amssq56 is CRITICAL: Puppet has not run in the last 10 hours [04:14:33] PROBLEM - Puppet freshness on knsq18 is CRITICAL: Puppet has not run in the last 10 hours [04:14:33] PROBLEM - Puppet freshness on knsq19 is CRITICAL: Puppet has not run in the last 10 hours [04:14:34] PROBLEM - Puppet freshness on amssq42 is CRITICAL: Puppet has not run in the last 10 hours [04:14:34] PROBLEM - Puppet freshness on amssq49 is CRITICAL: Puppet has not run in the last 10 hours [04:14:35] PROBLEM - Puppet freshness on knsq24 is CRITICAL: Puppet has not run in the last 10 hours [04:14:35] PROBLEM - Puppet freshness on knsq21 is CRITICAL: Puppet has not run in the last 10 hours [04:14:36] PROBLEM - Puppet freshness on ssl3001 is CRITICAL: Puppet has not run in the last 10 hours [04:14:36] PROBLEM - Puppet freshness on ssl3002 is CRITICAL: Puppet has not run in the last 10 hours [04:14:37] PROBLEM - Puppet freshness on amssq62 is CRITICAL: Puppet has not run in the last 10 hours [04:14:37] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [04:15:27] PROBLEM - Puppet freshness on knsq26 is CRITICAL: Puppet has not run in the last 10 hours [06:12:43] PROBLEM - Puppet freshness on virt0 is CRITICAL: Puppet has not run in the last 10 hours [06:12:43] PROBLEM - Puppet freshness on virt1000 is CRITICAL: Puppet has not run in the last 10 hours [06:29:42] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 06:29:36 UTC 2013 [06:30:22] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [06:30:43] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 06:30:34 UTC 2013 [06:31:23] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [06:31:52] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 06:31:44 UTC 2013 [06:32:22] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [07:01:08] New review: Hashar; "(1 comment)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52340 [07:01:15] New patchset: Hashar; "adapt Lucene configuration file to support beta" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52340 [07:05:33] morebots still needs boot [07:05:38] maybe Ryan_Lane or paravoid ? [07:06:31] done [07:06:55] !log test [07:06:59] hm [07:07:12] !log restarted morebots [07:07:18] Logged the message, Master [07:07:48] !log 01:08:27 <+logmsgbot> !log reedy synchronized wmf-config/CommonSettings.php 'Revert wgSecureLoginDefaultHTTPS' [07:07:51] !log 01:33:50 <+logmsgbot> !log csteipp synchronized wmf-config/CommonSettings.php 'Revert https patch' [07:07:54] Logged the message, Master [07:07:55] !log 02:28:21 <+logmsgbot> !log LocalisationUpdate completed (1.21wmf10) at Wed Mar 6 02:28:21 UTC 2013 [07:07:59] Logged the message, Master [07:08:04] Logged the message, Master [07:08:06] !log 02:36:35 <+logmsgbot> !log reedy synchronized wmf-config/InitialiseSettings.php [07:08:09] !log 02:52:46 <+logmsgbot> !log LocalisationUpdate completed (1.21wmf11) at Wed Mar 6 02:52:46 UTC 2013 [07:08:11] Logged the message, Master [07:08:16] Logged the message, Master [07:09:07] !log 01:05:45 <+logmsgbot_> !log reedy synchronized wmf-config/CommonSettings.php 'Set wgSecureLoginDefaultHTTPS to false for testing' [07:09:12] Logged the message, Master [07:09:21] * jeremyb_ reorders onwiki [07:09:29] Ryan_Lane: danke [07:11:13] PROBLEM - ircecho_service_running on neon is CRITICAL: Connection refused by host [07:11:52] PROBLEM - MySQL disk space on neon is CRITICAL: Connection refused by host [07:13:58] huh, the bot logged that last one i did to the wiki but never ACK'd in the channel [07:15:11] [01:09:12 AM] Logged the message, Master [07:16:21] boxofjuice: what timezone? [07:16:27] CST [07:16:31] oh [07:16:35] i meant the first time around [07:16:47] i count [07:16:54] 06 01:05:45 <+logmsgbot_> !log reedy synchronized wmf-config/CommonSettings.php 'Set wgSecureLoginDefaultHTTPS to false for testing' [07:16:57] you making 6 [07:16:57] 06 01:08:20 -!- morebots [~morebots@wikitech-static.wikimedia.org] has quit [Ping timeout: 252 seconds] [07:17:11] and 6 responses [07:17:17] it never ACK'd but it was already on the wiki [07:17:22] oh [07:17:28] idk then [07:17:35] (TZ=UTC there) [07:45:19] RECOVERY - ircecho_service_running on neon is OK: PROCS OK: 3 processes with args ircecho [07:45:39] RECOVERY - MySQL disk space on neon is OK: DISK OK [08:07:59] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 08:07:55 UTC 2013 [08:08:29] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [08:08:39] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 08:08:30 UTC 2013 [08:09:29] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [08:44:53] where is the http 301 response to "http://en.wikipedia.org" requests configured? [08:47:45] apache-config repo? [08:47:51] ori-l: 301 for what? [08:48:14] to Main_page [08:48:42] i read a good article on mobile web performance http://queue.acm.org/detail.cfm?id=2441756 [08:48:48] which prompted me to poke around a little [08:49:06] and i noticed that if a user types 'en.wikipedia.org' on a mobile device we redirect them twice [08:49:34] see http://i.imgur.com/J4CAueo.png [08:49:58] yeah, not surprising [08:50:24] Alias /wiki /usr/local/apache/common/docroot/mediawiki/w/index.php [08:50:26] we should just push mobile user-agent detection to the initial request [08:50:27] RewriteRule ^/$ /w/index.php [08:51:23] there's various optimizations we could do [08:51:36] * ori-l is tempted to set that as the title :P [08:51:56] the fact that text is still squid isn't helping much, but even so there's room for improvements [08:52:07] feel free :) [08:52:33] I'd love to see them (and hence gladly review/possibly merge) [08:52:50] i think mobile text is varnish, no? [08:52:56] it is [08:53:03] which means the ua detection is vcl, which means having to convert it to apache syntax [08:53:34] nah, I don't think this can fly [08:53:36] i was hoping to make a splash with the discovery and leave the implementation work to you lot :P [08:53:59] why not? [08:54:12] ua detection isn't a cheap operation and produces multiple variants (21 iirc) of e.g. / [08:54:31] this might be expensive [08:54:44] it'll need some careful benchmarking [08:54:44] * ori-l is tempted to set *that* as the title [08:55:06] title? [08:55:24] topic [08:55:27] it's late [08:55:28] ah [08:55:29] Subject: Broken stuffs [08:55:42] not broken, just suboptimal [08:55:49] and a nice distraction from my broken stuffs [08:55:53] mobile redirects are werid in general [08:56:01] we also have the zero redirects [08:56:17] we have m.wikipedia.org [08:57:05] I don't want to dishearten you, but are you aware that because of the low cache hit ratio of mobile we don't have caches in esams? [08:57:34] lots of optimizations to be done, that doesn't mean we shouldn't start from somewhere though [08:58:06] cleaning up redirects in general (mobile or not) is a nice project [09:00:35] I don't want to dishearten you, but are you aware that because of the low cache hit ratio of mobile we don't have caches in esams? [09:00:40] can you explain? [09:00:45] no text caching you mean? [09:02:23] there's no mobile in esams [09:02:26] en.m.wikipedia.org is an alias for m.wikimedia.org. [09:02:26] m.wikimedia.org has address 208.80.154.236 [09:02:26] m.wikimedia.org has IPv6 address 2620:0:861:ed1a::c [09:02:32] this is from europe [09:02:37] going all the way to eqiad [09:03:38] it's http traceroute [09:03:48] ? [09:03:52] just joking [09:04:13] it visually resembles the output of traceout, i mean [09:04:16] *route [09:05:20] anyway, this wasn't an argument against redirect cleanups [09:08:42] no, i know. i just wrote an email to the mobile team about it [09:32:40] PROBLEM - MySQL disk space on neon is CRITICAL: Connection refused by host [09:32:50] PROBLEM - ircecho_service_running on neon is CRITICAL: Connection refused by host [10:07:42] RECOVERY - MySQL disk space on neon is OK: DISK OK [10:07:51] RECOVERY - ircecho_service_running on neon is OK: PROCS OK: 3 processes with args ircecho [10:08:11] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [11:02:54] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [11:07:54] PROBLEM - Puppet freshness on ms1004 is CRITICAL: Puppet has not run in the last 10 hours [11:16:56] New patchset: Aklapper; "[bug 45770] Drop NEW status from SQL query of getBugsPerProduct() and clarify description" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52390 [11:19:20] New patchset: Aklapper; "[bug 45770] Drop NEW status from SQL query of getBugsPerComponent() and clarify description" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52392 [11:20:44] New patchset: Aklapper; "[bug 45770] Clarify description of getBugsResolvedPerUser()" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52393 [11:21:07] PROBLEM - MySQL disk space on neon is CRITICAL: Connection refused by host [11:21:08] PROBLEM - ircecho_service_running on neon is CRITICAL: Connection refused by host [11:22:46] New patchset: Aklapper; "[bug 45770] Add TODO to check query for getBugResolutions()" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52394 [11:24:13] New patchset: Aklapper; "[bug 45770] Make getTotalOpenBugs() also include bugs in UNCONFIRMED status" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52395 [11:29:04] New patchset: Aklapper; "[bug 45770] Update used parameters in and arrays" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52396 [11:30:18] New patchset: Aklapper; "[bug 45770] Add TODO about deprecated mysql_connect()" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52397 [11:32:00] New patchset: Aklapper; "[bug 45770] Add 'Reports created this week' item" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52399 [11:35:29] New patchset: Aklapper; "[bug 45770] Minor description and layout improvements" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52400 [11:36:25] New patchset: Aklapper; "[bug 45770] Add a getHighestPrioTickets() SQL query, not used yet" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52401 [11:37:13] New review: Silke Meyer; "(Again, this is for the Wikidata demo servers only, not for production.)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52026 [11:41:38] New review: Nikerabbit; "Looks like someone is using git commit -m ;)" [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/52396 [11:48:33] ou [11:49:19] andre__: what is that commit supposed to do? [11:49:25] * Nemo_bis confused :< [11:50:18] Nemo_bis, which one? [11:53:50] Nemo_bis, my patches? See https://bugzilla.wikimedia.org/show_bug.cgi?id=45770 [11:53:53] reviews welcome. [11:55:11] RECOVERY - MySQL disk space on neon is OK: DISK OK [11:55:11] RECOVERY - ircecho_service_running on neon is OK: PROCS OK: 3 processes with args ircecho [12:07:40] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 12:07:34 UTC 2013 [12:08:10] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [12:08:51] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 12:08:47 UTC 2013 [12:09:10] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [12:10:00] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 12:09:52 UTC 2013 [12:10:11] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [12:10:50] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 12:10:46 UTC 2013 [12:11:10] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [12:11:40] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 12:11:37 UTC 2013 [12:11:50] PROBLEM - Puppet freshness on es3 is CRITICAL: Puppet has not run in the last 10 hours [12:12:10] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [12:12:30] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 12:12:20 UTC 2013 [12:13:10] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [12:13:30] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 12:13:27 UTC 2013 [12:14:11] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [12:14:53] PROBLEM - Puppet freshness on cp1003 is CRITICAL: Puppet has not run in the last 10 hours [12:45:54] New patchset: Aude; "Settings for deploying wikidata to more wikipedias" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52351 [12:51:48] New review: Aude; "fywikipedia does weird stuff for sorting... need to see what we can do about it" [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/52351 [12:57:14] fywikipedia? [12:57:36] * boxofjuice looks [13:09:22] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [13:21:56] New review: Daniel Kinzler; "(1 comment)" [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/52351 [13:36:08] boxofjuice, after so many years with Wikipedia you don't remember ISO 939 by heart? lucky you:P [13:36:22] heheh [13:36:31] initially i thought it was a typo of "pywikipedia" [13:42:12] PROBLEM - Puppet freshness on virt1005 is CRITICAL: Puppet has not run in the last 10 hours [14:02:36] New review: Aude; "(1 comment)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52351 [14:03:19] New review: Aude; "(1 comment)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52351 [14:11:57] PROBLEM - Puppet freshness on amssq46 is CRITICAL: Puppet has not run in the last 10 hours [14:11:57] PROBLEM - Puppet freshness on ms6 is CRITICAL: Puppet has not run in the last 10 hours [14:11:57] PROBLEM - Puppet freshness on ssl3003 is CRITICAL: Puppet has not run in the last 10 hours [14:11:57] PROBLEM - Puppet freshness on amslvs1 is CRITICAL: Puppet has not run in the last 10 hours [14:12:57] PROBLEM - Puppet freshness on amslvs3 is CRITICAL: Puppet has not run in the last 10 hours [14:12:58] PROBLEM - Puppet freshness on amslvs4 is CRITICAL: Puppet has not run in the last 10 hours [14:12:58] PROBLEM - Puppet freshness on amslvs2 is CRITICAL: Puppet has not run in the last 10 hours [14:12:58] PROBLEM - Puppet freshness on amssq32 is CRITICAL: Puppet has not run in the last 10 hours [14:12:58] PROBLEM - Puppet freshness on amssq36 is CRITICAL: Puppet has not run in the last 10 hours [14:14:57] PROBLEM - Puppet freshness on amssq31 is CRITICAL: Puppet has not run in the last 10 hours [14:14:58] PROBLEM - Puppet freshness on amssq33 is CRITICAL: Puppet has not run in the last 10 hours [14:14:58] PROBLEM - Puppet freshness on amssq40 is CRITICAL: Puppet has not run in the last 10 hours [14:14:58] PROBLEM - Puppet freshness on amssq35 is CRITICAL: Puppet has not run in the last 10 hours [14:14:58] PROBLEM - Puppet freshness on amssq34 is CRITICAL: Puppet has not run in the last 10 hours [14:14:59] PROBLEM - Puppet freshness on amssq52 is CRITICAL: Puppet has not run in the last 10 hours [14:15:00] PROBLEM - Puppet freshness on amssq42 is CRITICAL: Puppet has not run in the last 10 hours [14:15:00] PROBLEM - Puppet freshness on amssq56 is CRITICAL: Puppet has not run in the last 10 hours [14:15:01] PROBLEM - Puppet freshness on amssq61 is CRITICAL: Puppet has not run in the last 10 hours [14:15:01] PROBLEM - Puppet freshness on amssq53 is CRITICAL: Puppet has not run in the last 10 hours [14:15:02] PROBLEM - Puppet freshness on amssq59 is CRITICAL: Puppet has not run in the last 10 hours [14:15:02] PROBLEM - Puppet freshness on amssq62 is CRITICAL: Puppet has not run in the last 10 hours [14:15:03] PROBLEM - Puppet freshness on cp3009 is CRITICAL: Puppet has not run in the last 10 hours [14:15:03] PROBLEM - Puppet freshness on cp3021 is CRITICAL: Puppet has not run in the last 10 hours [14:15:04] PROBLEM - Puppet freshness on knsq16 is CRITICAL: Puppet has not run in the last 10 hours [14:15:04] PROBLEM - Puppet freshness on knsq18 is CRITICAL: Puppet has not run in the last 10 hours [14:15:05] PROBLEM - Puppet freshness on knsq19 is CRITICAL: Puppet has not run in the last 10 hours [14:15:05] PROBLEM - Puppet freshness on knsq21 is CRITICAL: Puppet has not run in the last 10 hours [14:15:06] PROBLEM - Puppet freshness on knsq24 is CRITICAL: Puppet has not run in the last 10 hours [14:15:06] PROBLEM - Puppet freshness on ssl3001 is CRITICAL: Puppet has not run in the last 10 hours [14:15:07] PROBLEM - Puppet freshness on ssl3002 is CRITICAL: Puppet has not run in the last 10 hours [14:15:53] New review: Aude; "serbian sorting depends on https://gerrit.wikimedia.org/r/#/c/52402/" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52351 [14:15:57] PROBLEM - Puppet freshness on knsq26 is CRITICAL: Puppet has not run in the last 10 hours [15:13:14] Nikerabbit, around? [15:17:56] MaxSem: not in few hours [15:18:07] meh [15:19:10] Nikerabbit, when you're back, please take a look at your Solr's slow queries causing https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=vanadium&service=Solr [15:19:41] oki [15:19:47] causing what? [15:20:57] https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=vanadium&service=Solr ;) [15:21:44] yeah but i dont have a browser [15:21:52] ugh [15:22:04] poke me when you're back then:) [15:22:13] MaxSem: ideas on https://bugzilla.wikimedia.org/show_bug.cgi?id=44134 ? [15:23:03] no idea WTF it is [15:23:26] if it mentions Solr it doesn't mean that I'm the guilty party:) [15:23:52] yes it does :) [15:24:24] ops knows nothing about solr, someone just merged some manifest somewhere [15:24:44] and this means that you've just volunteered to maintain solr :) [15:24:54] (yes, this is dysfunctional and I'm half-joking) [15:25:30] there was a talk about handing this stuff to Chad anyway:) [15:26:05] paravoid, have you registered for OSD? [15:27:34] not yet, later today [15:27:42] mmm, I see only Nikerabbit's stuff at https://noc.wikimedia.org/cgi-bin/report.py?db=all&sort=real&limit=5000&prefix=Solr :) [15:27:43] saw the mail, thanks :) [15:27:46] New patchset: Asher; "setting swappiness to 0" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52418 [15:29:23] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52418 [15:29:30] binasher: :) [15:30:22] hey [15:39:15] hiiii paravoid! [15:39:21] are you still in sf or back in eurpoe? [16:11:10] New review: Ram; "Shouldn't the last line of lucene-common have a conditional to include either lucene-labs or lucene-..." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52340 [16:11:35] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [16:12:15] PROBLEM - MySQL disk space on neon is CRITICAL: Connection refused by host [16:12:16] PROBLEM - ircecho_service_running on neon is CRITICAL: Connection refused by host [16:13:36] PROBLEM - Puppet freshness on virt1000 is CRITICAL: Puppet has not run in the last 10 hours [16:13:36] PROBLEM - Puppet freshness on virt0 is CRITICAL: Puppet has not run in the last 10 hours [16:20:34] New review: Asher; "Ram - see multiversion/MWRealm.php:function getRealmSpecificFilename() - this is a correct example o..." [operations/mediawiki-config] (master) C: 1; - https://gerrit.wikimedia.org/r/52340 [16:20:53] paravoid, I would love to have a discussion with you about the cdh4 module stuff, about potentially splitting it into several different modules, maybe even seeing if it is possible to make it non cdh4 specific [16:21:24] ok [16:21:25] when? [16:21:41] maybe we should finish what's already in gerrit first though? [16:21:50] yeah, would love to do that first! [16:22:23] i've got kafka module ready to push…just not sure how to push it for review since I already have a repo there [16:22:25] sorry [16:22:30] debian [16:22:31] not module [16:22:56] ah cool, chad responded to my email, I might be able to push it today [16:24:40] so, the things in gerrit now that need looking at [16:24:45] New review: Ram; "Ah, sorry, didn't see the function call around it for some reason." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52340 [16:24:51] are the kafka module…ahh, i guess that is just waiting on the .deb... [16:24:52] and [16:24:56] the puppet-merge thing [16:25:02] we can do puppet-merge now [16:26:07] https://gerrit.wikimedia.org/r/#/c/50452/ [16:26:07] paravoid ^ [16:40:03] New patchset: Faidon; "annotate disabled root accounts on include" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/50724 [16:41:28] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/50724 [16:42:58] Change abandoned: Faidon; "Not going to happen :)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/47514 [16:51:23] RECOVERY - MySQL disk space on neon is OK: DISK OK [16:51:33] RECOVERY - ircecho_service_running on neon is OK: PROCS OK: 3 processes with args ircecho [17:01:18] hello [17:07:50] New review: Ram; "Could "$wgEnableLucenePrefixSearch = true;" and "$wgLucenePort = 8123;" be moved into lucene-common...." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52340 [17:17:55] xyzram: sure :-] [17:17:58] xyzram: good morning [17:22:39] hi hashar, can you tell me anything about putting Search on beta cluster? [17:22:47] anything [17:22:56] hello chrismcmahon [17:22:59] :-) [17:23:52] I've been going through the Search bugs in bugzilla and wonder if we can hack on some of them in beta labs eventually [17:23:55] chrismcmahon: so yeah we have it setup on labs [17:24:05] it is even running and the db might have been imported [17:24:25] I need to write some workarounds to adapt our crappy piece of software to fit in beta cluster [17:24:36] like settings that only applies to production [17:24:39] and hardcoded [17:25:04] Jeff_Green: thanks for the comments, i expect Yossie to follow up [17:25:30] mutante: cool, no problem [17:26:01] chrismcmahon: as for lucene itself, I guess xyzram and ^demon will attempt to fix the most trivial and critical bugs. But not that much. We probably want to invest toward replacing our backend with something a bit more modern. [17:26:15] chrismcmahon: maybe solr (a search system based on lucene). [17:26:23] !rt 4648 | notpeter [17:26:23] notpeter: http://rt.wikimedia.org/Ticket/Display.html?id=4648 [17:27:06] hashar: yes, I know about solr, seems like lucene would be a first step though [17:27:49] chrismcmahon: so currently we have two boxes setup, the MediaWiki config change is pending for review and I got to write a hack script. [17:27:52] let me open bugs for that [17:28:01] left myself a todo list to open myriad of bugs [17:28:21] hashar: great, could you put me on those bugs? [17:28:48] I am not sure you will fit on top of a bug [17:29:02] chrismcmahon: the tracking bug is at https://bugzilla.wikimedia.org/show_bug.cgi?id=34250 [17:29:28] thanks hashar got it [17:30:10] * hashar opens bug [17:31:47] New patchset: Hashar; "(bug 45784 ) adapt Lucene configuration file to support beta" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52340 [17:32:55] mutante: yep [17:33:01] I saw it yesterday [17:33:06] cool:) [17:33:08] chrismcmahon: tracking bug 34250 should give you an overview of what is left to do for search on beta : https://bugzilla.wikimedia.org/showdependencytree.cgi?id=34250&hide_resolved=1 Will get more bugs added to that list. [17:33:10] I have a couple of index tickets to do [17:33:15] doing that today [17:33:20] great, thank you [17:33:21] New patchset: Hashar; "(bug 45784) adapt Lucene configuration file to support beta" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52340 [17:36:02] New review: Hashar; "> Could "$wgEnableLucenePrefixSearch = true;" and" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52340 [17:37:37] PROBLEM - Packetloss_Average on locke is CRITICAL: CRITICAL: packet_loss_average is 8.83242375 (gt 8.0) [17:39:20] New review: Ram; "Looks good." [operations/mediawiki-config] (master) C: 1; - https://gerrit.wikimedia.org/r/52340 [17:40:51] Ryan_Lane: ping [17:47:12] preilly: not at his desk yet [17:47:42] hashar: okay thanks [17:50:25] hashar: Your patch looks good. [17:54:08] uh, was bayes deaded? [17:55:05] hashar: chrismcmahon: The current priority as laid out by Robla is to stabilize and fix the most searious issues in the current lucene-based search. Solr is more medium/long term. [17:56:04] xyzram: got it :-] [17:57:04] xyzram: I figured out a solution to the initialisesettings file being parsed by lucene. I will just insert at the top of it the labs version. Hopefully that will trick lucene-search-2 to recognize the beta configuration :-] [17:58:49] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52340 [17:59:00] Ok, seeems like it _might_ work. [17:59:23] !log deploying {{gerrit|52340}} adapt Lucene configuration file to support beta [17:59:32] Logged the message, Master [18:01:37] RECOVERY - Packetloss_Average on locke is OK: OK: packet_loss_average is 2.80695561151 [18:01:50] !log hashar synchronized wmf-config 'deploying {{gerrit|52340}} adapt Lucene configuration file to support beta' [18:01:56] Logged the message, Master [18:04:33] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [18:05:41] PROBLEM - ircecho_service_running on neon is CRITICAL: Connection refused by host [18:06:11] PROBLEM - MySQL disk space on neon is CRITICAL: Connection refused by host [18:09:25] !log hashar synchronized wmf-config/InitialiseSettings.php 'clear config cache' [18:09:31] Logged the message, Master [18:10:14] !log hashar synchronized wmf-config/InitialiseSettings.php 'clear config cache' [18:10:19] Logged the message, Master [18:22:48] New patchset: Pyoungmeister; "removing broken and redundant enwiki jobqueue check" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52426 [18:24:45] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52426 [18:26:54] New patchset: Dzahn; "add exim and lighttpd redirects for mailman "allhands" list renaming (RT-4640)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52428 [18:27:37] New review: Krinkle; "* https://bits.wikimedia.org/w/extensions-1.17" [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/50609 [18:28:21] New patchset: Aude; "Settings for deploying wikidata to more wikipedias" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52351 [18:31:21] New patchset: Aude; "Settings for deploying wikidata to more wikipedias" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52351 [18:32:20] New patchset: Jgreen; "puppetizing file_mover user for locke, including sudoers" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52430 [18:34:36] New patchset: Pyoungmeister; "removing code for broken and no longer used sms dongle" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52431 [18:35:02] paravoid! paravoid let's hang out, whatcha doin, yeah yeah yeah, yeah? [18:35:06] !log aaron synchronized php-1.21wmf11/includes/DefaultSettings.php 'deployed cbc98b09fb0a13b387d82bafab825608e2bb0b3b' [18:35:12] Logged the message, Master [18:35:38] !log aaron synchronized php-1.21wmf11/includes/job/JobQueueGroup.php 'deployed cbc98b09fb0a13b387d82bafab825608e2bb0b3b' [18:35:44] Logged the message, Master [18:36:02] !log aaron synchronized php-1.21wmf11/maintenance/nextJobDB.php 'deployed cbc98b09fb0a13b387d82bafab825608e2bb0b3b' [18:36:08] Logged the message, Master [18:36:47] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52431 [18:39:24] RECOVERY - MySQL disk space on neon is OK: DISK OK [18:39:33] RECOVERY - ircecho_service_running on neon is OK: PROCS OK: 3 processes with args ircecho [18:39:37] New review: Daniel Kinzler; "(3 comments)" [operations/mediawiki-config] (master) C: 1; - https://gerrit.wikimedia.org/r/52351 [18:39:43] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52430 [18:40:54] sbernardin: which disk on db44? [18:42:35] New patchset: Alex Monk; "Try to fix trwiki autopromotion config" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52433 [18:42:47] Jeff_Green: Can you merge this trivial change? https://gerrit.wikimedia.org/r/#/c/50609/ [18:43:44] New review: Daniel Kinzler; "(1 comment)" [operations/mediawiki-config] (master) C: 1; - https://gerrit.wikimedia.org/r/52351 [18:44:35] !log renaming mailman list 'allhands' to 'wmfreqs' (RT-4640) [18:44:41] Logged the message, Master [18:45:25] Krinkle: yep [18:45:37] Change merged: Jgreen; [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/50609 [18:47:20] Jeff_Green: iirc I can deploy it myself, right? Or are you doing that already (fenari, /h/w/conf/httpd, update, sync-apache) [18:47:32] i'm not sure actually [18:47:47] !log regenerating mailman list archives for wmfreqs [18:47:53] Logged the message, Master [18:47:55] i mean, I wasn't planning to do anything and I'm not sure how to merge that [18:48:03] ok [18:48:05] err 'deploy' rather [18:48:42] New patchset: Aude; "Settings for deploying wikidata to more wikipedias" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52351 [18:49:23] New review: Dzahn; "RT-4640" [operations/puppet] (production) C: 2; - https://gerrit.wikimedia.org/r/52428 [18:49:38] New review: Aude; "(2 comments)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52351 [18:49:47] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52428 [18:50:12] Krinkle: Ask mutante [18:56:22] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52433 [18:56:34] preilly: ? [18:58:43] !log reedy synchronized wmf-config/InitialiseSettings.php [18:58:49] Logged the message, Master [19:00:13] Ryan_Lane: can you give me access to the performance host? [19:00:21] I think it's been deleted [19:00:32] Ryan_Lane: did you back up the data? [19:01:54] asher had deleted it while we were at lunch [19:02:09] so, probably no backup [19:02:22] Ryan_Lane: WTF [19:02:25] dzahn is doing a graceful restart of all apaches [19:03:03] well, we had decided to delete it before lunch [19:03:06] !log dzahn gracefulled all apaches [19:03:12] Logged the message, Master [19:03:13] !log gracefulling apaches to remove 1.17 aliases (gerrit 50609) [19:03:18] Logged the message, Master [19:03:19] Ryan_Lane: You could have reached out to me first [19:03:19] not really his fault [19:03:42] it's a labs instance. it's not like it has anything important on it [19:04:52] maybe asher backed it up first [19:04:54] ask himn [19:05:02] *him [19:05:02] Ryan_Lane: Okay I had asked for it to be backed up [19:05:05] at lunch [19:05:08] Ryan_Lane: I sent him a text to ask [19:05:14] then maybe he did [19:05:21] Ryan_Lane: no biggie really [19:05:29] Ryan_Lane: but I want the hiphop stuff [19:05:41] Ryan_Lane: it's such a pain to set-up [19:07:45] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Closed, private and fishbowl to 1.21wmf11 [19:07:52] Logged the message, Master [19:08:25] apergos: Could you run sync-common as root on snapshot1002 please? [19:08:39] Should remove the couple of error lines seen in pushing files [19:09:57] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Special wikis to 1.21wmf11 [19:10:03] Logged the message, Master [19:14:01] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/50379 [19:14:07] /me kicks logmsgbot_ [19:14:18] * Reedy kicks logmsgbot_ [19:14:40] New patchset: Jgreen; "puppetize existing banner log rotation script" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52436 [19:15:13] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikivoyage, wikiversity and wiktionary to 1.21wmf11 [19:15:18] Logged the message, Master [19:15:36] PROBLEM - Puppet freshness on constable is CRITICAL: Puppet has not run in the last 10 hours [19:16:45] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52436 [19:18:06] PROBLEM - Parsoid on constable is CRITICAL: Connection refused [19:18:15] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikibooks and wikinews to 1.21wmf11 [19:18:23] Logged the message, Master [19:19:56] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikimedia and wikiquote to 1.21wmf11 [19:20:02] Logged the message, Master [19:21:19] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: [19:21:28] Logged the message, Master [19:25:31] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: aawiki back to 1.21wmf10 for Aaron [19:25:37] Logged the message, Master [19:28:40] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikisource to 1.21wmf11 [19:28:46] Logged the message, Master [19:29:31] !log reedy synchronized wmf-config/InitialiseSettings.php [19:29:36] Logged the message, Master [19:30:35] New patchset: Reedy; "Everything non wikipedia to 1.21wmf11" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52440 [19:31:01] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52440 [19:31:38] * AaronSchulz gets redirected to labs from wikitech for the first time [19:32:01] Ryan_Lane: joy :) [19:32:35] AaronSchulz: :) [19:32:55] Fatal error: require_once(): Failed opening required '/usr/local/apache/common-local/php-1.21wmf1/extensions/EventLogging/EventLogging.php' (include_path='/usr/local/apache/common-local/php-1.21wmf1/extensions/TimedMediaHandler/handlers/OggHandler/PEAR/File_Ogg:/usr/local/apache/common-local/php-1.21wmf1:/usr/local/lib/php:/usr/share/php') in /usr/local/apache/common-local/wmf-config/CommonSetting [19:32:57] s.php on line 2625 [19:32:59] cawikisource enotifNotify [19:33:13] New patchset: Jgreen; "stuff for fundraising banner log pipeline" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52441 [19:33:17] wmf1? [19:33:30] something seems wtf [19:33:40] wikiversions.cdb need an update ? [19:34:13] ffs [19:34:15] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52441 [19:34:17] fixing again [19:34:18] FAIL [19:34:22] +azwikisource php-1.21wmf1 * [19:34:25] +arwikisource php-1.21wmf1 * [19:34:27] and so on [19:34:33] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: [19:34:33] !g I8b5cbe3bc07000d57bd378a732dab5fdec263358 [19:34:33] https://gerrit.wikimedia.org/r/#q,I8b5cbe3bc07000d57bd378a732dab5fdec263358,n,z [19:34:38] Logged the message, Master [19:34:39] fixed [19:34:55] you should get a unit test for that :-] [19:35:05] Based on what? [19:35:22] it failed a bit more gracefully as we have loads of staged versions again :| [19:35:24] making sure we only have two versions in the wikiversions.dat file ? [19:35:29] New patchset: Reedy; "Fix 1.21wmf1 -> 1.21wmf11 fail" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52442 [19:35:43] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52442 [19:35:58] https://bugzilla.wikimedia.org/show_bug.cgi?id=41861 :] [19:37:40] ok jobs-loop actually does stuff now [19:37:54] stuff is good [19:38:32] notpeter: can you restart all job runners? [19:38:38] only mw1006 is doing stuff [19:38:44] the others have procs but are sitting there [19:39:08] New patchset: Reedy; "Settings for deploying wikidata to more wikipedias" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52351 [19:39:23] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52351 [19:41:37] AaronSchulz: it's a feature [19:41:40] (yes) [19:41:45] !Log restarting all jobrunners [19:41:48] New patchset: Jgreen; "cron for banner pipeline" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52443 [19:42:59] !log restarting all jobrunners [19:43:04] Logged the message, notpeter [19:43:06] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52443 [19:43:59] New review: Hashar; "As matmarex said, we need to use array_keys( IcuCollation::$tailoringFirstLetters ) . I have no id..." [operations/mediawiki-config] (master) C: -2; - https://gerrit.wikimedia.org/r/51313 [19:48:32] New review: Matmarex; "Coudln't we use a ghetto require_once() to make sure it is actually accessible whatever happens? The..." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/51313 [19:49:00] @ [19:49:01] New patchset: Jgreen; "fix script permissions" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52446 [19:51:44] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52446 [19:56:29] !log reedy synchronized php-1.21wmf11/extensions/DataValues [19:56:35] Logged the message, Master [19:57:03] !log reedy synchronized php-1.21wmf11/extensions/Wikibase [19:57:08] Logged the message, Master [20:00:42] New patchset: Reedy; "Add CVE linker" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52253 [20:03:16] New patchset: Dereckson; "(bug 45636) Namespace configuration for it.wikivoyage" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52449 [20:03:19] !log reedy synchronized php-1.21wmf10/extensions/Diff [20:03:24] Logged the message, Master [20:03:53] !log reedy synchronized php-1.21wmf10/extensions/DataValues [20:03:59] Logged the message, Master [20:04:11] wmf10 code updated [20:04:13] nearly [20:04:25] !log reedy synchronized php-1.21wmf10/extensions/Wikibase [20:04:26] now [20:04:30] Logged the message, Master [20:07:15] !log reedy synchronized wmf-config/ [20:07:20] Logged the message, Master [20:08:54] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 20:08:45 UTC 2013 [20:09:00] New patchset: Reedy; "Enable Wikibase Client on all wikipedias" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52450 [20:09:25] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [20:11:04] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 20:10:57 UTC 2013 [20:11:24] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [20:12:03] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52221 [20:12:39] !log reedy synchronized wmf-config/ [20:12:44] Logged the message, Master [20:13:04] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 20:12:55 UTC 2013 [20:13:24] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [20:14:29] notpeter: looks like that gave them a good kick :) [20:14:44] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 20:14:39 UTC 2013 [20:15:24] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [20:15:38] New patchset: Ori.livneh; "Add config var for redis host to use with GettingStarted" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52453 [20:16:24] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 20:16:15 UTC 2013 [20:16:24] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [20:17:54] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 20:17:44 UTC 2013 [20:18:24] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [20:18:31] AaronSchulz: cool! I'm glad :) [20:19:34] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 20:19:30 UTC 2013 [20:20:24] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [20:20:44] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 20:20:36 UTC 2013 [20:21:16] !Log Creating and populating site_identifiers table on all Wikipedias [20:21:24] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [20:21:34] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 20:21:29 UTC 2013 [20:21:47] * Reedy kicks morebots [20:22:17] csteipp: can we just remove the checkbox, regarding the https login issue? [20:22:23] ^ [20:22:24] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [20:22:25] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 20:22:20 UTC 2013 [20:22:37] ugh. morebots is dead again? [20:22:45] I wonder if this is a conflict with the user its using [20:23:03] csteipp: if people can't use https, they'd never make it to that form to begin with :) [20:23:04] PROBLEM - MySQL Slave Delay on db53 is CRITICAL: CRIT replication delay 181 seconds [20:23:07] Ryan_Lane: So that all users just use ssl, and don't have a choice? [20:23:08] It worked ~11 minutes ago [20:23:12] csteipp: yep [20:23:24] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [20:23:28] Ryan_Lane: I like the idea [20:23:45] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Mar 6 20:23:41 UTC 2013 [20:23:45] PROBLEM - MySQL Replication Heartbeat on db53 is CRITICAL: CRIT replication delay 182 seconds [20:23:47] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52453 [20:24:03] if we have issues with https, we can just disable the feature [20:24:04] The use case of someone logged in not wanting to use HTTPS must be extremely small [20:24:08] yep [20:24:14] and options suck [20:24:22] and cause bugs :) [20:24:24] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [20:24:26] I suggested last night we should make them a wiki for HTTP that works on ie6 [20:24:53] heh [20:24:59] http://en.upgradeyourbrowserpedia.org/ [20:25:27] You should just charge people using ie6 to view wikipedia and get rid of donations :P [20:25:40] Ryan_Lane: It would be a small code change, but I think that would be fine. Would someone in features / product have to be ok with it? [20:25:45] PROBLEM - MySQL Replication Heartbeat on db53 is CRITICAL: CRIT replication delay 184 seconds [20:25:56] robla: ^^ [20:25:58] And we should drop the lock icons while we're at it [20:26:01] lol [20:26:05] PROBLEM - MySQL Slave Delay on db53 is CRITICAL: CRIT replication delay 185 seconds [20:26:12] lock icons will be hard to do [20:26:22] hm [20:26:29] wait, no, it shouldn't be [20:26:31] Isn't that just CSS fixable? [20:26:35] in a meeting...what am I supposed to be looking at? [20:26:45] also, is this something greg-g should be looking at? [20:26:58] robla: we want to just remove the checkbox for staying logged in with https [20:27:21] and have that be the only option [20:27:22] eerr. Going back to HTTP after logging in with HTTPS [20:27:31] robla: just wondering if we need an OK from product / features if we switch everyone to using https when they login [20:27:43] with no option to not use https [20:27:47] !log olivneh synchronized wmf-config/CommonSettings.php 'Adding redis host config. var for GettingStarted' [20:27:47] !Log Creating and populating sites table on all Wikipedias [20:27:52] Logged the message, Master [20:28:03] * Reedy kicks morebots again [20:28:20] Reedy: you're using Log [20:28:22] that doesn't work [20:28:26] you need to use log [20:28:32] srsl? [20:28:42] !og Creating and populating sites table on all Wikipedias [20:28:45] !log Creating and populating sites table on all Wikipedias [20:28:46] hahaha [20:28:51] Logged the message, Master [20:28:51] ... [20:28:54] see? :) [20:28:59] !log Creating and populating site_identifiers table on all Wikipedias [20:29:05] Logged the message, Master [20:29:07] I should change that [20:29:14] New patchset: Aaron Schulz; "Added de-duplication stats to jobq graphs." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52458 [20:29:14] so that it's case insensitive [20:32:52] lol [20:34:46] elif line.startswith("!log "): => elif line.lower().startswith("!log "): [20:35:12] Will it let you chain calls like that? [20:35:20] robla: I don't know the answer to csteipp's clarification (if product should be involved). I'd lean to yes, just to make sure, but that's mostly because I don't know and I'd be overly cautious maybe [20:37:22] elif bool(re.match('\!log', line, re.I)): [20:37:28] Ryan_Lane: What's your preferred fix? :p [20:37:49] Reedy: if thats python, yes you can chain calls like that [20:37:54] Yeah [20:38:00] !log reedy synchronized php-1.21wmf11/extensions/DataValues/ [20:38:06] Logged the message, Master [20:40:28] New patchset: Reedy; "Make !log case insensitive" [operations/debs/adminbot] (master) - https://gerrit.wikimedia.org/r/52460 [20:41:01] Ryan_Lane: ^ There you go :p [20:42:58] New patchset: Alex Monk; "Stop trwiki sysops, bureaucrats and bots from being autopromoted" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52461 [20:43:31] New patchset: MaxSem; "WIP: OSM module" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/36222 [20:43:50] I'm sorry about all these autopromotion patches Reedy [20:43:55] The autopromotion config is really horrible [20:44:54] lol [20:46:44] Reedy: heh. thanks [20:47:42] Change merged: Ryan Lane; [operations/debs/adminbot] (master) - https://gerrit.wikimedia.org/r/52460 [20:54:47] !log reedy synchronized wmf-config/ [20:54:47] Logged the message, Master [20:59:45] RECOVERY - MySQL Replication Heartbeat on db71 is OK: OK replication delay seconds [21:01:06] csteipp: Ryan_Lane: greg-g: quick response on the HTTPS thing. I doubt Product has the cycles to think about this. I'd lean toward just doing it in a scheduled window, even though it's trivial from a deploy standpoing [21:01:14] *standpoint [21:02:00] scheduled window meaning next major rollout? [21:02:15] or just one we add to the calendar? [21:02:21] PROBLEM - Puppet freshness on constable is CRITICAL: Puppet has not run in the last 10 hours [21:02:41] let's put it on the deployment calendar (greg-g can do that), so that we have at least light notification for everyone. [21:02:41] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [21:02:43] PROBLEM - MySQL Replication Heartbeat on db71 is CRITICAL: CRIT replication delay 78513 seconds [21:02:47] cool [21:03:29] sometime next week-ish, so that it shows up in our Friday meetings [21:03:34] * Ryan_Lane nods [21:03:41] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [21:03:51] PROBLEM - Parsoid on constable is CRITICAL: Connection refused [21:05:07] robla: got it. [21:05:36] csteipp: Ryan_Lane preference on day? Monday? [21:06:32] New patchset: Jgreen; "make admins.pp handle additional unix groups, change ownership of script in fundraising.pp" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52464 [21:06:50] To hide the checkbox, we need code updates-- so it would be easier to wait until after wmf 12 [21:07:21] should be just a matter of removing it from the form, right? [21:07:22] * Ryan_Lane looks [21:07:40] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52464 [21:07:53] csteipp: after 12? or after 11 (what's gong to be done deploying next week)? [21:08:42] PROBLEM - Puppet freshness on ms1004 is CRITICAL: Puppet has not run in the last 10 hours [21:08:52] I get a weekly email from mozilla that's really annoying about https search, and this is needed before it, so I'm pretty stoked to get it done sooner rather than later :) [21:09:00] unless we do it in css, or we backport the changes to wmf 11, we would need to wait until wmf 12 is out on all wikis [21:09:13] we can backport into this branch [21:09:34] sorry, I thought you meant doing it *after* 12 is deployed, I was confused by that. So you mean *with* the wmf12 deploy? [21:09:39] ok. then yeah, maybe mon or tues? [21:09:53] ok, with backporting to 11, mon/tues [21:09:55] well, let me make sure the change is actually simple first :) [21:10:03] true [21:10:04] :) [21:10:08] greg-g: no, i meant after-- but if we backpoert then no holdup. [21:10:26] I'd hate to say "let's backport" to find out it's a giant pain in the ass. heh [21:10:33] csteipp: I'm confused, 12 isn't made yet, so it could be a part of it, no? [21:10:41] s/made/branched/ [21:11:25] right. if we fix today, then we have to wait for 12 to be deployed to all wikis before we can turn on the feature for all wikis. [21:11:39] New patchset: Jgreen; "Revert "make admins.pp handle additional unix groups, change ownership of script in fundraising.pp". Grr this is going to trip over broken things." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52465 [21:12:19] * Ryan_Lane grumbles [21:12:23] stupid non-htmlform forms [21:12:34] csteipp: I see, we're meaning the same thing, I think, I just wasn't aware it would be a deploy, then turn the toggle [21:12:50] no problem :) [21:12:53] New review: Hashar; "That might break any while loop existing in that script :(" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52185 [21:13:08] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52465 [21:13:13] Ryan_Lane: Making it a hidden instead of checkbox may be the easiest [21:13:36] well, I was going to remove the checkbox from the code altogether [21:13:37] To remove the checkbox and default everything will be a pain. [21:13:42] * Ryan_Lane nods [21:15:51] Guest16642: ;) [21:16:22] !log reedy synchronized php-1.21wmf10/extensions/ [21:16:28] Logged the message, Master [21:16:43] AaronSchulz: you rang? [21:16:56] just snickering at your nick changes [21:17:26] csteipp: hm. you aren't kidding. there's a lot of code for this [21:17:39] AaronSchulz: it's not me it's freenode [21:17:50] it's not me, it's her! [21:19:04] preilly: have you ever thought about working on hhvm? [21:19:23] AaronSchulz: what do you mean? [21:19:41] like actually coding it [21:19:52] AaronSchulz: yes [21:20:21] AaronSchulz: why do you ask? [21:21:48] preilly: just curious [21:22:14] heh, domas was trolling us last week about trying to get people to work at fb [21:25:44] New patchset: Hashar; "adapts lucene classes for beta" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/51677 [21:25:49] AaronSchulz: heh heh [21:27:41] RECOVERY - LVS HTTP IPv4 on parsoidcache.svc.pmtpa.wmnet is OK: HTTP OK: HTTP/1.1 200 OK - 1314 bytes in 0.062 second response time [21:32:45] New patchset: Lcarr; "switching constable to internal ip" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52467 [21:37:32] New patchset: Hashar; "adapts lucene classes for beta" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/51677 [21:41:01] New review: Apmon; "Looks like it is working well now and does what it is supposed to. At least it configured a fresh la..." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/36222 [21:43:01] PROBLEM - MySQL Slave Delay on db53 is CRITICAL: CRIT replication delay 186 seconds [21:54:06] New patchset: Reedy; "Enable Wikibase Client on all wikipedias" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52450 [21:54:16] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52450 [21:54:51] RECOVERY - MySQL Replication Heartbeat on db53 is OK: OK replication delay 24 seconds [21:55:07] RECOVERY - MySQL Slave Delay on db53 is OK: OK replication delay 19 seconds [21:55:17] !log reedy synchronized wmf-config/InitialiseSettings.php 'Enable Wikibase Client on all wikipedias' [21:55:23] Logged the message, Master [21:55:29] !log reedy cleared profiling data [21:55:35] Logged the message, Master [22:01:28] RoanKattouw_away: parsercache pmtpa is alive with constable!! [22:01:29] :) [22:01:36] now to destroy celsus [22:02:16] !log reedy synchronized wmf-config/InitialiseSettings.php [22:02:22] Logged the message, Master [22:06:21] PROBLEM - LVS Lucene on search-pool4.svc.pmtpa.wmnet is CRITICAL: Connection timed out [22:07:07] oh :-( [22:07:14] binasher around? [22:07:39] !log reedy synchronized wmf-config/InitialiseSettings.php [22:07:45] Logged the message, Master [22:09:03] PROBLEM - Lucene on search13 is CRITICAL: Connection timed out [22:09:04] !log reedy synchronized wikidataclient.dblist [22:09:10] Logged the message, Master [22:10:01] !log reedy synchronized wmf-config/CommonSettings.php [22:10:07] Logged the message, Master [22:11:13] notpeter: you about? [22:11:18] yeah [22:11:19] looking at it [22:11:22] !log reedy synchronized wmf-config/InitialiseSettings.php [22:11:33] Logged the message, Master [22:11:33] cool, was worried i was gonna have to figure it out. [22:11:36] i have the wikitech page up i was about to wade in ;] [22:11:58] well, asher and I have been doing a bunch of weird shit to the search cluster [22:12:04] and I have a todo to update that page ;) [22:12:14] New patchset: Reedy; "Rename wikidata.dblist to wikibase client" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52534 [22:12:31] PROBLEM - Puppet freshness on es3 is CRITICAL: Puppet has not run in the last 10 hours [22:13:00] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52534 [22:13:11] RECOVERY - LVS Lucene on search-pool4.svc.pmtpa.wmnet is OK: TCP OK - 0.027 second response time on port 8123 [22:14:01] RECOVERY - Lucene on search13 is OK: TCP OK - 0.027 second response time on port 8123 [22:14:12] notpeter: thx dude [22:15:23] I restarted the daemons a bit ago to pick up a new index, but the frontends seem to have died [22:15:26] woooo [22:15:40] PROBLEM - Puppet freshness on cp1003 is CRITICAL: Puppet has not run in the last 10 hours [22:15:54] New review: Nemo bis; "If it does what described (no SQL knowledge here), it makes sense: currently it's quite obscure." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52390 [22:18:34] New review: Nemo bis; "Haven't we just changed it the other way round?" [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/52393 [22:19:28] New patchset: Lcarr; "removing cronspam on successful run, subscribing to correct files" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52537 [22:20:06] !log reedy Started syncing Wikimedia installation... : Rebuild message cache for Wikibase deploy [22:20:12] Logged the message, Master [22:20:18] !log reedy synchronized php-1.21wmf11/extensions/Wikibase [22:20:19] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52467 [22:20:23] Logged the message, Master [22:23:03] New review: Nemo bis; "Probably it was just forgotten because we use UNCONFIRMED so little. Stats shouldn't be skewed too m..." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52395 [22:26:25] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52537 [22:27:07] New review: Nemo bis; "Dunno A34A3, but language looks good." [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/52400 [22:29:00] !log authdns-update [22:29:05] Logged the message, RobH [22:30:48] Reedy: sync-common takes forever ;p [22:30:59] (looking at snapshot1002) [22:31:17] not really suprising when it has to sync 11 copies of mediawiki + 2 localisation caches [22:31:23] * Reedy coughs [22:32:09] Anyone know why none of the esams squids are calling into puppet? [22:32:20] well, not none, but a ton are 3+ days since puppet run [22:33:17] RobH: is more likely they are not getting the snmp messages to icinga [22:33:30] nah, refuses to connect, testing now [22:33:30] try running one by hand [22:33:31] err: Could not retrieve catalog from remote server: Connection refused - connect(2) [22:33:38] yea, leslie is also checking now [22:34:02] the puppetmaster for esams is "puppet" [22:34:09] so it looks like they are going ot brewster for their puppetmaster [22:34:15] which i guess is supposed to proxy the connections through [22:34:22] since sockpuppet and stafford are internal [22:34:34] so let's check out brewster and see how it is(should be?) passing these requests on [22:35:06] brewster ran out of disk a couple days ago, but it was back to 77% now [22:35:17] maybe something died then [22:35:37] hrm [22:35:47] is it an unpuppetized undocumented critical service ? [22:35:51] cuz we don't have any of those ;) [22:36:13] it has to be [22:36:29] cuz the install server puppet class has nothign to do with puppet services redirection [22:36:41] New patchset: Aude; "Enable Wikibase dispatching changes to clients" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52540 [22:39:02] New patchset: Reedy; "Update size dblists" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52541 [22:39:06] !log reslaving db36 and db38 [22:39:11] Logged the message, notpeter [22:39:27] !log reedy Finished syncing Wikimedia installation... : Rebuild message cache for Wikibase deploy [22:39:29] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52541 [22:39:32] Logged the message, Master [22:39:41] PROBLEM - mysqld processes on db36 is CRITICAL: PROCS CRITICAL: 0 processes with command name mysqld [22:40:00] PROBLEM - mysqld processes on db38 is CRITICAL: PROCS CRITICAL: 0 processes with command name mysqld [22:40:44] !log starting haproxy on brewster [22:40:50] Logged the message, Master [22:41:01] root@amssq38:~# puppetd -tv [22:41:01] info: Loading facts in default_gateway [22:41:05] that works better now :p [22:42:16] ......oh look an undocumented and unpuppetized critical service! [22:42:16] New patchset: Reedy; "Enable Wikibase dispatching changes to clients" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52540 [22:42:18] RECOVERIES ,you may begin [22:42:18] sigh. [22:42:23] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52540 [22:43:43] mutante: gimme that command in here! [22:43:46] ;] [22:44:43] RobH: upgrade-helper [22:44:56] we created a new dsh group called "ams" you can use it now [22:45:40] New patchset: Ram; "Bug: 45795 Ignore non-existent host; reduces noise" [operations/debs/lucene-search-2] (master) - https://gerrit.wikimedia.org/r/52543 [22:48:00] RECOVERY - mysqld processes on db38 is OK: PROCS OK: 1 process with command name mysqld [22:53:13] !log reedy synchronized wmf-config/ [22:53:19] Logged the message, Master [22:53:54] !log reedy synchronized wmf-config/ [22:54:00] Logged the message, Master [22:54:42] anyone around to vet something I want to do to transfer a few files from stat1 to labs? [22:55:25] so we actually on purpose restrict prod <-> labs [22:55:26] what're you thinking of ? [22:55:53] LeslieCarr: so, this is data (csv files) needing to go from stat1 to labs [22:56:13] are they scrubbed of any personal information ? [22:56:19] LeslieCarr: so, idea is 1. create special purpose user on github / gerrit 2. have them push code to https://github.com/wikimedia/limn-mobile-data (they will have no other rights) 3. labs instance pulls from that [22:56:29] LeslieCarr: yes. they are just aggregate data. [22:56:45] stat1 has a webserver and labs instances have wget .. just saying [22:57:14] we can make an exception in the acl for stat1 [22:57:17] if the data is scrubbed that is [22:57:17] if desired [22:57:19] rt ticket [22:57:39] yes, because labs isn't approved for storage of any personally identifiable information [22:58:01] YuviPanda: .csv and really public? just put in the document root? [22:58:33] mutante: I... was not aware that stat1 had a web server. [22:58:37] http://stat1.wikimedia.org/ [22:58:54] it doesnt appear to have content, but there is one.. [22:59:06] haha [22:59:17] that should work [22:59:26] and yes, it is scrubbed. [22:59:46] thank you mutante, LeslieCarr [23:00:06] also sample data is https://github.com/wikimedia/limn-mobile-data/tree/master/data/datafiles, just in case you want to be doubly sure [23:00:11] (that it has no personal data) [23:00:25] (I pushed those by scping from stat1 to my local and then pushing) [23:00:31] but I'll replace those with a script [23:01:59] ok, if you make an rt ticket we can allow an exception to stat1 port 80 [23:02:02] from labs [23:02:08] New patchset: Ram; "Bug: 45795 Add explicit property identifying null host." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52547 [23:03:35] LeslieCarr: hmm, I already see a few other folders there [23:05:46] !log reedy synchronized wmf-config 'touch' [23:05:52] Logged the message, Master [23:06:43] !log reedy synchronized php-1.21wmf11/resources/ 'touch' [23:06:49] Logged the message, Master [23:07:25] !log reedy synchronized php-1.21wmf10/resources/ 'touch' [23:07:30] Logged the message, Master [23:08:48] New patchset: Lcarr; "updating upgrade script and moving celsus to internal" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52548 [23:14:24] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52548 [23:18:30] PROBLEM - Host celsus is DOWN: PING CRITICAL - Packet loss = 100% [23:19:06] !log reinstalling celsus as an internal host [23:19:11] Logged the message, Mistress of the network gear. [23:19:54] New patchset: Lcarr; "moving celsus to internal" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52552 [23:27:02] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52552 [23:38:33] New review: Reedy; "Ganglia? You're confused with icinga replacing nagios ;)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/37441 [23:42:40] PROBLEM - Puppet freshness on virt1005 is CRITICAL: Puppet has not run in the last 10 hours [23:54:14] New patchset: Ori.livneh; "Add $wgGettingStartedCategories config var" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/52555