[00:00:51] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Tue 11 Mar 2014 08:47:37 PM UTC [00:01:13] bblack: it's just labs! i promise to wait at least a week before suggesting it graduate to prod [00:03:37] (03PS1) 10Reedy: Decommission ssl[1-4] [operations/puppet] - 10https://gerrit.wikimedia.org/r/118643 [00:05:11] (03PS1) 10Reedy: Remove ssl[1-4]. Leave mgmt [operations/dns] - 10https://gerrit.wikimedia.org/r/118644 [00:08:31] PROBLEM - ElasticSearch health check on elastic1003 is CRITICAL: CRITICAL - elasticsearch (production-search-eqiad) is running. status: red: timed_out: false: number_of_nodes: 16: number_of_data_nodes: 16: active_primary_shards: 1287: active_shards: 3800: relocating_shards: 0: initializing_shards: 2: unassigned_shards: 4 [00:08:31] PROBLEM - ElasticSearch health check on elastic1001 is CRITICAL: CRITICAL - elasticsearch (production-search-eqiad) is running. status: red: timed_out: false: number_of_nodes: 16: number_of_data_nodes: 16: active_primary_shards: 1287: active_shards: 3800: relocating_shards: 0: initializing_shards: 2: unassigned_shards: 4 [00:09:31] RECOVERY - ElasticSearch health check on elastic1001 is OK: OK - elasticsearch (production-search-eqiad) is running. status: green: timed_out: false: number_of_nodes: 16: number_of_data_nodes: 16: active_primary_shards: 1309: active_shards: 3866: relocating_shards: 0: initializing_shards: 0: unassigned_shards: 0 [00:09:31] RECOVERY - ElasticSearch health check on elastic1003 is OK: OK - elasticsearch (production-search-eqiad) is running. status: green: timed_out: false: number_of_nodes: 16: number_of_data_nodes: 16: active_primary_shards: 1309: active_shards: 3866: relocating_shards: 0: initializing_shards: 0: unassigned_shards: 0 [00:13:11] (03CR) 10Dzahn: Initial commit of pmacct module and role (035 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/115345 (owner: 10Jkrauska) [00:25:42] ori: (or anyone else) easy peasy CR: https://gerrit.wikimedia.org/r/118421 [00:28:48] (03CR) 10Manybubbles: [C: 031] "So long as we don't allow thousands of people access to the script we're just fine." [operations/puppet] - 10https://gerrit.wikimedia.org/r/117647 (owner: 10Ori.livneh) [00:39:51] (03CR) 10Manybubbles: "Another useful thing: you can add ?timeout=30s to the search url and Elasticsearch will make an effort to stop and return then. It isn't " [operations/puppet] - 10https://gerrit.wikimedia.org/r/117647 (owner: 10Ori.livneh) [01:04:41] (03PS5) 10Dzahn: turn RT from misc/* into puppet module [operations/puppet] - 10https://gerrit.wikimedia.org/r/116064 [01:05:08] (03CR) 10Dzahn: turn RT from misc/* into puppet module (033 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/116064 (owner: 10Dzahn) [01:17:14] (03PS1) 10Gerrit Patch Uploader: Add namespace aliases for shwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118654 [01:17:20] (03CR) 10Gerrit Patch Uploader: "This commit was uploaded using the Gerrit Patch Uploader [1]." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118654 (owner: 10Gerrit Patch Uploader) [01:20:10] (03PS6) 10Dzahn: turn RT from misc/* into puppet module [operations/puppet] - 10https://gerrit.wikimedia.org/r/116064 [01:35:57] (03CR) 10Hoo man: [C: 04-1] "This also contains aliases which should almost certainly rather go into MediaWiki. See also https://bugzilla.wikimedia.org/show_bug.cgi?id" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118654 (owner: 10Gerrit Patch Uploader) [01:41:43] (03CR) 10Reedy: Make puppet cronjob to run AbuseFilter/maintenance/purgeOldLogIPData.php (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/81257 (owner: 10Reedy) [01:42:40] (03CR) 10Reedy: Make puppet cronjob to run SecurePoll/cli/purgePrivateVoteData.php (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/74592 (owner: 10Reedy) [01:53:08] (03PS2) 10Gerrit Patch Uploader: Add namespace aliases for shwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118654 [01:53:10] (03CR) 10Gerrit Patch Uploader: "This commit was uploaded using the Gerrit Patch Uploader [1]." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118654 (owner: 10Gerrit Patch Uploader) [01:54:39] (03PS3) 10Gerrit Patch Uploader: Add namespace aliases for shwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118654 [01:54:40] (03CR) 10Gerrit Patch Uploader: "This commit was uploaded using the Gerrit Patch Uploader [1]." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118654 (owner: 10Gerrit Patch Uploader) [01:56:20] (03CR) 10Ori.livneh: [C: 032] Fix title display in mwgrep [operations/puppet] - 10https://gerrit.wikimedia.org/r/118421 (owner: 10Hoo man) [02:00:29] Thanks for the web fonts post, paravoid. [02:04:34] (03CR) 10Kolega2357: [C: 031] Add namespace aliases for shwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118654 (owner: 10Gerrit Patch Uploader) [02:12:54] !log LocalisationUpdate completed (1.23wmf17) at 2014-03-14 02:12:53+00:00 [02:13:12] Logged the message, Master [02:47:13] !log LocalisationUpdate completed (1.23wmf18) at 2014-03-14 02:47:13+00:00 [02:47:22] Logged the message, Master [02:47:51] !log ori synchronized php-1.23wmf18/includes/resourceloader/ResourceLoaderStartUpModule.php 'Emit as JavaScript config variable' [02:47:59] Logged the message, Master [02:48:27] !log ori synchronized php-1.23wmf17/includes/resourceloader/ResourceLoaderStartUpModule.php 'Emit as JavaScript config variable' [02:48:36] Logged the message, Master [03:01:51] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Tue 11 Mar 2014 08:47:37 PM UTC [03:20:15] !log LocalisationUpdate ResourceLoader cache refresh completed at Fri Mar 14 03:20:12 UTC 2014 (duration 20m 11s) [03:20:23] Logged the message, Master [03:29:35] !log springle synchronized wmf-config/db-eqiad.php 's1 depool db1034, xtrabackup clone to db1062' [03:29:44] Logged the message, Master [03:35:00] !log ori synchronized php-1.23wmf18/includes/resourceloader/ResourceLoaderStartUpModule.php 'Ibeda834e9: Emit $wgSearchType as JavaScript config variable' [03:35:08] Logged the message, Master [03:35:22] !log ori synchronized php-1.23wmf17/includes/resourceloader/ResourceLoaderStartUpModule.php 'Ibeda834e9: Emit $wgSearchType as JavaScript config variable' [03:35:31] Logged the message, Master [03:48:28] (03CR) 10Chad: [C: 032] Small wikis done building [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118638 (owner: 10Chad) [03:48:39] (03Merged) 10jenkins-bot: Small wikis done building [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118638 (owner: 10Chad) [03:51:53] !log demon synchronized wmf-config/InitialiseSettings.php 'small wikis done building' [03:52:02] Logged the message, Master [03:53:27] <^d> springle: I had to git stash and then reapply your db-eqiad change. Might want to commit if you don't want it lost :) [03:54:31] oh sorry [03:54:40] * springle cleans up after himself [04:07:50] (03PS1) 10Springle: Misc DB-related scripts. [operations/software] - 10https://gerrit.wikimedia.org/r/118660 [04:09:00] (03PS1) 10coren: Tool Labs: prepare mailrelay for tool processing [operations/puppet] - 10https://gerrit.wikimedia.org/r/118661 [04:09:26] (03CR) 10Springle: [C: 032] Misc DB-related scripts. [operations/software] - 10https://gerrit.wikimedia.org/r/118660 (owner: 10Springle) [04:12:17] (03CR) 10coren: [C: 032] "Relatively simple change." [operations/puppet] - 10https://gerrit.wikimedia.org/r/118661 (owner: 10coren) [04:18:42] (03PS2) 10Springle: db10: decom [operations/dns] - 10https://gerrit.wikimedia.org/r/118388 (owner: 10Matanya) [04:19:42] (03CR) 10Springle: [C: 032] db10: decom [operations/dns] - 10https://gerrit.wikimedia.org/r/118388 (owner: 10Matanya) [04:42:11] (03PS1) 10Chad: Opt everyone into Cirrus as a beta feature by default [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118663 [05:07:44] ori: fine, I really haven't had time to look at it in depth, but if the Get/Set header thing is a hard varnish limitation, we may as well try it on labs and see how it goes. I really wish there were a better way to fix it though. [06:02:51] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Tue 11 Mar 2014 08:47:37 PM UTC [06:04:57] bblack: <3 [06:06:07] (03PS3) 10Ori.livneh: Emit GeoIP cookie using dedicated Set-Cookie header [operations/puppet] - 10https://gerrit.wikimedia.org/r/117375 [06:07:22] (03CR) 10Ori.livneh: [C: 032 V: 032] " ori: fine, I really haven't had time to look at it in depth, but if the Get/Set header thing is a hard varnish limitation, we may" [operations/puppet] - 10https://gerrit.wikimedia.org/r/117375 (owner: 10Ori.livneh) [06:07:39] (03PS2) 10Ori.livneh: Re-enable GeoIP Set-Cookie on Labs [operations/puppet] - 10https://gerrit.wikimedia.org/r/117612 [06:07:44] (03CR) 10Ori.livneh: [C: 032 V: 032] Re-enable GeoIP Set-Cookie on Labs [operations/puppet] - 10https://gerrit.wikimedia.org/r/117612 (owner: 10Ori.livneh) [06:34:38] fatal: unable to access 'https://gerrit.wikimedia.org/r/operations/puppet/': Empty reply from server [06:34:38] odd [06:52:23] (03PS2) 10Ori.livneh: mwgrep: add '--user' option for searching NS_USER [operations/puppet] - 10https://gerrit.wikimedia.org/r/117647 [06:53:48] (03PS1) 10Springle: S1 pool db1062, depool db1034 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118668 [06:54:08] (03PS3) 10Ori.livneh: mwgrep: add '--user' option for searching NS_USER [operations/puppet] - 10https://gerrit.wikimedia.org/r/117647 [06:54:20] (03CR) 10Springle: [C: 032] S1 pool db1062, depool db1034 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118668 (owner: 10Springle) [06:54:29] (03Merged) 10jenkins-bot: S1 pool db1062, depool db1034 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118668 (owner: 10Springle) [06:57:18] (03CR) 10Ori.livneh: [C: 032] mwgrep: add '--user' option for searching NS_USER [operations/puppet] - 10https://gerrit.wikimedia.org/r/117647 (owner: 10Ori.livneh) [06:58:53] !log springle synchronized wmf-config/db-eqiad.php 's1 pool db1062' [06:59:02] Logged the message, Master [07:18:11] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [07:21:11] PROBLEM - check_job_queue on terbium is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 199,999 jobs: , Total (201256) [07:26:11] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [07:32:11] PROBLEM - check_job_queue on terbium is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 199,999 jobs: , Total (201579) [07:33:11] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [07:33:16] does shell access to a server listed in a nova resource require more steps than an admin adding someone as a member on the relevant wikitech page? [07:40:34] if you're already in labs, no, that should be it [07:40:47] apergos: so it can be done entirely on-wiki, right? [07:41:16] yes, wher by 'wiki' we mean 'a special page that does a lot of back end stuff behind the scenes' :-) [07:41:28] sure, thanks :) [07:42:10] one day we'll be setting up skynet on-wiki and someone will make a bad edit [07:42:18] and that will be how it all begins... [07:42:41] or ends [08:06:10] (03CR) 10Alexandros Kosiaris: [C: 032] nfs: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/109081 (owner: 10Matanya) [08:18:18] (03PS1) 10Ori.livneh: mwgrep: Improve in-line help [operations/puppet] - 10https://gerrit.wikimedia.org/r/118671 [08:19:12] (03CR) 10Ori.livneh: [C: 032 V: 032] mwgrep: Improve in-line help [operations/puppet] - 10https://gerrit.wikimedia.org/r/118671 (owner: 10Ori.livneh) [08:41:06] (03CR) 10Matanya: Initial commit of pmacct module and role (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/115345 (owner: 10Jkrauska) [08:58:08] (03CR) 10Alexandros Kosiaris: [C: 032] certs: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/110366 (owner: 10Matanya) [09:03:51] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Tue 11 Mar 2014 08:47:37 PM UTC [09:22:38] (03CR) 10Hashar: [C: 031] Tools: Install package joe [operations/puppet] - 10https://gerrit.wikimedia.org/r/118595 (owner: 10Tim Landscheidt) [09:26:02] !log upgraded php5 on apt.wikimedia.org to php5_5.3.10-1ubuntu3.10+wmf1. [09:26:11] Logged the message, Master [09:57:56] !log springle synchronized wmf-config/db-eqiad.php 's1 db1062 full steam' [09:58:05] Logged the message, Master [09:59:58] (03PS3) 10Hashar: adding Amsterdam Museum to the wgCopyUploadsDomains array. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118342 (owner: 10Dan-nl) [10:00:13] (03CR) 10Hashar: [C: 032] "Deploying. Thanks dan-nl && aude!" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118342 (owner: 10Dan-nl) [10:00:20] (03Merged) 10jenkins-bot: adding Amsterdam Museum to the wgCopyUploadsDomains array. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118342 (owner: 10Dan-nl) [10:01:11] PROBLEM - check_job_queue on terbium is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 199,999 jobs: , Total (204172) [10:01:50] !log hashar synchronized wmf-config/InitialiseSettings.php 'adding Amsterdam Museum to the wgCopyUploadsDomains {{gerrit|118342}}' [10:01:59] Logged the message, Master [10:02:04] (03CR) 10Hashar: "deployed" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118342 (owner: 10Dan-nl) [10:03:02] enwiki has: ParsoidCacheUpdateJobOnDependencyChange: 134858 queued; [10:03:03] :-( [10:10:00] looks like that job is lagging behind, there is a bunch of entries which are 2+ days old [10:35:11] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [10:39:11] PROBLEM - check_job_queue on terbium is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 199,999 jobs: , Total (200365) [10:40:11] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [10:40:33] (03PS3) 10Hashar: deployment::target does not work in labs, skip it [operations/puppet] - 10https://gerrit.wikimedia.org/r/115624 [10:40:56] could I possibly get https://gerrit.wikimedia.org/r/#/c/115624/ merged in please? It is to get rid of deployment::target in applicationserver when being run on labs. [10:41:43] (03CR) 10Alexandros Kosiaris: [C: 032] deployment::target does not work in labs, skip it [operations/puppet] - 10https://gerrit.wikimedia.org/r/115624 (owner: 10Hashar) [10:42:14] i hate trusty already :-( [10:42:29] hashar: done [10:48:35] thanks ! [10:48:49] akosiaris: did one of you managed to create a labs image for trusty ? [10:48:57] or are you guys still working on making puppet work on it? [10:52:33] i am trying to even get puppet installed on trusty as we speak [10:52:36] and it is not cooperating [10:53:01] I am this close to forward porting the entire universe to trusty [10:55:26] hey [10:55:38] it used to work, then trusty removed more packages [10:56:29] I debugged it a bit, it just needs libxmlrpc-ruby and libopenssl-ruby which are transitional/empty packages I think [10:56:56] virtual [10:57:01] provided by libruby [10:57:15] yeah whatever, just equivs it :) [10:57:51] I don't think that alone will work though [10:58:35] ruby conflicts with ruby1.8 and facter wants ruby and puppet wants 1.8 and blah blah. tired ... [11:00:18] no, we have a forward-ported facter [11:00:20] that's not a problem [11:00:38] we did not [11:00:41] I just did that [11:00:46] ah [11:00:48] even without that [11:00:56] facter depends on ruby | ruby-interpreter [11:01:02] ruby1.8 Provides: ruby-interpreter [11:01:02] I think if i have puppet depend on ruby [11:01:07] so the dependency would be satisfied [11:01:09] and not ruby1.8 it might solve this [11:01:29] what is the problem though? [11:01:39] besides libxmlrpc-ruby and libopenssl-ruby [11:02:49] puppet-common : Depends: libxmlrpc-ruby which is a virtual package. [11:02:49] Depends: libopenssl-ruby which is a virtual package. [11:02:50] Depends: libshadow-ruby1.8 but it is not going to be installed. [11:02:50] Depends: libaugeas-ruby1.8 but it is not going to be installed. [11:02:50] Depends: facter but it is not going to be installed. [11:02:51] ruby : Conflicts: ruby1.8 but 1.8.7.358-8ubuntu3 is to be installed. [11:03:33] some of these i already solved by forward porting facter and remove the virtual ones from being a dependency [11:04:00] i am left with libaugeas-ruby1.8 which is a transitional package [11:04:12] just reprepro include all of augeas? [11:04:15] from precise [11:04:21] no need to build anything [11:04:24] hmmm [11:04:29] ok let's try this [11:04:33] I doubt augeas is being used anywhere but puppet :) [11:04:55] i think we will solve it for now, I am worried what is going to happen in 20 days again [11:05:05] hopefully trusty will be more stable by then ? [11:05:11] let's finish with puppet3 by then? :) [11:05:12] it is like a month before the release date ? [11:05:44] niah. Mir is a better option :P [11:06:00] oh wait. It aint gonna be here till 16.04 [11:06:23] why not? [11:06:24] ;) [11:07:03] http://www.phoronix.com/scan.php?page=news_item&px=MTYyODg [11:07:43] lol okay [11:09:36] !log maxsem synchronized php-1.23wmf17/extensions/MobileFrontend 'https://gerrit.wikimedia.org/r/118681' [11:09:45] Logged the message, Master [11:10:08] Ubuntu's not pleased with the scale of Unity's failure, wants it to fail even more with Mir? [11:10:25] exactly [11:10:58] !log maxsem synchronized php-1.23wmf18/extensions/MobileFrontend 'https://gerrit.wikimedia.org/r/118681' [11:11:06] Logged the message, Master [11:12:14] as some of you know, I'm relocating to SF. initially I thought I'd choose an Ubuntu laptop as my WMF machine. but after playing with Unity and KDE in VMs, I'd rather choose a Mac [11:15:23] MaxSem: have you tried Gnome 3 ? [11:15:32] I did [11:15:42] and ? [11:15:52] oh! I didn't know! [11:15:56] relocating, how come? [11:16:05] and a bunch of Dock copycats [11:16:38] reedy said he is as well, our timezone is going to be more quiet :( [11:16:44] if Linux can't even reproduce W7 in usability, it's doomed [11:17:07] even microsoft can not reproduced W7 in usability [11:17:13] reproduce* [11:17:48] paravoid, but we're hiring 2 Italians to replace us, right?:P [11:18:19] we are, but it's opsens [11:18:23] that's monoculture ;) [11:18:53] Nice, Greek ops cabal expanding [11:19:03] To Magna Graecia this time [11:19:09] haha [11:21:38] I'm devastated for the loss of Reedy from our continent :( [11:21:39] (03PS1) 10Hashar: dataset: code hygiene for rsync-dumps.py [operations/puppet] - 10https://gerrit.wikimedia.org/r/118687 [11:21:49] OTOH he only followed reedy-TZ [11:21:58] yeah :( [11:22:21] platform should really hire someone in EUR [11:22:25] gtg [11:22:41] are you going to the US _now_? [11:22:42] :P [11:25:08] MaxSem: are you moving as retortion against Russia for Crimea or something? [11:30:01] Nemo_bis: well reedy was already working more or less on on SF timezone [11:30:03] same for timo :/ [11:32:25] what SF timezone? [11:33:03] hahaha [11:44:09] ... [11:44:18] ori: have a good night :-] [11:44:23] heading out for lunch [11:47:12] good night hashar [11:47:26] hashar: i re-enabled the geoip cookie thing with bblack's blessing on beta [11:47:39] let me know if there are any troubles. it should not clobber any existing cookies any more [12:03:09] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [12:03:59] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Tue 11 Mar 2014 08:47:37 PM UTC [12:05:09] PROBLEM - Redis on tantalum is CRITICAL: Connection refused [13:00:09] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [13:37:31] (03CR) 10Hashar: [C: 031] "Thanks! That should fix up the deployment class on the beta cluster eqiad instance (ex: deployment-apache01.eqiad.wmnet )" [operations/puppet] - 10https://gerrit.wikimedia.org/r/118071 (owner: 10Hashar) [13:38:40] (03CR) 10coren: [C: 032] "This should fix all the l10nupdate issues." [operations/puppet] - 10https://gerrit.wikimedia.org/r/118071 (owner: 10Hashar) [13:42:58] (03PS1) 10coren: beta: fix really broken dependencies [operations/puppet] - 10https://gerrit.wikimedia.org/r/118696 [13:44:08] (03CR) 10Hashar: [C: 031] "The usual un spottable puppet mistakes :]" [operations/puppet] - 10https://gerrit.wikimedia.org/r/118696 (owner: 10coren) [13:45:14] * Coren (im)patiently waits for Jenkins. [13:45:30] (03CR) 10coren: [C: 032] "D'oh!" [operations/puppet] - 10https://gerrit.wikimedia.org/r/118696 (owner: 10coren) [13:52:59] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1582.766724 [13:56:59] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [14:08:59] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 2106.833252 [14:13:59] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [14:26:59] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1270.400024 [14:27:29] RECOVERY - DPKG on virt1006 is OK: All packages OK [14:30:59] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [14:31:53] (03PS1) 10Hashar: beta: skip mwdeploy user creation [operations/puppet] - 10https://gerrit.wikimedia.org/r/118699 [14:33:12] (03PS1) 10Alexandros Kosiaris: Fix bug introduced in 577ea6a [operations/puppet] - 10https://gerrit.wikimedia.org/r/118700 [14:34:47] (03CR) 10Alexandros Kosiaris: [C: 032] Fix bug introduced in 577ea6a [operations/puppet] - 10https://gerrit.wikimedia.org/r/118700 (owner: 10Alexandros Kosiaris) [14:35:21] (03CR) 10coren: [C: 032] "This should do the trick." [operations/puppet] - 10https://gerrit.wikimedia.org/r/118699 (owner: 10Hashar) [14:36:47] argh.... Error 400 on SERVER: Duplicate definition: Group[l10nupdate] is already defined in file /etc/puppet/manifests/admins.pp at line 99; cannot redefine at /etc/puppet/modules/generic/manifests/systemuser.pp:9 on node fenari.wikimedia.org [14:37:19] PROBLEM - check_job_queue on terbium is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 199,999 jobs: , Total (200821) [14:40:07] akosiaris: Hm. Probably related to what hashar has been doing. The patches I've seen are /removing/ some user creations, but he's doing something related to l10nupdate [14:43:59] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1273.0 [14:44:13] (03PS1) 10Hashar: beta: fix l10nupdate homedir [operations/puppet] - 10https://gerrit.wikimedia.org/r/118701 [14:45:35] (03PS1) 10Alexandros Kosiaris: Fix second bug introduced in 577ea6a [operations/puppet] - 10https://gerrit.wikimedia.org/r/118702 [14:45:59] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [14:48:07] (03CR) 10Alexandros Kosiaris: [C: 032] Fix second bug introduced in 577ea6a [operations/puppet] - 10https://gerrit.wikimedia.org/r/118702 (owner: 10Alexandros Kosiaris) [14:48:21] (03PS2) 10Alexandros Kosiaris: beta: fix l10nupdate homedir [operations/puppet] - 10https://gerrit.wikimedia.org/r/118701 (owner: 10Hashar) [14:48:27] thx :] [14:48:44] look at the 118702 [14:48:57] I had to undo the group creation [14:48:58] :-( [14:49:08] fenari was complaining [14:49:31] the idea was to get rid of the hardcoded 10002 [14:49:40] (03CR) 10Alexandros Kosiaris: [C: 032] beta: fix l10nupdate homedir [operations/puppet] - 10https://gerrit.wikimedia.org/r/118701 (owner: 10Hashar) [14:50:56] well, then some more changes at least for fenari are needed [14:54:59] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1276.199951 [14:57:59] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [14:58:13] akosiaris: idea why redis on tantalum is down yet? [14:58:53] mutante: tantalum is trusty [14:58:55] oh, JeffGreen setting it up, [14:59:03] yeah [14:59:09] yea, right, Redis monitoring shows up in Icinga already [14:59:18] wonder if i should start it [15:00:05] ACKNOWLEDGEMENT - Redis on tantalum is CRITICAL: Connection refused daniel_zahn still being setup [15:04:59] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Tue 11 Mar 2014 08:47:37 PM UTC [15:04:59] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1286.800049 [15:06:59] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [15:14:59] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1470.800049 [15:15:59] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [15:16:10] huhm hum [15:21:59] (03PS1) 10Hashar: contint: remove dsc sudo from servers [operations/puppet] - 10https://gerrit.wikimedia.org/r/118706 [15:23:15] (03CR) 10Dzahn: [C: 032] contint: remove dsc sudo from servers [operations/puppet] - 10https://gerrit.wikimedia.org/r/118706 (owner: 10Hashar) [15:23:59] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 625.06665 [15:26:52] (03PS1) 10Hashar: beta: define /a on bastion [operations/puppet] - 10https://gerrit.wikimedia.org/r/118707 [15:26:59] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [15:31:03] (03CR) 10Dzahn: [C: 032] "that's /a just like on tin" [operations/puppet] - 10https://gerrit.wikimedia.org/r/118707 (owner: 10Hashar) [15:33:59] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 974.200012 [15:35:59] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [15:36:48] (03CR) 10Manybubbles: [C: 031] Opt everyone into Cirrus as a beta feature by default [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118663 (owner: 10Chad) [15:42:59] PROBLEM - Puppet freshness on tantalum is CRITICAL: Last successful Puppet run was Fri 14 Mar 2014 12:42:11 PM UTC [15:43:59] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 935.866638 [15:45:59] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [15:47:03] greg-g: The source of the udp2log explosion on fluorine isn't as obvious this time as it was before. So far analytics isn't too excited about fixing the bug that makes the demuxer barf: https://bugzilla.wikimedia.org/show_bug.cgi?id=62082 [15:49:38] <^d> I love it when people sign their bug comments. [15:51:11] ^d: me too! -Greg [15:51:37] bd808: pinged on it ;) [15:53:59] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 666.666687 [15:55:36] (03PS1) 10Hashar: beta: get rid of /home/wikipedia/logs shortcut [operations/puppet] - 10https://gerrit.wikimedia.org/r/118708 [15:57:36] (03CR) 10coren: [C: 032] "Seems sane." [operations/puppet] - 10https://gerrit.wikimedia.org/r/118708 (owner: 10Hashar) [15:58:59] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [16:06:59] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 641.56665 [16:10:00] (03PS1) 10Hashar: Fix error: timidity service can not be stopped [operations/puppet] - 10https://gerrit.wikimedia.org/r/118709 [16:10:43] (03CR) 10Hashar: "Not sure who can review this. I feel it is going to be safe :-]" [operations/puppet] - 10https://gerrit.wikimedia.org/r/118709 (owner: 10Hashar) [16:10:59] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [16:19:59] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 939.466675 [16:22:30] (03CR) 10Hashar: "An alternative would be to ensure timidity-daemon package is installed and then ensure the service is stopped. That sounds dumb to me :-]" [operations/puppet] - 10https://gerrit.wikimedia.org/r/118709 (owner: 10Hashar) [16:22:59] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [16:24:19] RECOVERY - SSH on carbon is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:32:11] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1502.866699 [16:33:59] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [16:39:59] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 2315.733398 [16:40:18] (03PS1) 10Cmjohnson: Adding mgmt dns for ms-be10[13-15] [operations/dns] - 10https://gerrit.wikimedia.org/r/118713 [16:41:59] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [16:47:45] (03CR) 10Hashar: generic: lint clean (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/107037 (owner: 10Matanya) [16:50:38] (03CR) 10Cmjohnson: [C: 032] Adding mgmt dns for ms-be10[13-15] [operations/dns] - 10https://gerrit.wikimedia.org/r/118713 (owner: 10Cmjohnson) [16:52:27] (03PS1) 10Hashar: Restore generic::upstart_job parameters [operations/puppet] - 10https://gerrit.wikimedia.org/r/118714 [16:52:36] (03CR) 10Hashar: "Back compatibility hack with https://gerrit.wikimedia.org/r/118714" [operations/puppet] - 10https://gerrit.wikimedia.org/r/107037 (owner: 10Matanya) [16:52:40] (03PS1) 10Cmjohnson: Fixing ms-be1011 typo [operations/dns] - 10https://gerrit.wikimedia.org/r/118715 [16:53:02] (03CR) 10Cmjohnson: [C: 032] Fixing ms-be1011 typo [operations/dns] - 10https://gerrit.wikimedia.org/r/118715 (owner: 10Cmjohnson) [16:54:03] I found out an issue with generic::upstart_job() puppet define no more honoring its install and start parameter. The function uses boolean internally when the call are made using strings :-( https://gerrit.wikimedia.org/r/#/c/118714/ [16:54:14] roots: ^^^ :-] [16:54:38] that cause for example the twemproxy to never be properly installed / started [16:54:39] :(- [16:58:54] boolean as string? oh wow, that's always been "could cause problems" but this is very real, nice [16:59:31] and yea, exmaple of lint changes being more than just style [17:03:39] (03PS1) 10Hashar: openstack: generic_upstart now use boolean values [operations/puppet] - 10https://gerrit.wikimedia.org/r/118716 [17:04:08] (03PS1) 10Hashar: lvs: generic_upstart now use boolean values [operations/puppet] - 10https://gerrit.wikimedia.org/r/118717 [17:04:34] (03PS1) 10Hashar: twemproxy: generic_upstart now use boolean values [operations/puppet] - 10https://gerrit.wikimedia.org/r/118718 [17:05:13] (03CR) 10Hashar: "lvs, twemproxy and openstack calls are fixed in child changes." [operations/puppet] - 10https://gerrit.wikimedia.org/r/118714 (owner: 10Hashar) [17:05:28] mutante: I think I fixed them all [17:05:32] +added some back compatibility [17:05:37] might want to handle that on monday [17:05:37] :-] [17:06:06] starting new instances is nice to find out bugs in our puppet manifests :] [17:10:04] (03CR) 10CSteipp: "We really need to get these merged. If the syntax is really a blocker, can we get someone in ops to help correct it to whatever they prefe" [operations/puppet] - 10https://gerrit.wikimedia.org/r/81257 (owner: 10Reedy) [17:11:58] hashar: thanks! [17:16:06] disappearing, see you on monday folks [17:17:51] hashar: have a good weekend [17:18:01] bye hashar [17:19:26] greg-g: got good progress on migrating beta to eqiad :] [17:19:49] greg-g: will look at migrating some data to eqiad next week. And our favorite dba should create the sql instance next week! [17:19:52] have a good afternoon! [17:20:10] hashar: awesome!...... [18:00:09] PROBLEM - MySQL InnoDB on db1038 is CRITICAL: CRIT longest blocking idle transaction sleeps for 2147483647 seconds [18:03:09] RECOVERY - MySQL InnoDB on db1038 is OK: OK longest blocking idle transaction sleeps for 0 seconds [18:05:59] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Tue 11 Mar 2014 08:47:37 PM UTC [18:16:30] (03Abandoned) 10Tim Landscheidt: WIP: Tools: Add infrastructure for AWStats [operations/puppet] - 10https://gerrit.wikimedia.org/r/80332 (owner: 10Tim Landscheidt) [18:17:52] ^d: manybubbles: last sanity check: solr1-3 in Tampa can die, right, physically shutting down [18:18:27] <^d> they're not mine :p [18:18:37] mutante: huh..... I don't use them Nikerabbit probably doesn't either [18:18:43] they aren't for geo, I imagine [18:18:46] I think we're out [18:19:00] include role::solr::geodata [18:19:04] but the change would be [18:19:09] node /^solr100[1-3]\.eqiad\.wmnet/ { [18:19:11] intead of [18:19:16] node /^solr(100)?[1-3]\.(eqiad|pmtpa)\.wmnet/ { [18:19:56] https://gerrit.wikimedia.org/r/#/c/118632/3 [18:22:07] * ^d is double checking wmf config [18:22:19] thank you [18:23:06] <^d> GeoData, not TM. [18:23:13] <^d> And we already point at eqiad. [18:25:46] great, let's save some more Wh in ptmpa then [18:26:05] removes from puppet [18:26:35] pmtpa wow, i typed it too much [18:27:38] (03PS4) 10Dzahn: Decommission solr[1-3] [operations/puppet] - 10https://gerrit.wikimedia.org/r/118632 (owner: 10Reedy) [18:30:16] plop [18:38:55] (03CR) 10Dzahn: [C: 032] Decommission solr[1-3] [operations/puppet] - 10https://gerrit.wikimedia.org/r/118632 (owner: 10Reedy) [18:40:31] wow, puppet-merge is super slow today [18:40:48] i see it counting up objects for a while [18:43:38] !log solr[12] - disable puppet, puppetstoredconfigclean, remove from icinga [18:43:45] Logged the message, Master [18:43:59] PROBLEM - Puppet freshness on tantalum is CRITICAL: Last successful Puppet run was Fri 14 Mar 2014 12:42:11 PM UTC [18:45:03] ACKNOWLEDGEMENT - Puppet freshness on tantalum is CRITICAL: Last successful Puppet run was Fri 14 Mar 2014 12:42:11 PM UTC daniel_zahn still being installed with new distro [18:48:27] mutante, the solr hosts are mine and yes, you can kill them [18:48:50] well, not kill but ship to eqiad, the HW's deliciously beefy [18:48:56] MaxSem: thanks! right in time, removing from monitoring but did not shut down yet [18:49:11] ohh.. good that you mention that if you want it shipped [18:49:17] will note that on the followup ticket [18:49:36] for the pmtpa queue [18:50:02] MaxSem: _after_ disk wiping? [18:50:19] I think they're still covered by warranty so makes a lot of sense to ship them and give to ^d and manybubbles [18:50:27] nods [18:50:41] <^d> mmm, beefy boxes [18:50:59] mutante, GeoData had nothing private so wipe only if ops stuff needs wiping like puppet keys or I don't know what [18:51:10] ok [18:52:06] Dell PowerEdge R420 [18:52:18] 2012-12-05 [18:52:21] 64G ram [18:53:04] mutante: sounds useful [18:53:07] we should have 3 years warranty [18:53:19] so 2015-12-05 then [19:08:26] MaxSem: yea but we wipe every disk that has had our OS on it before it goes anyplace, just standard operating procedure =] [19:08:46] so why ask me then? :} [19:09:00] RECOVERY - Kafka Broker Messages In on analytics1021 is OK: kafka.server.BrokerTopicMetrics.AllTopicsMessagesInPerSec.FifteenMinuteRate OKAY: 1542.63428728 [19:09:31] I dunno what was asked, just walked into irc conversation about wiping and if it needs wiping before ship [19:10:00] just noting that any disk that has data on it, unless otherwise specified, gets wiped before it gets shipped =] [19:10:11] i asked to make sure the data on it is not expected to be on the disks anymore, actually [19:10:31] since the disks should be reused in eqiad [19:10:39] i assumed thats what was usually asked but https://xkcd.com/1339/ [19:10:44] heh [19:11:31] hehe [19:14:50] <^d> robh: That's my new favorite xkcd. [19:17:17] i figure in a few years opsen worldwide wont need to have actual conversations [19:17:28] we'll just link the relevant xkcd comic. [19:19:22] robh++ [19:22:24] (03PS1) 10Ori.livneh: Git::Clone: support shared repositories [operations/puppet] - 10https://gerrit.wikimedia.org/r/118745 [19:24:05] !log jgage ran kafka preferred-replica-election on analytics1021 to rebalance [19:24:13] Logged the message, Master [19:25:14] (03PS1) 10Ori.livneh: Configure the scap repository to be shared by wikidev group [operations/puppet] - 10https://gerrit.wikimedia.org/r/118746 [19:25:45] !log shut down solr1/2 [19:25:52] Logged the message, Master [19:32:29] (03CR) 10Hashar: "We had such issue with the scap repository introducing a new directory which ended up not being group writable by wikidev group. From my c" [operations/puppet] - 10https://gerrit.wikimedia.org/r/118745 (owner: 10Ori.livneh) [19:32:44] (03CR) 10Hashar: "Follow up at https://gerrit.wikimedia.org/r/#/c/118745/ by Ori" [operations/puppet] - 10https://gerrit.wikimedia.org/r/115851 (owner: 10Dzahn) [19:33:05] (03CR) 10jenkins-bot: [V: 04-1] Git::Clone: support shared repositories [operations/puppet] - 10https://gerrit.wikimedia.org/r/118745 (owner: 10Ori.livneh) [19:33:30] (03CR) 10jenkins-bot: [V: 04-1] Configure the scap repository to be shared by wikidev group [operations/puppet] - 10https://gerrit.wikimedia.org/r/118746 (owner: 10Ori.livneh) [19:35:04] (03PS2) 10Ori.livneh: Git::Clone: support shared repositories [operations/puppet] - 10https://gerrit.wikimedia.org/r/118745 [19:35:24] (03PS2) 10Ori.livneh: Configure the scap repository to be shared by wikidev group [operations/puppet] - 10https://gerrit.wikimedia.org/r/118746 [19:36:33] hashar: fun discovery of the day: both 'git init' and 'git clone' take a --shared argument, but they mean completely different things :D cf [19:36:54] yeah :-] [19:36:56] (03PS1) 10John F. Lewis: Enable GuidedTours on Wikidata [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118747 [19:37:25] ori: have a read at my comment, that might make the git::clone command simpler by using git init && git fetch [19:37:30] err git init --shared [19:38:16] (03PS2) 10John F. Lewis: Enable GuidedTours on Wikidata [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118747 [19:38:54] I hate it when things fail because of a damn space. Especially in commit messages >.> [19:39:03] hashar: no need, since you can git -c core.sharedRepository=group clone .... [19:39:15] :-] [19:39:43] ori: and you can pass the mode to it ::] [19:40:17] hashar: so should it be "git -c core.sharedRepository=${mode} clone ..." ? [19:40:32] I would said it depends on the use case [19:40:46] I dont think 'group' would set the group setuid bit [19:40:57] which we might want to enforce on some repos clones [19:41:06] aka making sure files always belong to the wikidev group [19:41:19] yeah, plus mode => can actually be set to things like 'ug=rw,o=r' in puppet [19:41:39] whereas i *think* the sharedRepository arg would only accept octal [19:41:48] I guess so [19:42:14] also you want to double check the git version we have with Precise supports the commands [19:42:19] but Que sais-je? , as your compatriot Montaigne said [19:42:21] there might be some fun differences [19:43:07] he wore a pendant with that (well, 'Que sçay-je') inscribed on it, i want to get one too [19:47:11] ori: I have never read Montaigne :/ [19:47:56] oh he's really wonderful, and some of his essays are quite short so you don't need to panic (as i do) by the very thick volumes in which his work is published [19:48:26] hashar: probably your daughter is already reading them behind your back :P [19:48:35] I guess Iwould have to read a bit of it one day [19:48:37] ahah [19:48:59] she is reading https://en.wikipedia.org/wiki/Mr._Men [19:51:43] off for now, will be back later on for some emails handling [19:51:44] "DE L'INSTITUTION DES ENFANTS": http://www.ed4web.collegeem.qc.ca/prof/rthomas/pm/montaigne.htm short and relevant :) [19:51:55] great [19:51:58] will read that :-] [19:51:59] * ori wishes he knew french [19:52:03] good night! [19:52:05] relocate here ! [19:52:16] hashar: i'm already building a boat [19:52:49] that perfectly illustrate what I said to WMF HR: "can you please stop hiring awesome people?" [19:52:52] boat+++ [19:52:55] L'ordinateur [19:53:09] :-] [19:53:23] daughter calling for a Mr. Men story, bbl [19:55:40] oh i used to love those [19:57:47] http://en.wikipedia.org/wiki/The_Very_Hungry_Caterpillar#Synopsis [19:58:11] mutante: NO SPOILERS! :D [19:58:19] haha:) [19:58:59] my daughter complained that her mom won't let her read Bukowski yet :p [20:01:55] [20:02:05] For God's sake, does anyone have a son here? [20:02:09] [20:02:12] * greg-g raises his hand [20:02:12] :-D [20:03:30] greg-g: didn't you say your son was too small to read yet? [20:03:42] yeah, he's 2 [20:03:57] oh, not so long then [20:04:22] ohhh http://www.wikiforkids.ws/ [20:04:31] we had that suggestion too, didnt we [20:04:32] I remember listening to an audiobook recently, and they recommended that kids listen to them [20:04:37] and it was said it's like "simple" [20:04:51] "Safe Wiki Search":P [20:04:59] simple.wikipedia? [20:05:27] https://simple.wikipedia.org/wiki/Special:Search works too [20:05:28] greg-g: yea [20:05:52] The United States of America (USA), commonly referred to as the United States (US), America, or simply the States is a country in North America. It is made up of 50 states, a federal district, and five territories. The United States was on the winning side of two world wars (World War I and World War II) and became one of the world's superpowers. It is famous for its influence over finance, trade, culture, military, politics, and technology [20:05:58] Not that simple... [20:06:01] https://simple.wikipedia.org/wiki/Wikimedia [20:06:06] neither [20:06:24] Ting Chen will be the current Chair of the Wikimedia Foundation Board until July 2012. [20:06:41] :) [20:06:48] I except it was true at the time :) [20:06:52] expect* [20:06:55] (03CR) 10BryanDavis: Git::Clone: support shared repositories (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/118745 (owner: 10Ori.livneh) [20:07:46] !log shut down solr3, almost forgot [20:07:54] Logged the message, Master [20:11:24] (03CR) 10BryanDavis: Configure the scap repository to be shared by wikidev group (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/118746 (owner: 10Ori.livneh) [20:29:55] (03CR) 10Ori.livneh: Git::Clone: support shared repositories (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/118745 (owner: 10Ori.livneh) [20:38:20] (03PS3) 10Ori.livneh: Git::Clone: support shared repositories [operations/puppet] - 10https://gerrit.wikimedia.org/r/118745 [20:39:06] (03CR) 10jenkins-bot: [V: 04-1] Git::Clone: support shared repositories [operations/puppet] - 10https://gerrit.wikimedia.org/r/118745 (owner: 10Ori.livneh) [20:43:01] (03PS4) 10Dzahn: Make puppet cronjob to run AbuseFilter/maintenance/purgeOldLogIPData.php [operations/puppet] - 10https://gerrit.wikimedia.org/r/81257 (owner: 10Reedy) [20:44:39] (03PS4) 10Ori.livneh: Git::Clone: support shared repositories [operations/puppet] - 10https://gerrit.wikimedia.org/r/118745 [20:48:08] (03PS5) 10Ori.livneh: Git::Clone: support shared repositories [operations/puppet] - 10https://gerrit.wikimedia.org/r/118745 [20:50:31] (03PS3) 10Ori.livneh: Configure the scap repository to be shared by wikidev group [operations/puppet] - 10https://gerrit.wikimedia.org/r/118746 [20:51:20] (03PS4) 10Ori.livneh: Configure the scap repository to be shared by wikidev group [operations/puppet] - 10https://gerrit.wikimedia.org/r/118746 [20:54:09] hashar: is Mr. Happy still happy? [20:54:17] yup [20:54:32] plus we have are "The Little Mermaid" [20:56:01] hm hm, Mr. Happy better keep his hands to himself since the Mermaid is little! [20:56:38] you mean petite sirène [20:56:41] cya later [20:58:43] I need to hire an assistant [20:58:46] got too many email [21:00:30] hashar: I'll be your PA and mess everything up :D [21:01:22] JohnLewis: as long as you handle the hate replies that follow up, that sounds good to me [21:01:38] hashar: Sure - I get hate anyway [21:06:53] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Tue 11 Mar 2014 08:47:37 PM UTC [21:09:02] (03CR) 10Bene: [C: 031] "Consensus is there in fact." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118747 (owner: 10John F. Lewis) [21:18:17] (03PS1) 10coren: Tool Labs: update mail relay to allow incoming [operations/puppet] - 10https://gerrit.wikimedia.org/r/118765 [21:19:02] (03PS6) 10Ori.livneh: Git::Clone: support shared repositories [operations/puppet] - 10https://gerrit.wikimedia.org/r/118745 [21:20:54] (03CR) 10coren: [C: 032] "Already tested." [operations/puppet] - 10https://gerrit.wikimedia.org/r/118765 (owner: 10coren) [21:25:31] (03PS7) 10Ori.livneh: Git::Clone: support shared repositories [operations/puppet] - 10https://gerrit.wikimedia.org/r/118745 [21:25:37] (03PS1) 10coren: Tool Labs: minor fix to mailrelay.pp [operations/puppet] - 10https://gerrit.wikimedia.org/r/118767 [21:26:37] (03CR) 10BryanDavis: [C: 031] Git::Clone: support shared repositories [operations/puppet] - 10https://gerrit.wikimedia.org/r/118745 (owner: 10Ori.livneh) [21:27:04] (03CR) 10Ori.livneh: [C: 032] Git::Clone: support shared repositories [operations/puppet] - 10https://gerrit.wikimedia.org/r/118745 (owner: 10Ori.livneh) [21:27:52] (03CR) 10coren: [C: 032] "Yeah, so puppet URI sucks" [operations/puppet] - 10https://gerrit.wikimedia.org/r/118767 (owner: 10coren) [21:34:13] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [21:37:13] PROBLEM - check_job_queue on terbium is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 199,999 jobs: , Total (202301) [21:49:13] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [21:56:13] PROBLEM - check_job_queue on terbium is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 199,999 jobs: , Total (200776) [22:12:00] (03CR) 10Chad: [C: 032] Opt everyone into Cirrus as a beta feature by default [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118663 (owner: 10Chad) [22:12:12] (03Merged) 10jenkins-bot: Opt everyone into Cirrus as a beta feature by default [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118663 (owner: 10Chad) [22:12:28] I have pinged parsoid folks about the job queue [22:12:33] they are parsoid jobs [22:13:26] !log demon synchronized noncirrus.dblist [22:13:34] Logged the message, Master [22:13:42] random job queue commands: http://paste.openstack.org/show/73531/ (top wikis) and http://paste.openstack.org/show/73532/ (enwiki jobs) [22:13:45] !log demon synchronized wmf-config/CommonSettings.php [22:13:53] Logged the message, Master [22:14:30] !log demon synchronized wmf-config/InitialiseSettings.php [22:14:40] Logged the message, Master [22:17:44] ... [22:18:07] "Except 151 wikipedias that still need scheduling and planning." whew [22:18:29] <^d> hashar: The job queue one-week chart looks nice. [22:18:29] <^d> Downward trend. [22:18:32] where "everyone" != "everyone" ;) [22:19:07] <^d> greg-g: Hehe, subtitle mattered :) [22:19:30] <^d> "everyone" == "no one" [22:19:30] <^d> I was just reversing the config. [22:19:33] <^d> Since 730 wikis now use it, it makes more sense to track the 151 that don't. [22:20:07] * greg-g nods [22:20:12] scared me a little, tbh [22:29:53] (03PS1) 10Ori.livneh: git::clone: Include umask as part of command [operations/puppet] - 10https://gerrit.wikimedia.org/r/118776 [22:30:13] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [22:31:40] (03CR) 10Ori.livneh: [C: 032] git::clone: Include umask as part of command [operations/puppet] - 10https://gerrit.wikimedia.org/r/118776 (owner: 10Ori.livneh) [22:32:11] (03CR) 10Ori.livneh: [C: 032] Configure the scap repository to be shared by wikidev group [operations/puppet] - 10https://gerrit.wikimedia.org/r/118746 (owner: 10Ori.livneh) [22:33:05] (03PS1) 10Chad: Update symlinks for notcirrus [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118778 [22:33:13] PROBLEM - check_job_queue on terbium is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 199,999 jobs: , Total (202326) [22:40:06] (03PS1) 10Ori.livneh: Fix typo from I96287df09 [operations/puppet] - 10https://gerrit.wikimedia.org/r/118779 [22:40:25] (03CR) 10Ori.livneh: [C: 032 V: 032] Fix typo from I96287df09 [operations/puppet] - 10https://gerrit.wikimedia.org/r/118779 (owner: 10Ori.livneh) [22:59:12] (03PS1) 10Ori.livneh: Set core.sharedRepository to 'group' for shared clones [operations/puppet] - 10https://gerrit.wikimedia.org/r/118783 [23:00:26] (03CR) 10Ori.livneh: [C: 032 V: 032] Set core.sharedRepository to 'group' for shared clones [operations/puppet] - 10https://gerrit.wikimedia.org/r/118783 (owner: 10Ori.livneh) [23:02:54] greg-g: BTW, I assume the Hovercards feature is now in Beta Labs (if it's going live to all wikis on Wednesday, it needs at least a few days' worth of testing)… [23:03:07] James_F: it is, yeah [23:03:17] greg-g: Awesome. :-) [23:04:22] <^d> greg-g: I e-mailed engineering and ops about that config reversal. So nobody else gets surprised/confused :) [23:05:58] ^d: good deal :) [23:16:19] Hovercards reminds me of Hovrboard ad.. what did that turn out ot be for [23:30:06] (03CR) 10Ori.livneh: [C: 032] Add scap-purge-l10n-cache to /usr/local/bin [operations/puppet] - 10https://gerrit.wikimedia.org/r/118338 (owner: 10BryanDavis) [23:30:35] (03PS2) 10Ori.livneh: Add scap-purge-l10n-cache to /usr/local/bin [operations/puppet] - 10https://gerrit.wikimedia.org/r/118338 (owner: 10BryanDavis) [23:30:45] (03CR) 10Ori.livneh: [C: 032 V: 032] Add scap-purge-l10n-cache to /usr/local/bin [operations/puppet] - 10https://gerrit.wikimedia.org/r/118338 (owner: 10BryanDavis)