[00:12:42] New review: Demon; "(no comment)" [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/5547 [00:40:34] !log deploying change 5593 to virt0 for fixing non-global puppet group management [00:41:04] where/s the damn bot? [01:00:19] New patchset: MarkAHershberger; "add start bugzilla" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5595 [01:00:37] New patchset: MarkAHershberger; "lint warnings" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/4734 [01:00:53] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5595 [01:00:54] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/4734 [01:44:07] PROBLEM - MySQL Slave Delay on storage3 is CRITICAL: CRIT replication delay 270 seconds [01:46:58] RECOVERY - MySQL Slave Delay on storage3 is OK: OK replication delay 7 seconds [03:22:58] PROBLEM - Puppet freshness on amslvs4 is CRITICAL: Puppet has not run in the last 10 hours [05:31:58] PROBLEM - Puppet freshness on nfs2 is CRITICAL: Puppet has not run in the last 10 hours [05:53:52] PROBLEM - Puppet freshness on nfs1 is CRITICAL: Puppet has not run in the last 10 hours [08:50:08] New review: Hashar; "(no comment)" [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/5444 [09:10:11] PROBLEM - swift-container-auditor on ms-be5 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [09:22:32] New review: Hashar; "I guess you should split both fixes as they are unrelated. We probably want to avoid the l10n-bot sp..." [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/5547 [09:22:38] RECOVERY - swift-container-auditor on ms-be5 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [09:22:54] New patchset: Dzahn; "various fixes to parsing function, enhance error handling and log output" [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5604 [09:22:55] New patchset: Dzahn; "add custom status codes for parser errors / url fetching" [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5605 [09:22:56] New patchset: Dzahn; "get rid of textstats column, fix sorting by http, nicer ${variables}" [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5606 [09:22:56] New patchset: Dzahn; "fix http status code checking and retab" [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5607 [09:22:57] New patchset: Dzahn; "add id and method columns to tables, fix sorting, get rid of unneeded sort options" [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5608 [09:22:58] New patchset: Dzahn; "fix link creation in html tables, step 1" [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5609 [09:22:59] New patchset: Dzahn; "more general cleanup, make HTML more readable with heredoc syntax, de-uglification, remove unused stuff" [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5610 [09:22:59] New patchset: Dzahn; "less repetition of the same strings" [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5611 [09:23:00] New patchset: Dzahn; "get serialized PHP format data from MW API rather than XML use maxlag per API etiquette simplify parsing, tabs, ..." [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5612 [09:25:44] New review: Dzahn; "(no comment)" [operations/debs/wikistats] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/5604 [09:25:46] Change merged: Dzahn; [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5604 [09:26:10] New review: Dzahn; "(no comment)" [operations/debs/wikistats] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/5605 [09:26:12] Change merged: Dzahn; [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5605 [09:26:43] New review: Dzahn; "(no comment)" [operations/debs/wikistats] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/5606 [09:26:45] Change merged: Dzahn; [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5606 [09:27:18] New review: Dzahn; "(no comment)" [operations/debs/wikistats] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/5607 [09:27:20] Change merged: Dzahn; [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5607 [09:27:57] New review: Dzahn; "(no comment)" [operations/debs/wikistats] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/5608 [09:28:00] Change merged: Dzahn; [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5608 [09:28:30] New review: Dzahn; "(no comment)" [operations/debs/wikistats] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/5609 [09:28:33] Change merged: Dzahn; [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5609 [09:29:04] New review: Dzahn; "(no comment)" [operations/debs/wikistats] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/5610 [09:29:06] Change merged: Dzahn; [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5610 [09:29:31] New review: Dzahn; "(no comment)" [operations/debs/wikistats] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/5611 [09:29:33] Change merged: Dzahn; [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5611 [09:32:01] New patchset: Dzahn; "get serialized PHP format data from MW API rather than XML use maxlag per API etiquette simplify parsing, tabs, ..." [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5612 [09:32:29] New review: Dzahn; "(no comment)" [operations/debs/wikistats] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/5612 [09:32:31] Change merged: Dzahn; [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5612 [09:35:20] New review: Hashar; "Patchset 2 still has the issues I have mentioned previously. Mainly png files missing!" [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/3285 [09:35:59] PROBLEM - Puppet freshness on amslvs2 is CRITICAL: Puppet has not run in the last 10 hours [09:40:09] mutante: hi :) [09:40:30] mutante: will you be available this afternoon to write the Jenkins upgrade process? [09:40:56] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [09:43:14] hashar: hi. yeah, we can do that soon. have a page already? will be there in a few [09:43:27] mutante: I am going out for lunch soon [09:43:33] ok [09:43:55] mutante: I will write a first draft then ping you to enhance it :-D [09:44:09] ok, that's fine [09:51:08] PROBLEM - Host srv278 is DOWN: PING CRITICAL - Packet loss = 100% [09:52:11] RECOVERY - Host srv278 is UP: PING OK - Packet loss = 0%, RTA = 0.29 ms [09:56:14] PROBLEM - Apache HTTP on srv278 is CRITICAL: Connection refused [10:14:42] !log searchidx1 is in site.pp and decom.pp at the same time. breaks puppet runs on spence. cannot override local resource. should it be gone or not? [10:15:22] RIP morebots [10:18:26] searchidx1 has been gone for a long time, it was out of space and it still is [10:18:27] RECOVERY - Apache HTTP on srv278 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.042 second response time [10:18:47] arr, do you know where the bot was [10:19:02] i see your edit on its wiki page, saying "really? zwinger is gone" :) [10:19:11] morebots? [10:19:30] yes [10:19:35] inode, wikitech [10:19:38] "Morebots (as opposed to Moarbots)" :p [10:19:57] ignore the stuff about identi.ca [10:20:01] gotcha [10:21:58] there is an init script, but "morebots~" and no "morebots" [10:22:58] and its broken [10:25:46] don't use the init script [10:26:03] You can start it with ~werdnum/start_morebots.sh [10:26:08] first shoot anything runnign with [10:26:11] "morebots" in it [10:26:13] then run that [10:26:53] people keep breaking morebots [10:26:56] wikitech: actually accurate for restarting [10:26:57] back in the day it was rocksolid! [10:27:05] didn't need restarting [10:27:14] !log killed a couple morebots processes on wikitech and it came back by itself :p [10:27:19] Logged the message, Master [10:27:50] it was running multiple times [10:28:12] btw: if [ "$1" -eq "start" ] -> "start: integer expression expected" [10:28:15] :p [10:29:19] heh [10:30:29] !log searchidx1 was in site.pp and decom.pp at the same time. breaks puppet runs on spence. cannot override local resource. removing from site [10:30:31] Logged the message, Master [10:31:29] #mediawiki has lost the gerrit-wm bot :/ [10:32:14] bot days :p [10:32:29] just now? [10:33:22] no idea [10:33:34] I have just noticed that right now [10:33:53] also we have a CIA-89 bot emitting commits messages from the KDE project :-] [10:34:44] New patchset: Dzahn; "remove searchidx1 from site.pp, long dead and in decom.pp already. having it in both breaks puppet on spence" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5615 [10:34:51] wth @KDE [10:35:01] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5615 [10:35:25] the host itself is still around but hasn't been used for anything, I wonder if it's still in warranty [10:35:39] if so it could be reinstalled for somethingorother [10:37:50] New review: Dzahn; "per apergos this could be reinstalled for something else (if it's still in warranty?)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/5615 [10:37:52] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5615 [10:43:16] apergos: mutante do you have the [10:43:20] err [10:43:30] do you have any information / account for CIA bot? [10:47:09] well I have just banned CIA bots :-D [10:47:21] that solve the issue for now and I have sent a wikitech-l message [10:47:55] I got nothin [10:48:15] !log banned CIA bots from #mediawiki IRC channel. It started spamming us with notifications from KDE and mandriva projects. See http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/60905 [10:48:18] Logged the message, Master [10:48:38] sry, i didnt. i was updating docs on the other bot and stuff: http://wikitech.wikimedia.org/view/Special:Contributions/Dzahn [10:49:44] how about the gerrit one [10:51:22] gerrit-wm is here and fine, was it not wanted on #mediawiki anymore maybe? [10:51:34] I am sure we want it there [10:51:40] though maybe it was kicked out form the channel :/ [10:51:47] and does not rejoin automatically [10:52:11] the script is for sure still alive since gerrit-wm is in this channel [10:52:11] PROBLEM - swift-container-auditor on ms-be2 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [10:54:23] hashar: makes sense. there you go. "force-reload" ftw [10:54:33] adding another docs page :p [10:54:52] someone must have kicked it :-( [10:55:07] !log force-reload ircecho on manganese to make gerrit-wm rejoin #mediawiki [10:55:10] Logged the message, Master [10:55:27] where is ircecho source code? [10:56:53] somewhere deep in wikitech I think [10:57:11] seems to be a debian package already [10:57:17] http://wikitech.wikimedia.org/view/Ircecho [10:57:42] first edit 2006, last 2008, 3 edits . dont trust it [10:58:12] I never trust wikitech-l [10:58:14] er wikitech [10:58:24] it is full of misleading and out of date "doc" [10:58:40] the one on the server has "modified by Ryan Lane" in it and a lot longer [10:58:49] svn.wikimedia.org/svnroot/mediawiki/trunk/debs/ircecho [10:59:06] please update that page with a copy/paste if you found it [11:00:16] adding {{fixme|outdated}} to these pages [11:00:57] makes them show up in the "Fix me!" category linked in navigation [11:01:58] I have updated it https://wikitech.wikimedia.org/view/Ircecho [11:02:10] with some generic informations that refers to the SVN repo [11:02:17] thanks, adding gerrit-wm page [11:11:41] RECOVERY - swift-container-auditor on ms-be2 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [11:14:51] !bots [11:14:51] http://www.mediawiki.org/wiki/Wikimedia_Labs/Create_a_bot_running_infrastructure proposal for bots [11:16:16] !bots del [11:16:16] Successfully removed bots [11:16:54] !bots is bot down? http://wikitech.wikimedia.org/view/Category:Bots | proposal for new bot infra: http://www.mediawiki.org/wiki/Wikimedia_Labs/Create_a_bot_running_infrastructure [11:16:54] Key was added! [11:20:50] PROBLEM - Host mw21 is DOWN: PING CRITICAL - Packet loss = 100% [11:23:56] !log mw21 powercycling mw21 - it died with this http://etherpad.wikimedia.org/mw21 [11:23:59] Logged the message, Master [11:24:46] μμ [11:24:53] looks similar to one of the deadlocks I saw on mw4 [11:25:05] no call trace there to get any further info though [11:25:36] if you look in the syslog you'll probably see some related cruft [11:26:23] RECOVERY - Host mw21 is UP: PING OK - Packet loss = 0%, RTA = 0.90 ms [11:29:23] PROBLEM - Apache HTTP on mw21 is CRITICAL: Connection refused [11:30:45] apergos: not really in syslog, but the most interesting line in the one above is probably "BUG: soft lockup - CPU#1 stuck for 61s! [apache2:13582]" <-- apache2 [11:31:16] the question is what it was doing, unfortunately we don't have that [11:31:27] these soft lockkups can happen for a bunch of different reasons [11:32:52] yeah, last one in syslog is just normal executing cron.hourly, then the few minutes break and reboot [11:36:08] :q [11:36:15] wrong window [11:36:26] RECOVERY - Apache HTTP on mw21 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.039 second response time [11:38:42] Reedy: hi? re: refreshLinks. it ran, and we have logs. short version: looks good afaict on s2-s7, but not on s1. it lost connection to mysql server during query :( [11:39:39] Reedy: hume:/home/mwdeploy/refreshLinks/*.log [11:40:30] the other ones look good. "Retrieving illegal entries from xxx .. and always getting 0..0 [11:41:34] well, nothing illegal sounds good at least [12:00:19] I would have thought it would take years to run refreshLinks on enwiki [12:00:48] well, it worked on all except enwiki [12:01:07] how long did it take? [12:02:22] it ran for 03:39 and then Lost connection to MySQL server during query [12:04:34] ah right, you ran it with --dfn-only [12:04:39] yes [12:04:52] hence hours not years [12:05:52] you know I wrote a gearman version of the real refreshLinks, I wonder where that got to [12:05:53] yeah, that's what i've been told, "make sure to --dfn-only" when it comes to performance issues [12:06:50] I think it would be OK for site performance, it's just that you'd be a lot older when it finished [12:07:27] heh,ok ;p. the original request is to run it once a month [12:08:04] how often would you think makes sense for --dfn-only, i had it at daily for the last 2 days to see [12:08:49] and maybe i should temp. remove s1 and just leave the other ones in [12:09:45] depends on how broken the site is, I guess [12:10:25] I mean, something must be broken if a lot of such links are being created [12:10:42] all i saw was "0..0" after "Retrieving illegal entries" [12:10:50] not a single one that wasnt "0..0" [12:11:02] on s2 to s7 that is [12:13:23] you can just run it again on s1 to resume where you left off [12:14:40] ah, cool. then i will just leave the daily cron jobs in there for another day and check tomorrow [12:14:51] looks like reedy deleted my gearman script just a couple of months ago [12:15:10] TimStarling: I hear you must be very excited about all the new Performance Team [12:15:24] :(? @ deleted script [12:15:38] https://www.mediawiki.org/wiki/Special:Code/MediaWiki/110958 [12:15:43] domas: wheee [12:15:45] TimStarling: I disclosed one low hanging fruit to performance engineering at wikimedia already [12:15:54] that would make databases much faster [12:15:55] \o/ [12:16:06] deploy innodb compression! [12:16:07] one that I was always too lazy to work on [12:16:12] no, mediawiki-side change [12:16:23] put revision metadata into parser cache, so you don't have to read 'revision' on page reads [12:16:58] generally, making revision not needed for readonly ops would help quite a bit [12:17:09] timstarling: innodb compression is pure fun too [12:17:15] oops, Python Traceback at https://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/maintenance/gearman/?pathrev=110957&view=markup [12:17:19] timstarling: our latest freshest bestest story about mysql [12:17:31] … just need april tree [12:18:21] uh huh [12:18:22] TimStarling: did you see my latest post about mysql future? it is considered to be my best post ever by some people :) [12:18:36] not yet, no [12:18:45] the 'mysql is doomed' one [12:21:32] sounds very cloudy [12:21:45] I should go, I have to be up early tomorrow [12:22:47] night [12:23:19] thanks Tim, night [12:30:41] New patchset: Dzahn; "refreshLinks, just run the s1 cron job alone which failed yesterday, others have just been refreshed succsecfully" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5618 [12:30:58] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5618 [12:32:00] New review: Dzahn; "let's see how long it takes, per Tim it can resume where it stopped after 3:39 so far" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/5618 [12:32:03] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5618 [13:04:57] PROBLEM - Host mw25 is DOWN: PING CRITICAL - Packet loss = 100% [13:05:51] PROBLEM - check_all_memcacheds on spence is CRITICAL: MEMCACHED CRITICAL - Can not connect to 10.0.11.25:11000 (Connection timed out) [13:07:24] !log fix puppet run on spence by removing searchidx1 resources from db9 (was in weird state being in site but also decommissioned) [13:07:27] Logged the message, Master [13:09:18] RECOVERY - Puppet freshness on spence is OK: puppet ran at Mon Apr 23 13:09:12 UTC 2012 [13:09:32] !log powercycling frozen mw25, looks like mw21 above but no console output to paste here [13:09:35] Logged the message, Master [13:12:36] RECOVERY - Host mw25 is UP: PING OK - Packet loss = 0%, RTA = 0.33 ms [13:13:49] PROBLEM - Apache HTTP on mw25 is CRITICAL: Connection refused [13:14:16] RECOVERY - check_all_memcacheds on spence is OK: MEMCACHED OK - All memcacheds are online [13:14:25] PROBLEM - NTP on mw25 is CRITICAL: NTP CRITICAL: Offset unknown [13:15:10] RECOVERY - Apache HTTP on mw25 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.051 second response time [13:15:55] RECOVERY - NTP on mw25 is OK: NTP OK: Offset -0.01898825169 secs [13:23:34] PROBLEM - Puppet freshness on amslvs4 is CRITICAL: Puppet has not run in the last 10 hours [13:40:39] mutante: here is a very basic page about Jenkins : http://etherpad.wikimedia.org/jenkins-upgrading [14:14:33] New patchset: Pyoungmeister; "re-enabling new varnishncsa setup on mobile varnish boxxies to see if it will push out the right init file this time..." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5619 [14:14:51] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5619 [14:16:30] notpeter: I claimed that searchidx1 could be in decommed servers and not in site.pp (since it's beenin the same dead state for ages now). so if that's wrong feel free to stab me [14:16:39] nope! [14:16:56] I actually have a ticket to rename it to some philosopher, or something [14:17:01] great [14:17:15] or just catch it on fire and roll it into the bay [14:17:16] I guess puppet was whining about it being in both places [14:17:18] all good options [14:17:21] yeah [14:17:22] makes sense [14:17:26] cool [14:17:59] !log temp stopping puppet on cp1042-1044 [14:18:01] Logged the message, notpeter [14:18:41] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/5619 [14:18:44] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5619 [14:18:47] PROBLEM - swift-container-auditor on ms-be5 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [14:20:51] New patchset: Dzahn; "check_all_memcached: check all, then tell if multiple are down instead of exiting after first error" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5620 [14:21:08] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5620 [14:21:54] searchidx1 may be old enough to decommission [14:22:31] probably so [14:28:41] RECOVERY - swift-container-auditor on ms-be5 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [14:29:40] bahhh, why is my nick not available [14:29:51] stupid irc [14:31:30] /msg nickserv release nick password [14:33:54] yayyyyy [14:33:56] !log stopping puppet on cp1041 as well [14:33:57] closedmouth, thanks =] [14:33:58] Logged the message, notpeter [14:34:05] i didnt feel like me. [14:35:03] ok, going afk, gotta rig the wiring in new row c. [14:35:08] :) [14:57:49] New patchset: Pyoungmeister; "jeff is awesome." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5622 [14:58:06] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5622 [15:00:04] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/5622 [15:00:07] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5622 [15:04:01] Who can merge my puppet changes so I can test my bz package? [15:04:55] actually just getting someone to look at it and tell me if they see something wonky would help [15:05:01] https://gerrit.wikimedia.org/r/#change,5595 [15:06:42] New review: Dzahn; "tested on spence" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/5620 [15:06:44] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5620 [15:09:56] New review: Dzahn; "did you really want the gerrit::ircbot stuff in the bugzilla file?" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/5595 [15:25:25] New patchset: MarkAHershberger; "add start bugzilla" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5595 [15:25:43] New patchset: MarkAHershberger; "lint warnings" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/4734 [15:25:59] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5595 [15:25:59] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/4734 [15:26:53] New review: MarkAHershberger; "removed commented-out ircbot stuff" [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/5595 [15:28:42] New patchset: Alex Monk; "Don't log patchsets submitted by L10n-bot or comments/merges made by gerrit2" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5547 [15:28:59] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5547 [15:32:29] PROBLEM - Puppet freshness on nfs2 is CRITICAL: Puppet has not run in the last 10 hours [15:37:17] New patchset: Pyoungmeister; "reenabling new varnishncsa for eqiad upload varnish instances now that mobile is looking good." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5624 [15:37:34] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5624 [15:39:29] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/5624 [15:39:31] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5624 [15:54:32] PROBLEM - Puppet freshness on nfs1 is CRITICAL: Puppet has not run in the last 10 hours [17:23:40] http://julien.danjou.info/blog/2012/openstack-swift-consistency-analysis [17:23:50] that's quite interesting [17:50:49] New patchset: Ottomata; "Including misc::statistics::plotting on stat1 for RT 2163." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5625 [17:51:07] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5625 [17:53:01] New review: Ottomata; "See http://rt.wikimedia.org/Ticket/Display.html?id=2163." [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/5625 [17:56:36] Change abandoned: Ottomata; "This actually HAS already been done in https://gerrit.wikimedia.org/r/#change,3513. At this time c..." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5625 [17:57:43] New review: Ottomata; "(no comment)" [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/3513 [17:58:28] can someone approve https://gerrit.wikimedia.org/r/3513 please? [17:58:36] Erik Z needs this to start using stat1 [18:02:02] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/5033 [18:02:05] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5033 [18:06:57] woosters: can you help approve this one? https://gerrit.wikimedia.org/r/3513  [18:07:05] we are in an ops meeting [18:07:18] ok cool [18:08:02] no worries, i was just asked to see if I could help with some RT tickets, and the main hold up I can see is this change has been waiting since March 22 [18:08:25] so, i'm just going to be squeaky about it, thank youuuuuuu! [18:16:23] New patchset: Ryan Lane; "Revert "Link rXXXXX to CodeReview"." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5626 [18:16:41] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5626 [18:16:45] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/5626 [18:16:48] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5626 [18:38:08] hi folks [18:38:17] Ryan_Lane: did you see http://code.google.com/p/gerrit/issues/detail?id=1124 [18:38:45] <^demon> preilly: I already mentioned it :) [18:39:14] ^demon: oh, I didn't see that you did [18:39:28] <^demon> In #-labs a bit ago [18:39:43] yeah, just saw an email come through about it [18:41:20] <^demon> In other random Gerrit news, turns out the "forbid self-reviews" feature is more or less possible, just completely undocumented :p [18:44:20] Hmm speaking of that [18:44:27] I need to work on the wmf/* permissions a bit more [18:45:30] Such that only the wmf-deployment group can +2 revisions in refs/for/wmf/* [18:48:07] ^demon: Why does JenkinsBot have CR -2/+2 rights on refs/* ? Why would Jenkins ever CR anything? [18:48:45] Oh wait nm [18:48:54] It has -2/0 rights, I guess that sort of makes sense. Maybe. [18:52:03] <^demon> RoanKattouw: I think Antoine's plan is to make the linter do a CR-2 [18:52:10] <^demon> But ask him. [18:54:39] hi guys [18:54:48] i'm trying to set up a local VM test environment for puppet stuff [18:55:00] apt-get update is complaining about apt.wikimedia.org pubkey [18:55:01] GPG error: http://apt.wikimedia.org oneiric-wikimedia Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 09DBD9F93F6CD44A [18:55:29] is there a pubkey for apt.wikimedia.org apt? [18:55:39] ottomata: don't use oneiric [18:55:44] use lucid [18:55:52] ok, i think puppet did that when i ran it and it created the sources [18:55:53] or wait for precise [18:55:58] ohhhh [18:56:01] the OS you mean [18:56:03] ergh [18:56:12] oneiric is there for testing (and for openstack development) [18:56:15] you guys are running lucid? [18:56:20] of course [18:56:22] in prod? [18:56:25] ok [18:56:33] we only run LTS, in general [18:56:52] also, best to use #wikimedia-labs for this kind of discussion :) [18:57:14] ok, thanks, i wasn't using labs, which is why I didn't ask there, but will in the future [18:57:15] thank you! [18:57:16] (lucid is set as the default on purpose ;) ) [18:57:20] ahhhh. ok [18:57:28] I thought this was a labs instance [18:57:34] it is too difficult to test puppet in labs [18:57:40] i want to make changes and see what they do [18:57:44] so I am setting up a vm locally [18:57:50] faidon is working on that [18:58:03] aye cool [19:11:38] New patchset: Demon; "Improve truncation in IRC/logging" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5629 [19:11:55] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5629 [19:36:56] PROBLEM - Puppet freshness on amslvs2 is CRITICAL: Puppet has not run in the last 10 hours [19:56:54] New review: Demon; "Same issue as with patchset 1, I just didn't review it well enough." [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/5547 [20:01:14] PROBLEM - Host mw1 is DOWN: PING CRITICAL - Packet loss = 100% [20:25:43] New patchset: Dzahn; "now that i added the memcached group also need to use @monitor_group to unbreak icinga" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5667 [20:26:00] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5667 [20:27:58] New review: Dzahn; "need for icinga fix for now, but should look at better group just containing the ones defined in mc...." [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/5667 [20:28:01] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5667 [20:40:10] LeslieCarr: icinga unbroken, sry, just though about spence. upraded libssl on neon as well. bbl [20:41:24] !log neon - upgraded libssl, started icinga after adding monitor group [20:41:26] Logged the message, Master [20:41:39] * mutante out [20:44:55] Hi Dzahn: could you maybe have a look at this changeset: https://gerrit.wikimedia.org/r/#change,3513 (it is yours but the analytics team would like to have this merged) [20:49:27] guru meditation: https://bugzilla.wikimedia.org/36166 [20:52:39] yay upgrading [20:53:31] hexmode: lolbug [20:53:58] right, but how to fix? [20:54:11] Looks like this has a long history: https://en.wikipedia.org/wiki/Guru_Meditation [20:59:13] it's probably a memory error [20:59:50] it's a very large thumbnail [21:04:38] Ryan_Lane: http://bitfieldconsulting.com/puppet-vs-chef [21:08:10] several of those reasons really collapse into one [21:09:49] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/3513 [21:09:58] I'n not utterly sold by that article [21:10:18] if it talked more about things that weren't "it has a larger user base and all that entails, for longer" [21:10:21] heh [21:10:22] I might be more sold [21:10:59] hahaha [21:11:07] they're saying it's the same technology google uses ? [21:11:09] that's a riot [21:16:21] New patchset: Lcarr; "fixed conflicts add role class for statistics server, move includes from site.pp to role class and add plotting class" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/3513 [21:16:38] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/3513 [21:17:22] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/3513 [21:17:25] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/3513 [21:17:48] I'm just terrified that the documentation for puppet is better.... [21:18:34] hahaha [21:20:09] PROBLEM - check_all_memcacheds on spence is CRITICAL: MEMCACHED CRITICAL - Could not connect: 10.0.11.27:11000 (timeout) [21:23:00] RECOVERY - check_all_memcacheds on spence is OK: MEMCACHED OK - All memcacheds are online [21:31:33] LeslieCarr: google does use puppet [21:32:28] i'm guessing in corp ? [21:32:55] yep [21:33:19] apparently to control all google controlled desktops/laptops [21:37:24] PROBLEM - check_all_memcacheds on spence is CRITICAL: MEMCACHED CRITICAL - Could not connect: 10.0.11.24:11000 (timeout) [21:37:34] who killed memcached? [21:38:54] RECOVERY - check_all_memcacheds on spence is OK: MEMCACHED OK - All memcacheds are online [21:46:22] New patchset: Ryan Lane; "Remove project groups from sudo, add ops group" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5216 [21:46:39] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5216 [21:47:51] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/5216 [21:47:54] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5216 [21:48:46] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/4988 [21:48:49] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/4988 [21:57:44] what's the puppet://volatile source? [22:14:30] Change abandoned: Alex Monk; "(no reason)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5547 [22:17:57] New review: Demon; "Were you going to submit a new patchset for this?" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5547 [22:20:35] New review: Alex Monk; "No, I just couldn't figure out how I was supposed to find the owner of a change." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5547 [22:25:28] Change restored: Demon; "Well I plan to still fix it and I'll keep digging, so let's restore this :)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5547 [22:38:45] New patchset: Ryan Lane; "Removing nimish and awjrichards from sudo on locke and emery." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5679 [22:39:02] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5679 [22:39:27] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/5679 [22:39:30] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5679 [22:54:02] New review: Asher; "this looks good, provided that testswarm continues to use vanilla mysql pkgs which these configs bui..." [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/4395 [22:54:33] New review: Asher; "see inline comment" [operations/puppet] (production); V: 0 C: -1; - https://gerrit.wikimedia.org/r/4400 [23:18:34] New patchset: Catrope; "Update mwmultiversion scripts from SVN" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5680 [23:18:50] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5680 [23:19:11] New patchset: Catrope; "Fix /usr/local/bin/refreshWikiVersionsCDB" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/4652 [23:19:28] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/4652 [23:24:38] PROBLEM - Puppet freshness on amslvs4 is CRITICAL: Puppet has not run in the last 10 hours [23:36:06] binasher: just did a MobileFrontend deployment - can you flush the varnish cache? [23:37:51] Ryan_Lane: what was the rule you were talking about for this? ^^^ :D [23:37:57] awjr: sure, just a sec [23:38:21] should be flushed [23:38:28] heh thanks binasher :D [23:39:28] :D [23:39:56] awjr: every time you guys ask to have the cache flushed, you also need to check in a piece of code that reduces the number of times you need the cache flushed [23:40:10] otherwise, we won't flush it ;) [23:41:32] lol