[00:00:06] adding the cname now [00:00:25] then need to check out the rsyslogd setup in puppet [00:03:04] New patchset: Reedy; "Move wgDebugLogGroups from CommonSettings.php to InitialiseSettings.php" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/46651 [00:03:06] AaronSchulz: ^ You're welcome ;) [00:03:51] you remembered ;) [00:04:21] New patchset: Reedy; "Move wgDebugLogGroups from CommonSettings.php to InitialiseSettings.php" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/46651 [00:05:20] RoanKattouw_away: i changed my user agent to fake Internet Explorer 8 and the VisualEditor appeared [00:05:24] Reedy: its added and but will take some time to propagate (syslog.eqiad.wmnet is an alias for fenari.wikimedia.org.) [00:06:01] Great, thanks. It shouldn't be a major rush now [00:06:02] mutante: Well there you go, then it's a bug with us not recognizing Iceweasel [00:06:27] want Bugzilla? [00:06:32] ^demon: Im trying to get the strace stuff pushed for you [00:06:37] New patchset: RobH; "sudo for strace for demon in role appservers (RT-4066)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/42791 [00:06:39] who knows if folks will approve [00:09:32] notpeter: wanna take a gander at the above patchset and see if its sane? [00:09:49] daniel listed you as a reviewer on it (dunno why) [00:09:50] New patchset: Reedy; "Move wgDebugLogGroups from CommonSettings.php to InitialiseSettings.php" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/46651 [00:09:59] i asume cuz you deal with apaches a lot recently. [00:10:36] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/46651 [00:10:37] RobH: sounds legit [00:11:34] hrmm, i wonder if i should push it or just poke mark about it tomorrow am [00:11:48] notpeter: it seems legit enough that i can just merge it dontcha think [00:11:49] ? [00:11:55] * RobH wants other folks on the firing line with him [00:12:16] the linked patchset looks like it will accomplish the task stated in that rt ticket [00:13:03] https://gerrit.wikimedia.org/r/#/c/42791/3 [00:13:28] rfaulkner: https://gerrit.wikimedia.org/r/#/q/status:open+project:sartoris,n,z [00:13:55] !log reedy synchronized wmf-config/ [00:13:56] Logged the message, Master [00:15:28] New patchset: RobH; "sudo for strace for demon in role appservers (RT-4066)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/42791 [00:17:56] wtf: host syslog.eqiad.wmnet recursor0 => "not found: 3(NXDOMAIN)" vs. host syslog.eqiad.wmnet ns0 => "syslog.eqiad.wmnet is an alias for fenari.wikimedia.org" [00:18:04] recursor0 and ns0 both == dobson [00:19:13] !log restarted pdns-recursor on dobson [00:19:14] Logged the message, Master [00:20:40] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:22:17] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:22:57] New review: RobH; "I made the changes requested by Faidon, and this now looks legit to me. However, as it touches ever..." [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/42791 [00:23:30] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 2.141 second response time [00:23:56] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 6.458 seconds [00:29:21] PROBLEM - Host labstore3 is DOWN: PING CRITICAL - Packet loss = 100% [00:30:00] RECOVERY - Host labstore3 is UP: PING OK - Packet loss = 0%, RTA = 26.65 ms [00:34:47] AaronSchulz: I have seen that, yes [00:36:03] New patchset: Dzahn; "remove setting $wgBlockDisablesLogin to true for foundationwiki (RT-690) (bug 44473)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/46653 [00:42:58] Change merged: Dzahn; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/46653 [00:43:56] !log sync InitialiseSettings.php for bug 44473 [00:43:57] Logged the message, Master [00:44:05] !log dzahn synchronized ./wmf-config/InitialiseSettings.php [00:44:06] Logged the message, Master [00:44:15] ty :) [00:44:23] New patchset: Pyoungmeister; "debug: adding some notifies for troubleshooting" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46654 [00:45:12] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46654 [00:48:30] New patchset: Pyoungmeister; "Revert "debug: adding some notifies for troubleshooting"" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46655 [00:49:09] rfaulkner: ping [00:49:19] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46655 [00:56:34] Ryan_Lane: I need your help on salt when your crisis is over [00:56:45] paravoid: what do you need help with? [00:56:49] [WARNING ] Setting up the Salt Minion "ms-be1003.eqiad.wmnet" [00:56:49] [CRITICAL] The Salt Master has rejected this minion's public key! [00:56:53] what I'm doing is a boring long shitty manual provess [00:56:55] the box was reprovisioned [00:56:57] ah [00:57:03] so I'm guess we don't revoke salt keys or something [00:57:10] salt-key -d ms-be1003.eqiad.wmnet [00:57:16] on sockpuppet [00:57:16] can we automate this? [00:57:21] yes [00:57:28] and sync states between puppet/salt? [00:57:38] I'd like to change our bootstrapping to install salt first [00:57:48] okay, if I do salt-key -d will puppet recreate the key then? [00:57:49] then to have salt run puppet on the server [00:57:52] yep [00:58:12] we can use salt reactors to automatically sign puppet certs based on salt certs [00:58:21] same with deleting them when salt keys are deleted [00:59:09] I started working on that last night for labs. it should be easy to rework it for production too [00:59:53] New patchset: Pyoungmeister; "some var cleanup per faidon's suggestion" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46658 [00:59:57] can't we just make puppet sync keys no matter what? [01:00:19] what do you mean? [01:00:20] for salt? [01:00:25] yeah [01:00:41] salt isn't x509 [01:01:01] so the pub/private keys can't be the same [01:02:04] with proper bootstraping we can actually avoid needing the new_install ssh key [01:03:38] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46658 [01:05:47] we could do exported resources for the pub key, so that the master knows the salt key is valid [01:05:47] that'll be slower and slightly less useful, though [01:27:06] New patchset: Pyoungmeister; "debug: more notifies for debugging" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46663 [01:28:18] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46663 [01:28:45] stupid SVN..arggggg [01:29:13] "unexpectedly changed special status" thing ...annoy [01:48:07] New patchset: Pyoungmeister; "using primary_site = "none" instead of primary_site = false for replication topology" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46667 [01:56:17] Change abandoned: Asher; "(no reason)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46667 [01:57:58] New patchset: Asher; "trying to fix slave monitoring logic for read-only masterless instances" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46670 [01:58:28] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46670 [02:02:10] oh hey [02:02:33] binasher: I was chatting with peter about that, did he leave for the day? [02:02:46] yeah, it was a simple thing to fix.. [02:03:02] yeah didn't initially see it either though [02:03:13] string interpolation into boolean [02:03:56] nah, was trying to treat something undef as a bool false [02:05:02] New patchset: Asher; "cleaning up debug notifies" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46671 [02:05:02] err [02:05:08] I don't see it [02:05:12] thought I did [02:05:20] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46671 [02:05:52] $dict { 'key1' => false, 'key2' => string } [02:06:38] $dict['key1'] -> doesn't actually get defined as a bool, it doesn't exist [02:06:59] puppet. [02:08:04] wtf [02:08:06] so you can't test if $dict['key1'] == false in the above case [02:08:07] didn't know that [02:08:07] http://projects.puppetlabs.com/issues/18234 [02:08:16] fixed in 3.0.2 it seems [02:08:54] hah, awesome. fixing that of course breaks everything based around years of puppets flawed logic [02:09:30] seems like a thing to change in a major version release only [02:09:48] oh wait, not merged yet [02:09:55] In Topic Branch Pending Review [02:10:23] i'd prefer it wait til 3.1.. but not enough to post [02:11:02] anyway [02:11:05] thanks for that [02:11:24] no prob [02:11:32] I didn't know that, although it does seem something that would have bitten me before in the past [02:14:59] PROBLEM - Host ms-be1003 is DOWN: PING CRITICAL - Packet loss = 100% [02:16:01] PROBLEM - Host ms-be1003 is DOWN: PING CRITICAL - Packet loss = 100% [02:16:15] that would be me, ignore [02:17:21] RECOVERY - Host ms-be1003 is UP: PING OK - Packet loss = 0%, RTA = 0.30 ms [02:17:32] RECOVERY - Host ms-be1003 is UP: PING OK - Packet loss = 0%, RTA = 26.55 ms [02:19:01] PROBLEM - Puppet freshness on stat1001 is CRITICAL: Puppet has not run in the last 10 hours [02:21:19] PROBLEM - Puppet freshness on stat1001 is CRITICAL: Puppet has not run in the last 10 hours [02:21:22] PROBLEM - SSH on labstore4 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:22:12] RECOVERY - SSH on labstore4 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [02:28:23] !log LocalisationUpdate completed (1.21wmf8) at Wed Jan 30 02:28:22 UTC 2013 [02:28:25] Logged the message, Master [02:38:43] dzahn is doing a graceful restart of all apaches [02:39:24] !log dzahn gracefulled all apaches [02:39:25] Logged the message, Master [02:39:26] dzahn: https://bugzilla.wikimedia.org/show_bug.cgi?id=44395 too [02:52:03] !log LocalisationUpdate completed (1.21wmf7) at Wed Jan 30 02:52:03 UTC 2013 [02:52:04] Logged the message, Master [02:54:52] catrope is doing a graceful restart of all apaches [02:55:05] !log catrope gracefulled all apaches [02:55:06] Logged the message, Master [02:56:34] catrope is doing a graceful restart of all apaches [03:06:49] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/46011 [03:07:25] !log reloading Apaches in eqiad using manual dsh (because apache-graceful-all sanity check has a bug) [03:07:26] Logged the message, Master [03:09:13] that's lots of gracefulling [03:11:16] !log wikimania.asia redirect now works, it did not before because of a bug in apache-graceful-all not actually restarting Apaches in eqiad [03:11:17] Logged the message, Master [03:11:41] jeremyb: yea, manual gracefulling bypassing the sanity check, heh [03:11:56] because that check had a bug, thanks to Roan for finding it [03:12:58] mutante: oh, he wasn't trying to graceful for his own thing he was just debugging? [03:13:08] One of those was me typing less `apache-graceful-all` instead of less which `apache-graceful-all` [03:13:21] yea [03:13:21] Before that, I ran it once for real to see the output [03:13:25] neither of those [03:13:33] less `which apache-graceful-all` [03:13:50] Sorry yes, the latter [03:13:58] :) [03:14:26] * jeremyb wonders what wikimania.asia even means [03:15:01] wikimania2013 [03:15:37] jeremyb: try it in your browser [03:16:32] Works for me. That's neat. [03:16:49] ohhhhhhhhhhh [03:16:52] asia is a TLD [03:17:02] it is, just like .museum and .xxx :) [03:17:44] (didn't try it yet. you said try it so i just went straight to wikimania2013.wikimedia.org) [03:17:44] yea, this was about redirecting one to the other [03:17:44] now, tried and it does seem to work [03:17:46] right [03:17:53] i just thought it was a fake internal name [03:17:56] like smnet [03:17:58] wmnet* [03:18:23] Nope [03:18:27] heh.. i see.yeah. i expect we will be asked to redirect this one in every year Wikimania is actually in Asia:) [03:18:39] We just had a lot of fun because http://wikimania.asia is the first new redirect we set up after the eqiad migration [03:18:54] we should do Istanbul just to discuss whether it's Europe or Asia..haha [03:19:27] http://en.wikipedia.org/wiki/.asia [03:19:28] we could just redirect both europe and asia to istanbul [03:19:49] mutante: we already did a controversy: gdanzing [03:19:54] gdanzig* [03:20:09] hmm.. there is .eu but not .europe [03:20:40] oh, i can see that discussion yeah.. Gdansk [03:20:57] i think maybe there's an eu.org or something too [03:21:08] hey, another thing.. we have those US based state or city chapters, right [03:21:21] like Wikimedia NYC, and California etc [03:21:27] i think there's no city chapters [03:21:46] there is a plan to make a .nyc tld [03:21:46] there are [03:21:48] Wikimedia Boston [03:21:51] no [03:22:00] wikimedia new england not boston [03:22:06] and NYC is really not NYC [03:22:26] DC is maybe the closest to being just a city but it's also multiple states really [03:22:32] http://meta.wikimedia.org/wiki/Wikimedia_chapters [03:22:46] that table still lists it as Boston [03:22:54] but i see it's redirected..ok [03:23:35] well, state or city is secondary.. here's the suggestion though [03:23:49] that page needs updates! [03:23:58] it says 2012 in future tense [03:24:00] we have "pa.us.wikimedia" and stuff for those [03:24:08] pa is defunct [03:24:11] and the problem is that they can't support HTTPS [03:24:25] because we cant get a *.*.wikimedia.org certificate [03:24:31] i know [03:24:33] mobile [03:24:35] and at the same time we have wikimedia.us the domain [03:24:39] and it doesnt redirect anywhere [03:25:15] so my suggestion is to solve those both at once [03:25:15] by declaring wikimedia.us the domain for those US chapters [03:25:15] and use pa.wikimedia.us etc [03:25:19] yeah, except there's no PA activity [03:25:20] have nicer URLs, dont have a certificate problem AND have a use for that domain .. and it actually matches [03:25:32] pa is within the jurisdiction of NYC [03:25:33] it is wikimedia ...in the us [03:26:02] anyway, i can suggest it. i don't have a strong opinion [03:26:04] yea, it is not just that one.. it is about all of them in *.us.wikimedia [03:26:17] what are there? [03:26:22] nyc is nyc.wikimedia.org [03:26:27] not .us.wikimedia.org [03:26:43] pa is defunct [03:27:29] hmmm.. interesting ..maybe you are right and we can just close that ticket ..also nice:) [03:27:38] looking at DNS zone for all sub.sub domains [03:28:03] de.labs, flaggedrevs.labs, www.commons, noboard.chapters ... [03:28:19] but indeed no other US chapters as i expected [03:28:49] http://noboard.chapters.wikimedia.org/ ??? [03:28:54] what is that:) [03:29:33] mutante: norway? [03:29:37] rings a bell [03:31:17] yeah, Norwegian. nice skin though,hah [03:32:28] mutante: Looks like the "standard" private wiki skin from a while back. [03:32:40] compare with otrs-wiki [03:32:46] Yup. [03:33:24] ok, it's the only one that has *.chapters.wikimedia.org [03:33:29] let's clean that up some time [03:33:54] mutante: Ideally none of the wikis for chapters should be hosted by WMF, of course. [03:33:54] i hope they are fine moving to something else.. if its still used [03:34:16] James_F: that is different with every single domain name :p [03:34:22] mutante: I know. [03:34:33] mutante: But it's actively-problematic legally, for instance. [03:34:38] * James_F sighs. [03:34:45] all domain transfers go through legal ..shrug [03:35:17] Yeah. [03:35:27] sometimes the chapters own domains and have their nameservers [03:35:31] 'Cos they have to worry about trademarks, and have to worry about costs. [03:35:35] sometimes they own the domain but point to our NSes [03:35:42] sometimes we own it but redirect to them [03:35:47] Yeah. [03:35:48] and so on..you will find any combination [03:36:09] thats why we have a whole queue just for domains..sugh [03:36:10] I was the original owner of wiki{p,m}edia.{co,org}.uk. [03:36:29] That's interesting syntax. [03:36:35] sometimes it's owned by a complete stranger but redirects to a sensible place so no one has managed to get it transferred (not quite sure the entire story. thinking of wikimania.org) [03:36:45] I was going to transfer them to WMF, but WMF wasn't keen at the time, and WMUK (which I founded) then wanted them, so in the end WMUK got them. [03:37:00] Susan: try echo wiki{p,m}edia.{co,org}.uk. in bash [03:37:24] Susan: RISC OS PRM house style. Sticks so much, 20 years later. [03:37:56] Interesting. [03:38:05] I thought it was just mangled regex, heh. [03:38:08] no [03:38:18] Susan: echo {0..5} [03:38:28] I hate bash. [03:38:41] Susan: Bash is lovely, you. [03:41:07] James_F: well, if you wanna try again.. mpaulson :) [03:41:24] mutante: I no longer have the domains. WMUK does. :-) [03:41:32] mutante: As of 2005. [03:41:50] mutante: Sorry, should have been more clear. [03:44:26] alright, as long the chapter and legal is happy:) ehmm.. but wikipedia.co.uk is broken .. [03:44:50] * James_F sighs. [03:44:50] while i see remnants of it in our DNS .. why is this not surprising anymore..hrmm [03:44:56] Will ping them an e-mail. [03:45:41] Oh, wait. [03:45:47] wikipedia.co.uk and .org.uk went to WMF. [03:45:51] Or, rather, back then, to Bomis. [03:45:54] Who still have it. [03:46:02] Which is a bit of a problem. :-) [03:46:21] * James_F sighs. [03:46:24] Someone else can deal. [03:46:33] this is fun! [03:46:43] legal :) [03:48:13] oh, and it still says aude is president on the page mutante linked! [03:48:16] way out of date [03:48:21] (as of marchish last year) [03:49:14] NYC is also wrong [03:51:14] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46558 [03:51:44] * jeremyb runs away [03:51:48] Faster. [03:52:16] yep, me too. i'll be back from European timezone... cu ..out [03:59:21] New patchset: Spage; "Add a logbot class for #wikimedia-e3 channel" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46672 [04:34:14] PROBLEM - SSH on kaulen is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:35:14] PROBLEM - HTTP on kaulen is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:37:00] PROBLEM - HTTP on kaulen is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:37:09] PROBLEM - SSH on kaulen is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:47:24] PROBLEM - NTP on kaulen is CRITICAL: NTP CRITICAL: No response from NTP server [04:58:00] PROBLEM - NTP on kaulen is CRITICAL: NTP CRITICAL: No response from NTP server [04:59:19] Bugzilla is down. [05:01:24] PROBLEM - Puppet freshness on analytics1007 is CRITICAL: Puppet has not run in the last 10 hours [05:34:27] PROBLEM - MySQL Slave Delay on db53 is CRITICAL: CRIT replication delay 181 seconds [05:34:38] PROBLEM - MySQL Slave Delay on db53 is CRITICAL: CRIT replication delay 184 seconds [05:34:39] PROBLEM - MySQL Replication Heartbeat on db53 is CRITICAL: CRIT replication delay 184 seconds [05:35:30] PROBLEM - MySQL Replication Heartbeat on db53 is CRITICAL: CRIT replication delay 203 seconds [05:36:15] RECOVERY - MySQL Slave Delay on db53 is OK: OK replication delay 0 seconds [05:36:39] RECOVERY - MySQL Slave Delay on db53 is OK: OK replication delay 0 seconds [05:36:39] RECOVERY - MySQL Replication Heartbeat on db53 is OK: OK replication delay 0 seconds [05:37:18] RECOVERY - MySQL Replication Heartbeat on db53 is OK: OK replication delay 0 seconds [05:39:06] PROBLEM - Puppet freshness on db1047 is CRITICAL: Puppet has not run in the last 10 hours [05:39:06] PROBLEM - Puppet freshness on msfe1002 is CRITICAL: Puppet has not run in the last 10 hours [05:39:06] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [05:39:06] PROBLEM - Puppet freshness on ms1004 is CRITICAL: Puppet has not run in the last 10 hours [05:39:07] PROBLEM - Puppet freshness on vanadium is CRITICAL: Puppet has not run in the last 10 hours [05:39:07] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [05:41:03] PROBLEM - Puppet freshness on professor is CRITICAL: Puppet has not run in the last 10 hours [05:43:49] PROBLEM - Host kaulen is DOWN: PING CRITICAL - Packet loss = 100% [05:47:57] PROBLEM - Host kaulen is DOWN: CRITICAL - Host Unreachable (208.80.152.149) [05:52:51] I'm looking at kaulen [05:53:48] So am I, but you're a lot better at looking at things [05:53:56] I'm mostly just going "oh no". [05:57:18] RECOVERY - SSH on kaulen is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [05:57:28] RECOVERY - Host kaulen is UP: PING OK - Packet loss = 0%, RTA = 26.93 ms [05:57:39] RECOVERY - HTTP on kaulen is OK: HTTP OK: HTTP/1.1 302 Found - 489 bytes in 0.055 second response time [05:57:40] you don't need to be smart to press the reset button [05:57:49] you do need the password though, which I assume you don't have ;) [05:58:03] RECOVERY - HTTP on kaulen is OK: HTTP OK - HTTP/1.1 302 Found - 0.005 second response time [05:58:03] RECOVERY - SSH on kaulen is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [05:58:12] RECOVERY - Host kaulen is UP: PING OK - Packet loss = 0%, RTA = 0.54 ms [05:58:28] yup. [06:05:35] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [06:09:25] PROBLEM - Puppet freshness on professor is CRITICAL: Puppet has not run in the last 10 hours [06:25:24] PROBLEM - Host mw1041 is DOWN: PING CRITICAL - Packet loss = 100% [06:25:55] PROBLEM - Host mw1041 is DOWN: PING CRITICAL - Packet loss = 100% [06:28:24] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Wed Jan 30 06:28:22 UTC 2013 [06:28:34] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [06:28:44] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Wed Jan 30 06:28:36 UTC 2013 [06:29:35] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [06:33:54] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Wed Jan 30 06:33:44 UTC 2013 [06:34:34] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [07:36:46] PROBLEM - Puppet freshness on labstore2 is CRITICAL: Puppet has not run in the last 10 hours [07:43:05] PROBLEM - Puppet freshness on labstore3 is CRITICAL: Puppet has not run in the last 10 hours [07:45:20] New patchset: Ryan Lane; "Changes for newer glusterfs packages" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46680 [07:46:03] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46680 [07:51:02] PROBLEM - Puppet freshness on labstore4 is CRITICAL: Puppet has not run in the last 10 hours [07:52:59] PROBLEM - Puppet freshness on labstore1 is CRITICAL: Puppet has not run in the last 10 hours [08:07:51] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Wed Jan 30 08:07:42 UTC 2013 [08:08:31] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [08:09:49] New patchset: Ryan Lane; "Split gluster into gluster-client/gluster-server" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46681 [08:10:28] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46681 [08:11:44] RECOVERY - Puppet freshness on labstore4 is OK: puppet ran at Wed Jan 30 08:11:13 UTC 2013 [08:12:38] RECOVERY - Puppet freshness on labstore3 is OK: puppet ran at Wed Jan 30 08:12:22 UTC 2013 [08:19:14] RECOVERY - Puppet freshness on labstore1 is OK: puppet ran at Wed Jan 30 08:18:51 UTC 2013 [08:25:19] New patchset: Ryan Lane; "Pin labsconsole memcache to ubuntu version" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46684 [08:29:08] RECOVERY - Puppet freshness on labstore2 is OK: puppet ran at Wed Jan 30 08:28:51 UTC 2013 [09:05:16] New patchset: Silke Meyer; "Add a variable to enable/disable experimental Wikidata features in labsconsole" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46536 [09:16:15] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [09:24:45] PROBLEM - Memcached on virt0 is CRITICAL: Connection refused [09:26:17] PROBLEM - Memcached on virt0 is CRITICAL: Connection refused [09:26:39] mutante, paravoid, Ryan_Lane can you check https://bugzilla.wikimedia.org/show_bug.cgi?id=44499 [09:26:49] labsconsole is down [09:30:45] RECOVERY - Memcached on virt0 is OK: TCP OK - 0.027 second response time on port 11000 [09:31:32] RECOVERY - Memcached on virt0 is OK: TCP OK - 0.032 second response time on port 11000 [10:09:23] PROBLEM - Puppet freshness on labstore2 is CRITICAL: Puppet has not run in the last 10 hours [10:38:30] paravoid or apergos could you review https://gerrit.wikimedia.org/r/#/c/46548/ please?:) [10:42:32] Change merged: ArielGlenn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46548 [10:47:41] done, puppet ran ok [10:47:47] back in a while (lunch) [10:54:11] thanks! [11:01:04] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [11:19:29] New patchset: Hashar; "(bug 44424) wikiversions.cdb for labs" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/46726 [11:19:46] New review: Hashar; "Added back with https://gerrit.wikimedia.org/r/#/c/46726/" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/46240 [12:07:38] RECOVERY - Puppet freshness on labstore2 is OK: puppet ran at Wed Jan 30 12:07:36 UTC 2013 [12:09:13] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [12:14:58] New patchset: Mark Bergsma; "Remove double slashes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44076 [12:14:58] New patchset: Mark Bergsma; "Set CORS header in vcl_deliver" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44078 [12:15:58] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44076 [12:19:03] PROBLEM - Puppet freshness on stat1001 is CRITICAL: Puppet has not run in the last 10 hours [12:22:19] PROBLEM - Puppet freshness on stat1001 is CRITICAL: Puppet has not run in the last 10 hours [12:22:56] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44078 [12:23:30] ololo, someone's attempting an SQL injection attack on us:) [12:23:42] Exception from line 549 of /usr/local/apache/common-local/php-1.21wmf8/includes/content/ContentHandler.php: Format text/x-wiki%' aNd 4795=4796-1 aNd 'VNwC'!=' is not supported for content model wikitext [12:24:06] lots of similar crap in exception logs [12:24:17] shall we ban this fucker? [12:25:24] will it stop them? ;) [12:25:58] if we ban on a Squid level, it will. for a time, if they're persuistent:) [12:26:28] Format text/x-wiki'/**//**/> is not supported [12:26:42] well they can just change to a different ip [12:46:02] New patchset: Mark Bergsma; "Use $::mw_primary for bits app servers" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46738 [12:47:56] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46738 [12:50:53] PROBLEM - LVS Lucene on search-pool4.svc.eqiad.wmnet is CRITICAL: Connection timed out [12:51:54] RECOVERY - LVS Lucene on search-pool4.svc.eqiad.wmnet is OK: TCP OK - 0.000 second response time on port 8123 [12:52:48] !log maxsem synchronized php-1.21wmf8/extensions/GeoData [12:52:49] Logged the message, Master [12:54:56] !log maxsem synchronized php-1.21wmf7/extensions/GeoData [12:54:56] Logged the message, Master [13:11:14] PROBLEM - Lucene on search1015 is CRITICAL: Connection timed out [13:13:14] RECOVERY - Lucene on search1015 is OK: TCP OK - 3.003 second response time on port 8123 [13:57:25] New patchset: Mark Bergsma; "Add backends hash to role::cache::configuration" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46745 [13:57:25] New patchset: Mark Bergsma; "Replace some backend definitions by $::role::cache::configuration hash counterparts" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46746 [13:59:27] New patchset: Hashar; "(bug 44506) raise throttle for an Israel editthon" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/46747 [13:59:50] New patchset: Mark Bergsma; "Replace some backend definitions by $::role::cache::configuration hash counterparts" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46746 [13:59:50] New patchset: Mark Bergsma; "Add backends hash to role::cache::configuration" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46745 [14:00:34] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46745 [14:00:55] !! [14:00:56] ! [14:01:26] there's much more to come [14:01:38] that is the poor man Hiera [14:01:47] that is going to make stuff a bit cleaner indeed :-] [14:02:00] i have to do this slowly, step by step [14:03:56] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46746 [14:06:04]