[00:02:20] RECOVERY - Varnish traffic logger on cp1041 is OK: PROCS OK: 3 processes with command name varnishncsa [00:05:00] PROBLEM - Puppet freshness on xenon is CRITICAL: No successful Puppet run in the last 10 hours [00:05:20] PROBLEM - Puppet freshness on db1012 is CRITICAL: No successful Puppet run in the last 10 hours [00:06:00] !log maxsem synchronized php-1.22wmf1/extensions/MobileFrontend/includes/MobileFrontend.hooks.php 'https://gerrit.wikimedia.org/r/#/c/58436/' [00:06:06] Logged the message, Master [00:07:27] !log maxsem synchronized php-1.21wmf12/extensions/MobileFrontend/includes/MobileFrontend.hooks.php 'https://gerrit.wikimedia.org/r/#/c/58436/' [00:07:34] Logged the message, Master [00:09:00] RECOVERY - Puppet freshness on db1012 is OK: puppet ran at Wed Apr 10 00:08:52 UTC 2013 [00:09:10] RECOVERY - Puppet freshness on xenon is OK: puppet ran at Wed Apr 10 00:09:08 UTC 2013 [00:09:20] PROBLEM - Puppet freshness on db1012 is CRITICAL: No successful Puppet run in the last 10 hours [00:10:00] PROBLEM - Puppet freshness on xenon is CRITICAL: No successful Puppet run in the last 10 hours [00:10:20] PROBLEM - Varnish traffic logger on cp1041 is CRITICAL: PROCS CRITICAL: 2 processes with command name varnishncsa [00:10:50] RECOVERY - Puppet freshness on db1012 is OK: puppet ran at Wed Apr 10 00:10:43 UTC 2013 [00:11:00] RECOVERY - Puppet freshness on xenon is OK: puppet ran at Wed Apr 10 00:10:59 UTC 2013 [00:11:20] PROBLEM - Puppet freshness on db1012 is CRITICAL: No successful Puppet run in the last 10 hours [00:12:00] PROBLEM - Puppet freshness on xenon is CRITICAL: No successful Puppet run in the last 10 hours [00:12:31] RECOVERY - Puppet freshness on db1012 is OK: puppet ran at Wed Apr 10 00:12:28 UTC 2013 [00:12:50] RECOVERY - Puppet freshness on xenon is OK: puppet ran at Wed Apr 10 00:12:43 UTC 2013 [00:13:00] PROBLEM - Puppet freshness on xenon is CRITICAL: No successful Puppet run in the last 10 hours [00:13:20] PROBLEM - Puppet freshness on db1012 is CRITICAL: No successful Puppet run in the last 10 hours [00:13:37] WTF is going on with these checks? [00:14:10] RECOVERY - Puppet freshness on db1012 is OK: puppet ran at Wed Apr 10 00:14:05 UTC 2013 [00:14:20] PROBLEM - Puppet freshness on db1012 is CRITICAL: No successful Puppet run in the last 10 hours [00:14:20] RECOVERY - Puppet freshness on xenon is OK: puppet ran at Wed Apr 10 00:14:19 UTC 2013 [00:15:00] PROBLEM - Puppet freshness on xenon is CRITICAL: No successful Puppet run in the last 10 hours [00:15:40] RECOVERY - Puppet freshness on db1012 is OK: puppet ran at Wed Apr 10 00:15:37 UTC 2013 [00:16:00] RECOVERY - Puppet freshness on xenon is OK: puppet ran at Wed Apr 10 00:15:51 UTC 2013 [00:16:00] PROBLEM - Puppet freshness on xenon is CRITICAL: No successful Puppet run in the last 10 hours [00:16:20] PROBLEM - Puppet freshness on db1012 is CRITICAL: No successful Puppet run in the last 10 hours [00:17:10] RECOVERY - Puppet freshness on db1012 is OK: puppet ran at Wed Apr 10 00:17:04 UTC 2013 [00:17:20] PROBLEM - Puppet freshness on db1012 is CRITICAL: No successful Puppet run in the last 10 hours [00:17:30] RECOVERY - Puppet freshness on xenon is OK: puppet ran at Wed Apr 10 00:17:21 UTC 2013 [00:18:00] PROBLEM - Puppet freshness on xenon is CRITICAL: No successful Puppet run in the last 10 hours [00:18:30] RECOVERY - Puppet freshness on db1012 is OK: puppet ran at Wed Apr 10 00:18:23 UTC 2013 [00:18:40] RECOVERY - Puppet freshness on xenon is OK: puppet ran at Wed Apr 10 00:18:35 UTC 2013 [00:19:00] PROBLEM - Puppet freshness on xenon is CRITICAL: No successful Puppet run in the last 10 hours [00:21:05] !log removing xenon and db1012 from icinga configs, running puppetstoredconfigclean.rb on them, restarting icinga [00:21:12] Logged the message, Master [00:21:44] MaxSem: ^ some issue with decom'ed hosts that don't get removed from monitoring, known issue, but not solved yet, should stop now [00:22:19] PROBLEM - SSH on cp1043 is CRITICAL: Server answer: [00:23:19] RECOVERY - SSH on cp1043 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [00:32:42] New patchset: Dzahn; "turn planet into a puppet module" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54493 [00:32:42] New patchset: Dzahn; "rename planet class, per docs init.pp must exist and contain a class matching the module name" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54494 [00:33:40] PROBLEM - Puppet freshness on virt1000 is CRITICAL: No successful Puppet run in the last 10 hours [00:53:12] New review: Dzahn; "recheck" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54493 [00:56:11] New patchset: Dzahn; "turn planet into a puppet module" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54493 [00:57:41] PROBLEM - SSH on lvs6 is CRITICAL: Server answer: [00:58:41] RECOVERY - SSH on lvs6 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [00:58:46] New patchset: Dzahn; "turn planet into a puppet module" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54493 [01:01:41] RECOVERY - Varnish traffic logger on cp1041 is OK: PROCS OK: 3 processes with command name varnishncsa [01:05:20] PROBLEM - Puppet freshness on db1012 is CRITICAL: No successful Puppet run in the last 10 hours [01:06:11] PROBLEM - Puppet freshness on xenon is CRITICAL: No successful Puppet run in the last 10 hours [01:20:31] New patchset: Dzahn; "turn planet into a puppet module" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54502 [01:23:29] New patchset: Dzahn; "turn planet into a puppet module" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54502 [01:46:44] PROBLEM - Varnish traffic logger on cp1041 is CRITICAL: PROCS CRITICAL: 2 processes with command name varnishncsa [01:46:52] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54502 [01:47:43] New review: MZMcBride; "\o/" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54502 [01:57:34] New patchset: Dzahn; "remove duplicate definition of locales package" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/58445 [01:58:19] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/58445 [02:01:44] RECOVERY - Varnish traffic logger on cp1041 is OK: PROCS OK: 3 processes with command name varnishncsa [02:05:26] PROBLEM - Puppet freshness on db1012 is CRITICAL: No successful Puppet run in the last 10 hours [02:06:06] PROBLEM - Puppet freshness on xenon is CRITICAL: No successful Puppet run in the last 10 hours [02:17:20] !log LocalisationUpdate completed (1.22wmf1) at Wed Apr 10 02:17:19 UTC 2013 [02:17:27] Logged the message, Master [02:26:46] PROBLEM - SSH on gadolinium is CRITICAL: Server answer: [02:27:38] New patchset: Dzahn; "resource references should now be capitalized" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/58446 [02:27:46] RECOVERY - SSH on gadolinium is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [02:27:47] PROBLEM - Varnish traffic logger on cp1041 is CRITICAL: PROCS CRITICAL: 2 processes with command name varnishncsa [02:28:06] !log LocalisationUpdate completed (1.21wmf12) at Wed Apr 10 02:28:06 UTC 2013 [02:28:10] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/58446 [02:28:13] Logged the message, Master [02:29:46] RECOVERY - Varnish traffic logger on cp1041 is OK: PROCS OK: 3 processes with command name varnishncsa [02:30:26] PROBLEM - LVS HTTP IPv4 on appservers.svc.pmtpa.wmnet is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 MediaWiki exception - 1600 bytes in 2.276 second response time [02:30:29] PROBLEM - Apache HTTP on mw1103 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:36] PROBLEM - Apache HTTP on mw1171 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:36] PROBLEM - Apache HTTP on mw1113 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:36] PROBLEM - MySQL Slave Delay on db1017 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [02:30:36] PROBLEM - LVS HTTP IPv4 on rendering.svc.pmtpa.wmnet is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 MediaWiki exception - 1600 bytes in 2.254 second response time [02:30:38] PROBLEM - Apache HTTP on mw1061 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:38] PROBLEM - Apache HTTP on mw1066 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:38] PROBLEM - Apache HTTP on mw1042 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:38] PROBLEM - Apache HTTP on mw1049 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:38] PROBLEM - Apache HTTP on mw1057 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:38] PROBLEM - Apache HTTP on mw1044 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:38] PROBLEM - Apache HTTP on mw1167 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:39] PROBLEM - Apache HTTP on mw1215 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:39] PROBLEM - Apache HTTP on mw1175 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:40] PROBLEM - Apache HTTP on mw1173 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:46] PROBLEM - Apache HTTP on mw1039 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:46] PROBLEM - Apache HTTP on mw1078 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:46] PROBLEM - Apache HTTP on mw1027 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:46] PROBLEM - Apache HTTP on mw1041 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:46] PROBLEM - Apache HTTP on mw1111 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:46] PROBLEM - Apache HTTP on mw1084 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:46] PROBLEM - Apache HTTP on mw1069 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:47] PROBLEM - Apache HTTP on mw1063 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:47] PROBLEM - Apache HTTP on mw1060 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:48] PROBLEM - Apache HTTP on mw1021 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:48] PROBLEM - Apache HTTP on mw1050 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:49] PROBLEM - Apache HTTP on mw1026 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:49] PROBLEM - Apache HTTP on mw1092 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:50] PROBLEM - Apache HTTP on mw1178 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:50] PROBLEM - Apache HTTP on mw1083 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:51] PROBLEM - Apache HTTP on mw1080 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:51] PROBLEM - Apach [02:33:23] LeslieCarr: I'll update the VPs, sure, once I can reach en.wikipedia.org again... :-/ [02:33:34] well, it's back up [02:33:40] works for me [02:33:51] localization again [02:33:54] apache process also running on a random one (mw1080) [02:33:59] ugh, ok [02:34:33] yeah, works now again [02:35:16] Is there any chance to get something more useful than "(Cannot contact the database server: Unknown error (10.64.16.6))"? [02:36:21] errr, wtf, icinga-wm stopped mid msg? [02:36:53] (never seen that before) [02:37:13] andre__: it's related to localization updates and happened yesterday too, already has an Ops thread [02:37:26] ah, saw that one [02:37:29] eh, and should be bug 27320 [02:37:58] jeremyb_: yea, usually it gets kicked at that point :p [02:38:23] for flooding [02:38:54] right [03:00:29] New patchset: Dzahn; "make the planet logo a "per language" thing" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/58448 [03:01:07] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/58448 [03:10:24] Coren: I just copied you on https://bugzilla.wikimedia.org/show_bug.cgi?id=47067 [03:10:38] I wasn't sure if uberbox was your work Bugzilla account. [03:11:08] Susan: It is; it predates my indentu^W staff-like status. :-) [03:11:26] Susan: OpenID is in the short-term plans, btw. [03:12:05] I don't touch Labs very much. [03:12:05] Coren: you can always change email addresses in bz if you desire [03:12:12] without breaking anything [03:12:24] Coren: I imagine some people will be too lazy to switch from the Toolserver. [03:12:29] I'm not sure I'm moving everything over. [03:12:34] There's a lot of random shit. [03:12:52] * Susan shrugs. [03:13:01] Susan: They'll eventually hit a brick wall; the final posts on the roadmap is tarballs of what's left and /sbin/poweroff [03:13:11] Some projects are sustained by scripts that people have forgotten about. [03:13:15] Wikisource, for example. [03:13:19] arr, so how do i debug more if puppetd -tv just doesn't do anything for minutes and nothing in syslog and it used to finish in seconds until my last change :p [03:13:39] Susan: No doubt. Silke is hard at working trying to ferret those out -- but it's going to be a pain to track the authors down. [03:13:40] Coren: Of the Toolserver? Not in this decade, I don't imagine. [03:13:57] Coren: I'm saying that the authors aren't going to want to move shit over. [03:14:01] Like me. [03:14:02] well, at some point they'll just go away, then [03:14:20] It seemed like replication would break one day. [03:14:28] And that would be the beginning of the end for the Toolserver. [03:14:39] mutante: maybe puppetd -td ? [03:14:57] iirc they're not the same [03:15:11] Susan: We're making a welcoming home for refugees, not just new maintainers. The problem is the tools that have been abandonned; we can't just grab them. [03:15:34] mutante: self-hosted puppet? [03:15:34] You can if they're under an open license. [03:15:52] Susan: Yes, but the actual licensing info is often not there at all. [03:15:59] if no one is willing to maintain something, it should go away [03:16:08] Coren: no, the production one, it is just on this node though [03:16:10] One benefit of the Toolesrver being so unstable is that tools are starting to use the API more often. [03:16:23] Ryan_Lane: That hurts small projects pretty badly. [03:16:40] Ryan_Lane: I wish it were that simple; but there are project out there that rely on unmaintained tools. :-( [03:16:40] Like I said, some tools that are very important to the Global South projects aren't maintained. [03:16:46] Even some deployed extensions. [03:16:52] Coren: then someone better step up and maintain them [03:16:57] The API doesn't suffer replay very often, which is nice. [03:16:58] Coren: do you plan on maintaining them? :) [03:16:59] It's much rarer, at least. [03:17:17] localization again [03:17:24] TimStarling: was it not? [03:17:37] Ryan_Lane: Agreed. AFAIK, that's Silke's primary task (enumerate old tools, see what can be saved (has the right licenses), find adopters) [03:17:47] Are localization updates now breaking the site? [03:17:51] I considered just running LU manually, during off peak yesterday, to generate the errors I needed to isolate the problem [03:18:04] but I thought I might get yelled at for not having a deployment window [03:18:09] hahaha [03:18:16] so I just set up some logging and let it happen by itself [03:18:17] * Coren yells at TimStarling. Just because. [03:18:42] TimStarling: I got your new and improved(R) log2udp(tm) by the way. Where do you want me to stick it? :-) [03:19:27] ® [03:19:28] ™ [03:19:31] They're easy on a Mac. [03:19:33] Coren: excellent [03:20:06] On Linux too. ™ ® Compose key FTW. [03:20:13] I was just lazy. :-) [03:20:15] mutante: tried -td ? [03:20:28] can you push it to gerrit for review? [03:20:30] Susan: you added yourself to a closed bug? :D [03:20:36] TimStarling: New repo or somewhere specific? [03:20:40] I've been doing that lately. [03:20:43] I'm not really sure why. [03:20:45] Coren: log2udp? is this for sending to ircecho? [03:20:50] In case it gets reopened, I guess. [03:20:56] Sometimes I'm sad to have missed the bug. [03:20:59] I éńábĺéd the compose key on my desktop recently [03:21:19] jeremyb_: yea, it stops after "Using cached certificate.." [03:21:30] mutante: strace? :) [03:21:44] https://en.wikipedia.org/wiki/Compose_key # Interesting. [03:21:50] i've had some good results with it. but not tried to strace puppet yet [03:21:52] it let's you type some things on IRĉ that would take a long time with a character map [03:22:04] Alt codes seem like the kind of thing MediaWiki would do. :v [03:22:15] TimStarling: you know ctrl-shift-u ? [03:23:01] TimStarling: new repo or did you have somewhere in mind? [03:24:04] Coren: did you write it from scratch, rather than starting from the existing code? [03:24:13] TimStarling: Scratch. [03:24:23] maybe a new project under analytics then [03:24:25] there's a project called scratch... :P [03:24:46] should it be called something different, to avoid confusion? [03:25:00] jeremyb_: the puppetmaster is just thaat busy.. (stafford) [03:25:15] I can put it in analytics/udplog since I named it (imaginatively) log2udp2 :-) [03:25:27] what's the change? [03:25:35] mutante: is it a particularly heavy catalog generation? e.g. neon [03:25:42] tell ori-l about it, he will probably care [03:26:12] I can make an analytics/log2udp2 project [03:26:13] TimStarling: i've developed an unreasonable attachment to it. stockholm syndrome. [03:26:41] ori-l: A bit more flexible, fixed footprint, should be blazingly fast. Has a numbering and prefix option, and merges lines into packets when possible. [03:26:44] I think udplog should mostly be for things which use that C++ library [03:26:54] kk. I'll make a new repo [03:27:40] merges lines into packets is a good thing and will spare a lot of unneeded udp frames, but be careful with it [03:27:54] lots of scripts that assume 1 datagram = 1 line [03:28:10] can someone review this varnish configuration change? https://gerrit.wikimedia.org/r/#/c/58269/ [03:28:36] ori-l: Need it to be optional, then? When Tim gave me the requirements, it seems like it was core. [03:29:04] tim knows better than me, but i think it's just a matter of socializing the change, esp. to erik zachte [03:29:23] So much socialization lately. [03:29:26] who loyally shepherds a flock of perl scripts that munge udplog data [03:29:30] if it has a different program name then you can migrate to it at your leisure [03:30:32] jeremyb_: it's compiling catalogs for all kinds of servers succesfully.. but it's also not getting finished on neon. yea [03:30:36] obviously 1 datagram = 1 line is not a great convention [03:30:51] that's why only log2udp uses it [03:30:59] mutante: i didn't know which node you were focusing on [03:31:03] log3udp [03:32:13] TimStarling: re: that change -- doesn't it just duplicate line 495 in the same file? [03:33:18] jeremyb_: zirconium, shouldn't have that much to do, it's planet, i'll just wait a bit, getting too late here [03:33:18] that's for upload [03:33:33] otherwise yes [03:33:44] it duplicates line 495 and line 703 [03:33:59] because I think bits should be logged as well as upload and mobile [03:34:04] mutante: ok, gute nacht. please poke me about rt 822 tomorrow? [03:34:07] getting late here too [03:35:00] jeremyb_: yep, don't have a reply yet, maybe the owner email has changed since 2011.. ttyl, good night [03:35:34] mutante: no, i mean about keeping status quo functionality and the confusing terminology "shortening service" [03:36:03] jeremyb_: if it can be done with a change to redirects.conf we should be good [03:36:10] mutante: it can be... [03:36:17] great..ok. [03:36:23] it's a simple regex subst [03:36:35] i wasn't sure either what the term includes and what it doesnt [03:36:40] TimStarling: well, aren't you duplicating the resource name, though? [03:36:44] "service" sounded like a bit more [03:36:53] right [03:37:01] if something requires => Varnish::logging['locke'] or whatever [03:38:09] the same problem would exist for mobile and upload, right? [03:38:47] TimStarling: let me trace this through [03:38:55] s/would/does, maybe [03:40:16] it looks like this change is fine to me [03:40:26] TimStarling, ori-l: https://gerrit.wikimedia.org/r/#/c/58449/ [03:40:58] oooo your code is purty [03:41:38] thanks Coren [03:41:58] i presume "quasi-circular" means pizza slice [03:42:15] * ori-l brbs [03:42:19] Heh. I explain what it means the next statement. But pizza slice sounds good. :-) [03:43:15] ori-l: it would only be a problem if that resource was defined on the same system [03:43:23] Coren: you should add reviewers... [03:43:26] ori-l: and in that case, you'd set instance_name [03:43:41] jeremyb_: I have no idea who I should add. :-) ori and Tim I suppose. [03:43:58] I just added a few. [03:44:11] Coren: that's a big problem with gerrit IMO. you can't request review from "whoever feels like reviewing right now" [03:44:14] And Domas, why not. [03:44:47] jeremyb_: unless you add a mailing list as a reviewer? [03:45:11] I hate my overcomplicated makefile. :-) [03:45:13] mutante: uhuh? [03:45:25] mutante: which lists are gerrit users? [03:45:40] jeremyb_: not saying to add ops :p just in general, if you had an appropriate list or alias, you could add it like a user [03:45:59] does gerrit allow adding non-users? [03:46:29] i don't think it would be able to tell the difference, it would just be Foo Bar with a certain email address..no? [03:47:10] 1 way to find out ... [03:47:59] New patchset: Ryan Lane; "Remove labstore3/4 bricks from management (DON'T MERGE)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/58451 [03:49:36] jeremyb_: yep, out for real, cya [03:52:25] Oh, I just realized I used my own code format. I hope I didn't commit some WMF sin there. :-) [03:52:50] Uses cino={.5s:.5s=.5sl1g.5sh.5s(0u0U1 [03:55:10] Coren: errr, where are you talking about? [03:58:20] jeremyb_: Indenting and whitespace conventions in C-like languages is... religious. :-) There's the church of K&R, the cult of Allman, the GNU cabal, etc. And then there are the iconoclasts like me that do something a bit different. Mine is close to 1TBS. :-) [03:58:35] the cino= is the vim settings to cindent to make that. :-) [03:59:10] ahhh [03:59:27] well we don't have much compiled stuff. AFAIK [03:59:35] Ryan_Lane: well, it's still a mistake [03:59:55] the fact that by chance it doesn't actually engender conflict doesn't mean it isn't worth the extra second it would take to make the names distinct and informative [04:01:12] the fact that tim put 'locke' as the resource name probably betrays a kind of half-conscious belief that this value specifies the destination host name, or that it ought to be the destination host name, or something like that [04:01:23] it's confusing and funny looking [04:02:22] sooner or later there will be the occasion to ask 'which?' -- when some logger indicates a failure of that resource, or something [04:02:30] not a big deal, but i'm sticking to my guns [04:08:19] Coren: re code format, I don't think it matters much for something like this [04:09:03] when I hear "sticking to my guns" I always have the mental image of GWB :) [04:09:11] ori-l: then all of them need to be fixed [04:09:12] ori-l: Besides, 1TBF is Good. :-) [04:09:54] and it should be done as a refactoring [04:10:06] yeah, not suggesting it block this change [04:10:16] can't blame someone for staying consistent with what's there, and can't expect them to refactor everything for it :) [04:10:34] * Ryan_Lane nods [04:10:38] I agree with you otherwise [04:12:25] * ori-l retires to his texas ranch [04:14:50] Aaron|home: BGW? [04:14:55] err, GWB* ? [04:16:37] the ahh, former presi-dent... of Amurica [04:16:46] ahhh [04:16:49] 43 [04:17:09] not GHWB [04:17:22] GWB is also George Washington Bridge [04:30:53] New patchset: Tim Starling; "Add UDP log to bits" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/58269 [04:31:03] Change merged: Tim Starling; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/58269 [04:38:20] is stafford down? [04:38:42] Connection timed out during banner exchange [04:39:06] ah yes, ganglia says swapdeath [04:40:35] oh, I will need the root password for this, I don't think anyone ever gave it to me [04:41:02] faidon was going to send it by email to my PGP key, but then he didn't [04:41:24] how mean [04:41:30] sleeping? fine, I will page your asses in retaliation for not giving me the root password ;) [04:41:53] * Aaron|home can't recall Tim saying "ass" before [04:42:09] * Aaron|home does a whois [04:42:12] checks outs [04:42:33] I had to specially translate it into american so that they would understand [04:42:48] no point insulting americans in english english [04:45:44] what's the english english version? [04:46:07] asses -> arses [04:46:20] i think that's understandable. at least in Brooklyn [04:46:30] have you got it, jeremyb? [04:46:39] the passwd? no [04:46:48] maybe ssh will work, I am trying with -oConnectTimeout=10000 [04:46:56] hehe [04:46:59] TimStarling: one sec [04:47:40] console: Serial Device 2 is currently in use [04:47:40] note: never been to britain and i think it's 20ish years since i was in the commonwealth [04:47:42] * Ryan_Lane grumbles [04:47:55] yeah, that's me [04:48:02] waiting for the root password to arrive in my inbox [04:48:13] but don't bother, I got an ssh shell and I'm killing everything now [04:48:17] ok [04:48:36] was there a specific reason you didn't get root? [04:48:41] I don't believe so [04:48:55] logistics are hard? :) [04:49:23] I think I wasn't in the room at the relevant times [04:49:30] off at some platform meeting or something [04:50:06] only yourself to blame, working with platform instead of ops…. [04:50:09] :) [04:50:29] Ryan_Lane: don't you dare [04:50:43] or maybe it's because I actually want to maliciously destroy data and the lack of the root password is the only thing that is stopping me is the lack of the root password [04:50:58] and therefore CT wisely decided not to give it to me [04:50:58] that's definitely the reason [04:51:07] * jeremyb_ detects some redundancy in that statement [04:51:15] and of course he didn't tell you because it would get back to me and then I would DDoS everything [04:51:20] :D [04:51:39] TimStarling: err, not just `halt` ? [04:52:00] CDOS <--- Centralized DOS [04:52:22] where's the fun in that? [04:52:49] !log stafford went into swapdeath. I killed all ruby processes a couple of times, and then eventually stopped apache2 for a while, while it recovered [04:52:56] Logged the message, Master [04:53:02] i didn't realize fun was in the spec [04:53:37] * jeremyb_ wonders if stafford had ever swapdeath'd before [04:55:17] unlikely [04:55:32] ganglia says memory usage massively increased around the end of March [04:56:27] from 3GB to 8GB [04:56:50] so maybe that is related [04:58:34] Beware the ides of March? [05:00:13] right, the load has increased too [05:00:42] so maybe the memory usage per process hasn't increased [05:01:47] syslog:Apr 10 04:10:17 stafford kernel: [24359322.661058] ntpd invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0 [05:01:58] obviously it's all ntpd's fault [05:02:50] not 150 ruby processes using 150MB each [05:04:24] the syslog doesn't go back far enough to tell if it happened before [05:04:26] hi, TimStarling or someone, if you have a sec, could you merge https://gerrit.wikimedia.org/r/#/c/58265/ pls [05:05:20] it has already been posted to the community http://meta.wikimedia.org/wiki/Meta:Babel#Zero_configuration_namespace_coming_to_meta_near_you [05:05:43] o_O What was the OOM killer smoking to kill ntpd? It it tuned? [05:06:04] Coren: core [05:06:04] yurik: go sleep! [05:06:12] * jeremyb_ too [05:06:15] jeremyb_, my flight to s africa in 10 hours [05:06:22] need to finish a whole bunch of crap before that [05:06:30] 20 hours on the plane [05:06:44] yurik: hah, you're following amit or kul? :) (wild guess) [05:06:46] i'll go insane... getting there already [05:06:50] both [05:07:08] oh, not amit - kul & dan [05:07:08] * Coren goes to sleep too, actually. [05:07:18] conference [05:07:23] New patchset: Ori.livneh; "Added "Zero" & "Zero talk" namespaces to metawiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/58265 [05:07:29] learning USSD [05:07:32] the joy! [05:07:37] ahhhh [05:07:39] fun [05:08:14] any stopover? [05:08:22] jeremyb_, usually we try to work with newer tech, not the antiquated custom protocols... [05:08:37] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/58265 [05:08:39] TimStarling, 1 in johannesburg [05:08:41] yurik: i still use ussd! [05:08:57] jeremyb_, to pay bills? or to browse the web? ;) [05:09:24] so like 18 hours to johannesburg and then 1 hour to wherever you're going? [05:09:30] ussd is what telcos supposedly use to show you your network/minute usage [05:09:40] yurik: neither. to check usage. yeah, that [05:09:47] TimStarling, yep - cape town, 2 hrs+ [05:10:00] you could visit stellenbosh! [05:10:10] what/who is that? [05:10:36] ahh, spelled it wrong [05:10:50] https://meta.wikimedia.org/wiki/Wikimania_2012/Bids/Stellenbosch [05:10:57] i'm going to sync yurik's config change, unless anyone objects [05:10:58] ori-l, always lurking ;) [05:10:59] https://gerrit.wikimedia.org/r/#/c/58265/4/wmf-config/InitialiseSettings.php [05:11:10] lol, i started typing that before ori said anything :) [05:11:12] meta folk seem OK with it [05:11:24] thanks ori-l ! [05:12:21] i'm going to wait 10 minutes to see if anyone says no [05:12:38] ori-l, i was thinking of introducing a new sec group, but couldn't find a proper way to do it [05:12:57] what would be right way to do it? [05:12:59] i looked at http://www.mediawiki.org/wiki/Manual:$wgNamespaceProtection [05:15:06] i *think* if you do $wgNamespaceProtection[your_namespace] = array('zero-config-edit'); [05:15:49] and then $wgGroupPermissions['zero-folk'] = array( 'zero-config-edit' => true ) [05:15:59] the assignment will effectively create the group [05:16:03] but i can't remember exactly [05:18:05] right, that seems like the right way to do it, but for some reason initialiseSettings.php doesn't have it [05:18:15] commonSettings does [05:19:41]