[00:06:57] ops, my buddies! [00:07:02] can you approve this one pretty please? [00:07:13] https://gerrit.wikimedia.org/r/#/c/12299/ [00:07:25] mr. maplebed can help me maybe? [00:08:07] ottomata: do we have ct approval ? [00:10:49] !log stopped puppet on cp1020 until tomorrow - testing new build of the squid mobile redirector on one server until tomorrow [00:10:54] Logged the message, Master [00:10:56] preilly: ^^ [00:11:08] binasher: SWEET [00:13:21] does robla approval != CT approval? [00:13:29] does not [00:13:33] iiiinteresting [00:13:37] I will tell diederik that for the future [00:13:40] mr. woosters? [00:13:41] cool [00:13:50] yes mr ottoman [00:13:53] do you know about mr. Evan Rosen wanting access to stat1 and some other stuff? [00:13:59] Diederik told me to create an account for him [00:14:16] https://rt.wikimedia.org/Ticket/Display.html?id=3119 [00:14:23] yes, i have asked notpeter to look at it tomorrow morning when he gets online [00:14:34] ok, i've already done puppetization of everything [00:14:38] except for internproxy [00:14:46] cause I don't know what that is, and I dont' see it in puppet [00:14:49] so it just needs approved/merged [00:15:18] i can merge it if i get your ok woosters (or we can wait until morn) [00:15:30] that is ok [00:15:40] but still need the acct setup in internproxy [00:16:20] leslie - if asher is there, he could tell u about internproxy [00:16:27] ah [00:16:54] ottomata: i was about to run to the gym, mind if we get this tomorrow (aka notpeter probably getting it) [00:17:51] LeslieCarr: its a thing that Ryan_Lane built :D [00:18:55] then definitely too much for a quick 30 second explanation ;) [00:19:05] yeah no probs [00:19:18] i think evan doesn't need it today [00:19:37] hey binasher, who else should I get to approve my mysql /var/run vs. /run stuff? [00:19:44] if you and mark both +1 ed it? [00:20:01] ottomata: do you have rights to merge it yourself? [00:20:35] New review: Lcarr; "Plus 1 pending internproxy setup" [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/12299 [00:20:47] i'm off :) see you later [00:21:04] no [00:21:07] i don't have any merge powers [00:21:25] wouldn't it be nice if I did though? [00:21:28] oh yes indeedy [00:22:19] hah [00:22:39] i'll merge it for you [00:22:44] binasher: what did I build? [00:22:55] oh [00:22:58] internproxy? heh [00:23:05] I really want tesla to die [00:23:14] i blame… the analinterns [00:23:26] yay, thank you binasher [00:23:38] lemme know when it is on puppetmaster and I will run it on stat1 to see it do good [00:23:45] agreed [00:23:48] it shouldn't affect anywhere else [00:24:40] New review: Asher; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11296 [00:26:00] gah [00:26:01] Your change could not be merged due to a path conflict. [00:26:02] Please merge (or rebase) the change locally and upload the resolution for review. [00:38:58] ottomata: i'm about to get picked up from the office and won't be back on for a few hours. could you do the rebase? [00:47:58] brwwwwww [00:48:05] yeah, i will do it tomorrow [00:48:10] ergh, eating dinner now [00:48:14] path conflict [00:48:15] hmm [00:59:42] New patchset: Faidon; "Add virt6-8 to netboot.cfg" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12308 [01:00:14] New review: Faidon; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12308 [01:00:14] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12308 [01:00:20] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12308 [01:02:07] New patchset: Faidon; "Add virt6-8 to site.pp" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12309 [01:02:39] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12309 [01:03:12] New review: Faidon; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12309 [01:03:15] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12309 [01:49:02] New patchset: Faidon; "partman: use sda-sdh for Ciscos instead of sdc-sdj" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12316 [01:49:36] New review: Faidon; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12316 [01:49:36] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12316 [01:53:30] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12316 [05:30:44] !log clearing mobile varnish cache - my friend can't expand some article categories on his iphone after rebooting and clearing cache [05:30:50] Logged the message, Master [07:48:14] what you guys think of providing a server to freenode to host their ircd as some kind of long term donation [09:11:54] petan, they only have to ask a decommissioned server, not hard [09:12:47] they can't have a server in the WMF datacentre because they want full access to it, basically a dedicated server they own, and this is not possible [09:13:28] Nemo_bis: it could be a virtual server on labs [09:13:33] that would be secure [09:13:49] or it could be a decommisioned server which would be firewalled [09:14:01] decommissioned servers go out [09:14:04] also they don't want full access, they need non root access to ssh to user who can run ircd [09:14:16] need or want? [09:14:23] need to maintain ircd [09:15:56] "we greatly prefer dedicated machines with root access" [09:15:59] http://freenode.net/hosting_ircd.shtml [09:16:06] prefer, but not demand [09:16:26] anyway if it was on labs or such, they could have root anyway. [09:16:44] machines there are virtual but so powerfull it would be enough for ircd [09:17:08] anyway I still think that they should be fine with non root [09:17:14] I can ask them [09:17:24] but I wanted some input from ops first, if it's even possible [09:17:42] no, they're usually not fine with non root [09:17:56] Nemo_bis: I can ask them [09:18:08] question now is if it's even possible [09:18:27] I think it's better to investigate whether they're interested first [09:18:51] there is a little point in investigating that if it wouldn't be even possible [09:19:01] I would prefer to ask someone from ops first [09:19:10] * Nemo_bis shrugs [09:19:16] if there was some option, then I ask freenode [09:20:45] suggestions in the past to actively (= financially) helping FreeNode have not been very favourably received (hosting donations ~ financial help) [09:31:15] yes, I disagree with financial support but I like the idea of providing a server resources to freenode [09:31:24] it seems to be a better option [09:31:33] more transparent as well [09:31:44] you can never know where the money would go to [09:32:08] but providing the servers, given there are many of them we could use is no problem at all, it's cheap and useful for freenode [09:39:20] oh you mean like irc.wikimedia.org would be linked into freenode? [09:41:13] http://freenode.net/hosting_ircd.shtml [09:42:57] sorry, i obviously should have read more backlog first [09:46:12] agree to petan (server is better than financial) if we really want to give them root on something misc. that is totally separated from other stuff and we can take the network requirements [09:47:51] in this moment I have no idea if they really require to have a root [09:48:07] I doubt so [09:48:34] but question is if someone from operation team would support that idea of providing some resources to freenode as kind of donation [09:52:20] but you can talk them out of requiring root if you tell them its all puppetized and they can use gerrit [09:52:24] maybe [09:54:30] i would support the idea of donating some resources to them as we indeed heavily use their network. ask network people about the requirements and separation though and hardware people about availability of "misc" hardware with these specs (maybe separate tickets is best) [11:14:32] mutante: right [11:27:01] mutante: ok, is there any decomissioned server that could be used for this purpose? they would like to know some technical details [11:27:30] petan: please ask robh about that part [11:27:34] ok [11:28:45] mutante: what is his email? [11:30:03] petan: let me create a ticket for it instead, what is your email to be added to cc: [11:30:15] benapetr@gmail.com [11:30:25] a ticket? decommissioned servers are always handled all together [11:33:48] petan: you now have 3174@rt.wikimedia.org. feel free to mail summary of the talk above, the specs, the link etc directly to that? ok? [11:34:05] ok [11:34:17] Nemo_bis: there is more than one, they need specific specs, and just one.. needs the context what for etc.. [11:35:41] a server to use for what? [11:35:49] a freenode irc server [11:35:55] did you talk to mark about this? :) [11:36:14] no, i am trying to get it into tickets so everybody can reply :) [11:36:17] ah ok [11:36:59] I think I'm fixing the "No credentials for your account" bug right now [11:37:09] or I'm breaking the ldap extension. one of the two [11:37:21] but exactly yea, i did point out maybe its best to have 2 tickets, one for hardware donation and one for networking (can we separate, do we even want to) and access request or not [11:37:32] if this means that we're setting up privileged channels to get decommissioned servers, I don't like it (but it's p to Rob) [11:37:41] maybe it turns out that we _just_ donate servers but dont run anything on our network.. but maybe not [11:38:01] why would we donate the server? [11:38:06] if we are running it on our network [11:38:16] well, all of a sudden it was about "decommissioned servers" [11:38:21] * Ryan_Lane nods [11:38:22] which sounds like hardware donation [11:38:29] we decommission them because they are old [11:38:34] but before it was about actually running a server that fits their specs [11:38:43] and we don't want to deal with them anymore, because they might break [11:38:45] yea, another reason to clear that all up in tickets:) [11:38:49] yeah [11:39:16] Ryan_Lane: any ideas why the %$%$%!$@ Ciscos do not boot? [11:39:24] paravoid: no idea [11:39:25] Nemo_bis: in the past we have had (iirc) decommissioned servers we could not find homes for [11:39:29] and how fucking long does it take for them to boot [11:39:35] the one I build manually with the same config worked [11:39:37] that said.... [11:39:40] it took a few tries [11:39:47] because the raid had to build itselt [11:39:49] good luck with LDAP [11:39:53] and it timed out a few times [11:39:58] fucking software raid [11:40:11] paravoid: it takes ..forever [11:40:16] but eventually they do it [11:40:22] and yes, the ciscos take *forever* to boot [11:40:33] because it checks all of the ram every time [11:40:45] and with 185GB of ram, that's slow :) [11:41:22] it's not just that [11:41:23] apergos, not recently apparently https://blog.wikimedia.org/2011/06/13/server-decommission-donations/ [11:41:24] everything's slow [11:41:25] how could they think it was a good idea to not have a switch to skip the memory check in the bios?? [11:41:54] Nemo_bis: have to check with rob and mark, they will know about more recent decommisions [11:42:04] but I am pretty sure a batch went from esams without homes [11:43:19] petan: so there are several ways ,nothing at all / just donate old servers / just donate money / donate a server that is actually new and they can still use / run on on our network (with or without access for them/) etc. first needs somebody to sort out what different people were expecting. [11:43:24] Ryan_Lane: I have no idea what their hw requirements are... but I am sure that ircd's would run fine even on some virtual server [11:43:41] mutante: I am writing a mail to ticket [11:43:56] I'd prefer not to run that in labs :) [11:43:58] mutante: donating money doesn't sound like a good idea to me [11:43:58] the requirements are here: http://freenode.net/hosting_ircd.shtml [11:45:40] yep, already agreed on that. and thanks for writing it all as a mail [11:48:20] we have several donated servers in Tampa that we could send. older PE 1950's and the old SUN search boxes....I don't think they will support virtualization as they sit. [11:48:38] meant severals servers for donation [11:53:26] I don't think they accept the hardware as it is, but rather a servers that are already on some network and running, but I don't really know that [11:53:42] i guess a plus for us actually running a server would be having more control over limits when running into issues at hackathons etc [11:54:10] including possibility to have our bots running on local ircd server with no limits :) [11:54:17] nagios etc [11:54:23] which sometimes get killed for spamming [11:56:19] Ryan_Lane: it doesn't need to run on labs, but if it was running on virtual system you would probably have better control over the network, even if they had root access [11:56:42] I just wanted to say that these ircd are usually lightweight programs [11:57:02] "Our servers must be able to open at least 15,000 concurrent connections, so per-user and system-wide file descriptor limits must be high enough to accomodate this. Suggested values are at least 24,000 for the freenode user account, and 25,000 system-wide." [11:57:20] mutante: but these connections have low traffix [11:57:22] traffic [11:57:29] irc protocol is just a text [11:58:20] petan: what do you mean by "wiki projects" [11:58:26] which ones are rejected by freenode? [11:58:27] paravoid: where [11:58:54] wikitech [11:59:03] paravoid: oh this, that was some complaint by vargent, not me, I only know that he registered a channel on freenode and he was rejected because it's a wiki and these aren't allowed to be hosted on freenode [11:59:08] varnent [11:59:37] he talked to sumana regarding that, I don't know details [12:00:04] but he said that freenode told him that wikimedia is affiliated to mediawiki and that is open source, thus we can have channels here or something like that [12:00:42] I think he was generally talking about 3rd wiki projects using mediawiki as their sw [12:01:33] nearly positive I just fixed the no nova creds for your account problem [12:01:40] on my test wiki [12:01:42] Ryan_Lane: yay [12:01:55] :) [12:01:59] now to think about the consequences of the solution [12:02:03] Nemo_bis: about hardware donation of decom'ed servers in general. there are no privileged channels. once it has been announced publicly and people all sent their request for hardware into RT and this is just a ticket in the queue like the others [12:03:00] so, the only time that the domain needs to be re-fetched is when a login occurs from a token [12:03:04] the long-lived one [12:03:10] otherwise it can get the domain from the session [12:03:45] the token is only valid in one login-session at a time. if you login again, with the "remember me" option selected, it overwrites your token [12:03:52] and your old long-lived session is gone [12:04:16] so, I set the domain in the user's options (which gets stored in the database) as well as setting it for the session [12:04:28] hm. [12:04:33] that's not going to work [12:04:40] I need to only do it if the token is also set [12:05:06] otherwise a non-token login can modify a token-login's domain [12:16:52] not at all -> i guess a plus for us actually running a server would be having more control over limits when running into issues at hackathons etc [12:27:12] \o/ [12:27:14] fixed it [12:27:29] and did it a way that is actually secure [12:30:05] pretty sure I fixed another bug by doing this too :) [12:38:05] of course, I need to make core changes for this [12:38:58] yay \o/! [12:39:04] 111oneoneone [12:40:57] heh [12:41:28] I'm going to backport my changes into the version of mediawiki we're using on labs, of course [12:41:36] well, on labsconsole itself, anyway [12:42:02] I was waiting on this to enforce two-factor ;) [12:46:18] mutante: btw, what's the mailman admin password? [12:46:33] or, to make my question more clear [12:47:02] we are getting bounces to gerrit@ because apparently it mails mediawiki-commits and mediawiki-commits doesn't allow it because of 'implicit destination' [12:47:09] if freenode needs root on the system it's likely a non-starter [12:47:25] (I have to do something while I wait for the @#%$ Ciscos to boot) [12:47:29] heh [12:47:36] paravoid: feel my pain [12:47:50] I rebooted one of those fuckers about 50 times when we were trying to get PXE working [12:49:02] the thing is, I can't understand why Andrew's machine booted and mine isn't [12:49:32] * Ryan_Lane nods [12:51:28] I have little experience with raid10 md, but how can grub boot from them? [12:51:34] they're striped (too), aren't they? [12:51:56] the same way it would boot from a raid1 [12:52:13] grub has support for booting from a software raid, I believe [12:53:18] oh, grub2 does that [12:53:19] didn't know that [12:53:24] I've always avoided grub2 :) [12:55:10] the other extremely annoying about the ciscos is their mgmt timeout [12:58:10] Nemo_bis: oh , i got you wrong first then, i thought you worry about privileges for receiving used hardware originally bought from donations [12:59:27] mutante, no, I just worry about Rob having to manage many single-server donations processes [12:59:32] :) [12:59:54] afraid he has to but this is not a new thing at all [13:00:17] with all these requests for hardware from public that arrived like a giant batch job [13:01:30] paravoid: the timeout is annoying, but connecting to the console is even more annoying [13:01:35] because you can't disconnect [13:01:40] yes, that too! [13:01:43] goddammit [13:01:56] who in their right minds doesn't include an escape sequence? [13:35:12] New patchset: Dereckson; "(bug 37778) Enable LiquidThreads on fi.wikimedia.org" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/12366 [13:35:20] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/12366 [13:37:01] something's seriously wrong with this partioning [13:37:40] Ryan_Lane: btw, do you like /b? [13:37:41] I don't [13:38:09] I'd say md + LVM + ext4, one LV for / and perhaps another LV for instance storage [13:47:10] New review: Dereckson; "shellpolicy issue resolved" [operations/mediawiki-config] (master) C: 0; - https://gerrit.wikimedia.org/r/11843 [14:06:40] http://ganglia.wikimedia.org/latest/?r=hour&cs=&ce=&m=&s=by+name&c=Application%2520servers%2520pmtpa&tab=m&vn= [14:06:41] hmm [14:07:33] https://gdash.wikimedia.org/dashboards/reqerror/ [14:07:41] fcking nagios [14:08:02] well I did get a page jus tnow [14:08:27] srv272 load spiking [14:08:32] seems like yesterday's fault again [14:08:46] got a page, ack [14:09:18] and the one its OK again [14:09:28] yeah [14:09:33] paravoid: it was originally /a [14:09:44] I was looking at 272 (as in the logs, couldn't actually get on it) [14:09:47] till the giant gluster splitbrain [14:09:54] then I moved everything to /b [14:10:09] nothing too special in the apache logs though [14:10:10] paravoid: this is a good time to change it to something you like [14:10:27] this is a good time to have a look at the downtime... [14:10:30] dammit [14:10:35] you could just mount it at /var/lib/nova/instances [14:10:40] downtime? [14:11:13] ugh [14:11:30] what happened to ms-be5 btw [14:11:31] !log powercycling srv272, unreachable due to load spike [14:11:37] Logged the message, Master [14:11:38] maplebed: ? [14:11:39] it must have swapdeathed [14:11:45] http://ganglia.wikimedia.org/latest/?c=Application%20servers%20pmtpa&h=srv272.pmtpa.wmnet&m=load_one&r=hour&s=by%20name&hc=4&mc=2 [14:11:50] memcache issues [14:12:08] lots and lots of memcache issues [14:12:40] just that one box [14:12:46] mark: same as yesterday [14:13:01] cache storm for one object? [14:13:27] where do you see the memcached being a problem? [14:13:28] I did that [14:13:52] PROBLEM - Apache HTTP on srv269 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:14:06] on that box [14:14:10] apergos: /home/w/logs/memcached.log [14:14:19] RECOVERY - Apache HTTP on mw42 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 7.838 second response time [14:14:24] "2012-06-21 14:14:13 srv259 plwiki: Error parsing memcached response" [14:14:28] RECOVERY - Apache HTTP on srv203 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 6.983 second response time [14:14:37] RECOVERY - Apache HTTP on srv236 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 1.605 second response time [14:14:46] RECOVERY - Apache HTTP on mw9 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 5.888 second response time [14:14:46] RECOVERY - Apache HTTP on mw19 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 6.690 second response time [14:14:55] RECOVERY - SSH on srv272 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [14:14:55] RECOVERY - LVS HTTPS IPv4 on wikiversity-lb.pmtpa.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 47321 bytes in 0.129 seconds [14:15:04] RECOVERY - LVS HTTP IPv4 on foundation-lb.pmtpa.wikimedia.org is OK: HTTP OK HTTP/1.0 200 OK - 38959 bytes in 0.147 seconds [14:15:04] RECOVERY - Apache HTTP on mw1 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 5.238 second response time [14:15:13] RECOVERY - Apache HTTP on mw10 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.026 second response time [14:15:13] RECOVERY - Apache HTTP on srv269 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 5.091 second response time [14:15:14] okay, the powercycle fixed it it seems [14:15:22] RECOVERY - Apache HTTP on mw7 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 8.162 second response time [14:15:22] ACKNOWLEDGEMENT - Host ms-be5 is DOWN: PING CRITICAL - Packet loss = 100% daniel_zahn RT #2595: investigate ms-be5 for hardware failure [14:15:22] RECOVERY - Apache HTTP on mw11 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.902 second response time [14:15:23] now let's look at atop to figure out why the fuck that machine died [14:15:24] annoying that one box can take us down [14:15:32] RECOVERY - Apache HTTP on mw16 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.034 second response time [14:15:32] RECOVERY - Apache HTTP on mw20 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.045 second response time [14:15:32] RECOVERY - Apache HTTP on mw26 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.455 second response time [14:15:32] RECOVERY - check_all_memcacheds on spence is OK: MEMCACHED OK - All memcacheds are online [14:15:36] Ryan_Lane: very [14:15:40] RECOVERY - Apache HTTP on mw43 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.026 second response time [14:15:40] RECOVERY - Apache HTTP on mw55 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.029 second response time [14:15:40] RECOVERY - Apache HTTP on mw5 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.024 second response time [14:15:40] RECOVERY - Apache HTTP on mw21 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.034 second response time [14:15:40] RECOVERY - Apache HTTP on mw24 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.036 second response time [14:15:41] RECOVERY - Apache HTTP on mw38 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.032 second response time [14:15:41] RECOVERY - Apache HTTP on mw49 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.035 second response time [14:15:42] RECOVERY - Apache HTTP on mw14 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.132 second response time [14:15:42] RECOVERY - Apache HTTP on mw37 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 1.789 second response time [14:15:43] RECOVERY - Apache HTTP on mw29 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 2.138 second response time [14:15:43] RECOVERY - Apache HTTP on mw52 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 2.314 second response time [14:15:49] RECOVERY - Apache HTTP on mw6 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.031 second response time [14:15:49] RECOVERY - Apache HTTP on mw32 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.023 second response time [14:15:49] RECOVERY - Apache HTTP on mw53 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.033 second response time [14:15:49] RECOVERY - Apache HTTP on mw41 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.030 second response time [14:15:49] RECOVERY - Apache HTTP on mw36 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.033 second response time [14:15:50] RECOVERY - Apache HTTP on mw34 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.041 second response time [14:15:50] RECOVERY - Apache HTTP on mw46 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.045 second response time [14:15:51] RECOVERY - Apache HTTP on mw25 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.021 second response time [14:15:51] RECOVERY - Apache HTTP on mw2 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.041 second response time [14:15:52] RECOVERY - Apache HTTP on mw4 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.192 second response time [14:15:52] RECOVERY - Apache HTTP on mw54 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.280 second response time [14:15:53] RECOVERY - Apache HTTP on mw3 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.046 second response time [14:15:53] RECOVERY - Apache HTTP on mw30 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.026 second response time [14:15:54] RECOVERY - Apache HTTP on mw17 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.033 second response time [14:15:54] RECOVERY - Apache HTTP on mw13 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.038 second response time [14:15:55] RECOVERY - Apache HTTP on mw31 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.044 second response time [14:15:55] RECOVERY - Apache HTTP on mw44 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.381 second response time [14:15:56] RECOVERY - Apache HTTP on mw57 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.871 second response time [14:15:56] RECOVERY - Apache HTTP on mw51 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 3.328 second response time [14:15:57] RECOVERY - Apache HTTP on mw18 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 4.217 second response time [14:15:58] RECOVERY - Apache HTTP on mw45 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.031 second response time [14:15:58] RECOVERY - Apache HTTP on mw28 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.055 second response time [14:15:58] RECOVERY - Apache HTTP on mw47 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.039 second response time [14:15:59] RECOVERY - Apache HTTP on mw33 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.034 second response time [14:15:59] RECOVERY - Apache HTTP on mw35 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.040 second response time [14:16:00] RECOVERY - Apache HTTP on mw48 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.034 second response time [14:16:00] RECOVERY - Apache HTTP on mw27 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.035 second response time [14:16:01] RECOVERY - Apache HTTP on mw12 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.395 second response time [14:16:01] RECOVERY - Apache HTTP on mw40 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.038 second response time [14:16:02] RECOVERY - Apache HTTP on mw15 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.027 second response time [14:16:02] RECOVERY - Apache HTTP on mw22 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.053 second response time [14:16:03] RECOVERY - Apache HTTP on mw39 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 3.477 second response time [14:16:07] RECOVERY - Apache HTTP on mw56 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.021 second response time [14:16:12] hopes the bot isnt banned [14:16:16] RECOVERY - Apache HTTP on mw8 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.029 second response time [14:16:27] I restarted the bot btw [14:17:01] RECOVERY - Apache HTTP on mw58 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.048 second response time [14:22:19] jobrunners & memcached on the same machine??? [14:22:21] W-T-F [14:22:30] maplebed: re: RT-2595 please select: (powercycle since still in rotation, take out of rotation and give up on it due to previous hardware issues [14:22:32] for a long time now [14:22:48] I thought we tried not to do that... [14:23:56] 7518 634667 0 7442K 218.8M 60800K 2388K 2796K 91% php [14:23:57] mutante: sorry, I forgot about that ticket. [14:23:59] 7517 607367 0 7442K 218.7M 60628K 1492K 1716K 91% php [14:24:02] 7516 34015 0 7442K 256.0M 97960K 24476K 24036K 86% php [14:24:06] that's the last minutes of srv272 [14:24:07] ms-be5 being down is intentional at the moment [14:24:08] maplebed: normal to get refused on mgmt? [14:24:13] (putting ssds in) [14:24:18] maplebed: alright, gotcha [14:24:23] 18315 14604 922 7442K 240.3M 69116K 19132K 6220K 30% php [14:24:26] 18316 43009 905 7442K 239.6M 68964K 18024K 4716K 29% php [14:24:29] 18313 94696 1110 7442K 244.2M 74892K 20664K 8596K 29% php [14:24:32] 18312 18207 1130 7442K 232.7M 65828K 10268K 1672K 28% php [14:24:35] 18314 22268 1093 7442K 275.3M 98.5M 56076K 37844K 27% php [14:24:38] 7516 134208 0 0K 0K 0K 0K 0K 100% [14:24:41] 7518 1011e3 0 0K 0K 0K 0K 0K 100% [14:24:44] 7517 960000 0 0K 0K 0K 0K 0K 100% [14:24:46] heh [14:25:08] paravoid: job runners and memcache should not be on the same machine [14:25:13] if they are, that's a misconfiguration [14:25:28] they are indeed [14:25:29] who fixes that? [14:25:37] fix it in puppet [14:27:20] # srv258 - srv280 are application servers, job runners, memcached [14:27:23] mutante: oh, and re: the refused management - it's because I'm still attached. [14:27:24] that stuff is a total mess atm [14:27:27] include applicationserver::homeless, [14:27:27] applicationserver::jobrunner, [14:27:27] memcached [14:27:46] I'm not going to remove 22 memcached nor 22 job runners :) [14:27:53] yeah I thought we had a bunch with both [14:27:54] * Ryan_Lane groans [14:27:59] asher is gonna make dedicated memcacheds [14:27:59] I thought we changed this a while ago [14:28:01] and wanted to move em off [14:28:01] maplebed: heh.ok:) just happened to see it when looking at Nagios after the unrelated page [14:28:42] I could have sworn I made a change like 8 months ago to ensure job runners and memcache weren't on the same systems [14:28:47] :( [14:29:02] have job runners somehow creeped back onto memcache systems? :( [14:29:02] thre was something that got moved off but I do not remember what any more [14:29:12] and I just got the page [14:29:14] how useful. [14:29:15] that would suck if true [14:29:25] ahhhh andrewbogott [14:29:28] hahaha atleast my page came on time, score one for COSMOTE :-P [14:29:38] I didn't get a page [14:29:39] heh [14:29:44] Ryan_Lane wins [14:29:47] my settings are probably set to US times [14:29:58] I suppose you could put some of the mw boxes as job runners, then slowly remove them from the rest of the apaches.. Though, I guess it might need to be stopped manually [14:30:13] Leslie puts you on the right gateways after telling her your provider [14:30:13] https://gerrit.wikimedia.org/r/#/c/9342/ [14:30:14] ergh [14:34:15] New patchset: Ottomata; "/var/run has been moved to /run in Ubuntu Precise. Updating generic::mysql::server accordingly." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11296 [14:34:50] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/11296 [14:38:52] ottomata: test branch is gone [14:39:04] it still exists, but we don't use it anymore [14:39:16] cool [14:39:20] so you merged it over to prod, right? [14:39:24] yes [14:39:26] which is why I had path conflicts with my stuff [14:39:28] but if you had unmerged changes.... [14:39:31] cool, that's fine [14:39:38] got it fixed I think [14:39:43] bogott had some changes that conflicted with mine [14:39:47] think i fixed them [14:39:56] which, maybe you can help me with real quick? [14:39:57] https://gerrit.wikimedia.org/r/#/c/11296/ [14:40:05] asher tried to merge this for me yesterday [14:40:08] but I had path conflicts [14:40:09] they are fixed now [14:40:16] couldyawouldya do a little mergy magic? [14:41:21] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/11296 [14:41:24] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11296 [14:41:27] yayyyyyy [14:42:52] mark maaarrrrrrrrrrrrrrk [14:43:09] I'm sure he'll be annoyed by that :) [14:43:42] yes? [14:44:34] https://gerrit.wikimedia.org/r/#/c/11574/ [14:44:34] :) [14:44:45] Ryan_Lane, i see something weird with the merge [14:44:49] mysql.pp [14:45:00] now has 2 versions of the class named [14:45:00] generic::mysql::packages::client [14:45:15] ottomata: with which merge? [14:45:18] the one I just did? [14:45:25] and generic::mysql::server [14:45:25] nono [14:45:31] i think this happend with the test merge [14:45:40] bogott had put some of my prod changes into the test branch [14:45:43] so he could use them in labs [14:45:44] you need to give me more info than that [14:45:46] not sure how he did that [14:45:55] and i think [14:46:03] that when you merged either from prod -> test, or test back into prod [14:46:10] those changes were duplicated and brought back over [14:46:14] i think i can fix manually [14:46:19] so, fix it.. :) [14:46:24] oooook, i will [14:46:24] heheh [14:46:35] but, just wanted to let you know, in case this may have happened other places [14:47:04] \o/ [14:47:07] so, who's our partman expert? [14:47:10] 2620:0:863:ed1a::/64 [14:47:10] *[BGP/170] 00:00:25, localpref 100, from 208.80.153.192 [14:47:10] AS path: 64601 I [14:47:10] > to 2001:7f8:1::a500:6939:1 via xe-1/1/0.0 [14:47:17] announced by my bgp implementation [14:47:23] mark: \o/ [14:49:53] New patchset: Ottomata; "mysql.pp - removing duplicate class definitions." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12376 [14:50:10] oooook Ryan_Lane, I have fixed it [14:50:16] mergy wergy? [14:50:29] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12376 [14:50:30] aaannnnnnd mark maaarrrrrrrrrrk [14:50:32] https://gerrit.wikimedia.org/r/#/c/11296/ :) [14:50:37] ottomata: it conflicted when I merged it [14:50:37] oop [14:50:47] ottomata: you're really asking to be /ignored [14:50:47] mark, not that one [14:50:48] this one: [14:50:48] https://gerrit.wikimedia.org/r/#/c/11574/ [14:50:58] Ryan_Lane, really? [14:50:59] ok sorry [14:51:07] what should I do to get things merged then? [14:51:17] add people as reviewers [14:51:20] I have done that [14:51:24] wait a bit [14:51:26] we're busy too [14:51:26] i have done that [14:51:35] this one has been in since last week [14:51:49] and hasn't had movement since tues [14:51:55] then poke some. don't badger [14:52:20] i am poking some now, as far as I know mark is the only one who can review this [14:52:25] this is the only overlapping time that we are both online [14:52:38] and he may have missed my request before since we were talking about the mysql stuff [14:53:16] and since he hasn't yet said "yes ok" or "no I'm busy" [14:53:31] I thought I'd ask again (in a hopefully perceived as lighthearted way) [14:54:16] i'd really like to work with everybody in ops so that there is no badgering at all [14:54:35] let's work together to figure out how to make that better, eh? [14:54:40] anywayyyyyy [14:54:54] you said "it conflicted when I merged it" [14:55:08] ottomata: It's funny. I wait weeks for merges when I push in MediaWiki changes too. [14:55:13] are you talking about https://gerrit.wikimedia.org/r/#/c/12376/ [14:55:13] ? [14:55:27] New patchset: Hashar; "/etc/wikimedia-realm containing $::realm" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12377 [14:55:33] yeah, that sucks, right? [14:55:37] Gerrit 2.4.1... rebase button! ;) [14:55:42] i mean, i'm not sure if it is the same with mediawiki [14:55:51] it's *way* *way* worse with mediawiki [14:55:54] since mediawiki is open source bla bla, affects tons of organizations [14:55:59] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12377 [14:56:07] do your changes to mediawiki block other work you are trying to do? [14:56:12] yes [14:56:15] aye yeah [14:56:16] suuuucky [14:56:23] do you have to badger people to get reviews? [14:56:40] I usually send an email to wikitech-l if something is waiting for too long [14:57:02] sending an email to ops list will likely be just as effective, and won't interrupt us when we're working on other things [14:58:35] aye ok, [14:58:36] New review: Andrew Bogott; "Right, and I do. Which kinda means that gerrit security is a single point of failure, right? Well,..." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/11892 [14:58:53] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/11574 [14:58:56] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11574 [14:59:10] thank you mark! [14:59:22] let's see what it breaks [14:59:42] hehe,, yup :) [14:59:48] New review: Ryan Lane; "Yes, you'd need to also be able to merge it on sockpuppet. It doesn't happen automatically." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/11892 [14:59:57] yunning puppet on oxygen now [15:00:00] running* [15:00:57] it's not merged on sockpuppet yet [15:01:10] oh bpfft, k :) [15:01:46] now it is [15:02:14] mmk.... [15:03:24] New review: Andrew Bogott; "OK, so as long as people aren't in the habit of making blind merges, that's pretty safe." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/11892 [15:05:12] New patchset: Ottomata; "udp2log.pp - found a boo boo. Template is at 'udp2log/logrotate_udp2log.erb'." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12378 [15:05:46] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12378 [15:06:01] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/11892 [15:06:04] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11892 [15:07:09] ok, mark, I found BooBoo #1, fixed here: [15:07:09] https://gerrit.wikimedia.org/r/12378 [15:13:36] Ryan_Lane, can you advise me on what I should do to get https://gerrit.wikimedia.org/r/#/c/12376/ merged? I know you are busy, but we have been talking about this now, so it seems appropriate to ask you. Alternatively I can send an email to ops list and wait. Whatcha think? [15:14:33] * hashar does the Boo Boo dance [15:14:45] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12378 [15:14:48] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12378 [15:18:13] thanks! lemme know when it is on sockpuppet [15:20:22] New review: Demon; "(no comment)" [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/5289 [15:20:42] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12376 [15:20:45] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12376 [15:20:53] ottomata: I don't mind being bothered for my broken merges :) [15:21:04] if I broke something I definitely want to fix it asap [15:25:09] !log stopping puppet on brewster [15:25:15] ok, thanks Ryan [15:25:15] Logged the message, notpeter [15:25:22] yw [15:25:31] oh. right. merge on sockpuppet [15:25:31] heh [15:25:51] ottomata: done [15:26:03] New patchset: Hashar; "apache does not know about INSTANCENAME, so remove it" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/12380 [15:26:10] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/12380 [15:27:05] RECOVERY - Puppet freshness on stat1 is OK: puppet ran at Thu Jun 21 15:26:42 UTC 2012 [15:28:47] I wonder if anyone notices they have overlapping deployment windows today? [15:29:21] New patchset: Ottomata; "site.pp - need to require Class["misc::udp2log"], not Class["udp2log"]" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12381 [15:29:52] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12381 [15:30:53] New patchset: Ottomata; "base.pp - removing trailing / on $run_directory" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12382 [15:31:24] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12382 [15:32:10] mark: want to do per-compute network nodes in labs at some point? :) [15:32:17] should be relatively straightforward [15:32:30] also mark: found BooBoo #2: https://gerrit.wikimedia.org/r/12381 [15:32:31] I think we just need to add a public IP to each of the compute nodes [15:32:32] no more boo boooooos! [15:32:37] (hopefully) [15:32:37] and change the routing [15:34:59] New patchset: Ottomata; "statistics.pp - installing sendmail-bin on stat servers so that they can send email." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12384 [15:35:31] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12384 [15:36:02] Ryan_Lane: this would be helpful but is not urgent: [15:36:03] https://gerrit.wikimedia.org/r/#/c/12382/ [15:36:17] the latest merge worked just fine :) [15:36:36] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12382 [15:36:38] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12382 [15:36:45] done [15:36:45] danke! [15:47:24] Ryan_Lane: did we write down somewhere what we discussed like half a year ago? :D [15:47:33] oh [15:47:37] that's for lvs :) [15:47:49] true [15:47:52] per-compute node network nodes is easier :) [15:47:58] ok can do [15:48:10] want to do it monday with me? [15:48:32] I'm moving udpmxircecho.py into the public repo [15:48:51] monday is fine I think [15:48:54] tomorrow i'm in the data center [15:49:19] ok [15:49:41] what changes will we need to do on the irc server to let a labs bot join and talk in a few channels? [15:49:45] RECOVERY - Puppet freshness on iron is OK: puppet ran at Thu Jun 21 15:49:38 UTC 2012 [15:49:57] I guess I could look at the irc config [15:50:12] yeah, I wouldn't want to make that change on a friday anyway :) [15:51:10] heh, Mark loves big changes on Fridays [15:51:42] RECOVERY - Puppet freshness on singer is OK: puppet ran at Thu Jun 21 15:51:20 UTC 2012 [15:54:06] RECOVERY - Puppet freshness on hooper is OK: puppet ran at Thu Jun 21 15:53:51 UTC 2012 [15:58:01] mark, i'm running to lunch, if you still have time today, could you merge this one too? Hopefully it will be the last BooBoo and then I can work on my other thing [15:58:01] https://gerrit.wikimedia.org/r/#/c/12381/ [15:58:04] thank you! [15:58:05] be back in a bit [15:58:20] (Ryan_Lane, thanks for your help with this stuff too, btw) [15:58:26] yw [15:59:01] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12381 [15:59:04] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12381 [15:59:28] New patchset: Demon; "Also add jenkins-bot to spammyusers to shut him up on IRC." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12387 [15:59:59] New review: Reedy; "(no comment)" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12380 [15:59:59] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/12380 [16:00:00] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12387 [16:04:33] Reedy: thanks [16:09:10] mark: hm. I'm not seeing any easy way to limit what a labs bot could do in the irc servr... [16:09:36] no prefix support? [16:09:37] New patchset: Hashar; "Move IRC bot from #mediawiki to #mediawiki-feed" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12388 [16:09:50] the docs aren't exactly good :) [16:09:54] hehe [16:10:10] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12388 [16:10:21] here it is: http://docs.ratbox.org/idx-ircd.shtml [16:10:29] in all of its crappinss [16:11:48] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [16:11:55] New patchset: Ryan Lane; "Move udpmxircecho.py into the public repo" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12389 [16:11:55] were is the ircd.conf in puppet again? [16:11:55] hashar: what about mw-jenkinsbot? [16:12:05] Krenair: it is done manually [16:12:09] ok [16:12:16] it is not really talking anyway [16:12:22] mark: private repo [16:12:22] not since march I noticed [16:12:27] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12389 [16:12:34] Krenair: I think it got moderated somehow :-] [16:12:57] Looks like it was just pushing status updates from jenkins jobs that are now disabled [16:13:37] MediaWiki-sqlite-phpunit and MediaWiki-postgres-phpunit [16:13:48] sqlite one has disappeared, postgres is disabled [16:14:28] Ryan_Lane: could switch to dns based [16:14:35] *.wmnet and *.wikimedia.org, but not wmflabs [16:14:47] well, we want labs to be able to write into it [16:14:52] just specific channels though [16:15:02] ah that's not really possible now [16:15:02] or we want to run a second server [16:15:23] I guess a second server on another port is doable... [16:15:43] New review: Alex Monk; "(no comment)" [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/12388 [16:15:49] New review: Hashar; "That is part of an ongoing discussion on wikitech -l" [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/12388 [16:16:04] New review: Dereckson; "File protected." [operations/mediawiki-config] (master) C: 0; - https://gerrit.wikimedia.org/r/11977 [16:16:58] hashar, why would you use -codereview for wikibugs? [16:17:11] gerrit-wm maybe [16:17:20] heh. of course, I have no clue how this irc daemon works [16:18:22] Krenair: -codereview is already setup [16:18:22] how was this even installed? [16:18:31] Krenair: and I guess we want all bots in the same channel, don't we ? :-) [16:18:35] compile; make; make install? :) [16:18:43] hashar, yes [16:18:45] -feed [16:18:50] so the channel name is relevant to all bots [16:18:54] not just the code review ones [16:19:02] well if you want [16:19:07] not going to fight for a channel name [16:19:15] it is just that -codereview is already there and properly configured [16:19:33] What is a proper channel configuration in your eyes then? [16:19:39] so I am not sure with Petr invented a second channel, probably he was not aware about -codereview existence [16:19:44] #mediawiki-noise [16:19:45] this documentation really makes me want to strangle the devs of this [16:19:47] AHAHA [16:19:54] #mediawiki-stuff [16:19:55] Ryan_Lane: which doc ? [16:20:01] ircd ratbox [16:20:15] hey, there's a wiki [16:20:20] Ryan_Lane: can't help there sorry :-( [16:20:53] which also has absolutely no docs [16:22:09] mark: any reason we don't use the packaged version in ubuntu? [16:26:15] because it's heavily modified for wikimedia? :) [16:26:19] oh [16:26:21] I am out for bread. Will be back later [16:26:22] good reason [16:26:33] the actual server code is? [16:26:36] so normal non-opers can't talk, and stuff :) [16:26:36] yes [16:27:00] is it not possible to set the default for channels to require voice [16:27:04] and not give out voice? [16:27:16] not without services I think [16:27:18] not sure [16:27:23] ah [16:27:24] * Ryan_Lane nods [16:28:03] yeah, seems it needs services [16:28:57] New review: preilly; "(no comment)" [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/12391 [16:29:07] notpeter: ^^ @ 10 [16:30:06] preilly: kk, will merge and deploy in 30 [16:32:06] mark: do we have the code checked in somewhere? [16:32:20] Ryan_Lane: I think in svn, don't remember where [16:32:23] ok [16:34:05] oh. it's a tiny patch [16:34:54] domas: bad news for you! [16:35:12] 500 Mbps FTTH available here now ;) [16:35:14] :( [16:35:23] I'm looking at comcast pricing now [16:35:34] awwww, comcast != FTTH :( [16:38:55] hah, I'm just got a report that people are still getting non-list messages from two days ago. go mchenry go! [16:39:42] PROBLEM - LVS HTTP IPv4 on mediawiki-lb.eqiad.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:39:43] New patchset: Ryan Lane; "Adding quilt to package-builder" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12396 [16:39:51] PROBLEM - LVS HTTP IPv4 on wikiversity-lb.eqiad.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:40:09] PROBLEM - LVS HTTPS IPv4 on wikipedia-lb.eqiad.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:40:15] Jeff_Green: 64,000 to go! [16:40:16] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12396 [16:40:18] PROBLEM - LVS HTTPS IPv4 on wikimedia-lb.eqiad.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:40:19] PROBLEM - LVS HTTPS IPv4 on bits-lb.eqiad.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:40:37] …. [16:40:48] LeslieCarr: does that include the stuff you froze? [16:41:03] RECOVERY - LVS HTTP IPv4 on mediawiki-lb.eqiad.wikimedia.org is OK: HTTP OK HTTP/1.0 200 OK - 61463 bytes in 0.449 seconds [16:41:12] RECOVERY - LVS HTTP IPv4 on wikiversity-lb.eqiad.wikimedia.org is OK: HTTP OK HTTP/1.0 200 OK - 61466 bytes in 4.795 seconds [16:41:21] -_- [16:41:23] yes, though it looks like either someone cleared out mail or it auto expired stuff 5 days old or older [16:41:24] hrm [16:41:25] what happened [16:41:30] RECOVERY - LVS HTTPS IPv4 on wikipedia-lb.eqiad.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 80297 bytes in 7.570 seconds [16:41:31] good question [16:41:32] it auto expires stuff after 4 days [16:41:39] RECOVERY - LVS HTTPS IPv4 on bits-lb.eqiad.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 3847 bytes in 3.360 seconds [16:41:41] ah [16:41:46] (unless you freeze it) [16:41:54] ah so yeah, that cleared a bunch of the queue out…. :-/ [16:42:09] you froze ~30K? [16:42:23] around that, yeah [16:43:03] why is the fatal log empty? [16:43:08] that's pretty annoying [16:43:37] PROBLEM - LVS HTTP IPv6 on wikinews-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:43:39] what the hell is going on again [16:43:59] http://gdash.wikimedia.org/dashboards/reqerror/ [16:44:08] (Javascript loading times are beginning to become slow for me on-wiki) [16:44:22] PROBLEM - LVS HTTPS IPv6 on wikimedia-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:44:54] I also just got a 502 [16:44:58] RECOVERY - LVS HTTPS IPv4 on wikimedia-lb.eqiad.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 80300 bytes in 7.339 seconds [16:45:08] and why eqiad? [16:45:22] mctest.php appears to have all the memcaches up, checking the net gear [16:45:56] bits in eqiad seems slightly unhappy [16:46:18] misc eqiad has a weird network graph [16:47:10] bits in eqiad had zero traffic up until 16:10 [16:47:13] RECOVERY - LVS HTTPS IPv6 on wikimedia-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK HTTP/1.1 200 OK - 80300 bytes in 0.491 seconds [16:47:22] PROBLEM - LVS HTTP IPv4 on wikipedia-lb.eqiad.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:47:24] and now it has 1.6gbps [16:47:47] oh wait, we had no monitoring for a while [16:47:49] New patchset: Dereckson; "(bug 37675) ruwiki: enable review on Portal:" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11839 [16:47:55] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/11839 [16:47:58] RECOVERY - LVS HTTP IPv6 on wikinews-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK HTTP/1.1 200 OK - 61083 bytes in 0.136 seconds [16:48:10] no packet loss [16:48:43] RECOVERY - LVS HTTP IPv4 on wikipedia-lb.eqiad.wikimedia.org is OK: HTTP OK HTTP/1.0 200 OK - 61463 bytes in 0.134 seconds [16:48:59] New review: Dereckson; "I replaced the syntax [] = {one namespace} by an array listing every namespace, to improve config re..." [operations/mediawiki-config] (master) C: 0; - https://gerrit.wikimedia.org/r/11839 [16:49:08] http://ganglia.wikimedia.org/latest/?c=Application%20servers%20pmtpa&m=load_one&r=hour&s=by%20name&hc=4&mc=2 [16:49:18] that's the same graph as the other outages [16:49:22] yes [16:50:07] eqiad because the eqiad text squids are active [16:51:21] mw1 has many many requests to DBs that are TIME_WAIT [16:51:26] well we have a power supply down … but that should be unrelated [16:51:33] so, the error rate is still relatively low [16:51:36] it hasn't spiked up [16:51:51] and we only got alerts for LVS [16:51:52] http://gdash.wikimedia.org/dashboards/poolcounter/ [16:52:08] what's poolcounter? [16:52:34] it's our mechanism for avoiding cache stampedes [16:53:46] paravoid: [16:53:46] http://wikitech.wikimedia.org/view/PoolCounter [16:54:11] php cpu usage is very very high on mw hosts [16:54:11] yeah, I was reading that [16:54:23] notpeter: it's been like this for a while [16:54:26] yeah [16:54:48] seems job queue rate isn't higher than normal [16:54:54] I wonder if we had a job-queue flood [16:55:07] is it still high? [16:56:18] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12391 [16:56:20] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12391 [16:56:47] I have no idea what to do [16:56:51] there's been a lot of page moves: http://gdash.wikimedia.org/dashboards/apimethods/ [16:57:07] things are better now though [16:57:09] that's good. [16:57:15] we don't know why, that's bad [17:00:22] yeah it does seem like things are settling down some [17:00:23] hm [17:02:43] weird traffic spike on nitrogen then http://ganglia.wikimedia.org/latest/?r=hour&cs=&ce=&m=network_report&s=by+name&c=Miscellaneous+eqiad&h=&host_regex=&max_graphs=0&tab=m&vn=&sh=1&z=small&hc=4 [17:03:18] http://ganglia.wikimedia.org/latest/?c=Miscellaneous%20eqiad&h=nitrogen.wikimedia.org&m=network_report&r=hour&s=by%20name&hc=4&mc=2 [17:03:19] yeah [17:03:23] was looking at the same thing [17:03:35] that's 8mbit [17:03:40] nitrogen is the v6relay [17:03:48] uh huh, I saw that [17:04:01] * Ryan_Lane nods [17:04:08] that was 20 minutes ago, exactly when puppet ran [17:04:10] unless you talk about the 0.8% cpu spike [17:04:20] i'm kicking off puppet again to see if that could cause a network spike in the same proportion [17:04:22] or 4% [17:04:26] lots of things have the same network graph, though [17:05:00] http://ganglia.wikimedia.org/latest/?r=day&cs=&ce=&m=&c=Miscellaneous+eqiad&h=nitrogen.wikimedia.org&tab=m&vn=&mc=2&z=medium&metric_group=ALLGROUPS [17:05:10] well the daily doesn't show regular spikes [17:05:47] but it's not much traffic overall, that's for sure [17:05:59] that's 8mbit for 10mins [17:06:22] New patchset: Mark Bergsma; "Fix MP attribute initialization & encoding" [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12415 [17:06:23] New patchset: Mark Bergsma; "Remove AttributeSet, use AttributeDict instead" [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12416 [17:06:24] New patchset: Mark Bergsma; "Remove NextHopAttribute.any support" [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12417 [17:07:19] New patchset: preilly; "new IP ranges we got from Opera" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12418 [17:07:27] mark: we really really need to make a proper page & Debian upload for pybal [17:07:40] you've invested so much time and it's so great that it's a real shame noone else uses it [17:07:55] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11051 [17:07:55] Change merged: Mark Bergsma; [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/11051 [17:07:56] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12418 [17:08:04] * mark shrugs [17:08:12] all that means is that it'll cost me more time ;-) [17:08:40] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 0 C: 2; - https://gerrit.wikimedia.org/r/11052 [17:08:44] it's odd. all network traffic dropped during that time, on all clusters [17:08:49] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11052 [17:08:51] but nothing has higher load or cpu usage [17:08:52] Change merged: Mark Bergsma; [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/11052 [17:09:04] ok, guess I'm afk again, hopefully it really is "fixed" or gone or whatever... [17:09:14] it would be nice to have the damn fatal log working [17:09:29] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12418 [17:09:31] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12418 [17:09:43] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11601 [17:09:45] Change merged: Mark Bergsma; [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/11601 [17:10:04] mark: or that others may contribute bugs or patches :) [17:10:29] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11602 [17:10:31] Change merged: Mark Bergsma; [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/11602 [17:10:45] which I need to review and thus causes me more work ;-) [17:10:48] I don't disagree [17:10:52] just saying [17:11:49] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11603 [17:11:52] Change merged: Mark Bergsma; [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/11603 [17:12:30] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11604 [17:12:32] Change merged: Mark Bergsma; [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/11604 [17:13:30] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12008 [17:13:34] Change merged: Mark Bergsma; [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12008 [17:13:36] bah [17:13:37] leaving [17:14:05] dinner, rest, sleep might help with the pain, see you [17:14:51] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12009 [17:14:53] Change merged: Mark Bergsma; [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12009 [17:15:12] not well? [17:15:18] get better then! [17:16:05] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12010 [17:16:07] Change merged: Mark Bergsma; [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12010 [17:16:46] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12011 [17:16:48] Change merged: Mark Bergsma; [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12011 [17:17:29] I am git-weeping: error: RPC failed; result=22, HTTP code = 502 [17:17:50] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12012 [17:17:52] Change merged: Mark Bergsma; [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12012 [17:18:37] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12013 [17:18:39] Change merged: Mark Bergsma; [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12013 [17:19:41] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12022 [17:19:43] Change merged: Mark Bergsma; [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12022 [17:20:16] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12026 [17:20:18] Change merged: Mark Bergsma; [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12026 [17:20:55] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12415 [17:20:58] Change merged: Mark Bergsma; [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12415 [17:22:09] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12416 [17:22:11] Change merged: Mark Bergsma; [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12416 [17:22:35] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12417 [17:22:38] Change merged: Mark Bergsma; [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12417 [17:22:40] mutante: if there is any activity in that ticket would I be informed? [17:22:50] mutante: I don't have access to RT [17:28:15] petan: we can add you to the cc list of the ticket [17:30:40] I thought I am in that [17:34:43] if you are then you'll get an email every time the ticket is updated [17:39:28] New patchset: Ryan Lane; "Adding quilt to package-builder" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12396 [17:40:02] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12396 [17:40:45] New patchset: Ryan Lane; "Adding quilt to package-builder" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12396 [17:41:12] cmjohnson1: can you give me another server's name/port plugged in on mrjp-b2-sdtpa please ? [17:41:17] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12396 [17:41:17] (near es5 and es6) [17:46:52] thanks [17:51:09] New patchset: Ryan Lane; "Initial checkin of ircd-ratbox package" [operations/debs/ircd-ratbox] (master) - https://gerrit.wikimedia.org/r/12425 [17:51:16] mark: ^^ [17:51:37] !log restarting puppet on brewster [17:51:43] Logged the message, notpeter [17:51:59] specifically: https://gerrit.wikimedia.org/r/#/c/12425/1/debian/patches/novoice_nonops [17:53:29] cmjohnson1: can you tell me another machine plugged in on mrjp-c3-sdtpa and its port ? linecard 16 is actually a 10g linecard, so i don't think the machine's in there :) [17:57:59] thanks [17:58:18] ah, it's the second half of linecard 15 :) [17:59:01] New review: Pyoungmeister; "this would give this person shell on every db node in eqiad. I don't think that this is correct. can..." [operations/puppet] (production); V: 0 C: -1; - https://gerrit.wikimedia.org/r/12299 [17:59:55] totally pita [18:00:50] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12389 [18:00:54] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12389 [18:11:59] New review: Uzume; "(no comment)" [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/12396 [18:13:55] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12396 [18:13:57] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12396 [18:15:53] hey cmjohnson1 - you have port 12 on asw-c1-pmtpa twice - https://rt.wikimedia.org/Ticket/Display.html?id=3161 and https://rt.wikimedia.org/Ticket/Display.html?id=3175 [18:21:22] thanks [18:22:19] PROBLEM - Puppet freshness on virt1 is CRITICAL: Puppet has not run in the last 10 hours [18:22:56] New patchset: Bhartshorne; "adding new partman config for swift storage nodes with SSDs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12437 [18:23:26] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12437 [18:24:04] New patchset: Bhartshorne; "adding new partman config for swift storage nodes with SSDs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12437 [18:24:35] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12437 [18:27:16] PROBLEM - Puppet freshness on virt0 is CRITICAL: Puppet has not run in the last 10 hours [18:27:50] thanks cmjohnson1 :) [18:28:07] anyone around who has access to dns-admin@ ? [18:28:19] PROBLEM - Puppet freshness on virt4 is CRITICAL: Puppet has not run in the last 10 hours [18:28:27] what's up jamesofur ? [18:29:14] PROBLEM - Puppet freshness on virt5 is CRITICAL: Puppet has not run in the last 10 hours [18:29:17] LeslieCarr: There should be a request from digicert and shopify to approve a cert for shop.wikimedia.org [18:29:33] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12437 [18:29:39] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12437 [18:29:49] Jamesofur: hehe it went to my spam box [18:29:56] oh fun lol [18:30:12] could you approve? ;) <3 would love you forever! [18:31:10] done, i approved for those specific hostnames [18:31:15] not for all future requests [18:31:19] * Jamesofur nods, thanks [18:35:13] PROBLEM - Puppet freshness on virt2 is CRITICAL: Puppet has not run in the last 10 hours [18:39:16] PROBLEM - Puppet freshness on virt1000 is CRITICAL: Puppet has not run in the last 10 hours [18:41:04] RECOVERY - Host ms-be5 is UP: PING OK - Packet loss = 0%, RTA = 0.24 ms [18:41:35] New review: preilly; "(no comment)" [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/12455 [18:44:13] PROBLEM - Puppet freshness on ms-be5 is CRITICAL: Puppet has not run in the last 10 hours [18:45:16] PROBLEM - Puppet freshness on nfs2 is CRITICAL: Puppet has not run in the last 10 hours [18:46:10] PROBLEM - Puppet freshness on virt3 is CRITICAL: Puppet has not run in the last 10 hours [18:46:34] New patchset: Catrope; "Point $wgVisualEditorParsoidURL to cadmium" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/12456 [18:46:41] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/12456 [18:47:10] New review: Catrope; "(no comment)" [operations/mediawiki-config] (master); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12456 [18:47:12] Change merged: Catrope; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/12456 [18:48:16] PROBLEM - Puppet freshness on nfs1 is CRITICAL: Puppet has not run in the last 10 hours [18:50:42] !log Started 5 exim queue runners on mchenry with exim -qqff & [18:50:47] Logged the message, Master [18:57:01] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12455 [18:57:04] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12455 [18:58:14] <^demon> jeremyb: You were asking about commentlink regexes. Relevant: http://code.google.com/p/gerrit/issues/detail?id=1451 [19:06:22] New review: Uzume; "So what is the programming threshold for making one feel stupid, as it seems we now need to learn to..." [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/8344 [19:06:31] New patchset: Bhartshorne; "second try. adjusted priorities, made the sum of partition sizes smaller than the total size of the disk." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12458 [19:07:05] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12458 [19:08:10] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12458 [19:08:13] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12458 [19:14:28] PROBLEM - Host ms-be5 is DOWN: PING CRITICAL - Packet loss = 100% [19:20:01] RECOVERY - Host ms-be5 is UP: PING OK - Packet loss = 0%, RTA = 0.19 ms [19:25:07] !log Restarted 3 queue runners as exim -qff & [19:25:13] Logged the message, Master [19:32:55] PROBLEM - Host ms-be5 is DOWN: PING CRITICAL - Packet loss = 100% [19:44:29] cmjohnson1: precise [19:44:40] we can make that the default install by now I think [19:49:52] RECOVERY - Host ms-be5 is UP: PING OK - Packet loss = 0%, RTA = 1.07 ms [19:51:46] New patchset: Asher; "new build of squid redirector to support additional projects" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12470 [19:52:15] dbs still need to be lucid [19:52:19] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12470 [19:52:51] New review: Asher; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12470 [19:52:53] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12470 [19:54:47] binasher: well we can specify lucid where we need it [19:54:56] but most new installs will be precise from now on i think [20:05:10] mutante: was there no response or I am not in cc list yet? [20:09:51] maplebed: can you fix the swift cronbomb going off on ms-fe1 ? [20:09:57] been going off for about a day now... [20:13:50] notpeter: yeah, I'll look. [20:14:15] maplebed: thanks! [20:15:16] grumble. this is a change I didn't review carefully enough on ryan's merge [20:16:21] New patchset: Ottomata; "More udp2log refactoring." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12475 [20:16:53] New review: Reedy; "(no comment)" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11843 [20:16:53] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12475 [20:16:54] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11843 [20:17:02] ottomata: is now a good time to set up that udp2log instance for search traffic? [20:17:11] or should I hold off for some more refactoring? [20:17:33] heya, i just committed a wee refactor, but you should be able to use it as it was merged this morning [20:17:41] if you want to review that with me real quick [20:17:44] we can merge this in now [20:17:46] which would be best [20:17:56] but, as long as you don't need to do anything more than [20:18:07] I'll take a look [20:18:09] misc::udp2log::instance { "lucene" … } [20:18:15] then it shoudl work as it is in the produciton branch [20:18:18] i think [20:18:33] buuut, actually, i haven't yet been able to get a successful run of this on a udp2log server yet [20:18:36] still testing it out [20:18:42] so maybe better to get all the kinks worked out first? [20:18:43] dunno. [20:18:49] (probably yes. :) ) [20:20:05] ottomata: cool cool. just lemme know when you think it's good to go [20:20:29] well, if you merge this and it is good, then it is good :) [20:20:53] New review: Reedy; "(no comment)" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11082 [20:20:53] (don't just merge though, would appreciate reviewiness) [20:20:56] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11082 [20:21:58] New review: Reedy; "(no comment)" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12183 [20:22:00] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/12183 [20:24:15] New patchset: Ottomata; "Adding Evan Rosen. RT 3119" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12299 [20:24:18] maplebed: the swift cron spam appears to include a production password in the subject.. doh [20:24:44] ouch. [20:24:50] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12299 [20:25:06] New review: Ottomata; "Doh, good catch! Dunno what I was thinking there. How's patchset 2?" [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/12299 [20:25:13] or maybe its a labs pw if its from that giant merge? oh well [20:25:50] oh, good point. I'll have to check. [20:29:20] New review: Reedy; "(no comment)" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11583 [20:29:22] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11583 [20:29:23] !log deployed new squid mobile redirector, now covers additional projects [20:29:28] Logged the message, Master [20:32:21] ottomata: that all looks legit [20:32:23] I shall merge [20:32:32] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12475 [20:32:34] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12475 [20:33:04] New patchset: Bhartshorne; "bugfix - setting clustername from the command line" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12513 [20:33:32] ottomata: you should have started on this earlier. that way you could have done all of the refactoring instead of me doing it first :) [20:33:35] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12513 [20:33:35] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12513 [20:33:36] New review: Reedy; "(no comment)" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11839 [20:33:36] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11839 [20:34:50] hehe :) [20:35:05] ottomata: forcing a puppet runon locke to see if anything is blatantly broken [20:35:14] ok cool, i'm doing so on oxygen [20:35:16] unlikely, but, ya know [20:35:17] want to see it too [20:35:20] yeah [20:36:26] looking good here [20:37:08] well, looks like locke is not listening for multicast traffic... [20:37:19] ah, found one thing! [20:37:22] (its not supposed to, is it?) [20:37:30] well, it wasn't before [20:37:32] but [20:37:40] oh, hrm [20:37:46] it might be getting double traffic now [20:38:00] or might not. not quite sure how the daemon listens [20:38:06] but no, that hsould probably be changed back [20:38:31] ? [20:39:13] New patchset: Ottomata; "oxygen's packet-loss.log file is not in the default location" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12517 [20:39:47] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12517 [20:40:39] notpeter, be a dear and merge that one in for me, wouldya? [20:40:54] I'm more of a deer, but sure [20:41:04] re: multicast, afaik oxygen was the only instance that was set up to use the multicast flag [20:41:13] yes [20:41:20] locke now has that flag in it's init script [20:41:25] oh [20:41:27] hmmm [20:41:33] which is either fine (as it'll just listen to the multicast traffic [20:41:37] or it's getting double traffic [20:41:44] naw, let's keep it the same as it was [20:41:45] it shouldn't be there [20:41:48] with the configs I have [20:41:50] lemme see [20:41:50] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12517 [20:41:53] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12517 [20:42:14] ok, merged that minor change [20:42:36] New patchset: Bhartshorne; "removing global read from swift config files" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12519 [20:42:39] danke, i think I know what's up [20:42:43] with multicast, one sec [20:43:09] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12519 [20:43:10] eyah, ok, its a weirdness with how ruby evaluates undef as false [20:43:35] New patchset: Cmjohnson; "adding mc 1-16 to dhcpd.conf" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12520 [20:44:07] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12519 [20:44:07] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12519 [20:44:07] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12520 [20:44:24] New patchset: Ottomata; "Using false instead of undef for default $multicast setting, so ruby can evalutate if multicast how I intend it to." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12521 [20:44:41] okey dokey [20:44:43] that should fix it [20:44:58] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12521 [20:45:35] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/12520 [20:45:48] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12521 [20:45:50] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12521 [20:46:03] merged [20:46:15] k, once it is on sockpuppet i will try it [20:47:23] yeah, on sockpuppet [20:48:06] you runnignon lock? [20:48:08] locke? [20:48:11] New patchset: Alex Monk; "Enable AbuseFilter auto-block on MWW" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/12522 [20:48:18] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/12522 [20:48:57] k, running puppet on locke [20:50:44] ottomata: Cron /usr/sbin/ganglia-logtailer --classname PacketLossLogtailer --log_file /var/log/udp2log/packet-loss.log --mode cron [20:50:50] Failed to get lock. Is another instance of ganglia-logtailer running? Exiting. [20:51:21] New patchset: Ryan Lane; "Fix file ownership and permissions of some config files" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12523 [20:51:45] hmm, there are two cron jobs in there for it.... [20:51:52] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12523 [20:51:52] weird, hmm [20:52:06] probably just needs to be stopped [20:52:07] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12523 [20:52:09] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12523 [20:52:20] removing one and running puppet again [20:52:27] kk [20:52:48] let's see if I break labs with that push. heh [20:53:02] yeeeehaawww! [20:53:57] ottomata: all good in the hood? [20:54:29] i teeeeenk so [20:54:36] sweet [20:54:47] ja, only one cronjob now [20:54:59] haha [20:55:10] i love how all 3 of these machines have packet-loss.log in a different directory [20:55:48] New patchset: Ryan Lane; "Add the openstack version to the hosts" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12527 [20:56:07] oh legacy.... [20:56:21] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12527 [20:56:24] gerrit-wm: ffs hurry up [20:56:28] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12527 [20:56:31] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12527 [20:57:35] RECOVERY - Puppet freshness on virt3 is OK: puppet ran at Thu Jun 21 20:57:11 UTC 2012 [20:58:51] yay, ok notpeter [20:58:53] things are good [20:58:59] so yes! lucene log away! [20:59:32] ok. what host, again? [20:59:38] i guess oxygen? [20:59:46] sure? [21:00:47] New patchset: Ryan Lane; "Only load keystone config in essex" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12531 [21:01:16] New review: gerrit2; "Change did not pass lint check. You will need to send an amended patchset for this (see: https://lab..." [operations/puppet] (production); V: -1 - https://gerrit.wikimedia.org/r/12531 [21:02:32] RECOVERY - Puppet freshness on virt1 is OK: puppet ran at Thu Jun 21 21:02:30 UTC 2012 [21:02:34] New patchset: Ryan Lane; "Only load keystone config in essex" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12531 [21:03:06] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12531 [21:03:33] ottomata: where should I put the logs? [21:03:34] do you care? [21:04:12] hmmm [21:04:35] well, i guess the reason they are on /a is because it is a big partition [21:04:47] it'd be nice if there was a log partition mounted in /var/log somewhere for htis [21:04:48] buuuut that's ok [21:05:09] notpeter, ottomata: yes oxygen for now [21:05:10] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12531 [21:05:13] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12531 [21:05:18] but we might want to use stat1001 once that box is confiugred [21:05:19] /a/log/lucene [21:05:20] maybe? [21:05:27] or /a/log/search [21:05:27] sounds reasonable [21:05:33] New patchset: Cmjohnson; "adding mc 1-16 to dhcpd.conf Change-Id: I10300f62408fe35f4f96167dcdfc69b23b297d48" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12520 [21:05:38] /a/log/ maybe? [21:05:44] something like that [21:06:05] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12520 [21:07:38] RECOVERY - Puppet freshness on virt0 is OK: puppet ran at Thu Jun 21 21:07:30 UTC 2012 [21:07:43] cmjohnson1: what kind of disk in those servers? [21:07:55] New patchset: Pyoungmeister; "udp2log instance to listen for lucene logs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12532 [21:08:30] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12532 [21:08:59] RECOVERY - Puppet freshness on virt4 is OK: puppet ran at Thu Jun 21 21:08:32 UTC 2012 [21:09:15] ottomata: like that? [21:11:23] notpeter: what disk profile are you using for the apache rebuild that makes a large / ? [21:11:38] binasher: havne't made one yet [21:11:40] want me to? [21:11:45] notpeter, looks great, but i don't think the instance will ensure that the parent dir exists [21:11:53] notpeter: yeah, if you have time [21:11:58] so you might want to also add a file { "/a/log": ensure =>directory } thing [21:12:00] ottomata: I made them by hand, as this is just temp solution [21:12:03] cmjohnson1: then use the new profile that notpeter makes [21:12:07] hm ok [21:12:31] binasher: does fs matter? [21:12:45] oh, and you might want to disable packet-loss monitoring [21:12:57] since you wont' be filtering through Tim's packetloss script [21:13:01] New patchset: Alex Monk; "Try to fix sysop adding autochecked group on MWW." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/12533 [21:13:07] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/12533 [21:13:11] binasher: I was going to use ext4 as faidon has been saying that it's now good [21:13:11] see the aft instance on emery [21:13:14] ottomata: kk [21:13:46] New patchset: Ryan Lane; "Seems nova, glance, and keystone are nogroup" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12534 [21:14:18] New patchset: Pyoungmeister; "udp2log instance to listen for lucene logs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12532 [21:14:39] ottomata: gonna mergify [21:14:48] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12534 [21:14:49] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12532 [21:14:59] cool [21:14:59] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12534 [21:15:02] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12534 [21:15:09] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12532 [21:15:11] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12532 [21:16:06] damnit. I need a filterz [21:16:46] !log deploying new mobile redirector to esams text squids [21:16:51] Logged the message, Master [21:18:29] New patchset: Pyoungmeister; "this will also reaquire a filter file" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12536 [21:19:00] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12536 [21:19:31] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12536 [21:19:33] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12536 [21:20:05] RECOVERY - Puppet freshness on virt1000 is OK: puppet ran at Thu Jun 21 21:19:41 UTC 2012 [21:20:40] notpeter: for app servers ext4 is fine [21:21:02] ginwhat about a cat? is a cat fine too? [21:22:37] notpeter: is it tranquilized? [21:23:32] nope. [21:23:59] RECOVERY - Puppet freshness on virt5 is OK: puppet ran at Thu Jun 21 21:23:43 UTC 2012 [21:24:06] New patchset: Bhartshorne; "moving options for swift global ganglia stats to a config file." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12538 [21:24:38] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12538 [21:27:08] RECOVERY - Puppet freshness on virt2 is OK: puppet ran at Thu Jun 21 21:27:01 UTC 2012 [21:28:06] !log restarting all lucene instances to direct logs to oxygen [21:28:11] Logged the message, notpeter [21:28:23] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12538 [21:28:30] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12538 [21:29:30] ottomata: drdee ok, logs are flowing in. [21:30:02] once all of the lucene instances are restarted (I'm doing one every 5 minutes for minimal effect) they'll all be there [21:30:24] although we should keep an eye on cpu for the next couple of hours [21:30:39] notpeter: woot woot, yes and also keep an eye on the size of these log files [21:32:18] although, I just looked at your initial request [21:32:28] and this is only about hald of the fields you asked for [21:32:38] the rest qill require some java codes [21:32:55] would you be willing to throw one of your java experts at it for a little bit? shouldn't take long... [21:33:12] i want to get a led scrolling sign for the office that shows what people are searching for in real time [21:34:38] binasher: I'mactually looking at that right now [21:35:30] binsasher: yes that would really cool [21:36:24] very doable [21:36:48] Google had one [21:37:01] but they got too many queries for their screen to be able to show /every/ query [21:37:08] just saw a query for "nude playboy babe" hope springs eternal that that will yield the desired result, eh? [21:37:12] we had one at a9 / amazon too [21:37:39] we made an atom feed, and the machine the sign was attached to would periodically grab the most recent 200 [21:37:46] I think even we have too many queries to do 100% in real time... [21:37:49] so definitely not every [21:38:23] maybe we can make it a 20% preilly style project :D (should only take 1 day anyways) [21:39:08] hopefully with a very large filter… i don't want to know what people are actually searching for... [21:40:17] ha ha [21:40:21] no fun [21:41:21] although yeah, the flatscreen that showed random commons images was pretty heavy on dongs and nazis [21:42:10] and right by the door that all visitors had to enter.. welcome! [21:46:30] ewww [21:48:28] been scanning the search entries for a couple of minutes, it has been suprisingly civilized [21:48:54] drdee: yep [21:49:14] seems some people think we are google [21:49:18] that struck me the most [21:52:03] New patchset: Bhartshorne; "moving options for swift global ganglia stats to a config file. round 2" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12540 [21:52:34] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12299 [21:52:34] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12299 [21:52:35] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12540 [21:52:43] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12540 [21:52:45] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12540 [21:53:04] notpeter: we're merging together! [21:53:10] heh [21:53:14] you can pull the trigger [21:53:15] I fetched [21:53:30] pamphlet [21:53:57] we are now one with the live repository! [21:53:59] binasher: I don't wanna know what kinda things you're into that make you think that that was pamphlet-worthy [21:54:10] New review: Hashar; "(no comment)" [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/8344 [21:55:24] yay my change works! [21:58:07] PROBLEM - udp2log log age for lucene on oxygen is CRITICAL: NRPE: Command check_udp2log_log_age-lucene not defined [21:59:04] yay! [22:00:41] i'm out, laters boys and girls! [22:14:46] PROBLEM - Puppet freshness on maerlant is CRITICAL: Puppet has not run in the last 10 hours [22:31:30] notpeter / binasher: if you do the LED scrolling, use BLM (Blinkenlights Markup Language). http://blinkenlights.net/project/bml , then it is compatible to all those homebrew LED displays like http://s23.org/wiki/BlinkenLEDsPro_1.1 and it can be played on houses as displays like http://en.wikipedia.org/wiki/Project_Blinkenlights [22:32:04] you can send BLM via UDP to blinkenlights displays [22:33:10] http://s23.org/wiki/Blinkenlights/Movies [22:33:39] New patchset: Bhartshorne; "adding access for john dickinson to the hardware swift test cluster" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12552 [22:34:13] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12552 [22:34:27] http://blinkenmini.schuermans.info/software.en.html [22:35:31] New patchset: Bhartshorne; "adding access for john dickinson to the hardware swift test cluster removing the 'update most recently used UID' because nobody does it." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12552 [22:36:03] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12552 [22:36:03] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12552 [22:36:59] http://oldwiki.blinkenarea.org/bin/view/Blinkenarea/SimpleBlinkenlightsTransferProtocolEnglish /me disappears again [22:53:49] New patchset: Dereckson; "(bug 37524) Change namespaces configuration - ku.wiktionary" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/12556 [22:53:56] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/12556 [23:14:34] New patchset: Bhartshorne; "changing the swift username and password for pmtpa-prod" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12558 [23:15:07] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12558 [23:16:25] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12558 [23:16:28] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12558 [23:16:39] PROBLEM - Puppet freshness on db29 is CRITICAL: Puppet has not run in the last 10 hours