[00:00:01] New patchset: Pyoungmeister; "bookkeeping for pmtpa dbs" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60361 [00:00:26] New review: Dzahn; "just don't include it in site.pp until it's fixed" [operations/puppet] (production) C: 2; - https://gerrit.wikimedia.org/r/60359 [00:00:27] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60359 [00:01:04] !log py synchronized wmf-config/db-pmtpa.php 'bookkeeping for pmtpa slaves' [00:01:11] Logged the message, Master [00:01:42] Change merged: Pyoungmeister; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60361 [00:02:22] !log py synchronized wmf-config/db-pmtpa.php 'bookkeeping for pmtpa slaves (this time for real)' [00:02:29] Logged the message, Master [00:02:47] New patchset: Jdlrobson; "Enable CentralNotice on mobile" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60362 [00:03:54] hey Ryan_Lane -- I'm still not able to add Eric to editor-engagement "Failed to add ebernhardson to editor-engagement. This needs user ebernhardson to have the "loginviashell" right. ". It's been more than half an hour, so I don't think it's waiting on a puppet run, but dunno. [00:04:15] there's no puppet run needed for that [00:04:18] it should be immediate [00:05:01] that must not be his username [00:05:19] uuuuggghhhhhh [00:05:20] EBernhardson (WMF) [00:05:27] I fucking hate (WMF) user names [00:05:34] * Susan smiles. [00:05:37] me too, it's like a cattle brand [00:05:40] yes [00:05:45] Some people should start creating (!WMF) usernames [00:05:48] and when they leave they lose all of their contributions [00:06:11] I may add it to the username blacklist [00:06:22] for new creations [00:06:23] Heh. [00:06:27] we shouldn't be encouraging that [00:06:29] It's Philippe's doing. [00:06:38] that's perfectly fine for project wikis [00:06:45] it's total bullshit for developer accounts [00:06:53] Local username blacklists fragment SUL. [00:07:03] we aren't using SUL [00:07:13] * Reedy beats Susan [00:07:13] Okay. [00:07:13] Useless! [00:07:17] Useless! (WMF) [00:07:22] ori-l: anyway, that's your issue :) [00:07:27] when i wanted to vote down a gerrit change for non-technical reasons, i was expecting to be asked to have 2 separate gerrit accounts, one "WMF"-one and one for my private opinion that is never a statement by the foundation :p, hehe [00:07:28] that and a bad error message [00:07:42] i got two turntables and a microphone [00:07:57] hmm Ryan_Lane I seem to be locked out of my wikitech account [00:08:02] * Reedy pets Thehelpfulone [00:08:06] mutante: "I think this is a bad idea". mutante (WMF): "I still think this is a bad idea" [00:08:16] heh [00:08:32] I used a temporary password, I tried to change it, then I get a Incorrect password entered. Please try again. [00:08:32] error :-/ [00:08:50] Reedy: mutante (WMF): "I don't have a strong opinion, please show consensus link" mutante: " i hate this":) [00:08:58] Thehelpfulone: try your old one? [00:09:21] nope, doesn't seem to work [00:10:55] alt-f4 [00:11:06] Thehelpfulone: Have you tried turning it off and on again? [00:11:40] * Thehelpfulone trouts Reedy  [00:12:18] I've even hit a captcha now for incorrectly trying to login too many times :P [00:14:06] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 238 seconds [00:14:09] Thehelpfulone: try sending another temporary? [00:14:37] did it accept your temporary at all? [00:14:45] I did, same problem, it allows me to login to the "change your password because you've got a temporary code" screen, but then when you enter a new password it doesn't like it [00:14:47] New patchset: Jdlrobson; "Enable CentralNotice on mobile" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60362 [00:14:49] yes it accepted the temporary one [00:15:02] it wouldn't let you set a permanent? [00:16:23] exactly [00:16:28] hm [00:16:46] PROBLEM - Puppet freshness on gallium is CRITICAL: No successful Puppet run in the last 10 hours [00:17:52] I see the MOD in LDAP [00:18:07] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 23 seconds [00:18:41] TimStarling: ping [00:18:54] hello [00:23:29] New patchset: Dzahn; "remove group.pp from jenkins module to fix duplicate definition. conflicts with systemuser in generic-definitions." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60363 [00:39:03] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 233 seconds [00:49:03] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 233 seconds [00:55:03] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 13 seconds [01:17:14] New review: awjrichards; "(2 comments)" [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/60362 [01:19:59] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 218 seconds [01:24:59] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 222 seconds [01:28:00] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 12 seconds [02:06:55] !log LocalisationUpdate completed (1.22wmf2) at Tue Apr 23 02:06:54 UTC 2013 [02:07:02] Logged the message, Master [02:09:16] PROBLEM - Puppet freshness on lvs1004 is CRITICAL: No successful Puppet run in the last 10 hours [02:09:16] PROBLEM - Puppet freshness on lvs1005 is CRITICAL: No successful Puppet run in the last 10 hours [02:09:17] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: No successful Puppet run in the last 10 hours [02:10:33] !log LocalisationUpdate completed (1.22wmf1) at Tue Apr 23 02:10:32 UTC 2013 [02:10:40] Logged the message, Master [02:14:56] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 210 seconds [02:23:09] New review: MZMcBride; "I personally didn't mind 180 days." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/54505 [02:23:56] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 225 seconds [02:25:16] PROBLEM - Puppet freshness on ms-be9 is CRITICAL: No successful Puppet run in the last 10 hours [02:25:26] !log LocalisationUpdate ResourceLoader cache refresh completed at Tue Apr 23 02:25:25 UTC 2013 [02:25:33] Logged the message, Master [02:25:56] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 4 seconds [02:26:36] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:27:26] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.202 second response time [02:29:56] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 187 seconds [02:30:56] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 27 seconds [02:31:05] New review: Tim Starling; "Ram, do you still want this?" [operations/debs/lucene-search-2] (master) - https://gerrit.wikimedia.org/r/55841 [02:33:36] New review: MZMcBride; "I don't want to file a separate bug for this, but please do annotate the configuration variable in C..." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/54505 [02:33:56] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 208 seconds [02:37:12] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 7 seconds [02:43:16] New patchset: Tim Starling; "Bug: 47293 Further reduce noise in log files" [operations/debs/lucene-search-2] (master) - https://gerrit.wikimedia.org/r/59533 [02:43:25] Change merged: Tim Starling; [operations/debs/lucene-search-2] (master) - https://gerrit.wikimedia.org/r/59533 [03:08:12] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 181 seconds [03:10:06] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 1 seconds [03:11:46] PROBLEM - Puppet freshness on virt1005 is CRITICAL: No successful Puppet run in the last 10 hours [03:26:36] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:28:26] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.125 second response time [03:34:06] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 240 seconds [03:35:04] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 1 seconds [03:38:03] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 181 seconds [03:40:03] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 1 seconds [04:13:09] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 181 seconds [04:15:09] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 1 seconds [04:23:09] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 181 seconds [04:29:09] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 12 seconds [04:31:39] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:32:30] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.125 second response time [04:38:07] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 181 seconds [04:40:07] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 0 seconds [04:44:08] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 232 seconds [04:45:07] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 1 seconds [05:24:12] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 240 seconds [05:25:12] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 1 seconds [05:38:08] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 181 seconds [05:40:08] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 1 seconds [05:51:41] New review: Hashar; ".Oh yeah puppet is broken on gallium currently because of that. I submitted a similar change at htt..." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60363 [05:59:08] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 214 seconds [06:01:48] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:02:38] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.128 second response time [06:05:08] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 1 seconds [06:33:27] New patchset: ArielGlenn; "attempt to shut up gmetric job queue cronspam" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60371 [06:34:20] Change merged: ArielGlenn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60371 [06:52:40] New patchset: ArielGlenn; "Revert "attempt to shut up gmetric job queue cronspam"" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60372 [06:53:46] Change merged: ArielGlenn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60372 [07:20:51] ori-l: have you had time to test it? :) [07:21:08] not yet, but you're right to call me out on it -- i'll do that next [07:21:25] ty :D [07:21:41] give me an hour or so, have to write an e-mail or three. [07:22:02] but i'll have it running tonight, promise. [07:22:17] legoktm: I think you have a legobot job tht is cronspamming [07:22:28] if you are legobot at tools-login or something like that [07:22:30] oh? [07:22:31] yeah [07:22:37] thats me [07:22:39] lemme see [07:22:45] so mail from those instances cron goes to all of ops cause.. just cause [07:23:07] so we get nice updates about $HOME/counter [07:23:27] heh [07:23:36] let me deflect the mail... [07:23:39] thanks [07:24:15] also can you pass the word around to other folks to really check their cron jobs? I think it's not general knowledge althogh I try to let the folks with the noisiest cron jobs know [07:24:33] its probably easiest if you just send a mail to labs-l ? [07:24:57] oh, huh [07:25:02] I"m not on it though [07:25:17] too much mail already... [07:25:59] * * * * * python $HOME/counter.py >> $HOME/counter.log 2>&1 [07:26:21] we shall see :-D [07:27:56] im keeping track of how fast we're creating items, eventually https://www.stathat.com/stats/JDTw will be a nice looking graph [07:28:46] ah hah [07:30:01] legoktm: how do you account for french guys with weird misunderstandings about copyright? :) [07:30:07] in the stats [07:30:15] lol :P [07:30:23] its just incremental [07:30:28] doesnt count for deletes [07:30:40] so why can't you just use dumps then? [07:30:54] i mean besides that the dumps are taking forever [07:31:09] this is instantaneous [07:31:21] and it already works :P [08:01:58] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60325 [08:02:15] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60337 [08:02:47] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:03:37] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.141 second response time [08:21:37] New patchset: Nikerabbit; "Optimize even more with optipng" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60375 [08:22:32] New review: Nikerabbit; "There are better alternatives to pngcrush. I use optipng myself. See gerrit 60375." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59638 [08:33:52] legoktm: having a bit of a hard time getting it to work :/ [08:34:02] uhoh whats not working? [08:34:49] it looks like it's entering into a busy loop waiting for events before the connection succeeds [08:35:08] i had to struggle a bit to get to that point, i'll make notes either here if you're around or on the patch [08:35:27] here is fine [08:37:12] PROBLEM - Puppet freshness on cp3003 is CRITICAL: No successful Puppet run in the last 10 hours [08:37:13] PROBLEM - Puppet freshness on virt3 is CRITICAL: No successful Puppet run in the last 10 hours [08:40:52] !log maxsem synchronized wmf-config 'Beta changes, no-op for production' [08:41:00] Logged the message, Master [08:41:43] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60375 [08:42:59] New review: MaxSem; ".../bits/DolphinBrowser/image/ios/toolbar-back.png | Bin 523 -> 463 bytes" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60375 [08:46:01] !log maxsem synchronized docroot/ 'https://gerrit.wikimedia.org/r/#/c/60375/' [08:46:08] Logged the message, Master [08:51:51] oh [08:53:14] well, bot.connect() no longer attempts to make a connection [08:53:46] you removed the call to self.server.connect but didn't replace them with anything [08:53:50] ^ legoktm [08:54:13] er, i thought it was called internally [08:54:14] * legoktm checks [08:56:31] New review: Ori.livneh; "(1 comment)" [operations/debs/adminbot] (master) C: -1; - https://gerrit.wikimedia.org/r/60240 [08:57:05] i have to crash, but I think that in order to do this patch set right you need to be able to test [08:57:14] do you want my configuration notes? [08:57:46] if you have mediawiki-vagrant you should be able to get this going quite easily [08:57:47] yeah sure [08:57:52] * legoktm does :D [08:57:59] the version on wikitech is the same one that's in ubuntu precise [08:58:10] so you can simply 'sudo apt-get install python-irclib' [08:58:19] and get the exact depdendency [08:58:39] oh, another issue [08:59:28] not introduced by you: "if 'confarg' in args" will always be true [08:59:41] the *key* is present in args regardless of whether or not the argument was supplied [09:00:02] i think the intent was to do 'if args.confargs is not None:' [09:00:14] ok [09:00:46] so this means that invoking it without any command line params will simply trigger a crash [09:00:55] but you can still work around it by actually specifying the config file on the command line [09:01:14] i'm invoking it like this: python adminlogbot.py --config=/etc/adminbot/foobarlegoktmbot.py [09:01:24] i'll pastebin the content of that file in a moment [09:01:41] you'll also need to 'pip install mwclient' [09:03:22] PROBLEM - Host ms-be9 is DOWN: PING CRITICAL - Packet loss = 100% [09:03:39] this is the content of my /etc/adminbot/foobarlegoktmbot.py : [09:03:40] http://dpaste.de/AwfJ3/raw/ [09:03:59] lol :P [09:04:23] [04:04:17 AM] -NickServ- foobarlegoktmbot is not registered. :( [09:04:44] yes, fortunately (or unfortunately) the bot is not getting quite that far :P [09:05:11] oh, i had to make one other change, sec.. [09:06:22] i'll note it in the change [09:06:30] ok [09:08:00] New review: Ori.livneh; "(1 comment)" [operations/debs/adminbot] (master) - https://gerrit.wikimedia.org/r/60240 [09:08:19] o.O does gerrit not work on opera? [09:08:49] bleh [09:08:51] dunno, i'd imagine MaxS_em would have complained by now [09:09:04] !log [09:09:21] back to firefox :/ [09:09:40] !log restarted all services [09:09:47] Logged the message, Master [09:11:15] you might be able to do something like [09:11:23] def connect(self, *args, **kwargs): [09:11:33] !log [09:11:56] super(logbot, self).connect(*args, **kwargs) [09:12:03] what on earth is 'pp-pdf1'? [09:12:06] !log restarted all services [09:12:12] oy vey. [09:12:14] Logged the message, Master [09:13:52] anwyays i'm going to crash, but feel free to ping me at some point tomorrow if you get stuck and i'll take a look at it with you [09:14:07] ok thanks :) [09:15:40] !log [09:16:13] !log restarted all services [09:16:21] Logged the message, Master [09:19:05] o.O [09:28:16] ori-l-zzz: pediapress [09:28:40] PROBLEM - Puppet freshness on ms-fe3001 is CRITICAL: No successful Puppet run in the last 10 hours [09:45:50] RECOVERY - Host ms-be9 is UP: PING OK - Packet loss = 0%, RTA = 26.61 ms [09:52:30] PROBLEM - Host ms-be9 is DOWN: PING CRITICAL - Packet loss = 100% [09:54:21] New patchset: Legoktm; "Convert logbot to use ircbot.SingleServerIRCBot for auto-reconnection." [operations/debs/adminbot] (master) - https://gerrit.wikimedia.org/r/60240 [09:55:54] !log - alow numprocs to be configured via initenv.sh [09:56:02] Logged the message, Master [09:56:27] !log restarted all services [09:56:35] Logged the message, Master [10:00:00] RECOVERY - SSH on ms-be9 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [10:00:10] RECOVERY - Host ms-be9 is UP: PING OK - Packet loss = 0%, RTA = 26.70 ms [10:04:08] New review: Legoktm; "PS3: Fixed the major issue where the bot actually connects to the server by adding a call to ircbot...." [operations/debs/adminbot] (master) - https://gerrit.wikimedia.org/r/60240 [10:04:40] New review: Legoktm; "(2 comments)" [operations/debs/adminbot] (master) - https://gerrit.wikimedia.org/r/60240 [10:09:02] PROBLEM - Puppet freshness on nickel is CRITICAL: No successful Puppet run in the last 10 hours [10:10:08] PROBLEM - Puppet freshness on db44 is CRITICAL: No successful Puppet run in the last 10 hours [10:17:02] PROBLEM - Puppet freshness on gallium is CRITICAL: No successful Puppet run in the last 10 hours [10:28:23] RECOVERY - NTP on ms-be9 is OK: NTP OK: Offset -0.02109622955 secs [10:29:51] New patchset: Legoktm; "Convert logbot to use ircbot.SingleServerIRCBot for auto-reconnection." [operations/debs/adminbot] (master) - https://gerrit.wikimedia.org/r/60240 [10:41:40] New review: Ori.livneh; "(6 comments)" [operations/debs/adminbot] (master) - https://gerrit.wikimedia.org/r/60240 [10:46:55] ori-l-zzz: you don't sleep much :o [10:47:20] the 'z' is for zorro [10:47:38] hahaha [10:47:51] :) [10:50:55] New patchset: Reedy; "Losslessly compress pngs further..." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60380 [10:52:01] Reedy, lol [10:52:09] MaxSem: Moment for size changelogf [10:52:13] the competition has started! [10:52:19] New review: Reedy; "reedy@ubuntu64-web-esxi:~/git/operations/mediawiki-config$ git pull ssh://reedy@gerrit.wikimedia.org..." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60380 [10:52:33] docroot/bits/DolphinBrowser/image/icon.png | Bin 8646 -> 6459 bytes [10:52:33] :D [10:52:56] do you commit from Linux? [10:53:00] Yeah [10:53:18] cba with windows for it [10:53:41] Nikerabbit: ^^ [10:54:15] from command line? TortoiseGit is vastly superior to anything else I've seen [10:57:43] Reedy: ? [10:57:54] Nikerabbit: I beat you [10:58:02] See https://gerrit.wikimedia.org/r/60380 [10:58:27] Reedy: what did you do? [10:58:37] Losslessly compressed them [10:59:24] why are we discussing this in #wikimedia-operations anyway? [10:59:24] with what? [11:00:04] Because gerrit-wm reports it in here [11:00:08] pngout [11:02:42] docroot/bits/DolphinBrowser/image/icon.png | Bin 10173 -> 8646 bytes [11:02:42] docroot/bits/DolphinBrowser/image/icon.png | Bin 8646 -> 6459 bytes [11:03:19] ^ Quite a saving [11:03:35] MaxSem: Is there somewhere the DolphinBrowser ones can be upstreamed to? [11:04:16] Reedy, as the matter of fact, the whole directory needs to be upstreamed to /dev/null [11:04:26] Haha. Really? [11:04:37] let's just wait for it to bitrot a bit [11:05:03] it's allegedly still working [11:05:15] you shouldn't touch the files in test/, since unit tests for image handling assert things about the binary content of those files [11:06:06] Jenkins should tell us if tests pass.. [11:06:38] sigh [11:07:05] I just hit Ctrl-X, y, Enter in Notepad:P [11:08:19] should we have a test which checks that images are optimized? :) [11:08:41] ori-l-zzz: Yup, I b0rked the build [11:08:41] gerrit should do it itself [11:18:19] do what? [11:20:51] compress the images [11:23:25] probably not worthwhile [11:23:35] oh, why pretend [11:23:52] the savings are almost certainly imperceptible to the majority of users [11:23:58] ori-l, do you ever sleep, and why?:P [11:24:24] tonight i just can't sleep :/ [11:25:28] it's basically for free, so it's good to do it once in a while and upon committing a lot of images, but i think a casual / occasional approach is sufficient. [11:26:34] if it's really worth doing, it's worth doing it on the edge somehow [11:27:25] * ori-l tries again [11:31:47] he also works on weekends [11:31:51] :) [11:32:39] holidays too [11:40:42] paravoid: hi -:] [11:40:46] hey [11:40:53] my python modules PTS pages have been created by whatever bot \O/ [11:41:50] paravoid: do you get time to fix up puppet on gallium ? There is some duplicate Group['jenkins'] definition caused by systemuser {} [11:42:47] isn't this https://gerrit.wikimedia.org/r/#/c/60024/ ? [11:44:17] paravoid: yeah that one [11:44:38] paravoid: I had a jenkins::user and jenkins::group and was using user {} [11:44:52] then switched to system user{} which does the group declare for us [11:45:11] I would drop systemuser {} and revert back to the building user {} [11:45:25] that's fine I think [11:46:02] build-in [11:46:02] grrg [11:52:02] New patchset: Hashar; "Revert "create jenkins user with systemuser"" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60385 [11:52:17] Change abandoned: Hashar; "finally we are going to stick to user {}" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60024 [11:52:38] https://gerrit.wikimedia.org/r/60385 would do it :-D [11:52:58] I could also use a deletion of the python-statsd package currently at apt.wm.o (v1.5.8) that is the wrong package. [11:53:25] and get instead the python-statsd v2.0.1 I have packaged (that is a different upstream module though having the same binary package name) [11:56:32] Change abandoned: Hashar; "Will get the git repo deleted. We are using upstream svn repo." [operations/debs/python-statsd] (master) - https://gerrit.wikimedia.org/r/59397 [11:56:45] Change abandoned: Hashar; "Will get the git repo deleted. We are using upstream svn repo." [operations/debs/python-statsd] (master) - https://gerrit.wikimedia.org/r/55028 [11:57:05] Change abandoned: Hashar; "Will get the git repo deleted. We are using upstream svn repo." [operations/debs/python-voluptuous] (master) - https://gerrit.wikimedia.org/r/59605 [12:09:54] PROBLEM - Puppet freshness on lvs1004 is CRITICAL: No successful Puppet run in the last 10 hours [12:09:54] PROBLEM - Puppet freshness on lvs1005 is CRITICAL: No successful Puppet run in the last 10 hours [12:09:54] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: No successful Puppet run in the last 10 hours [12:11:15] RECOVERY - Puppet freshness on ms-be9 is OK: puppet ran at Tue Apr 23 12:11:13 UTC 2013 [12:20:32] hashar: ah right, I forgot about that [12:20:37] apologies [12:21:46] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60385 [12:27:46] RECOVERY - Puppet freshness on gallium is OK: puppet ran at Tue Apr 23 12:27:38 UTC 2013 [12:34:13] !log reprepro python-statsd 2.0.1-1~wmf1 & voluptuous 0.6.1-1~wmf1 for precise-wikimedia [12:34:16] hashar: ^ [12:34:21] Logged the message, Master [12:35:01] paravoid: you are the best :-) [12:35:07] want me to upgrade gallium? [12:35:12] or are you taking care of it? [12:35:15] PROBLEM - LVS HTTPS IPv6 on wikimedia-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:35:15] PROBLEM - LVS HTTPS IPv6 on wikiversity-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out [12:35:24] PROBLEM - LVS HTTP IPv6 on wikisource-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out [12:35:24] PROBLEM - LVS HTTPS IPv6 on wikinews-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:35:25] PROBLEM - LVS HTTPS IPv6 on foundation-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:35:25] PROBLEM - LVS HTTP IPv6 on mediawiki-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:35:25] PROBLEM - LVS HTTPS IPv6 on wikipedia-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out [12:35:35] paravoid: I will upgrade gallium [12:35:45] I guess you want to look at the IPv6 connectivity issue at eqiad :-] [12:37:08] PROBLEM - LVS HTTP IPv6 on wikibooks-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:37:18] RECOVERY - LVS HTTPS IPv6 on wikimedia-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 95715 bytes in 3.508 second response time [12:37:18] RECOVERY - LVS HTTPS IPv6 on wikiversity-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 63690 bytes in 6.320 second response time [12:37:28] RECOVERY - LVS HTTP IPv6 on wikibooks-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 63690 bytes in 1.456 second response time [12:38:18] RECOVERY - LVS HTTPS IPv6 on wikinews-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 63690 bytes in 0.087 second response time [12:38:18] RECOVERY - LVS HTTP IPv6 on wikisource-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 63690 bytes in 3.248 second response time [12:38:18] PROBLEM - LVS HTTPS IPv6 on wikibooks-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out [12:38:27] PROBLEM - LVS HTTP IPv6 on wikimedia-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:38:28] PROBLEM - LVS HTTPS IPv6 on wikivoyage-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:40:20] RECOVERY - LVS HTTP IPv6 on mediawiki-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 63690 bytes in 0.027 second response time [12:40:45] paravoid: need to drop out python-voluptuous 0.6.1-4 . That prevents us from installing upstream 0.6.1-1~wmf1 [12:40:48] RECOVERY - LVS HTTPS IPv6 on wikivoyage-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 41720 bytes in 9.402 second response time [12:40:52] paravoid: the statsd one got upgraded [12:43:11] PROBLEM - Puppet freshness on lvs1004 is CRITICAL: No successful Puppet run in the last 10 hours [12:43:11] PROBLEM - Puppet freshness on ms-be9 is CRITICAL: No successful Puppet run in the last 10 hours [12:43:12] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: No successful Puppet run in the last 10 hours [12:43:12] PROBLEM - Puppet freshness on lvs1005 is CRITICAL: No successful Puppet run in the last 10 hours [12:44:48] New patchset: BBlack; "Initial Gerrit push of Work-In-Progress vhtcpd code." [operations/software/varnish/vhtcpd] (master) - https://gerrit.wikimedia.org/r/60390 [12:46:51] RECOVERY - Puppet freshness on gallium is OK: puppet ran at Tue Apr 23 12:46:46 UTC 2013 [12:48:31] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 198 seconds [12:50:31] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 12 seconds [12:55:01] PROBLEM - Packetloss_Average on analytics1005 is CRITICAL: CRITICAL: packet_loss_average is 8.27482272727 (gt 8.0) [12:59:01] PROBLEM - Packetloss_Average on analytics1004 is CRITICAL: CRITICAL: packet_loss_average is 8.80145209677 (gt 8.0) [13:08:58] RECOVERY - Packetloss_Average on analytics1004 is OK: OK: packet_loss_average is 1.79159968254 [13:09:00] RECOVERY - Packetloss_Average on analytics1005 is OK: OK: packet_loss_average is 0.0 [13:12:08] PROBLEM - Puppet freshness on virt1005 is CRITICAL: No successful Puppet run in the last 10 hours [13:24:19] PROBLEM - Host mw27 is DOWN: PING CRITICAL - Packet loss = 100% [13:25:19] RECOVERY - Host mw27 is UP: PING OK - Packet loss = 0%, RTA = 26.56 ms [13:27:18] PROBLEM - Apache HTTP on mw27 is CRITICAL: Connection refused [13:42:28] New patchset: Mark Bergsma; "Revert "decom'ing sq33, fixing role/cache.pp puppet and ganglia.pp puppet ot match standards"" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60395 [13:43:26] New patchset: Mark Bergsma; "Revert "decom'ing sq33, fixing role/cache.pp puppet and ganglia.pp puppet ot match standards"" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60395 [13:45:23] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60395 [13:51:26] hashar: voluptuous is now fixed on gallium btw [13:52:04] New patchset: Mark Bergsma; "Set dysprosium backend weight at 4x" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60397 [13:52:05] paravoid: thx :) [13:52:54] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60397 [13:53:31] paravoid: I think I have all needed dependencies now. I can proceed with zuul packaging :-) [13:59:26] PROBLEM - Varnish traffic logger on dysprosium is CRITICAL: PROCS CRITICAL: 2 processes with command name varnishncsa [14:00:25] RECOVERY - Varnish traffic logger on dysprosium is OK: PROCS OK: 3 processes with command name varnishncsa [14:14:11] PROBLEM - Varnish traffic logger on cp3009 is CRITICAL: PROCS CRITICAL: 2 processes with command name varnishncsa [14:16:02] New patchset: Mark Bergsma; "Move proxy URL rewriting into a function" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60399 [14:18:56] apergos: around? [14:19:11] yes [14:19:18] wondering about the swift status [14:20:10] ms-be9 installed, but not pooled. I looked at ganglia a bit earlier today and there is still a fair amount of catching up due to ms-be1 being down [14:20:34] once that settles out a bit I'll start adding ms-be9 and hopefully 2 back in [14:20:59] ms-be2 is apparently not ready, it's on my list to ask about [14:21:12] (in theory it was to be made ready yesterday evening our time) [14:25:30] catching up of what? [14:26:00] data it should have [14:26:05] which one? [14:26:07] ms-be1? [14:26:11] yep [14:26:29] what does that matter? [14:26:39] you're going to eventually replace it anyway [14:26:46] yes, but we want capacity now [14:26:54] no, let's just put the new boxes [14:27:35] wait a sec, I thought the point was, the cluster is borderline and we need to have working capacity asap [14:28:35] ms-be1 would be the fastest fix for that, but it needs [14:28:37] * apergos goes to look at ganglia agin and make a guess [14:29:28] we have brand new boxes with sensible controllers sitting idle [14:29:32] maybe a day? [14:30:27] to catch up. at that point we could put a couple boxes in at 33 % each (I wouldn't really want to put them in at more than that at once) [14:30:37] I think we should just start ramping up ms-be2/9 [14:32:34] apparently ms-be9 controller card is not seated properly....during install no disks were found [14:32:40] uh [14:32:42] I did the install [14:32:49] when? [14:32:51] today [14:32:56] it was sitting in some weird state [14:33:01] oh..maybe steve fixed it [14:33:06] I cleared the config and set up raid 0 etc [14:33:19] and then it was fine [14:33:33] ok..cool [14:33:38] when I got on the console it was in some installer screen complaining about no disks [14:33:42] but that was all the issue was [14:33:47] so ms-be2? [14:34:11] not set up yet [14:34:15] I see the old one is still powered on, I thought steve was going to get to that ysterday [14:34:21] can he do that today? [14:34:47] i don't think he was in the DC yesterday....i am hoping...ms-be2 (new one) is already racked...it is going in a different location [14:35:08] just needs drac set [14:35:10] ok [14:37:27] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60399 [14:41:31] RECOVERY - Varnish traffic logger on cp3009 is OK: PROCS OK: 3 processes with command name varnishncsa [14:41:35] New patchset: Mark Bergsma; "Rewrite proxy URLs to normal URLs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60408 [14:42:13] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60408 [14:42:38] mark: how come? [14:43:01] i see lots on the mobile cluster [14:43:18] forward proxies maybe? [14:43:22] maybe [14:43:24] on mobile telcos? [14:43:30] yeah that's what I thought [14:49:35] New patchset: Dereckson; "(bug 47418) Babel configuration for th.wikipedia" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60409 [15:00:29] New patchset: Mark Bergsma; "Fix proxy rewriting" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60415 [15:01:53] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60415 [15:03:24] New patchset: Hashar; "fix pep8 E502: backslash redundant between brackets" [operations/software/redactatron] (master) - https://gerrit.wikimedia.org/r/60416 [15:13:24] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 201 seconds [15:15:24] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 21 seconds [15:24:50] New patchset: Mark Bergsma; "Restrict PURGE lookups to mobile domains" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60417 [15:26:14] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60417 [15:30:36] New patchset: Mark Bergsma; "Fix variable name" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60418 [15:31:05] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60418 [15:38:49] paravoid: jsduck v4.8.0 is out with various bug fixes that I pushed, would be nice to get it deployed in a few days so I can start fixing bugs on our end that depend on it. No rush, but ideally within a week. Want me to file bugzilla or rt? [15:39:07] afaics the package is fully compatible, just needs a rebuild and push. [15:39:42] rt or just ping me here later this week or early next week [15:43:50] paravoid: https://rt.wikimedia.org/Ticket/Display.html?id=5000 [15:43:57] nice id :) [16:07:45] RECOVERY - Puppet freshness on ms-be9 is OK: puppet ran at Tue Apr 23 16:07:41 UTC 2013 [16:09:05] RECOVERY - Puppet freshness on nickel is OK: puppet ran at Tue Apr 23 16:09:01 UTC 2013 [16:14:28] New patchset: Mark Bergsma; "Make purging more generic, unbreak purging on mobile cluster" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60423 [16:15:34] New review: Demon; "Something else on fenari that needs an outside connection...noc.wm.o. Granted we could serve that fr..." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/56104 [16:16:14] New patchset: Mark Bergsma; "Make purging more generic, unbreak purging on mobile cluster" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60423 [16:16:41] New patchset: Reedy; "Set $wgTorBlockProxy to url-downloader.wikimedia.org:8080" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60424 [16:20:21] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60423 [16:23:10] New patchset: Dzahn; "blog apache config - use SSLCACertificatePath to provide full chain" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60426 [16:23:56] mutante: maybe not the best day to deploy this [16:23:58] (blog is on slashdot's front page) [16:24:17] paravoid: oooh:) thanks for the heads up [16:24:42] hope we're not literally dotted [16:25:04] New patchset: Reedy; "Move tor_exit_node from hume to terbium now proxy config exists" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60427 [16:25:43] New review: Dzahn; "< paravoid> mutante: maybe not the best day to deploy this" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60426 [16:26:31] New patchset: Mark Bergsma; "Add missing }" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60428 [16:26:50] ah, slashdot writes about the move to mariadb,, nice nice [16:27:04] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60428 [16:29:17] New patchset: Reedy; "Move tor_exit_node from hume to terbium now proxy config exists" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60427 [16:30:24] New patchset: Reedy; "Move tor_exit_node from hume to terbium now proxy config exists" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60427 [16:33:03] New patchset: Dzahn; "bugzilla apache config - use SSLCACertificatePath to provide full chain" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60431 [16:33:58] !log reedy synchronized php-1.22wmf1/extensions/TorBlock [16:34:05] Logged the message, Master [16:34:20] hey mark, q about x-cs stuff [16:34:26] !log reedy synchronized php-1.22wmf2/extensions/TorBlock [16:34:28] you there? [16:34:33] Logged the message, Master [16:37:32] grabbing lunch, be back in aibt [16:38:36] New patchset: Mark Bergsma; "Skip purge lookups for upload purges on mobile" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60432 [16:38:49] ottomata: oh sorry [16:39:57] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60432 [16:40:47] New patchset: Dzahn; "delete firewall.wm.org apache conf" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60433 [16:42:52] New patchset: Reedy; "Move all deployment path vars to using /usr/local/apache/common-local" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60434 [16:45:34] New patchset: Hashar; "contint: pin python-voluptuous at 0.6.1-1~wmf1" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60435 [16:46:11] # wikimedia-periodic-update.sh is unpuppetized and not in the scap source [16:46:17] ^ Anyone know what that means? [16:46:29] File is at source => "puppet:///files/misc/scripts/wikimedia-periodic-update.sh", [16:46:40] <^demon|away> Oh, did we puppetize that finally? [16:46:58] No idea :D [16:47:29] <^demon|away> Maybe just inconsistency between where puppet places it and where the cron expects it? [16:47:36] * ^demon|away doesn't have the files open [16:47:41] Reedy: https://gerrit.wikimedia.org/r/#/c/37946/ [16:48:24] <^demon|away> Don't we have some weird hack in make-wmf-branch for periodic updates too? [16:48:28] fwiw, that periodic-update.sh thing sounds like stuff that should now be found in puppet/manifests/misc/maintenance.pp [16:48:47] Oh [16:49:00] or not.. [16:49:23] eh, just saying that file has a couple cron jobs running mwscript stuff [16:50:18] Doesn't look like anyone has tried to move it either.. [16:50:37] it's still only listed under hume [16:50:46] ebernhardson: http://meta.wikimedia.org/wiki/IRC/Cloaks [16:51:01] mark, back, s'ok [16:51:14] i just realized I had a buncha meetings starting in 10 mins and had to get lunch first [16:51:22] ok [16:51:24] so no questions now? :) [16:51:27] but, ja so, evan and I are looking into x-cs [16:51:28] haha [16:51:29] nope [16:51:37] there might a few more issues [16:51:41] but the one we are looking at right now is [16:51:53] there are lines that def should be tagged but were not [16:52:10] i can provide some as an example [16:52:16] errors perhaps? [16:52:22] https://github.com/wikimedia/operations-puppet/blame/production/manifests/site.pp - Comment is from Tim [16:52:26] errors? [16:52:35] hm, one sec lemme anon the ips and throw up a gist [16:52:42] non HTTP 200 responses [16:52:44] ok [16:53:02] Need to find a new location for the "norotate" logs [16:53:26] https://gist.github.com/ottomata/5445370 [16:53:49] naw the responses are all mostly 200s, both hits and misses [16:53:51] but some errors as well [16:54:07] this lines are all from a single IP [16:54:16] that should ahve been tagged as an indonesia request [16:55:13] so what exactly is missing? [16:55:21] the X-Analytics header? [16:55:23] the last field [16:55:23] yeah [16:55:34] the last field should be tagged with x-cs=... [16:56:03] we noticed that this isn't the case for all providers, some of the providers are tagged just fine all the time [16:56:11] New patchset: Reedy; "List all needed crons on terbium, even if disabled" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60438 [16:56:11] so we narrowed this down to the ones that weren't tagged properly [16:56:21] and started looking at the xl indonesia provider [16:56:26] it only has 6 IP ranges defined [16:56:39] so it was easy to check to make sure all of the IP ranges lined up [16:56:53] it look fine in the vcl, as well as in the udp2log filter on oxygen [16:57:50] btw, i think the change that we deployed last week (tagging hits) worked [16:58:15] the providers that seem to be working have both hits and misses tagged properly now, whereas before they didnt' [16:58:23] but apparently this new thing we are looking at is a different issue [16:58:53] ottomata: so the paste has all requests as .m. request, not .zero. [16:58:59] and the VCL only tags on .zero. [16:59:14] hmm, erosen might not nkow that [16:59:52] } else if (client.ip ~ carrier_xl_indonesia) { [16:59:52] set req.http.X-DfltLang = "id"; [16:59:52] if (req.http.X-Subdomain == "ZERO") { [16:59:52] if (req.http.host ~ "(^(id|en|zh|ar|hi|ms|jv|su)\.zero|^zero)\.([a-zA-Z0-9-]+)\.org") { [16:59:52] set req.http.X-Carrier = "XL Axiata"; [16:59:52] set req.http.X-CS = "510-11"; [16:59:52] } [16:59:53] } [17:00:04] so it needs to be a .zero. request, and a specific one at that [17:00:12] iiiinteresting [17:00:20] erosen: [17:00:24] otherwise X-CS doesn't get set, and thus neither does the X-Analytics header [17:00:27] mark [17:00:27] ottomata: so the paste has all requests as .m. request, not .zero. [17:00:37]         if (req.http.X-Subdomain == "ZERO") { [17:00:40] ... [17:00:42] yeah [17:00:54] (pasting that for erosen's sake, he just joined) [17:01:03] thanks [17:01:09] i think i get the gist of it [17:01:22] is that how its supposed to work? or are you expecting to capture all mobile requests? [17:01:30] i'm rerunning the analysis, checking only zero domain requests [17:01:34] i don't think so [17:02:01] amit and kul certainly think of "wikipedia zero" as anything that is "zero-rated" [17:02:30] which includes just zero. or zero. and m. for different providers [17:02:36] the mobile team wrote this change, I don't know anything about the .m. vs .zero. [17:02:46] they wrote this code I mean [17:03:16] ok cool [17:03:20] well good to know! [17:03:30] but it certainly makes sense why the X-Analytics header doesn't get set in this case [17:03:34] we'll talk with zero and mobile folks and get this sorted out [17:03:35] yeah [17:03:41] https://gist.github.com/embr/012e97997cd2153b8fb8#file-zero-domains-only [17:04:02] looks better [17:04:07] yeah [17:04:09] so the ones where it's still not set for zero requests is probably because the subdomain doesn't match :) [17:04:16] New review: Dzahn; "yep, just list disabled crons for completness, by request" [operations/puppet] (production) C: 2; - https://gerrit.wikimedia.org/r/60438 [17:04:17] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60438 [17:05:15] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60433 [17:05:18] mark, ottomata: who owns the varnish config code at the moment? [17:05:46] noone "owns" it [17:05:49] hm, dunno if it is really 'owned' [17:05:54] mobile team push in changes sometimes, ops does too [17:06:00] they tend to do the carrier stuff [17:06:05] since we're not much involved with that at all [17:06:11] we do more general maintenance work, performance work, etc [17:06:24] but yeah, certainly talk to the mobile team about this issue [17:27:33] Someone broke the mariadb website [17:28:24] slashdotted? [17:28:49] wfm [17:29:14] percona-dotted maybe ? [17:29:21] New review: Mwalker; "Looks good -- pushing today in the LD slot." [operations/mediawiki-config] (master) C: 1; - https://gerrit.wikimedia.org/r/60338 [17:30:41] oh yeah, maybe they are having a live hacking sprint at the conference ;) but i see the page just fine [17:31:34] SkySQL merges with Monty Program Ab [17:32:38] http://blog.mariadb.org/wikipedia-adopts-mariadb/ [17:36:04] New patchset: Ottomata; "Removing udp2log instance from locke" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60441 [17:36:27] Change merged: Ottomata; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60441 [17:39:40] New patchset: Ottomata; "Removing rsyncd from locke" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60442 [17:39:47] Change merged: Ottomata; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60442 [17:41:43] Seems the mariadb website is only down via IPv6 [17:48:29] !log mlitn synchronized php-1.22wmf1/extensions/ArticleFeedbackv5 'Update ArticleFeedbackv5 to master' [17:48:36] Logged the message, Master [17:48:51] !log mlitn synchronized php-1.22wmf2/extensions/ArticleFeedbackv5 'Update ArticleFeedbackv5 to master' [17:48:58] Logged the message, Master [17:52:07] New patchset: Demon; "General code cleanup" [operations/debs/lucene-search-2] (master) - https://gerrit.wikimedia.org/r/60444 [18:02:09] New patchset: Umherirrender; "$wgNamespaceRobotPolicies for dewiki: Add 101 and 829" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60445 [18:08:01] <^demon|away> 2 easy puppet changes: https://gerrit.wikimedia.org/r/#/q/status:open+project:operations/puppet+branch:production+topic:pep8,n,z :) [18:08:25] New patchset: Umherirrender; "NS_IMAGE/NS_IMAGE_TALK => NS_FILE/NS_FILE_TALK" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60447 [18:10:01] New review: Demon; "(1 comment)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60447 [18:11:27] PROBLEM - Recursive DNS on 91.198.174.6 is CRITICAL: CRITICAL - Plugin timed out while executing system call [18:13:19] New review: Umherirrender; "(1 comment)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60447 [18:35:54] heya, mutante, can anyone email access-requests@rt.wikimedia.org to create an access request? [18:36:03] aaron halfaker is asking me how he should request an ssh key change [18:37:59] PROBLEM - Puppet freshness on cp3003 is CRITICAL: No successful Puppet run in the last 10 hours [18:37:59] PROBLEM - Puppet freshness on virt3 is CRITICAL: No successful Puppet run in the last 10 hours [18:41:22] <^demon|away> ottomata1: Just push a change to gerrit with the new key? [18:41:51] <^demon|away> (or if it's for gerrit, just change it in the UI?) [18:42:14] ottomata1: the main question will be if the old key needs to be ensured absent everywhere.. like, what is the reason to change it.. either it needs to stay in puppet forever, or people need to manually remove it [18:42:17] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59458 [18:42:34] aye, aaand, weren't you guys all paranoid about trusting keys now? [18:42:54] like, he needs to buy me a beer or something before I believe him? [18:43:35] he can either put the key on his Office wiki user page [18:43:38] or he can use gpg [18:43:53] or he can show you the fingerprint in a Google hangout session next to his face:) [18:43:59] or you can call him .. or.. [18:44:03] !log mlitn synchronized wmf-config/InitialiseSettings.php 'Re-enable AFTv5 on enwiki' [18:44:10] Logged the message, Master [18:44:18] ok, but there doesn't need to be an RT for this then? [18:44:43] there probably should [18:44:59] but he can't create it? [18:45:31] he can mail ops-requests [18:45:54] or ask for a full account.. or mail the list [19:15:10] New patchset: Dzahn; "add account and key for ebernhardson and include on stat1" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60457 [19:17:46] New patchset: Dzahn; "add account and key for ebernhardson and include on stat1" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60457 [19:18:44] SiteStatsUpdate::doUpdate: Transaction already in progress (from WikiPage::doDeleteArticleReal), performing implicit commit! [19:18:51] Reedy: any idea how that happens? [19:18:58] * Aaron|home looked at the code a few times [19:19:02] Uhh [19:20:27] * Aaron|home likes how that function does $this->loadPageData( 'forupdate' ); and then $dbw->begin() shortly after ;) [19:24:19] New patchset: Dzahn; "set SSH key for halfak to absent (RT #5004)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60466 [19:28:47] New review: Dzahn; "set this to absent, new key will follow" [operations/puppet] (production) C: 2; - https://gerrit.wikimedia.org/r/60466 [19:28:48] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60466 [19:29:06] PROBLEM - Puppet freshness on ms-fe3001 is CRITICAL: No successful Puppet run in the last 10 hours [19:36:24] !log anomie synchronized php-1.22wmf2/extensions/Cite 'Update to master for bug 47291' [19:36:31] Logged the message, Master [19:49:26] New patchset: Isarra; "Rerender wikimania, wikiversity, and wikiquote favicons with transparency and high-resolution support." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60472 [19:53:57] Good job, Isarra. [19:57:06] paravoid: let me know when you want to do ceph stuff [19:57:25] Aaron|home: I will [19:57:28] Aaron|home: journal done? [19:57:47] that happened the day after we last talked, it's just running in a loop now [19:57:57] great [20:03:42] mutante: having to do an intermediate commit with ensure => absent seems a bit broken. maybe it'd be better to create a template that iterates through the set of enabled keys to generate authorized_keys, and thereby rely on puppet's hash checks to ensure the file is always exactly in sync. [20:05:44] a minute after you merged the change to set the key to absent icinga-wm alerted that ms-fe3001 hasn't had a successful puppet run for a while. isn't there a risk that it will never see the old key removed? [20:06:22] New review: Mwalker; "The wikiquote and wikimania favicons seem to be missing the 16x16px resolutions. Isarra contends tha..." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60472 [20:07:19] New patchset: Dr0ptp4kt; "Adding namespace protection for the ZERO namespace on metawiki." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60512 [20:08:55] New review: DarTar; "I don't have permissions to display ticket #5004 but Aaron is still working with us. Can you please ..." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60466 [20:09:24] heh [20:09:40] New review: Mwalker; "I'll push this with the other favicons in today's LD window." [operations/mediawiki-config] (master) C: 1; - https://gerrit.wikimedia.org/r/60472 [20:10:12] PROBLEM - Puppet freshness on db44 is CRITICAL: No successful Puppet run in the last 10 hours [20:12:14] hmm, do we use apt-key with apt.wikimedia.org? [20:12:43] i've got puppet running on a vm that won't install packages because our apt repo isn't signed [20:12:53] ottomata1: i was _just_ looking at faidon's apt module in operations/puppet to check [20:12:59] yeah i looked at that too [20:13:06] doesn't seem to do anyting with key for us [20:13:56] oh you were just looking for that yourself too, eh? [20:13:56] hehe [20:14:20] New review: Ram; "Yes I think it will be useful as a quick reference when the global config changes (instead of having..." [operations/debs/lucene-search-2] (master) - https://gerrit.wikimedia.org/r/55841 [20:14:44] ottomata1: was looking for that for the purpose of helping that patch along [20:15:44] aye [20:15:54] that module is wmf specific though :( [20:16:05] ottomata1: well, if you do "apt-key list" you'll see it [20:16:05] value => 'http://brewster.wikimedia.org:8080', [20:16:05] , etc. [20:16:15] on a cluster machine, I mean [20:16:30] oh hm [20:16:38] so it's getting added _somewhere_. I imagine it might be a part of the OS image you guys use, dunno. [20:16:52] yah probably hmm [20:18:14] anyway, it should be simple to add it, i'm just uncertain about how much to rely on puppet's abstractions for modifying and generating config files vs. just plopping pre-made files in the right places [20:19:11] puppetlabs has an apt module that may be more complete than ours [20:19:25] our apt repo *is* signed though [20:19:27] and we do have a key [20:20:30] on regular setups, it's being installed by the installer [20:20:42] look at files/autoinstall/late_command [20:21:09] no idea what's the case in labs [20:21:26] but having it in puppet wouldn't hurt [20:21:35] paravoid: right; but because the puppet manifests in mediawiki/vagrant provision just one machine (and a dispensable VM at that) i tend to pick small-and-gets-the-job-done vs. reusable / composeable [20:21:40] in labs it's in the image [20:22:23] it was previously injected via cloud-init [20:22:43] ah [20:22:46] thanks :) [20:23:00] yw [20:25:35] ottomata1: if you're also looking at this for the vagrant instance, i'll have a patch in a few [20:26:11] yeah i am doing it right now too! [20:26:12] haha [20:26:13] uh oh! [20:26:18] i was about to submit [20:26:19] ! [20:26:36] gonna submit anyway [20:26:46] oh, sure :) [20:27:49] New review: Brion VIBBER; "Looks good. All have 32x32 versions; wikiversity also has a tweaked 16x16 but the others shouldn't n..." [operations/mediawiki-config] (master) C: 1; - https://gerrit.wikimedia.org/r/60472 [20:29:29] ori-l: patchset 2: https://gerrit.wikimedia.org/r/#/c/59716/ [20:30:19] New patchset: Dr0ptp4kt; "Adding namespace protection for the ZERO namespace on metawiki." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60512 [20:30:40] ottomata1: reviewing [20:40:44] New patchset: awjrichards; "Enable CentralNotice on mobile everywhere CentralNotice is enabled" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60362 [20:41:45] New patchset: awjrichards; "Enable CentralNotice on mobile everywhere CentralNotice is enabled" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60362 [20:45:11] Change merged: MaxSem; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60362 [20:45:56] http://apt.wikimedia.org/autoinstall/keyring/wikimedia-archive-keyring.gpg <-- not available over https? [20:46:17] ah, nice!!!! i was looking for that [20:46:38] wait, is taht whawt I want? [20:46:50] why would we care about security? :D [20:49:41] ori-l: even if it was, you'd have to establish the trust chain [20:49:46] so you'd be right on square one :) [20:50:21] paravoid: can you give https://gerrit.wikimedia.org/r/#/c/59716/ a second opinion? [20:51:01] modules/misc? [20:51:29] paravoid: that part is simply adapting to my poor puppet layout in that repo [20:51:38] good to call that out, but i'll fix that separately [20:51:45] oh that's vagrant [20:51:47] I was very confused [20:52:04] I was looking at things that weren't familiar at all [20:52:28] ori-l, i got another patch set to use that url [20:52:29] i thikn [20:52:31] one sec [20:52:36] I dunno, it feels very dirty to not use an apt module :) [20:53:05] paravoid: agreed, this is quick and dirty fo sho [20:53:18] ottomata1: i don't think using the URL is better [20:53:22] since it's http [20:53:23] no? [20:53:38] its better because then the key isn't hardcoded? [20:53:53] you can put it in a file btw [20:53:55] maybe less super dooper secure but meh for vagrant? [20:53:56] yeah i could [20:54:07] why not curl it though? [20:54:10] i'm optimistic that i'll get this cleaned up before we change keys [20:54:17] with a proper apt module, per paravoid [20:54:21] aye [20:54:38] but i think it's okay as is for now [20:54:48] well, putting it in a file would be cleaner [20:56:05] !log reedy synchronized php-1.22wmf2/extensions/Wikibase [20:56:07] than curling? [20:56:12] Logged the message, Master [20:56:14] https://gerrit.wikimedia.org/r/#/c/59716/3/puppet/modules/misc/manifests/apt/wikimedia.pp [20:56:57] i would have preferred that, but this is fine too. vagrant <1.2 uses ruby's stdlib http libs to retrieve boxes which doesn't very certificates anyway [20:57:11] PROBLEM - SSH on amslvs1 is CRITICAL: Server answer: [20:57:58] i'm going to adapt/import paravoid's apt module and do this over, but this is ok for now. let me jsut test it. [20:58:09] * doesn't verify [20:58:11] RECOVERY - SSH on amslvs1 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [21:00:07] ottomata1: would you mind if we don't include it by default, since you're going to require that someone manually add the umapi class anyhow? [21:01:03] that's cool [21:02:25] ori-l, submitted [21:03:53] New patchset: Dzahn; "add new SSH key for halfak (RT-5004)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60572 [21:04:36] !log maxsem synchronized php-1.22wmf2/extensions/MobileFrontend/ 'Weekly mobile deployment' [21:04:44] Logged the message, Master [21:06:06] !log maxsem synchronized php-1.22wmf2/extensions/ZeroRatedMobileAccess/ 'Weekly mobile deployment' [21:06:13] Logged the message, Master [21:06:28] ottomata1: ok, merged [21:07:41] New review: Dzahn; "new key per " [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60572 [21:07:45] danke thank you! [21:07:49] New review: Dzahn; "new key per " [operations/puppet] (production) C: 2; - https://gerrit.wikimedia.org/r/60572 [21:07:50] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60572 [21:11:44] !log maxsem synchronized php-1.22wmf1/extensions/ZeroRatedMobileAccess/ 'Weekly mobile deployment' [21:11:52] Logged the message, Master [21:13:21] !log maxsem synchronized php-1.22wmf1/extensions/MobileFrontend/ 'Weekly mobile deployment' [21:13:29] Logged the message, Master [21:14:11] !log aaron synchronized php-1.22wmf2/includes/db/Database.php 'temp logging' [21:14:18] Logged the message, Master [21:15:03] !log maxsem synchronized wmf-config 'Mobile banners everywhere - https://gerrit.wikimedia.org/r/#/c/60362/' [21:15:10] Logged the message, Master [21:17:06] !log reedy synchronized php-1.22wmf1/extensions/Score [21:17:13] Logged the message, Master [21:18:28] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 183 seconds [21:19:20] New patchset: MaxSem; "Revert "Enable CentralNotice on mobile everywhere CentralNotice is enabled"" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60576 [21:20:00] Change merged: MaxSem; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60576 [21:20:28] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 22 seconds [21:22:56] !log maxsem synchronized wmf-config [21:23:03] Logged the message, Master [21:23:04] ori-l: [21:23:04] https://gerrit.wikimedia.org/r/#/c/60577/ [21:23:05] :) [21:23:12] after that I've got the user_metrics thing [21:23:55] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60512 [21:24:29] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 212 seconds [21:24:33] New review: Dzahn; "new key has been added here: https://gerrit.wikimedia.org/r/#/c/60572/" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60466 [21:25:28] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 12 seconds [21:28:29] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 192 seconds [21:29:55] New patchset: Lcarr; "removing halfak's key as it is on stat1 other keys also on stat1, it appears" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60580 [21:30:10] mutante: ^^ fyi [21:30:29] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 22 seconds [21:30:32] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60580 [21:30:38] ottomata: looking [21:30:56] LeslieCarr: did you see my note above about using a template to iterate over keys and generate an authorized_keys file? [21:31:09] ah nope [21:31:15] that sounds like a great idea [21:31:25] :) [21:31:46] 13:03 mutante: having to do an intermediate commit with ensure => absent seems a bit broken. maybe it'd be better to create a template that iterates through the set of enabled keys to generate authorized_keys, and thereby rely on puppet's hash checks to ensure the file is always exactly in sync. [21:31:46] 13:05 a minute after you merged the change to set the key to absent icinga-wm alerted that ms-fe3001 hasn't had a successful puppet run for a while. isn't there a risk that it will never see the old key removed? [21:32:18] i can submit a patch, if you like [21:33:04] ? ms-fe3001 doesn't have users [21:33:20] we knew where he had a user, bastion hosts and stat1 [21:33:29] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 202 seconds [21:34:04] yeah, but it's a deeper problem, as LeslieCarr's discovery of 'other keys also on stat1' seems to indicate [21:34:14] and i did the real disable manually anyways by moving authorized_keys away, then letting puppet re-create it from scratch, with the new key and only that [21:34:20] namely that puppet runs are enforcing the presence of keys consistently but their absence only once [21:34:28] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 11 seconds [21:34:40] Leslie: what do you mean by "other keys also on stat1" where did you see it [21:37:30] i'm not sure i see the issue, the file has been rewritten and just had the single new key [21:38:25] .ssh/keys [21:38:33] -rw------- 1 halfak wikidev 1675 Sep 6 2012 umn.rsa.private [21:38:34] -rw------- 1 halfak wikidev 396 Sep 6 2012 umn.rsa.public [21:38:35] -rw------- 1 halfak wikidev 1679 Apr 23 16:36 wmf2.private.rsa [21:38:36] -rw-r--r-- 1 halfak wikidev 397 Apr 23 16:36 wmf2.public.rsa [21:39:24] ah, i turned that into a keys.tar.gz already on stat1 [21:39:53] we were talking about different hosts then [21:40:21] !log maxsem synchronized wmf-config 'https://gerrit.wikimedia.org/r/#/c/60512/' [21:40:29] Logged the message, Master [21:41:38] and this pair was the one we had just added [21:43:52] ugh, yea, gotcha [21:48:26] New review: Ram; "Looks good; couple of comments:" [operations/debs/lucene-search-2] (master) - https://gerrit.wikimedia.org/r/60444 [21:56:48] !log reedy synchronized wmf-config/InitialiseSettings.php 'Enable score debug log' [21:56:56] Logged the message, Master [21:57:53] LeslieCarr, Ryan_Lane: would either of you have any clue why https://meta.beta.wmflabs.org redirects to http://meta.beta.wmflabs.org? [21:58:10] wait, no; don't answer that [21:58:15] that's not my problem [21:58:39] !log reedy synchronized php-1.22wmf2/extensions/Score/Score.body.php 'debugging' [21:58:50] Logged the message, Master [22:00:00] !log reedy synchronized wmf-config/InitialiseSettings.php 'Score' [22:00:07] Logged the message, Master [22:01:16] * Aaron|home hates all the djvu stuff [22:01:55] http://commons.wikimedia.org/wiki/Category:1877_books [22:02:55] hehe [22:03:18] !log reedy synchronized wmf-config/InitialiseSettings.php 'rv debugging' [22:03:25] Logged the message, Master [22:04:45] !log reedy synchronized php-1.22wmf2/extensions/Score/Score.body.php 'debugging' [22:04:51] Logged the message, Master [22:11:41] !log reedy synchronized php-1.22wmf2/extensions/Score/Score.body.php 'debugging' [22:11:48] Logged the message, Master [22:18:30] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 181 seconds [22:20:30] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 22 seconds [22:20:35] !log reedy synchronized php-1.22wmf2/extensions/Score [22:20:46] Logged the message, Master [22:27:44] !log reedy synchronized php-1.22wmf1/extensions/Score [22:27:52] Logged the message, Master [22:29:29] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 237 seconds [22:30:29] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 22 seconds [22:32:43] New patchset: MaxSem; "Direct mobile banner requests to mobile site" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60597 [22:36:59] New patchset: MaxSem; "Direct mobile banner requests to mobile site" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60597 [22:38:42] New patchset: MaxSem; "Direct mobile banner requests to mobile site" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60597 [22:41:40] New patchset: Dzahn; "add wikiepdia.com and wikiepdia.org ServerAliases to redirect to wikipedia.org since wikipedia.org is the default redirect nothing needed besides this" [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/60599 [22:43:18] PROBLEM - Puppet freshness on lvs1004 is CRITICAL: No successful Puppet run in the last 10 hours [22:43:18] PROBLEM - Puppet freshness on lvs1005 is CRITICAL: No successful Puppet run in the last 10 hours [22:43:18] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: No successful Puppet run in the last 10 hours [22:49:29] RECOVERY - Disk space on ms-be1008 is OK: DISK OK [22:53:29] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 201 seconds [22:54:28] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 1 seconds [22:56:13] New patchset: MaxSem; "Revert "Revert "Enable CentralNotice on mobile everywhere CentralNotice is enabled""" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60602 [22:57:36] anyone lightning-deploying? [22:58:24] greg-g ^ [22:59:00] MaxSem: other than mwalker doing a favicon change, no [22:59:22] I need to push a couple of config changes [22:59:40] MaxSem: I'll do the favicon pushes after all the mobile stuff settles out [22:59:55] ok, deploying [23:00:25] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60602 [23:02:28] New patchset: Lcarr; "reenabling ahalfak's access" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60605 [23:04:04] New patchset: MaxSem; "Direct mobile banner requests to mobile site" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60597 [23:04:06] Change merged: Dzahn; [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/60599 [23:04:22] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60605 [23:04:41] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60597 [23:05:11] !log sync-apache, add wikepdia.org and .com typo domains [23:05:12] New patchset: Lcarr; "really reenabling ahalfak's" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60606 [23:05:18] Logged the message, Master [23:06:17] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60606 [23:06:39] !log maxsem synchronized wmf-config 'Mobile banners again, hopefully fixed' [23:06:46] Logged the message, Master [23:09:49] mwalker, I'm done, thanks [23:10:09] cool [23:10:09] greg-g, Isarra; pushing new favicons [23:10:14] doing an apache-graceful-all for unrelated things [23:10:40] Change merged: Mwalker; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60472 [23:10:41] dzahn is doing a graceful restart of all apaches [23:10:52] mwalker: Whooo! [23:11:03] Change merged: Mwalker; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60338 [23:11:27] !log dzahn gracefulled all apaches [23:11:34] Logged the message, Master [23:12:24] PROBLEM - Puppet freshness on virt1005 is CRITICAL: No successful Puppet run in the last 10 hours [23:12:25] RECOVERY - Apache HTTP on mw27 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 742 bytes in 0.118 second response time [23:12:25] had a one-off . mw27: httpd not running, trying to start [23:12:32] yep, there it is [23:13:34] RECOVERY - Disk space on ms-be1002 is OK: DISK OK [23:14:32] !log mwalker synchronized docroot/bits/favicon/ 'Pushing wikidata, wikimania, wikiquote, and wikiversity favicons from Isarra' [23:14:39] Logged the message, Master [23:14:50] ^ Isarra :) [23:14:53] greg-g: I'm done [23:15:08] mwalker: cool, thanks [23:16:27] Squee. [23:16:57] This will be hella up in... [23:16:59] Uh... [23:17:15] Dammit, I can't speak this thing. [23:22:54] New patchset: MaxSem; "fix scope" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60607 [23:27:56] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60607 [23:31:33] !log maxsem synchronized wmf-config/mobile.php [23:31:40] Logged the message, Master [23:32:43] !log disabling old techblog apache site config on singer [23:32:51] Logged the message, Master [23:33:25] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 202 seconds [23:35:25] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 22 seconds [23:37:53] !log cleaning up all unused Apache sites on singer, gzip, move to /root for backup [23:38:00] Logged the message, Master [23:39:31] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 184 seconds [23:40:31] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 22 seconds [23:43:31] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 202 seconds [23:44:32] RECOVERY - Disk space on ms-be1006 is OK: DISK OK [23:45:31] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 7 seconds [23:56:21] New patchset: awjrichards; "Revert "Revert "Revert "Enable CentralNotice on mobile everywhere CentralNotice is enabled"""" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60611 [23:56:48] greg-g: got one last config change to make ^ ok if i push it out real quick? [23:57:04] awjr: what is it? [23:57:12] greg-g: https://gerrit.wikimedia.org/r/#/c/60611/ [23:57:16] rvrvrvrvrvrv [23:57:19] heh [23:57:51] we're having issues (not caused by the config change) that we cant deal with immediately, but that change will disable the things that causing issues so we can fix it up later [23:57:52] hah, once I'm done figuring out the recursion you'll be done deploying [23:57:57] :p [23:57:59] awjr: yeah, go for it [23:58:06] thanks greg-g [23:58:13] New patchset: Dzahn; "delete outdated apache configs for outreachcivi (RT-5002) and ocs.wikimania2009 they still resolve(d) to singer, but they were not enabled apache sites anyways" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/60612 [23:58:45] gah rebase fail