[00:00:36] !log rebooting / upgrading kernel on es1003 first [00:00:41] Logged the message, Master [00:03:39] Platonides: haha. that one took me a while [00:04:13] LeslieCarr: hey, i'm just as far east as he! [00:04:28] so why arne't you out drinking ? :) [00:04:37] oooh, good idea [00:05:10] hmmmmm, should i use Leslie's whiskey? [00:05:42] it's national bourbon day [00:13:40] New patchset: Bsitu; " Add configuration variable to MoodBar" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11581 [00:13:46] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/11581 [00:14:46] New review: Bsitu; "(no comment)" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11581 [00:14:48] Change merged: Bsitu; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11581 [00:46:06] PROBLEM - Puppet freshness on searchidx2 is CRITICAL: Puppet has not run in the last 10 hours [01:07:44] New patchset: SPQRobin; "Bug 37614 - Change namespace "Wikipidia pamandiran" to "Pamandiran Wikipidia" for bjnwiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11583 [01:07:50] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/11583 [01:33:30] New review: Jeremyb; "running for the train, will write a comment later" [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/11583 [01:40:34] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 232 seconds [01:42:04] PROBLEM - MySQL Slave Delay on storage3 is CRITICAL: CRIT replication delay 243 seconds [01:47:46] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 10 seconds [01:48:22] PROBLEM - Misc_Db_Lag on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 623s [01:51:22] RECOVERY - Misc_Db_Lag on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 36s [01:52:07] RECOVERY - MySQL Slave Delay on storage3 is OK: OK replication delay 20 seconds [03:51:55] New review: Jeremyb; "* The new wgMetaNamespaceTalk value matches the NS_PROJECT_TALK value in all of master, 1.20wmf[45]...." [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/11583 [03:58:38] can anyone tell me the prod gerrit version? [03:58:50] and how is it deployed? manual download? [03:59:07] there seem to be no war files in production puppet repo [04:17:25] New patchset: Jeremyb; "gerrit: fix some overzealous commentlink patterns" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11589 [04:17:53] New review: Jeremyb; "(no comment)" [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/11589 [04:17:53] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/11589 [04:21:48] New review: Jeremyb; "Maybe there's a way to guarantee there's no match inside a URL or to specify precedence order for co..." [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/11589 [04:22:41] good night ;) [05:56:43] PROBLEM - Puppet freshness on maerlant is CRITICAL: Puppet has not run in the last 10 hours [05:56:43] PROBLEM - Puppet freshness on professor is CRITICAL: Puppet has not run in the last 10 hours [05:59:25] PROBLEM - Host search32 is DOWN: PING CRITICAL - Packet loss = 100% [06:02:16] RECOVERY - Host search32 is UP: PING OK - Packet loss = 0%, RTA = 0.26 ms [06:59:43] PROBLEM - Puppet freshness on db29 is CRITICAL: Puppet has not run in the last 10 hours [07:03:47] New review: Dzahn; "per mail to list" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/8339 [07:03:50] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/8339 [08:04:15] PROBLEM - Host search32 is DOWN: PING CRITICAL - Packet loss = 100% [08:20:23] New review: Mark Bergsma; "Comments inline." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/11574 [08:25:15] New review: Dzahn; "see inline comment" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/6489 [08:32:44] New patchset: Dzahn; "mwscriptwikiset - do not rely on mwscript being in path (f.e. cron jobs)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/6489 [08:33:09] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/6489 [08:39:08] New review: Dzahn; "just dont feel like merging out of fear of breaking Nagios and causing notification bomb because of ..." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/3291 [08:53:52] New review: Dzahn; "a good time to merge this would be ..ehm... a general outage :p" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/3291 [08:54:49] ?? [08:54:56] for an nrpe init file? [08:55:03] i'd merge that in my sleep [08:55:09] because in the past when we had those Nagios bombs [08:55:21] puppet tried restarting it but failed [08:55:25] on all boxes [08:55:33] i don't even think those are set as critical [08:55:39] please do:) [08:55:44] why is the init script needed? [08:55:50] i broke it before, so ..hrmm [08:55:59] just cause it wasnt puppetized [08:56:07] and the file itself is already merged in repo [08:56:08] no but presumably it's in the package? [08:56:11] just not the puppet definition [08:56:34] we added changes in the past [08:56:39] trying to fix other issues [08:56:45] what other issues? [08:57:18] something about failed restarts, adding a "sleep" afair [08:57:22] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/3291 [08:57:32] perhaps time to convert that into a sane upstart job then [08:57:37] that was to prevent some weird bug when it tried to restart to early [08:57:54] sounds right @ upstart [09:56:00] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [10:44:58] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/5492 [10:45:01] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5492 [10:47:32] PROBLEM - Puppet freshness on searchidx2 is CRITICAL: Puppet has not run in the last 10 hours [10:54:44] New review: Ryan Lane; "I'll be honest. I now have absolutely no idea what this code is doing." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/8120 [10:57:56] +INFILE="<%= ircecho_logs.map {|k,v| v = v.map {|c| c.sub(/^#?/,'#') }.join(","); "#{k}:#{v}" }.join(";") %>" [10:58:22] haha [11:01:31] New review: Ryan Lane; "Please try to avoid using crazy lambdas when possible. Yes, they are exquisite, but I don't like fee..." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/8344 [11:25:51] New patchset: Mark Bergsma; "Prepare NaiveBGPPeering for multi-protocol operation" [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/11601 [11:25:52] New patchset: Mark Bergsma; "Attempt to make NaiveBGPPeering handle other address families" [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/11602 [11:25:53] New patchset: Mark Bergsma; "Clean up handling of Attribute constructors with their ancestors" [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/11603 [11:25:53] New patchset: Mark Bergsma; "Reimplement move of advertisements/withdrawals to MP attributes" [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/11604 [11:36:19] New patchset: Hashar; "'gs' package renamed 'ghostscript' in Precise" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11605 [11:36:40] paravoid: gs to ghostscript rename for production ^^^^^ [11:36:43] and hello :-] [11:36:47] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/11605 [11:36:54] oh great [11:37:07] you're one step ahead :-) [11:37:17] New review: Faidon; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/11605 [11:37:19] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11605 [11:37:30] the package exist in Lucid, so we might as well migrate production right now :-] [11:37:42] thx! [11:42:09] PROBLEM - Host cp3002 is DOWN: PING CRITICAL - Packet loss = 100% [11:42:09] PROBLEM - Host cp3001 is DOWN: PING CRITICAL - Packet loss = 100% [11:42:21] erm [11:42:36] PROBLEM - Host knsq21 is DOWN: CRITICAL - Host Unreachable (91.198.174.31) [11:42:36] PROBLEM - Host knsq23 is DOWN: PING CRITICAL - Packet loss = 100% [11:42:36] PROBLEM - Host knsq27 is DOWN: PING CRITICAL - Packet loss = 100% [11:42:36] PROBLEM - Host knsq20 is DOWN: CRITICAL - Host Unreachable (91.198.174.30) [11:42:45] PROBLEM - Host knsq26 is DOWN: PING CRITICAL - Packet loss = 100% [11:42:45] PROBLEM - Host knsq24 is DOWN: PING CRITICAL - Packet loss = 100% [11:42:45] PROBLEM - Host knsq19 is DOWN: PING CRITICAL - Packet loss = 100% [11:42:54] PROBLEM - Host knsq17 is DOWN: CRITICAL - Host Unreachable (91.198.174.27) [11:42:54] PROBLEM - Host knsq18 is DOWN: PING CRITICAL - Packet loss = 100% [11:42:54] PROBLEM - Host knsq22 is DOWN: PING CRITICAL - Packet loss = 100% [11:43:03] PROBLEM - Host knsq16 is DOWN: PING CRITICAL - Packet loss = 100% [11:43:15] looking [11:44:06] PROBLEM - LVS HTTPS IPv6 on wiktionary-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:44:06] PROBLEM - LVS HTTPS IPv4 on wikiversity-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:44:06] PROBLEM - LVS HTTPS IPv4 on wikimedia-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:44:15] PROBLEM - LVS HTTPS IPv4 on wikibooks-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:44:15] PROBLEM - LVS HTTPS IPv6 on mediawiki-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:44:33] PROBLEM - LVS HTTPS IPv6 on wikimedia-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:44:33] PROBLEM - LVS HTTPS IPv4 on foundation-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:44:33] PROBLEM - LVS HTTPS IPv6 on wikibooks-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:44:42] PROBLEM - LVS HTTPS IPv6 on wikiversity-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:44:42] PROBLEM - LVS HTTPS IPv4 on wikisource-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:44:42] PROBLEM - LVS HTTPS IPv6 on foundation-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:44:51] PROBLEM - LVS HTTPS IPv4 on wikipedia-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:44:51] PROBLEM - LVS HTTP IPv4 on bits.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:45:00] PROBLEM - LVS HTTPS IPv4 on bits.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:45:00] PROBLEM - LVS HTTPS IPv6 on wikisource-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:45:00] PROBLEM - LVS HTTPS IPv4 on wikiquote-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:45:09] PROBLEM - LVS HTTPS IPv6 on wikiquote-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:45:10] PROBLEM - LVS HTTPS IPv4 on wikinews-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:45:10] PROBLEM - LVS HTTPS IPv4 on wiktionary-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:45:10] PROBLEM - LVS HTTPS IPv6 on wikipedia-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:45:10] PROBLEM - LVS HTTPS IPv4 on mediawiki-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:45:54] PROBLEM - LVS HTTPS IPv4 on upload.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:46:33] !log csw1-esams.wikimedia.org line card 2 in trouble, power cycled it [11:46:38] Logged the message, Master [11:47:04] it sensed that replacement hardware has arrived at the data center I think [11:47:06] RECOVERY - LVS HTTPS IPv4 on wikibooks-lb.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 44110 bytes in 0.666 seconds [11:47:06] RECOVERY - LVS HTTPS IPv6 on mediawiki-lb.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 60057 bytes in 0.665 seconds [11:47:06] RECOVERY - LVS HTTPS IPv6 on wiktionary-lb.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 60059 bytes in 9.657 seconds [11:47:06] RECOVERY - LVS HTTPS IPv4 on wikimedia-lb.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 80875 bytes in 9.771 seconds [11:47:07] just hours ago [11:47:15] RECOVERY - LVS HTTPS IPv4 on upload.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 637 bytes in 0.442 seconds [11:47:24] RECOVERY - LVS HTTPS IPv6 on wikimedia-lb.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 80875 bytes in 0.770 seconds [11:47:24] RECOVERY - LVS HTTPS IPv4 on foundation-lb.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 39111 bytes in 1.073 seconds [11:47:24] RECOVERY - LVS HTTPS IPv6 on wikibooks-lb.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 60058 bytes in 0.663 seconds [11:47:24] RECOVERY - Host knsq19 is UP: PING OK - Packet loss = 0%, RTA = 108.91 ms [11:47:24] RECOVERY - Host knsq20 is UP: PING OK - Packet loss = 0%, RTA = 108.83 ms [11:47:24] RECOVERY - Host knsq21 is UP: PING OK - Packet loss = 0%, RTA = 108.95 ms [11:47:25] RECOVERY - Host cp3001 is UP: PING OK - Packet loss = 0%, RTA = 121.32 ms [11:47:25] RECOVERY - Host knsq27 is UP: PING OK - Packet loss = 0%, RTA = 108.73 ms [11:47:26] RECOVERY - Host knsq24 is UP: PING OK - Packet loss = 0%, RTA = 108.65 ms [11:47:26] RECOVERY - Host knsq23 is UP: PING OK - Packet loss = 0%, RTA = 108.75 ms [11:47:27] RECOVERY - Host knsq26 is UP: PING OK - Packet loss = 0%, RTA = 109.52 ms [11:47:33] RECOVERY - LVS HTTPS IPv6 on wikiversity-lb.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 60057 bytes in 0.665 seconds [11:47:33] RECOVERY - LVS HTTPS IPv6 on foundation-lb.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 60058 bytes in 0.667 seconds [11:47:33] RECOVERY - LVS HTTPS IPv4 on wikisource-lb.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 43917 bytes in 0.664 seconds [11:47:42] RECOVERY - LVS HTTPS IPv4 on wikipedia-lb.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 60053 bytes in 0.772 seconds [11:47:42] RECOVERY - Host knsq17 is UP: PING OK - Packet loss = 0%, RTA = 108.82 ms [11:47:42] RECOVERY - Host cp3002 is UP: PING OK - Packet loss = 0%, RTA = 109.27 ms [11:47:51] RECOVERY - LVS HTTPS IPv6 on wikisource-lb.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 60060 bytes in 0.664 seconds [11:47:51] RECOVERY - LVS HTTPS IPv4 on wikiquote-lb.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 52459 bytes in 0.664 seconds [11:47:51] RECOVERY - Host knsq18 is UP: PING OK - Packet loss = 0%, RTA = 108.81 ms [11:47:51] RECOVERY - Host knsq16 is UP: PING OK - Packet loss = 0%, RTA = 108.59 ms [11:47:51] RECOVERY - Host knsq22 is UP: PING OK - Packet loss = 0%, RTA = 108.71 ms [11:48:00] RECOVERY - LVS HTTPS IPv4 on mediawiki-lb.esams.wikimedia.org is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.445 second response time [11:48:00] RECOVERY - LVS HTTPS IPv6 on wikiquote-lb.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 60056 bytes in 0.669 seconds [11:48:00] RECOVERY - LVS HTTPS IPv4 on wikinews-lb.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 69663 bytes in 0.774 seconds [11:48:00] RECOVERY - LVS HTTPS IPv6 on wikipedia-lb.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 60057 bytes in 0.882 seconds [11:48:00] RECOVERY - LVS HTTPS IPv4 on bits.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 3837 bytes in 9.919 seconds [11:48:00] RECOVERY - LVS HTTPS IPv4 on wiktionary-lb.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 60453 bytes in 0.979 seconds [11:48:28] RECOVERY - LVS HTTPS IPv4 on wikiversity-lb.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 48808 bytes in 0.904 seconds [11:51:49] paravoid: while you are around I have an other occurrence of Lucid -> Precise packages migration. That one is nastier though [11:51:51] https://gerrit.wikimedia.org/r/#/c/11358/ [11:51:54] PROBLEM - Varnish HTTP bits on cp3002 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:52:09] some packages providing fonts have been deleted, so we will have to explicitly distribute the fonts we need :-/ [11:55:03] PROBLEM - Varnish HTTP bits on cp3001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:55:50] top - 11:55:48 up 77 days, 22:08, 1 user, load average: 12923.16, 9550.86, 4640.65 [11:57:18] ouch [11:57:26] never seen something that high [11:57:45] RECOVERY - Varnish HTTP bits on cp3002 is OK: HTTP OK HTTP/1.1 200 OK - 634 bytes in 2.567 seconds [11:58:30] PROBLEM - LVS HTTPS IPv4 on bits.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:59:42] RECOVERY - LVS HTTP IPv4 on bits.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 3873 bytes in 0.612 seconds [12:00:54] RECOVERY - Varnish HTTP bits on cp3001 is OK: HTTP OK HTTP/1.1 200 OK - 634 bytes in 0.219 seconds [12:01:21] RECOVERY - LVS HTTPS IPv4 on bits.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 3890 bytes in 0.439 seconds [12:02:24] PROBLEM - Varnish HTTP bits on cp3002 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:05:15] RECOVERY - Varnish HTTP bits on cp3002 is OK: HTTP OK HTTP/1.1 200 OK - 634 bytes in 0.218 seconds [12:10:23] I didn't get paged [12:10:29] dammit [12:14:17] mark: did you get paged for the above? [12:14:33] yes [12:15:31] sigh [12:16:09] New patchset: Hashar; "redirects (301) /w/ to /w/index.php" [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/11606 [12:19:40] paravoid: would you mind looking at another Lucid to Precise packages migration? :D https://gerrit.wikimedia.org/r/#/c/11358/ :-] [12:22:05] I didn't forget you :) [12:22:06] New patchset: Faidon; "Add faidon to SMS Nagios group (doh!)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11607 [12:22:07] sec. [12:22:12] that ^^^ is more important :-) [12:22:17] totally :-] [12:22:30] just making sure I am still somewhere in the queue !! [12:22:33] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/11607 [12:22:38] New review: Faidon; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/11607 [12:22:41] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11607 [12:23:39] now we just need another outage to be able to test this :> [12:23:52] paravoid: someone called me? :D [12:24:06] so which system you want to break [12:24:07] :) [12:25:15] New review: Siebrand; "Interesting discussion, by the way. Who's the owner of operations/mediawiki-config and who should/ca..." [operations/mediawiki-config] (master) C: 1; - https://gerrit.wikimedia.org/r/11082 [12:25:38] hashar: so [12:25:57] what did you meant when you said that we'll have to ship the fonts ourselves? [12:27:15] paravoid: I meant, we need to explicitly list the fonts [12:27:22] instead of using the meta packages that regroups fonts per language [12:27:31] probably need to be reworded [12:27:43] aha [12:27:59] I wonder what they replaced language-support-fonts-* with [12:30:37] so, [12:30:38] could not find any package replacement [12:30:40] if we have to do this anyway [12:30:45] then why not list the fonts for lucid too? [12:31:06] I used "apt-cache depends" to get list of fonts and rdepends on precise to try to find out some meta package [12:31:22] right, I just did that too :) [12:31:47] I did not want to mess too much with Lucid :-] [12:31:52] so I just copy pasted the existing packages [12:32:01] well, if it's doing the same… [12:32:02] but now we have two classes :/ [12:32:06] sure [12:33:04] mark: btw, what did you to powercycle the linecard? [12:33:14] not that I have access to that but wondering :) [12:33:17] power-off [12:37:50] paravoid: fonts packages : https://gerrit.wikimedia.org/r/#/c/11358/ ||| https://gerrit.wikimedia.org/r/#/c/11358/3/manifests/imagescaler.pp,unified [12:38:26] grrr [12:38:36] I should move the other fonts in that imagescaler::packages::fonts [12:39:02] liberation and libertine you mean? [12:39:06] yeah [12:39:18] sure [12:39:25] love the nice commit messages too :-) [12:39:32] though that commit make it clear that we made a transition from language-support-fonts to explicit fonts definition [12:40:21] New review: Siebrand; "There's probably a reason why you approved but didn't merge. Does that need any additional informati..." [operations/mediawiki-config] (master) C: 0; - https://gerrit.wikimedia.org/r/10707 [12:42:01] New review: Nikerabbit; "I only merge stuff to config when I am going to deploy it immediately after, or I know someone else ..." [operations/mediawiki-config] (master); V: 0 C: 2; - https://gerrit.wikimedia.org/r/10707 [12:44:49] damn river [12:49:41] Patchset 4 moves fonts related packages from imagescaler::packages to imagescaler::packages::fonts [12:50:16] paravoid: ok got it rebased with all fonts in the same package https://gerrit.wikimedia.org/r/#/c/11358/ [13:05:56] :) [13:10:03] danke trying out on test [13:16:47] seems to work [13:22:34] New patchset: Hashar; "job runner now supports being run on a specific job type" [operations/debs/wikimedia-job-runner] (master) - https://gerrit.wikimedia.org/r/11610 [13:25:06] New review: Hashar; "This kind of follow up https://gerrit.wikimedia.org/r/#/c/11041/ which made jobs-loop.sh to recogniz..." [operations/debs/wikimedia-job-runner] (master) C: 0; - https://gerrit.wikimedia.org/r/11610 [13:39:25] !log adding labs-ns0 and labs-ns1 dns entries [13:39:31] Logged the message, Master [13:41:36] why does wikipedia not resolve for me? [13:44:58] i had that happen to me once [13:45:09] but it was because I had edited my hosts file and pointed wikipedia at localhost [13:45:18] the I went and told this room that wikipedia didn't work [13:45:21] then* [13:47:04] New review: Hashar; "(no comment)" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11145 [13:47:06] Change merged: Hashar; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11145 [13:48:06] New review: Hashar; "Merging test related stuff, not going to kill site :-]" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11146 [13:48:09] Change merged: Hashar; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11146 [13:48:17] so many stuff to do [13:57:28] New review: Demon; "Test comment." [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/11314 [14:08:53] hey mark, i'm improving the sysctl define and using it [14:08:54] q [14:09:03] the number prefixes in sysctl.d files [14:09:14] should I just use 60- for all of our custom prefixes? [14:09:20] should I make that configurable? [14:09:22] does it matter? [14:09:32] do those just specify priority load order? [14:10:42] ottomata: as long as they don't overlap it doesn't really matter [14:11:03] as in there can't be 2 files with the same number (there currently are) [14:11:22] or as long as they don't overlap with numbers and values? [14:11:32] the latter [14:11:36] ah ok cool [14:11:48] cool, I will just leave them at 60 then [14:15:46] @infobot-ignore+ log [14:15:47] petan: Unknown identifier (log) [14:15:47] Item log is already in list [14:16:21] why do we have those bots in here? [14:20:56] New patchset: Demon; "Fix spammy IRC output once and for all." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11615 [14:21:24] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/11615 [14:21:46] New patchset: Demon; "Fix spammy IRC output once and for all." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11615 [14:22:15] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/11615 [14:23:41] Change abandoned: Demon; "Dropping in favor of I9c96d565." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11314 [14:26:05] <^demon> Ryan_Lane: Mind taking a look at https://gerrit.wikimedia.org/r/#/c/11615/? The logic's much better and it actually does what I intended originally. [14:27:20] I swear if this breaks the hooks, I'll stab you :) [14:27:39] gimme a min, that's a lot of files [14:27:52] ^demon: https://gerrit.wikimedia.org/r/#/c/11615/2/files/gerrit/hooks/hookhelper.py,unified [14:27:56] <^demon> It's a bunch of 2 line changes. And I tested it. [14:27:57] you are doing a strict equality there [14:28:00] \o/ [14:28:03] if user in hookconfig.spammyusers [14:28:30] <^demon> I'm using the regex'd version that you see on IRC. [14:28:33] <^demon> "Demon" [14:28:40] <^demon> "L10n-bot" [14:28:40] <^demon> etc [14:28:53] * mark does a bunch of 2-line changes on ^demon [14:29:04] ^demon: oh yeah:) [14:29:47] ^demon: you could just do that regex at the log_to_file() level though [14:30:05] to factor out the code like: user = re.sub(' \(.*', "", something ) [14:30:06] <^demon> Meh, each of the messages are a bit different. [14:30:23] <^demon> I'm tired of playing with it. I just want it to fucking work. [14:30:41] <^demon> If this works I'm going to wash my hands of it and not touch it again. [14:31:09] New patchset: Hashar; "(bug 34866) Change wgLanguageCode of several wikis to be renamed" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/10707 [14:31:16] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/10707 [14:31:40] New review: Hashar; "Patchset 2 is a rebase" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/10707 [14:31:43] Change merged: Hashar; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/10707 [14:32:48] ^demon: well git blame know about you! [14:32:53] you will not escape :-] [14:33:04] hume: rsync: write failed on "/apache/common-local/wmf-config/InitialiseSettings.php": No space left on device (28) [14:33:14] !log hume is out of disk space [14:33:19] Logged the message, Master [14:34:20] !log hume: 5.0G 5.0G 68K 100% /usr/local/apache [14:34:25] Logged the message, Master [14:35:09] hashar: slooooow [14:35:14] It was out of disk space on monday [14:35:25] we need to have that partition resized [14:35:44] <^demon> We've got wmf[2-5] all on it [14:35:52] need notpeter to look at it probably [14:38:35] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/11615 [14:38:38] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11615 [14:39:02] I killed the old l10n cache dirs we dont need around [14:39:11] saved quite a bit [14:39:32] Reedy: so we are killing data now instead of removing? :D [14:39:59] yeah [14:40:01] DIE DIE DIE [14:40:15] New review: Ryan Lane; "test" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11615 [14:40:20] thank Reedy :-) [14:40:28] ^demon: ^^ done [14:40:46] <^demon> Thanks. Hopefully that did it. [14:40:59] <^demon> At the very least, it'll be easier to just change the one big of config if I got the name wrong [14:41:00] hopefully, yeah :) [14:41:05] /dev/mapper/tank-archive 1.5T 271G 1.3T 19% /archive [14:41:12] I think there's some free space there ;) [14:41:14] <^demon> *bit [14:42:33] I get like 2TB myself at home [14:42:35] mostly empty [14:42:45] that is just for my laptops backups :-D [14:45:11] Reedy: willing to review a small patch for me please ? https://gerrit.wikimedia.org/r/#/c/7298/ [14:45:23] to override wfHostname() return [14:45:41] need that on labs to get ride of the non human friendly instances names such as I-0000000D0E [14:46:01] thinking of needing reviews... [14:46:16] I still have extension changes that need reviews ;) [14:46:28] https://gerrit.wikimedia.org/r/#/c/11169/ [14:46:35] https://gerrit.wikimedia.org/r/#/c/11175/ [14:46:45] I can do them probably [14:46:53] Ryan_Lane: I got https://gerrit.wikimedia.org/r/#/c/7083/ for ya [14:46:58] Ryan_Lane: which is a bad patch btw :-( [14:48:18] hashar: bad patch? [14:50:28] the idea was to replace instance id with instance id + instance name [14:50:38] but only did that at one place :/ [14:52:13] ah [14:52:35] Yeah, it would be good for all of the messages to show what was happening on what instance [14:52:44] Many of the messages are really generic [14:55:07] <^demon> Ryan_Lane: 11175 reviewed. [14:55:36] I don't see a review [14:56:05] <^demon> I meant reviewed + merged. Not inline review. [14:56:06] oh you meant 11169 :) [14:56:11] <^demon> Oh, yeah [14:56:12] <^demon> whoops [14:56:14] heh [14:56:18] thanks [14:56:52] I really would have though we would have a global message for "you need to be logged in" [14:56:54] I was so, so wrong [14:57:02] that's duplicated in a billion places [14:57:08] <^demon> Other one too :) [14:57:17] sweet. thanks [14:57:22] now I can update labsconsole :) [14:57:24] <^demon> There probably is something in output page. [14:57:33] I need to cherry-pick in the core change first, though [14:57:41] <^demon> outputpage->loggedInError(), or throw new LoggedInError or somesuch [14:57:46] ha [14:57:48] err [14:57:49] ah [14:57:55] that is what i am looking for :( [14:57:55] https://gerrit.wikimedia.org/r/#/c/11169/ [14:57:58] can't find any [14:58:06] <^demon> I honestly don't know what's going on in mediawiki these days :p [14:58:19] <^demon> Too much gerrit and python. [14:58:24] well, this is from ages ago [14:59:00] we could just add a simple NotLoggedIn exception [14:59:20] In MediaWiki? Isn't there one already? AssertEdit extension? [14:59:24] Or something. [15:00:14] bah chad merged it [15:02:16] yeah [15:02:18] assetedit [15:04:26] I have opened https://bugzilla.wikimedia.org/show_bug.cgi?id=37627 [15:08:34] Ryan_Lane: if you are in a review mood, I have made some changes to the apache configurations https://gerrit.wikimedia.org/r/#/q/status:open+project:operations/apache-config,n,z [15:08:43] mostly housework [15:08:53] though one is about making /w/ to 301 to /w/index.php [15:13:23] * Ryan_Lane twitches [15:13:30] I'm not messing with apache config on a friday [15:14:30] apache configs are sorta like dns changes [15:14:33] seriously [15:14:43] one space somewhere, one bad redirect, entire squid cache is polluted [15:14:50] yes [15:14:58] I'd prefer to not have two outages this week [15:16:19] I fullly agree [15:16:39] I was barely making you aware of the existences of such changes :-D [15:16:46] heh [15:17:05] I am not even going to deploy them though I might be able to do so :D [15:50:16] !log updating dns for new cisco machines [15:50:21] Logged the message, RobH [15:50:41] Ryan_Lane: ^ so today we are setting the mgmt up, the systems are wired and racked, and leslie has the info for setting up the network when she comes online later [15:50:52] yay! [15:52:38] YAY! [15:53:44] Ryan_Lane: I have not allocated DNS or any of the mac info for installs, that I will leave to you once they are accessible. [15:53:59] New patchset: Ottomata; "Refactoring udp2log classes and defines." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11574 [15:54:29] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/11574 [15:54:54] :( [15:57:43] PROBLEM - Puppet freshness on maerlant is CRITICAL: Puppet has not run in the last 10 hours [15:57:43] PROBLEM - Puppet freshness on professor is CRITICAL: Puppet has not run in the last 10 hours [15:58:21] New patchset: Ottomata; "Refactoring udp2log classes and defines." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11574 [15:58:49] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/11574 [16:00:12] New review: Ottomata; "(no comment)" [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/11574 [16:00:42] mmmk, mark, i just made the changes you suggested (i like) for my udp2log puppet refactor [16:00:49] whenever you get a sec, would love another look over [16:15:18] drdee: you there? [16:15:25] sure am [16:15:38] oh, you should get in on this too, ottomata [16:15:49] so, I'm going to start throwing search logs at... something [16:15:57] sweet [16:16:04] should I throw them at the udp2log instances? (can probably only do one right now) [16:16:17] that sounds reasonable, ottomata what do you think? [16:16:28] I'm not sure if the filters on udp2log will accept them, is the thing [16:17:00] I guess what I'm asking is "where shall I point the hose?" [16:17:20] maybe stat1001? [16:17:24] ottomata, what do you think? [16:18:25] also, this already exists to catch the packets: https://svn.wikimedia.org/viewvc/mediawiki/branches/lucene-search-2.1/udplogger/udplogger.py?view=markup&pathrev=51097 [16:18:55] (whether or not it works or can handle the traffic, I can't say) [16:19:07] hmmm, interseting [16:19:15] probably can [16:19:20] worth trying at least [16:19:24] so you are having lucene send traffic to 8192? [16:20:16]