[00:01:54] RECOVERY - swift-object-auditor on ms-be6 is OK: PROCS OK: 2 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [00:02:21] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.036 seconds [00:02:39] RECOVERY - swift-object-auditor on ms-be8 is OK: PROCS OK: 2 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [00:02:48] RECOVERY - swift-object-auditor on ms-be7 is OK: PROCS OK: 2 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [00:05:30] New patchset: Asher; "fix pc host regex" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14863 [00:06:06] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/14863 [00:07:49] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14863 [00:10:38] New patchset: Tim Starling; "Updates for git and l10nupdate ownership" [operations/mediawiki-multiversion] (master) - https://gerrit.wikimedia.org/r/14864 [00:34:45] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:43:36] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 2.021 seconds [00:56:44] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [01:17:08] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:22:44] PROBLEM - Host mw1070 is DOWN: PING CRITICAL - Packet loss = 100% [01:23:11] New patchset: Tim Starling; "Updates for git and l10nupdate ownership" [operations/mediawiki-multiversion] (master) - https://gerrit.wikimedia.org/r/14864 [01:24:18] New review: Tim Starling; "PS2: fixed many bugs. Tested on server." [operations/mediawiki-multiversion] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/14864 [01:24:20] Change merged: Tim Starling; [operations/mediawiki-multiversion] (master) - https://gerrit.wikimedia.org/r/14864 [01:25:59] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 7.211 seconds [01:39:47] PROBLEM - Puppet freshness on owa1 is CRITICAL: Puppet has not run in the last 10 hours [01:41:08] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 247 seconds [01:42:47] PROBLEM - MySQL Slave Delay on storage3 is CRITICAL: CRIT replication delay 279 seconds [01:49:14] PROBLEM - Misc_Db_Lag on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 665s [01:50:17] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 8 seconds [01:53:08] RECOVERY - MySQL Slave Delay on storage3 is OK: OK replication delay 1 seconds [01:53:35] RECOVERY - Misc_Db_Lag on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 29s [02:00:20] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:09:11] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 2.970 seconds [02:18:41] PROBLEM - NTP on manganese is CRITICAL: NTP CRITICAL: Offset unknown [02:23:20] RECOVERY - NTP on manganese is OK: NTP OK: Offset -0.0004053115845 secs [02:27:41] PROBLEM - Host virt1003 is DOWN: PING CRITICAL - Packet loss = 100% [02:28:36] PROBLEM - Host virt1002 is DOWN: PING CRITICAL - Packet loss = 100% [02:28:36] PROBLEM - Host virt1001 is DOWN: PING CRITICAL - Packet loss = 100% [02:30:32] RECOVERY - Host virt1001 is UP: PING OK - Packet loss = 0%, RTA = 31.45 ms [02:30:41] RECOVERY - Host virt1003 is UP: PING OK - Packet loss = 0%, RTA = 31.02 ms [02:31:26] RECOVERY - Host virt1002 is UP: PING OK - Packet loss = 0%, RTA = 31.52 ms [02:41:41] New patchset: Asher; "moving pc1 from coredb (for comparitive benchmarks) to pcachedb conf" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14866 [02:42:14] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14866 [02:48:59] PROBLEM - Host mw1073 is DOWN: PING CRITICAL - Packet loss = 100% [03:11:28] PROBLEM - Host srv278 is DOWN: PING CRITICAL - Packet loss = 100% [03:12:13] RECOVERY - Host srv278 is UP: PING OK - Packet loss = 0%, RTA = 0.25 ms [03:16:16] PROBLEM - Apache HTTP on srv278 is CRITICAL: Connection refused [03:29:01] PROBLEM - Puppet freshness on nfs2 is CRITICAL: Puppet has not run in the last 10 hours [03:33:04] PROBLEM - Puppet freshness on nfs1 is CRITICAL: Puppet has not run in the last 10 hours [03:38:28] RECOVERY - Apache HTTP on srv278 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.083 second response time [04:29:18] PROBLEM - Puppet freshness on db1029 is CRITICAL: Puppet has not run in the last 10 hours [05:34:46] anyone on? could use an RT lookup [05:43:06] hey woosters, have a min? [05:43:34] wassup? [05:44:19] woosters: there's an RT from today from chmee2 for wikipedie.cz. can you give me the #? [05:45:15] it is https://rt.wikimedia.org/Ticket/Display.html?id=3244 [05:45:23] danke! [05:45:37] and https://rt.wikimedia.org/Ticket/Display.html?id=3245 [05:45:40] he created 2 [05:45:46] dupes? [05:45:49] should prolly merge them [05:45:55] k [06:02:07] New patchset: Tim Starling; "Avoid copying unnecessary LU files out to apaches" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14868 [06:02:40] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/14868 [06:03:07] Change merged: Tim Starling; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14868 [06:04:05] New patchset: Jeremyb; "wikipedie.cz/experti_na_prirodu: short URL redir" [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/14869 [06:04:59] New review: Jeremyb; "This is somewhat time sensitive but idk how much." [operations/apache-config] (master) C: 0; - https://gerrit.wikimedia.org/r/14869 [06:45:08] PROBLEM - Puppet freshness on mw1102 is CRITICAL: Puppet has not run in the last 10 hours [06:59:22] PROBLEM - Puppet freshness on maerlant is CRITICAL: Puppet has not run in the last 10 hours [07:25:56] New patchset: Tim Starling; "Set $wgLocalisationUpdateDirectory per I8bcd2817" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/14871 [07:25:56] New patchset: Tim Starling; "In robots.php cleaned up comments and whitespace" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/14872 [07:59:37] PROBLEM - Puppet freshness on db29 is CRITICAL: Puppet has not run in the last 10 hours [08:31:42] New patchset: Matthias Mullie; "lower AFTv4 odds to display AFTv5 at 3% (inverse odds)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/14873 [08:34:07] PROBLEM - swift-object-auditor on ms-be7 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [08:34:25] PROBLEM - swift-object-auditor on ms-be6 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [08:34:25] PROBLEM - swift-object-auditor on ms-be8 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [08:38:46] RECOVERY - swift-object-auditor on ms-be6 is OK: PROCS OK: 2 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [08:55:46] RECOVERY - swift-object-auditor on ms-be8 is OK: PROCS OK: 2 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [08:59:22] RECOVERY - swift-object-auditor on ms-be7 is OK: PROCS OK: 2 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [09:01:35] New patchset: MaxSem; "Move FeaturedFeedsWMF.php from extension directory" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/14874 [09:10:46] PROBLEM - Puppet freshness on ms3 is CRITICAL: Puppet has not run in the last 10 hours [09:21:43] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [09:32:40] PROBLEM - Puppet freshness on owa2 is CRITICAL: Puppet has not run in the last 10 hours [09:46:26] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [09:57:12] http://i.imgur.com/RbI19.jpg lol [09:58:02] heh [10:01:21] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/14746 [10:02:22] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/14747 [10:03:02] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/14678 [10:06:31] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/13285 [10:07:01] New review: Reedy; "docroot/commons/apple-touch-icon.png | Bin 2225 -> 1620 bytes" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/13285 [10:58:07] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [11:24:40] PROBLEM - Host mw1008 is DOWN: PING CRITICAL - Packet loss = 100% [11:41:10] PROBLEM - Puppet freshness on owa1 is CRITICAL: Puppet has not run in the last 10 hours [12:46:16] Change merged: Tim Starling; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/14871 [12:46:50] New patchset: Tim Starling; "In robots.php cleaned up comments and whitespace" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/14872 [12:47:03] Change merged: Tim Starling; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/14872 [12:48:08] New patchset: Tim Starling; "Script for streaming favicon.ico" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/14879 [12:49:09] Change merged: Tim Starling; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/14879 [13:22:12] New patchset: Tim Starling; "favicon caching tweak" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/14881 [13:22:12] New patchset: Tim Starling; "Fix for bug 38274: no oldid" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/14882 [13:23:12] Change merged: Tim Starling; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/14882 [13:23:13] Change merged: Tim Starling; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/14881 [13:30:06] PROBLEM - Puppet freshness on nfs2 is CRITICAL: Puppet has not run in the last 10 hours [13:34:09] PROBLEM - Puppet freshness on nfs1 is CRITICAL: Puppet has not run in the last 10 hours [13:48:07] It seems there is a handful of apaches that don't have tidy installed [13:48:29] both srv### and mw### [14:11:36] or not, ignore me [14:20:45] PROBLEM - Puppet freshness on pc1 is CRITICAL: Puppet has not run in the last 10 hours [14:27:38] hey ryan [14:27:51] howdy [14:28:01] my labs instance pybal-precise has been rather weird since you migrated it [14:28:04] heh [14:28:20] you aren't on labs-l, are you? :) [14:28:23] no [14:28:26] so..... [14:28:38] kvm block migration is broken in lucid [14:28:42] ok [14:28:48] it corrupts the images as they go across [14:28:54] just wanted to make sure you are aware [14:28:59] i'll kill that instance, no problem [14:29:02] and of course, it deletes the image when its done migrating [14:29:07] hehe [14:29:07] * Ryan_Lane nods [14:29:09] thanks [14:29:11] it sort of still works [14:29:14] but has weird issues ;) [14:29:54] i'll subscribe to labs-l [14:30:02] but won't promise i'll actually follow it ;) [14:30:48] PROBLEM - Puppet freshness on db1029 is CRITICAL: Puppet has not run in the last 10 hours [14:32:52] * Ryan_Lane1 mumbles [14:32:57] wifi is fucked up [14:33:12] mark: so, I transfered like 5 instances and checked them [14:33:53] mark: then I transfered 30 instances [14:33:57] \o/ [14:34:04] so, yeah 30 corrupted instances [14:36:36] now i'm happy git does hash checking [14:37:04] yeah [14:40:06] New patchset: Mark Bergsma; "Use IPv4 DNS servers only, small fixes" [operations/debs/pybal] (master) - https://gerrit.wikimedia.org/r/14888 [14:40:07] New patchset: Mark Bergsma; "Small fixes" [operations/debs/pybal] (master) - https://gerrit.wikimedia.org/r/14889 [14:40:08] New patchset: Mark Bergsma; "Fix attribute" [operations/debs/pybal] (master) - https://gerrit.wikimedia.org/r/14890 [14:40:09] New patchset: Mark Bergsma; "Support CNAMEs" [operations/debs/pybal] (master) - https://gerrit.wikimedia.org/r/14891 [14:40:55] Change merged: Mark Bergsma; [operations/debs/pybal] (master) - https://gerrit.wikimedia.org/r/14888 [14:41:28] Change merged: Mark Bergsma; [operations/debs/pybal] (master) - https://gerrit.wikimedia.org/r/14889 [14:41:52] Change merged: Mark Bergsma; [operations/debs/pybal] (master) - https://gerrit.wikimedia.org/r/14890 [14:42:15] Change merged: Mark Bergsma; [operations/debs/pybal] (master) - https://gerrit.wikimedia.org/r/14891 [14:43:30] New patchset: Mark Bergsma; "pybal (1.04) precise; urgency=high" [operations/debs/pybal] (master) - https://gerrit.wikimedia.org/r/14892 [14:43:31] New patchset: Mark Bergsma; "Merge branch 'malus/master'" [operations/debs/pybal] (master) - https://gerrit.wikimedia.org/r/14893 [14:43:31] New patchset: Mark Bergsma; "pybal (1.05) precise; urgency=low" [operations/debs/pybal] (master) - https://gerrit.wikimedia.org/r/14894 [14:44:41] Change merged: Mark Bergsma; [operations/debs/pybal] (master) - https://gerrit.wikimedia.org/r/14894 [14:44:42] Change merged: Mark Bergsma; [operations/debs/pybal] (master) - https://gerrit.wikimedia.org/r/14893 [14:44:43] Change merged: Mark Bergsma; [operations/debs/pybal] (master) - https://gerrit.wikimedia.org/r/14892 [14:46:01] New patchset: Mark Bergsma; "Merge branch 'malus/master' into monitors/dns" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14895 [14:46:02] New patchset: Mark Bergsma; "Merge branch 'malus/monitors/dns' into monitors/dns" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14896 [14:46:02] New patchset: Mark Bergsma; "Merge branch 'malus/monitors/dns' into monitors/dns" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14897 [14:46:03] New patchset: Mark Bergsma; "Merge branch 'malus/monitors/dns' into monitors/dns" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14898 [14:46:03] New patchset: Mark Bergsma; "Merge remote-tracking branch 'origin/master' into monitors/dns" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14899 [14:46:04] New patchset: Mark Bergsma; "Merge branch 'malus/monitors/dns' into monitors/dns" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14900 [14:46:05] New patchset: Mark Bergsma; "Merge branch 'malus/monitors/dns' into monitors/dns" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14901 [14:46:06] New patchset: Mark Bergsma; "Merge branch 'malus/monitors/dns' into monitors/dns" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14902 [14:46:06] New patchset: Mark Bergsma; "Merge branch 'malus/monitors/dns' into monitors/dns" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14903 [14:46:07] New patchset: Mark Bergsma; "Merge branch 'malus/master' into monitors/dns" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14904 [14:46:08] New patchset: Mark Bergsma; "Merge branch 'malus/monitors/dns' into monitors/dns" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14905 [14:46:08] New patchset: Mark Bergsma; "Merge branch 'malus/monitors/dns' into monitors/dns" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14906 [14:46:09] New patchset: Mark Bergsma; "Merge branch 'malus/monitors/dns' into monitors/dns" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14907 [14:46:12] meh [14:47:56] Change abandoned: Mark Bergsma; "(no reason)" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14895 [14:48:08] Change abandoned: Mark Bergsma; "(no reason)" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14896 [14:48:20] Change abandoned: Mark Bergsma; "(no reason)" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14897 [14:48:31] Change abandoned: Mark Bergsma; "(no reason)" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14898 [14:48:40] Change abandoned: Mark Bergsma; "(no reason)" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14899 [14:48:50] Change abandoned: Mark Bergsma; "(no reason)" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14900 [14:49:04] Change abandoned: Mark Bergsma; "(no reason)" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14901 [14:49:14] Change abandoned: Mark Bergsma; "(no reason)" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14902 [14:49:24] Change abandoned: Mark Bergsma; "(no reason)" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14903 [14:49:33] Change abandoned: Mark Bergsma; "(no reason)" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14904 [14:49:42] Change abandoned: Mark Bergsma; "(no reason)" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14905 [14:49:52] Change abandoned: Mark Bergsma; "(no reason)" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14906 [14:50:05] Change abandoned: Mark Bergsma; "(no reason)" [operations/debs/pybal] (monitors/dns) - https://gerrit.wikimedia.org/r/14907 [14:56:48] hm [14:56:59] Ryan_Lane: how feasible would it be to offer a configurable option for instances to not use NFS home? [14:57:06] sometimes you don't really need that [14:59:52] mark, did you ever work with ops volunteer 'Marumari'? [14:59:54] New patchset: Ottomata; "base.pp,site.pp - parameterizing contact_group for base::monitoring::host" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14911 [15:00:21] no [15:00:29] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/14911 [15:00:37] Hm. So when she said 'a long time ago' she must've meant a /really/ long time ago. [15:00:58] doesn't ring a bell at all [15:02:18] just curious. I've known her for a long time, only just learned that she used to hack on the mediawiki servers. [15:02:47] could have been before 2004 or so [15:03:16] in which case brion or tim might know [15:03:32] Yeah, Brion was the only real name she remembers. [15:03:43] New patchset: Ottomata; "base.pp,site.pp - parameterizing contact_group for base::monitoring::host" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14911 [15:04:16] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/14911 [15:05:28] New review: Ryan Lane; "reviews inline." [operations/apache-config] (master); V: 0 C: 0; - https://gerrit.wikimedia.org/r/14869 [15:06:16] mark: hm [15:06:21] mark: could do that, yeah [15:06:38] soon I'll be removing the script that creates home directories [15:06:45] and switch to pam_homedir [15:06:50] or whatever the hell it's called [15:07:04] since I switched the ssh keys to a separate location it's possibl [15:07:39] mark: what's the problem with the nfs homedirs, though? [15:07:42] nice [15:07:45] not much of a problem [15:07:50] but I think it's the main source of slowness now [15:07:52] ah [15:07:54] probably [15:07:55] yes [15:08:02] so if you don't need nfs, it would be nice to not have it [15:08:02] I hate the nfs homedirs [15:08:11] yeah [15:08:17] it's nice for most uses [15:08:21] agreed [15:08:36] when we do multi-node network node and project storage for homedirs it should be better [15:12:33] fortunately i'm already doing most building in /var/tmp [15:12:37] pbuilder etc [15:12:40] but git repos are on /home [15:49:56] RECOVERY - Host mw1008 is UP: PING OK - Packet loss = 0%, RTA = 31.60 ms [15:55:11] RECOVERY - Host mw1070 is UP: PING OK - Packet loss = 0%, RTA = 30.96 ms [15:55:56] RECOVERY - Host mw1073 is UP: PING OK - Packet loss = 0%, RTA = 30.92 ms [15:58:38] mutante: can you do a labs acct creation? https://www.mediawiki.org/w/index.php?title=Developer_access&diff=560500&oldid=560498 [15:58:49] (btw, are you coming? [15:58:50] ) [16:00:16] jeremyb: ok, on it. and no ..no wikimania unfortunately [16:01:00] New patchset: Ottomata; "statistics.pp - gerrit-stats now uses cli options for data file location" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14922 [16:01:33] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/14922 [16:01:47] jeremyb: i'm not sure i use the right scheme to decode the obfuscated email address :) [16:02:24] haha [16:02:25] jeremyb: ok, i am :) [16:03:38] mutante: echo 'a2ltQGJydW5pbmcueHM0YWxsLm5sCg==' | base64 -d ;) [16:04:14] jeremyb: as long as it's not kim dot com:) [16:04:54] jeremyb: oops, technical problem. There was either an authentication database error or you are not allowed to update your external account. [16:05:04] uhuh [16:12:34] mutante: still broke i guess? heading off for lunch [16:12:46] jeremyb: yeah [16:13:04] ;( [16:13:53] jeremyb: wait, "previous svn access" ...ah [16:15:06] jeremyb: should be done [16:15:52] jeremyb: if already has svn account, then can't create via labsconsole, gotta use add-labs-user on formey then, and did it [16:28:21] !log powercycling niobium [16:28:28] Logged the message, Master [16:32:24] RECOVERY - Host niobium is UP: PING OK - Packet loss = 0%, RTA = 30.93 ms [16:33:04] !log package upgrades and kernel on niobium [16:33:11] Logged the message, Master [16:34:51] New patchset: Ottomata; "statistics.pp - gerrit-stats now uses cli options for data file location and pushes generated stats to gerrit-stats/data" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14922 [16:35:29] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/14922 [16:38:51] PROBLEM - Host niobium is DOWN: PING CRITICAL - Packet loss = 100% [16:39:33] ? [16:39:54] RECOVERY - Host niobium is UP: PING OK - Packet loss = 0%, RTA = 30.88 ms [16:43:08] via serial or ssh? [16:43:14] which one of those is working? [16:43:40] ok, so is the network IP allocated for the mgmt interface? [16:44:20] setup would be connect via serial, set the ip info for the mgmt interface, ping mgmt interface, ssh into and web acces into mgmt interface [16:45:54] PROBLEM - Puppet freshness on mw1102 is CRITICAL: Puppet has not run in the last 10 hours [16:51:30] LeslieCarr, thanks in advance for reviewing those [16:51:40] lemme know how they go here and when they get merged so I can run puppet and check [16:52:03] ottomata: yw [16:52:07] are you at wikimania now ? [16:52:25] naw, in brooklyn [16:52:33] guess you're not either? [16:52:41] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14495 [16:52:56] i am [16:52:59] at the hackathon [16:53:04] aye cool [16:53:05] if i drop off it's cuz the network here is shit [16:53:08] ha, k [16:53:29] LeslieCarr: You mean you didn't sort the network for it? ;) [16:53:56] hehhee [16:54:01] man, i don't touch wireless [16:54:17] give me a network with mpls-te, multicast, and 500 routers … okay [16:54:25] an office of 20 people who want wireless? nuh uh [16:55:05] LeslieCarr: you mean you didn't bring a 100m cat5e/6 cable with you? :D [16:55:25] lol [16:55:26] hehehe [16:57:05] New review: Dzahn; "it's a Sun. it's been declared useless. it's down. bye gilman :)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/14314 [16:57:26] ottomata: sorry, so many people and noise, makes it slooow [16:57:30] yay bye gilman!! [16:57:33] UDP filters in sockpuppet [16:57:36] merge? [16:57:42] yes please [16:57:44] yep [16:57:50] oh mutante you need to rebase [16:58:01] oh, i see [16:58:07] also can you take gilman out of site.pp ? [16:58:19] sure [16:58:46] well, the UDP stuff is merged on sockpuppet [16:58:47] and brb [16:58:48] ottomata: i'm going to move physically to a smaller room [16:58:52] bbiab [16:58:54] mk [16:59:00] mutante, merged? [16:59:02] want to run puppet and check it [16:59:11] templates/udp2log/filters.oxygen.erb [16:59:13] I* ^ [16:59:14] cool danke [17:00:00] PROBLEM - Puppet freshness on maerlant is CRITICAL: Puppet has not run in the last 10 hours [17:00:11] ottomata: stat1? [17:00:16] oxygen [17:00:21] i'm running it [17:01:09] info: Applying configuration version '1341939445' [17:01:11] on oxygen [17:01:47] ottomata: it finished running. puppet run ok. it did change iptables rules [17:03:09] it always says that about iptables [17:03:16] yep, just Misc::Udp2log::Iptables_drops/Iptables_add_service[udp2log_drop_udp]/Iptables_add_rule[udp2log_drop_udp_udp] [17:03:24] ack, looks normal [17:03:30] yeah [17:03:36] and my filter is now running, so cool [17:03:36] danke [17:03:41] yay [17:03:41] bitte [17:03:54] perfect, data coming in, not too fast [17:03:54] real nice [17:04:34] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14497 [17:07:46] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14510 [17:08:36] ottomata: hey, so the last change is a big one, that one will take a bit longer to review :) [17:09:32] which one, gerrit stats or base::monitoring::host [17:09:32] ? [17:10:33] New patchset: Dzahn; "RT #2841: decommission gilman and remove from site.pp" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14314 [17:10:48] PROBLEM - swift-object-auditor on ms-be6 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [17:11:08] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/14314 [17:11:25] base::monitoring [17:11:31] aye yeah [17:11:38] i might get notpeter to help me with that one this afternoon [17:11:47] i'm trying to find where ssh currently tends to get monitored :) [17:11:55] we are going to review my lucene/scribe changes too, and he advised me to make that change, so ja [17:11:58] cool [17:12:01] buuut, if you think you can review and its cool [17:12:03] that's fine too :) [17:12:09] hehe [17:12:19] though first think i'm gonna say is "whitespace!!!!!" [17:12:22] https://gerrit.wikimedia.org/r/#/c/14911/2/manifests/base.pp [17:12:34] ahh just those two little guys [17:12:37] ok...... [17:12:37] New review: Dzahn; "sad, but time to go for real." [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/14314 [17:12:38] maplebed: have you done any rebalancing for the new be nodes? [17:12:39] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14314 [17:12:52] (for objects) [17:12:56] AaronSchulz: I moved about 7% of objects to the new nodes. [17:13:28] oh there is more than those little two [17:13:31] hehe, ok fixing [17:15:00] maplebed: when might that finish? [17:15:15] I'm not sure... why do you ask? [17:15:48] I'm just curious how rebalancing and migration will effect latency [17:15:51] New patchset: Ottomata; "base.pp,site.pp - parameterizing contact_group for base::monitoring::host" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14911 [17:16:07] it marginally affects object latency. You probably won't notice. [17:16:24] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/14911 [17:16:39] RECOVERY - swift-object-auditor on ms-be6 is OK: PROCS OK: 2 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [17:17:51] AaronSchulz: other things (such as the delete job that was running for the last 1.5wks) affect latency much more. [17:21:10] LeslieCarr, fixed whitespace [17:21:20] yep, saw, reviewing now :) [17:21:30] ah i see one more [17:22:20] New patchset: Ottomata; "base.pp,site.pp - parameterizing contact_group for base::monitoring::host" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14911 [17:22:55] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/14911 [17:24:50] ottomata: guess you'll have to rebase as well, site.pp changed in between [17:25:57] ok, i'm still doing a lot of comments on patchest 2 [17:25:57] Logged the message, Master [17:26:22] k [17:27:13] PROBLEM - swift-object-auditor on ms-be8 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [17:29:10] ottomata: inline comments on patchset 2 [17:29:29] RECOVERY - Host srv266 is UP: PING OK - Packet loss = 0%, RTA = 0.43 ms [17:30:54] New patchset: Mark Bergsma; "tempcommit: works for single lines" [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14928 [17:30:55] New patchset: Mark Bergsma; "the somewhat correct parts" [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14929 [17:30:56] New patchset: Mark Bergsma; "cleanup" [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14930 [17:30:56] New patchset: Mark Bergsma; "works" [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14931 [17:30:57] New patchset: Mark Bergsma; "works" [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14932 [17:30:58] New patchset: Mark Bergsma; "remove extra linefeed" [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14933 [17:30:58] New patchset: Mark Bergsma; "remove compile errors with -Werror" [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14934 [17:30:59] New patchset: Mark Bergsma; "Reverting 6 commits seems like more confusion. This is varnishncsa.c from d18336e59ed66512f5af2f2d26bb0a7261d012c8." [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14935 [17:31:00] New patchset: Mark Bergsma; "Fix poll waiter, so that we don't terminate the search for poll'ed fd's early in the case of a timeout." [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14936 [17:31:01] New patchset: Mark Bergsma; "A quick style polish" [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14937 [17:31:01] New patchset: Mark Bergsma; "Give VRT_re_match a sess* arg and report VRE errors using SLT_VCL_Error instead of asserting." [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14938 [17:31:02] New patchset: Mark Bergsma; "Make it possible to set limits for VRE matching." [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14939 [17:31:03] New patchset: Mark Bergsma; "Bump minimum number of threads for RPMs" [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14940 [17:31:03] New patchset: Mark Bergsma; "Add a "wait-running" primitive to varnish instances, so we can avoid fixed sleeps waiting for the child process to start." [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14941 [17:31:04] New patchset: Mark Bergsma; "Try to firm up v00010 even more, by explicitly waiting for the crashing child to do so, and then explicitly start it again." [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14942 [17:31:05] New patchset: Mark Bergsma; "Elminiate pthread poisoning of manager process." [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14943 [17:31:05] New patchset: Mark Bergsma; "WRK_Flush was renamed to WRW_Flush a long time ago; update some ifdefed code and comments to match" [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14944 [17:31:06] New patchset: Mark Bergsma; "Use = rather than := in Makefile to make solaris make stop whining on distcheck" [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14945 [17:31:07] New patchset: Mark Bergsma; "Quit early if setting blocking mode fails." [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14946 [17:31:08] New patchset: Mark Bergsma; "Drop explicit dependeny on Makefile for default_vcl.h to avoid spurious rebuilds" [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14947 [17:31:08] New patchset: Mark Bergsma; "Attempt to document the different options for invalidating cached content" [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14948 [17:31:09] argh [17:31:09] New patchset: Mark Bergsma; "Make EXP_NukeOne() make do with a struct worker arg instead of sess." [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14949 [17:31:10] New patchset: Mark Bergsma; "Merge new material from reference/purging_banning, and do a little house keeping" [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14950 [17:31:10] New patchset: Mark Bergsma; "Update to 3.0 syntax" [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14951 [17:31:11] New patchset: Mark Bergsma; "Also snapshot the worker thread workspace around esi:include processing." [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14952 [17:31:12] New patchset: Mark Bergsma; "Ignore SIGPIPE, it causes the test-executing subprocess to bail out before all diagnostics have been gathered." [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14953 [17:31:12] New patchset: Mark Bergsma; "Make it possible to mark http stuff "non-fatal"." [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14954 [17:31:13] New patchset: Mark Bergsma; "Register buffer allocation failuers on vgz's and make failure to clean those up non-fatal." [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14955 [17:31:14] New patchset: Mark Bergsma; "Use a better way of checking we have pkg-config" [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14956 [17:31:14] That must be a little spammy on hilights.... [17:31:14] New patchset: Mark Bergsma; "Don't abbreviate panic message output." [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14957 [17:31:15] New patchset: Mark Bergsma; "Typographical corrections, spelling" [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14958 [17:31:15] whaa [17:31:16] New patchset: Mark Bergsma; "Add new format %{VCL_Log:foo}x which output key:value from std.log() in VCL" [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14959 [17:31:17] New patchset: Mark Bergsma; "Support for \t\n in varnishncsa format strings" [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14960 [17:31:17] New patchset: Mark Bergsma; "Make the VFP calls symmetric and pair-wise visible, allow ->end() to fail, and act accordingly." [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14961 [17:31:18] New patchset: Mark Bergsma; "Overhaul the detection and reporting of fetch errors, to properly catch trouble that materializes only when we destroy the VGZ instance." [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14962 [17:31:19] New patchset: Mark Bergsma; "More cleanup and simplification of FetchError reporting." [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14963 [17:31:19] New patchset: Mark Bergsma; "Polish the chunked body fetcher. It still irks me that it does one byte reads for the hex-length headers." [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14964 [17:31:25] mark: spammer [17:31:27] Need a squash? [17:31:39] noone needs squash [17:31:42] heh [17:32:02] * maplebed needs zucchini [17:32:19] PROBLEM - swift-object-auditor on ms-be7 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [17:33:04] PROBLEM - Apache HTTP on srv266 is CRITICAL: Connection refused [17:33:54] mark: http://article.gmane.org/gmane.science.linguistics.wikipedia.technical/62347 [17:34:02] you may want to read that :) [17:35:10] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14928 [17:35:23] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14929 [17:35:27] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14930 [17:35:32] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14931 [17:35:36] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14932 [17:35:40] you must be using the command-line for that. heh [17:35:41] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14933 [17:35:45] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14934 [17:35:47] LeslieCarr, re: alpha order [17:35:50] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14935 [17:35:55] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14936 [17:35:59] you mean the emery locke oxygen nodes? [17:36:00] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14937 [17:36:04] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14938 [17:36:06] gerrit is so slow [17:36:09] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14939 [17:36:13] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14940 [17:36:18] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14941 [17:36:22] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14942 [17:36:25] database servers in eqiad ;) [17:36:27] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14943 [17:36:31] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14944 [17:36:35] that's all I want. heh [17:36:36] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14945 [17:36:40] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14946 [17:36:45] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14947 [17:36:49] nice for loop taking care of it [17:36:50] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14948 [17:36:55] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14949 [17:36:56] reviewed by bash [17:36:57] heh [17:36:59] :D [17:37:02] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14950 [17:37:06] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14951 [17:37:11] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14952 [17:37:15] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14953 [17:37:20] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14954 [17:37:24] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14955 [17:37:25] RECOVERY - swift-object-auditor on ms-be8 is OK: PROCS OK: 2 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [17:37:29] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14956 [17:37:33] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14957 [17:37:38] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14958 [17:37:42] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14959 [17:37:47] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14960 [17:37:51] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14961 [17:37:56] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14962 [17:38:00] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14963 [17:38:05] Change merged: Mark Bergsma; [operations/debs/varnish] (upstream-persistent) - https://gerrit.wikimedia.org/r/14964 [17:38:31] lemme do the same with the mediawiki queue [17:38:49] Lets hope he doesn't need to revert that... [17:38:52] mutante: oh, sorry i thought you got that. that's why i asked you instead of doing it myself at all [17:40:26] ok lesliecarr [17:40:32] New patchset: Mark Bergsma; "Fix exit code from reload-vcl (Closes: #664857)" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14965 [17:40:32] New patchset: Mark Bergsma; "reorder build dependencies" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14966 [17:40:33] you sure I can change the value of a fully qualified var like that? [17:40:33] New patchset: Mark Bergsma; "Remove /etc/varnish/secret on purge (Closes: #656220)" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14967 [17:40:34] New patchset: Mark Bergsma; "Do not run build tests by default (Closes: #663667)" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14968 [17:40:34] New patchset: Mark Bergsma; "Bump standards-version (no changes)" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14969 [17:40:35] New patchset: Mark Bergsma; "Add systemd services" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14970 [17:40:36] New patchset: Mark Bergsma; "Add service files" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14971 [17:40:37] New patchset: Mark Bergsma; "Remove vcs_version.h patch" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14972 [17:40:37] New patchset: Mark Bergsma; "Use debhelper compat level 9 (Closes: #663064)" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14973 [17:40:38] New patchset: Mark Bergsma; "Merge branch 'bug/663667-ftbfs-on-buildd' into next" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14974 [17:40:39] New patchset: Mark Bergsma; "Merge branch 'misc/standards-version-3.9.3' into next" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14975 [17:40:39] New patchset: Mark Bergsma; "Merge branch 'misc/remove-vcs-version-patch' into next" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14976 [17:40:40] New patchset: Mark Bergsma; "Merge branch 'feature/systemd' into next" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14977 [17:40:40] haha, not over! [17:40:41] New patchset: Mark Bergsma; "Merge branch 'feature/debhelper-9' into next" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14978 [17:40:42] New patchset: Mark Bergsma; "Update changelog" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14979 [17:40:42] New patchset: Mark Bergsma; "Remove unused lintian override" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14980 [17:40:43] going back to pm [17:40:43] New patchset: Mark Bergsma; "Do not explicitly create directories" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14981 [17:40:44] New patchset: Mark Bergsma; "Add Jan Wagner to the team (and sort the uploader list)" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14982 [17:40:44] New patchset: Mark Bergsma; "Add DAEMON_OPTS example to /etc/default/varnishncsa" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14983 [17:40:45] New patchset: Mark Bergsma; "varnish (3.0.3~rc1-persistent-wm1~1.gbp73f4f8) UNRELEASED; urgency=low" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14984 [17:40:46] New patchset: Mark Bergsma; "Merge remote branch 'debian/master'" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14985 [17:40:46] New patchset: Mark Bergsma; "Remove a bit of stale debugging." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14986 [17:40:47] New patchset: Mark Bergsma; "Avoid infinite loop in regsuball" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14987 [17:40:47] doh!!! [17:40:48] New patchset: Mark Bergsma; "Use scalbn(3) rather than exp2(3), it should be faster and more portable." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14988 [17:40:49] New patchset: Mark Bergsma; "Add a missing case: ESI parent document gunzip'ed but included document gzip'ed." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14989 [17:40:49] New patchset: Mark Bergsma; "document the compression behaviour in Varnish 3.0" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14990 [17:40:50] New patchset: Mark Bergsma; "Add health control of backends from CLI" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14991 [17:40:51] New patchset: Mark Bergsma; "Expose VSL_Name2Tag in libvarnishapi" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14992 [17:40:51] New patchset: Mark Bergsma; "FlexeLint spotted a memory leak in Tollefs "admin-health" code, so I started fixing that, and then I fixed another couple of nits and polished the naming a bit and ..." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14993 [17:40:52] New patchset: Mark Bergsma; "Add some backend.list commands to exercise that code also" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14994 [17:40:53] New patchset: Mark Bergsma; "Test more of the backend matching code." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14995 [17:40:53] New patchset: Mark Bergsma; "One more backend.list test to get all of the code." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14996 [17:40:54] New patchset: Mark Bergsma; "varnishstat: Add json output and continous mode" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14997 [17:40:55] New patchset: Mark Bergsma; "Simplify json callback" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14998 [17:40:56] New patchset: Mark Bergsma; "A minor preemptive cleanup before the ban-lurker gets remodelled." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/14999 [17:40:56] New patchset: Mark Bergsma; "Rewrite the ban-lurker." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15000 [17:40:57] New patchset: Mark Bergsma; "Add a new stats counter for "gone" marked bans" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15001 [17:40:58] New patchset: Mark Bergsma; "Don't start the ban-lurker until -spersistent is done loading." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15002 [17:40:58] New patchset: Mark Bergsma; "Add a missing WS_Release()" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15003 [17:40:59] New patchset: Mark Bergsma; "Explicitly document that concatenation is only supported for the builtins." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15004 [17:41:00] New patchset: Mark Bergsma; "Silence a Flexelint warning" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15005 [17:41:00] New patchset: Mark Bergsma; "Implement VRE options with hard linkage to PCRE options instead of maintaining magic hex-bit values that must match." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15006 [17:41:01] New patchset: Mark Bergsma; "Use PCRE_NOTEMPTY rather than NOTEMPTY_ATSTART, it suffices for us" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15007 [17:41:02] New patchset: Mark Bergsma; "fix a block of text rendering as part of source code" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15008 [17:41:03] New patchset: Mark Bergsma; "Update varnishtest(1) documentation somewhat" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15009 [17:41:03] * mark grins [17:41:03] New patchset: Mark Bergsma; "Make it possible to test for the non-definition of a http header." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15010 [17:41:04] New patchset: Mark Bergsma; "Be more precise about default behaviour in the presence of cookie headers" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15011 [17:41:05] New patchset: Mark Bergsma; "Clean up examples a bit" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15013 [17:41:06] New patchset: Mark Bergsma; "Clean up examples a bit (#2)" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15014 [17:41:07] New patchset: Mark Bergsma; "remove minor typos" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15015 [17:41:07] New patchset: Mark Bergsma; "s/reponse/response/" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15016 [17:41:08] New patchset: Mark Bergsma; "Correct ACL syntax" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15017 [17:41:09] New patchset: Mark Bergsma; "Some clarification on how 'now' works I'm not sure if we should go into more detail about how the whole thing works" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15018 [17:41:10] New patchset: Mark Bergsma; "add a label so we can ref this in the user guide / tutorial" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15019 [17:41:10] New patchset: Mark Bergsma; "Add a label, fix a typo and remove -o (now default)" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15020 [17:41:11] New patchset: Mark Bergsma; "remove -i and -I, add -m and describe it. Simplify the docs. The user only needs to care about -m, really" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15021 [17:41:12] New patchset: Mark Bergsma; "updated syntax, thanks to Chris Handy for spotting it" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15022 [17:41:12] New patchset: Mark Bergsma; "Remove the ref to -o for varnishlog Add a link to the varnishlog man page and the logging chapter in the tutorial" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15023 [17:41:13] New patchset: Mark Bergsma; "add a ref to man varnishlog" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15024 [17:41:14] New patchset: Mark Bergsma; "clean up refs" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15025 [17:41:14] New patchset: Mark Bergsma; "Fix syntax for 3.0" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15026 [17:41:15] New patchset: Mark Bergsma; "Curse WG14" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15027 [17:41:16] New patchset: Mark Bergsma; "typos" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15028 [17:41:16] New patchset: Mark Bergsma; "Another typo" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15029 [17:41:17] New patchset: Mark Bergsma; "Misleading use of key= instead of key: which is correct." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15030 [17:41:18] New patchset: Mark Bergsma; "Fix inconsistant man page, as reported by Chris Adams" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15031 [17:41:19] New patchset: Mark Bergsma; "All relevant BSD's have a testable kqueue now." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15032 [17:41:19] RECOVERY - swift-object-auditor on ms-be7 is OK: PROCS OK: 2 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [17:41:19] New patchset: Mark Bergsma; "Fix typo" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15033 [17:41:20] New patchset: Mark Bergsma; "Remove outdated comment" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15034 [17:41:21] New patchset: Mark Bergsma; "Add explanation for Varnish "hashing" in light of advisories." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15035 [17:41:21] New patchset: Mark Bergsma; "Added an introduction. Brushed up the chapter on varnish in a virtualized environment" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15036 [17:41:22] New patchset: Mark Bergsma; "typo. Thanks scoof" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15037 [17:41:23] New patchset: Mark Bergsma; "Handle the case of sub being NULL" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15038 [17:41:24] New patchset: Mark Bergsma; "Update .gitignore" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15039 [17:41:24] New patchset: Mark Bergsma; "Actually use a default (80) if no port is specified" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15040 [17:41:25] New patchset: Mark Bergsma; "Update syntax for 3.0" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15041 [17:41:26] New patchset: Mark Bergsma; "By Slink, via Geoff:" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15042 [17:41:26] New patchset: Mark Bergsma; "Add support for the %{format}t construct to varnishncsa" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15043 [17:41:27] New patchset: Mark Bergsma; "speling" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15044 [17:41:28] New patchset: Mark Bergsma; "Fix a test description" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15045 [17:41:28] New patchset: Mark Bergsma; "Make this test case more robust (and faster) by using the CLI to control backend health, rather than rely on probes." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15046 [17:41:29] New patchset: Mark Bergsma; "cleaned out some cruft. Added links to the actual docs" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15047 [17:41:30] New patchset: Mark Bergsma; "add link" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15048 [17:41:31] New patchset: Mark Bergsma; "Literal block colons in fully minimized form" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15049 [17:41:31] New patchset: Mark Bergsma; "Rest of the literal block colons in fully minimized form" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15050 [17:41:32] New patchset: Mark Bergsma; "Make the invalid domains FQDN to save time if you have a long search-list." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15051 [17:41:33] New patchset: Mark Bergsma; "Wait a bit before trying to start/use varnishd again" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15052 [17:41:33] New patchset: Mark Bergsma; "Honor remove-flag also when processing comments in ESI parsing." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15053 [17:41:34] New patchset: Mark Bergsma; "Make sure we have debug.sizeof in at least one test-case." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15054 [17:41:35] New patchset: Mark Bergsma; "Documentation fixes from Federico G. Schwindt " [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15055 [17:41:35] New patchset: Mark Bergsma; "Drop struct workreq, we don't use it." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15056 [17:41:36] New patchset: Mark Bergsma; "Add proper handling of TCP timeouts on client side" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15057 [17:41:37] New patchset: Mark Bergsma; "Make it possible to limit the total transfer time." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15058 [17:41:37] New patchset: Mark Bergsma; "Call SES_Delete on sessions found to be closed during vca_return_session." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15059 [17:41:38] New patchset: Mark Bergsma; "Set next state to STP_DONE also when an overflow error is encountered while parsing the HTTP request." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15060 [17:41:39] New patchset: Mark Bergsma; "Add missing + in docs" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15061 [17:41:40] New patchset: Mark Bergsma; "remove the experimental flag from http_range_support" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15062 [17:41:40] New patchset: Mark Bergsma; "Force lookup of kss resources and fix cached creation of objects in Plone" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15063 [17:41:41] New patchset: Mark Bergsma; "Don't try to stream when there is no body." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15064 [17:41:42] New patchset: Mark Bergsma; "typo" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15065 [17:41:42] New patchset: Mark Bergsma; "Allocate HTTP.status from http's designated workspace." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15066 [17:41:43] New patchset: Mark Bergsma; "Most of these variables are not available in vcl_deliver" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15067 [17:41:44] New patchset: Mark Bergsma; "Detect client crashing during startup" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15068 [17:41:44] New patchset: Mark Bergsma; "Add more info about changes from 2.1 to 3.0. Thanks to xcir.net for inspiration." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15069 [17:41:45] New patchset: Mark Bergsma; "Fix syntax" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15070 [17:41:46] New patchset: Mark Bergsma; "Grammar" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15071 [17:41:47] New patchset: Mark Bergsma; "Add short example on how to get Websockets to work" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15072 [17:41:47] New patchset: Mark Bergsma; "Add short example on how to get Websockets to work (#2)" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15073 [17:41:48] New patchset: Mark Bergsma; "Correct value for 2.1" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15074 [17:41:49] New patchset: Mark Bergsma; "Drop the body of hit-for-pass objects once we have delivered them to the original requester." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15075 [17:41:49] New patchset: Mark Bergsma; "Avoid taking the saintmode lock if the list empty." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15076 [17:41:50] New patchset: Mark Bergsma; "Use the hash digest as identification instead of the neutered objhead pointer, in order to not have dependency between trouble entry and objhead lifetime." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15077 [17:41:51] New patchset: Mark Bergsma; "backend cookies" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15078 [17:41:52] New patchset: Mark Bergsma; "3.0 syntax" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15079 [17:41:52] New patchset: Mark Bergsma; "Really change to 3.0 syntax" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15080 [17:41:53] New patchset: Mark Bergsma; "Grammar in varnishncsa man page" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15081 [17:41:54] New patchset: Mark Bergsma; "Add section on device detection" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15082 [17:41:54] New patchset: Mark Bergsma; "Fixing a typo in the documentation." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15083 [17:41:55] New patchset: Mark Bergsma; "typos, change example3 to allow for easier purging" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15084 [17:41:56] New patchset: Mark Bergsma; "Remove old Date: header before adding our new one." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15085 [17:41:56] New patchset: Mark Bergsma; "Correct function name" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15086 [17:41:57] New patchset: Mark Bergsma; "Small doc fixes" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15087 [17:41:58] New patchset: Mark Bergsma; "Include device class when hashing" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15088 [17:41:59] New patchset: Mark Bergsma; "Fix for #1109" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15089 [17:41:59] ryan said "I don't know how he does it" in a meeting [17:42:01] New patchset: Mark Bergsma; "Fix libedit detection on *BSD OS's" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15090 [17:42:02] New patchset: Mark Bergsma; "Stopgap fix to get FreeBSD 10-current compiling again." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15091 [17:42:02] New patchset: Mark Bergsma; "Missing errorchecks incompilation of regsub()" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15092 [17:42:03] New patchset: Mark Bergsma; "Fix #1126 by properly setting the mask token to the IP number token." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15093 [17:42:04] New patchset: Mark Bergsma; "Better fix for #1126" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15094 [17:42:04] New patchset: Mark Bergsma; ":: need to be on a separate line in RST." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15095 [17:42:05] New patchset: Mark Bergsma; "escape \0" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15096 [17:42:06] New patchset: Mark Bergsma; "Fix escaping in regsub docs" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15097 [17:42:06] New patchset: Mark Bergsma; "Expose resp.body for inspection and testing in varnishtest" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15098 [17:42:07] New patchset: Mark Bergsma; "Stopped using macros for make and install, according to Fedora's packaging guidelines" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15099 [17:42:08] New patchset: Mark Bergsma; "Removed source install instructions from the built package, as requested by rpmlint" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15100 [17:42:09] New patchset: Mark Bergsma; "Stopped using macros for make and install, according to Fedora's packaging guidelines" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15101 [17:42:09] this is how [17:42:09] New patchset: Mark Bergsma; "Removed source installation instructions INSTALL, as requested by rpmlint" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15102 [17:42:10] this [17:42:10] New patchset: Mark Bergsma; "No need to keep the sphinx doc =build dir. If a user wants them, she can recreate them." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15103 [17:42:11] New patchset: Mark Bergsma; "Added missing changelog entries from fedora" [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15104 [17:42:11] New patchset: Mark Bergsma; "Add an explicit macro_undef() function so we don't pass a NULL argument to a printflike function." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15105 [17:42:12] New patchset: Mark Bergsma; "Also reflect the VCC exit code through if -C is specified." [operations/debs/varnish] (testing/persistent) - https://gerrit.wikimedia.org/r/15106 [17:42:13] New patchset: Mark Bergsma; "Clean up docs about esi:remove and < LeslieCarr> though first think i'm gonna say is "whitespace!!!!!" [18:07:28] jeremyb: that's a good idea as a precommit hook ? [18:07:35] mark: https://gerrit.wikimedia.org/r/#/c/12425/ [18:07:44] added you as a reviewer a while back [18:07:55] It's being used in labs and works properly [18:08:09] ok will review [18:08:17] LeslieCarr: sure, but then you'd have to get everyone to install it? or as an extra check during lint [18:08:45] people can be convinced to install locally if they keep getting rejected by the linter ;) [18:09:27] i think extra lint check on gerrit ? [18:10:11] yeah. not an extra review, all in the same big comment from the bot [18:10:19] LeslieCarr [18:12:04] !log srv266 shutdown for chris [18:12:11] Logged the message, RobH [18:14:05] New patchset: Asher; "pcache on pc tuning based on sysbench testing" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15153 [18:14:43] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/15153 [18:14:55] PROBLEM - Host srv266 is DOWN: PING CRITICAL - Packet loss = 100% [18:15:14] hey binasher: should i file an RT ticket for the read-only mysql account for global dev or is that something you can create on the fly? [18:16:14] i guess the cname thing that we discussed last week is more involved, but i don't think i am the right person to file the RT ticket for that, could you do that? [18:16:31] drdee_: oh yeah, thanks for the reminder [18:17:06] its already in rt [18:17:40] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15153 [18:17:53] what is the ticket number? [18:21:05] the awesome powers of search report 3224 [18:29:52] Ryan_Lane: i amended in case you missed it [18:30:56] jeremyb: did you test the regex? :) [18:31:08] do I owe you, or do you owe me? heh [18:31:49] Change merged: Ryan Lane; [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/14869 [18:32:01] hashar: how does the apache stuff work? [18:32:06] is it still two repos? [18:32:15] yup still [18:32:19] Tim talked about it earlier [18:32:27] so, how do I deploy from it? [18:32:58] have you seen my email from June 27 in ops ? [18:33:06] Ryan_Lane: i need to test still [18:33:17] basically deploy as usual, edit the files in /home/wikipedia/conf/httpd [18:33:24] svn commit [18:33:33] apache-sync [18:33:46] hashar: the change is already in gerrit [18:34:07] then git commit in local repo and git push origin master:refs/for/master to publish your change publicly [18:34:09] ohh [18:34:12] so well hmm [18:34:16] git pull the merged change :-D [18:34:19] heh [18:34:20] svn commit it in the local repo [18:34:25] apache-sync [18:34:26] this is a nightmare ;) [18:34:29] profit [18:34:51] we only published in the public git repository the files which were already on noc.wikimedia.org [18:34:54] hashar: you should be here so we can berate in person ;) [18:34:56] the rest is left in the local svn [18:35:10] we could probably publish everything, but there must be some nasty stuff we want to keep hidden [18:35:18] another way would be to setup a second private git repo [18:35:26] jeremyb: :-] [18:35:40] so anyway, I guess it should be discussed on the ops list [18:35:51] i thought you thought it could all be public but you just didn't want to make the decision? [18:36:00] and we triple check we are not shooting our own feet by publishing some nasty apache conf [18:36:18] hashar: error: Your local changes to 'redirects.conf' would be overwritten by merge. Aborting. [18:36:42] jeremyb: indeed. I can't take the responsibility to publish some potentially harmful conf. I am probably just paranoid anyway [18:36:51] Ryan_Lane: so there is a local change [18:37:00] doesn't look like it [18:37:03] here the nightmare :-] Keeping two repos in sync [18:37:39] git status gives three modified files: main.conf redirects.conf www.wikipedia.conf [18:37:47] too many repos ! [18:38:27] jeremyb: is your change origin/sandbox/jeremyb/rt2919 ? [18:38:51] !g I0f244484a5a638799588017bd8eb1fe076cb84a1 [18:38:51] https://gerrit.wikimedia.org/r/#q,I0f244484a5a638799588017bd8eb1fe076cb84a1,n,z [18:39:13] bah that does not make any sense [18:39:41] ohh no [18:39:42] hmm [18:39:45] 10 17:42:43 <+gerrit-wm> New patchset: Jeremyb; "wikipedie.cz/experti_na_prirodu: short URL redir" [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/14869 [18:39:47] it is already in origin/master [18:39:52] yes [18:39:59] both are [18:40:00] so we need to commit the local changes I guess [18:40:04] pull [18:40:12] pull --rebase [18:41:30] Ryan_Lane: I will clean it up [18:42:40] hashar: thanks [18:42:48] hashar: can you do the deploy too? :) [18:43:29] I dont do Apache deploys :-D [18:43:35] my insurance does not cover me against that ;-) [18:45:38] manually importing svn commits to git [18:45:49] New patchset: Lcarr; "with no gammu on this, no need for gammu group" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15157 [18:46:22] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/15157 [18:46:32] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15157 [18:48:22] maplebed: ooo pretty graphs going down and to the right [18:48:52] maplebed: also, yay for the actual major performance gains :) [18:51:02] PROBLEM - Host srv278 is DOWN: PING CRITICAL - Packet loss = 100% [18:51:59] woosters, could you please approve https://rt.wikimedia.org/Ticket/Display.html?id=3219 [18:52:05] RECOVERY - Host srv278 is UP: PING OK - Packet loss = 0%, RTA = 0.25 ms [18:52:56] drdee_ - only to stat1? [18:53:22] yes only stat1 [18:54:04] Ryan_Lane: were you working on a redirects for the CZ chapter? [18:54:12] yes [18:54:19] that's the change that needs to be deployed [18:54:30] ok so that is pending in subversion [18:54:40] can you deploy it? :) [18:55:16] hop I can't :/ [18:55:23] PROBLEM - Apache HTTP on srv278 is CRITICAL: Connection refused [18:55:24] I must be missing root somewhere [18:55:47] ah [18:56:08] I don't think I did any apache conf since I came back in 2010 [18:56:28] it is too easy to break the cluster [18:56:41] iirc we don't even check if the conf are valid before reload [18:56:55] New patchset: Lcarr; "commenting out a few files for new install and then including packages" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15160 [18:56:59] pushed it [18:57:09] now so graceful [18:57:10] Ryan_Lane: I know Jeff_Green has a test suite to verify the conf [18:57:20] he does? [18:57:23] i do [18:57:27] he does [18:57:28] \O/ [18:57:33] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15160 [18:57:39] I just ssh into a mw box, then do apache2ctl -t :) [18:57:59] this does A:B comparison between boxes running different confs [18:58:00] !log reworked the git repository in /home/wikipedia/conf/httpd , manually synced changes from svn to the git repo [18:58:08] Logged the message, Master [18:58:26] New patchset: Hashar; "add labs->labsconsole redirect per gerrt 14509 / RT-2402" [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/15161 [18:58:27] New patchset: Hashar; "favicon.php test" [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/15162 [18:58:28] it's very good at finding all the cases where graceful fails to restart things ever [18:58:48] Ryan_Lane: 15161 and 15162 are my cleanup. Commits credited to mutante and Tim (according to svn) [18:59:44] RECOVERY - Apache HTTP on srv278 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.030 second response time [19:00:31] jeremyb: I guess you can delete your origin/sandbox/jeremyb/rt2919 sandbox. commit is 96ed1ee. It landed in master apparently as 76ad8cc [19:00:55] jeremyb: I am talking about operations/apache-config [19:01:07] hashar: sure [19:01:14] hashar: not hurting anything [19:01:28] the funny thing whith sandbox is that everyone knows about them since they are branches ;-D [19:01:28] New patchset: Lcarr; "removing service (temporarily)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15163 [19:02:03] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15163 [19:02:03] well, that short url isn't working [19:02:22] jeremyb: not working [19:02:51] it's supposed to be this, right? http://cs.wikipedia.org/experti_na_prirodu [19:02:59] no [19:03:05] wikipedie.cz or something [19:03:35] sorry, miscopy [19:03:36] it's in the first line of the commit msg [19:03:52] wikipedie.cz/experti_na_prirodu [19:03:56] !log chown l10nupdate:wikidev /home/wikipedia/common/php-1.20wmf7/cache/l10n/ [19:03:56] that doesn't work [19:04:03] Logged the message, Mr. Obvious [19:04:06] file not found [19:04:57] http://wikipedie.cz/ that one does redirect [19:05:02] !log chmod g+w /home/wikipedia/common/php-1.20wmf7/cache/l10n/ [19:05:14] Logged the message, Mr. Obvious [19:11:00] Ryan_Lane: looking some more [19:12:11] PROBLEM - Puppet freshness on ms3 is CRITICAL: Puppet has not run in the last 10 hours [19:13:50] RECOVERY - MySQL disk space on neon is OK: DISK OK [19:15:36] Logged the message, Master [19:19:14] RECOVERY - Host sq36 is UP: PING OK - Packet loss = 0%, RTA = 0.90 ms [19:22:16] LeslieCarr, [19:22:38] oop, checking comment on patchset 5... [19:23:08] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [19:24:02] ok re node inheritance vs. a role [19:24:03] hmm [19:24:10] well, this isn't the analytics role, fo sho [19:24:36] but maybe it is a logging role? [19:24:36] New patchset: Ryan Lane; "Initial commit of the new deployment system" [operations/deployment] (master) - https://gerrit.wikimedia.org/r/8732 [19:24:39] roles/logging.pp [19:24:41] i'm fine with that [19:24:56] RECOVERY - SSH on sq36 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [19:25:57] um, hm [19:26:04] oh i know [19:26:14] you can't override variables like that in other classes [19:26:17] they have to be done from the node [19:26:29] this is the only part that matters for that though: [19:26:29] $nagios_contact_group = "admins,analytics" [19:26:35] LeslieCarr ^ [19:27:29] i guess I could create a role::logging class, but put the $nagios_contact_group declaration in each node [19:27:36] ahha [19:27:49] kind of annoying though, i was trying to abstract out as much common stuff as possible [19:27:55] yeah .... [19:27:59] but true, looks like you guys aren't using node inheritance anywhere else [19:28:27] mentally weighing the new precedent versus old standard [19:28:41] with a "will mark kill me for saying yes to this" running in my head ;) [19:28:45] no reason why the base node can't go in role::logging [19:28:50] and include role::logging [19:28:53] sorry [19:29:00] base node def in roles/logging.pp [19:29:10] haha [19:33:11] PROBLEM - Puppet freshness on owa2 is CRITICAL: Puppet has not run in the last 10 hours [19:34:15] ah one more gotcha [19:34:24] setting the $nagios_contact_group variable [19:34:40] has to be done in the same node that includes base (which includes base::monitoring::host) [19:34:55] ah, and class standard includes base [19:35:11] so either I have to set the variable and include standard in each node [19:35:17] or I have to use a base node that does it there [19:36:38] PROBLEM - Auth DNS on ns2.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [19:37:50] i like the idea of base node in role::logging [19:37:54] eep, checking out ns2 [19:38:53] !log restarted frozen pdns on ns2 [19:39:01] Logged the message, Mistress of the network gear. [19:39:29] RECOVERY - Auth DNS on ns2.wikimedia.org is OK: DNS OK: 0.134 seconds response time. www.wikipedia.org returns 208.80.154.225 [19:39:42] ok, lemme try that and see what you think [19:42:40] New patchset: Lcarr; "updated icinga config files to latest format" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15218 [19:43:15] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/15218 [19:44:51] mutante: ever going to merge https://gerrit.wikimedia.org/r/#/c/3291/1 ? [19:45:48] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15218 [19:46:18] drdee: just closed that rt ticket, you should have been cc'd [19:47:31] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [19:48:25] New patchset: Ottomata; "base.pp,site.pp - parameterizing contact_group for base::monitoring::host" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14911 [19:48:25] RECOVERY - Puppet freshness on pc1 is OK: puppet ran at Tue Jul 10 19:48:05 UTC 2012 [19:48:46] binasher: thanks so much!! [19:49:00] New patchset: Andrew Bogott; "Specify a full path for git::clone: srv/mediawiki" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15221 [19:49:32] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/14911 [19:49:33] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/15221 [19:49:33] New patchset: Andrew Bogott; "Specify a full path for git::clone: srv/mediawiki" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15221 [19:50:08] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/15221 [19:50:34] Change merged: Andrew Bogott; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15221 [19:52:19] RECOVERY - Host srv266 is UP: PING OK - Packet loss = 0%, RTA = 0.36 ms [19:53:50] ottomata: looks good except whitespace [19:54:58] drdee: you around ? [19:55:10] New patchset: Ottomata; "base.pp,site.pp - parameterizing contact_group for base::monitoring::host" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14911 [19:55:13] ayay LeslieCarr [19:55:27] drdee: https://rt.wikimedia.org/Ticket/Display.html?id=2046 [19:55:43] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/14911 [19:55:55] yeah, close it as WONTFIX [19:56:13] PROBLEM - Apache HTTP on srv266 is CRITICAL: Connection refused [19:56:32] ok LeslieCarr [19:56:32] https://gerrit.wikimedia.org/r/#/c/14911/ [19:56:33] how's that? [19:56:38] it's a much bigger discussion about what data should be accessible in labs, but until that discussion happens this ticket is not going anywhere [19:58:32] ottomata: doh, one last thing i missed, but looking pretty good…. [20:00:18] LeslieCarr: hey, some people want a puppet tutorial, are you up for doing one? [20:00:38] Ryan_Lane: um, sure, as long as you warn them i'm not a good teacher [20:00:55] heh [20:00:59] I can do it with you [20:01:03] Ryan_Lane: I'll walk over to the break room (where the food is!) now --- tell them to look for me [20:01:20] well, we should try to get a timeslot and have it announced [20:01:47] oo, i want to do one! [20:02:19] Puppet is weird, it's rather easy to understand then you try something like 'figure out correct gateway for ip range' and smash your head against a wall for a day [20:03:58] Ryan_Lane: hehe, i'm over at the break room now anyways and there's bagels… who's th e person to talk to about talks and timeslots ? [20:04:30] Damianz: what exactly are you trying to do? you want to set manage static ip routes with puppet? [20:04:38] s/set// [20:06:35] jesusaurus: Set the interface tag, netmask and route correctly based on ip. Not figured out the best way, keep meaning to pick up ruby somemore and use a hash or w/e ruby calls a dict, but me+maths+ruby = sad [20:09:07] PROBLEM - Host sq36 is DOWN: PING CRITICAL - Packet loss = 100% [20:13:10] PROBLEM - NTP on srv266 is CRITICAL: NTP CRITICAL: Offset unknown [20:17:12] im not quite sure what you are trying to do, but in general, dont try to put a lot of ruby in your puppet [20:17:40] RECOVERY - Host sq36 is UP: PING OK - Packet loss = 0%, RTA = 0.37 ms [20:18:02] At the moment I've got an inline template with some horrid looking ruby :( Quickly learned that was a bad idea lol [20:18:07] RECOVERY - Apache HTTP on srv266 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.396 second response time [20:19:44] mgmt or os? [20:20:22] whatcha need? [20:20:28] its responsive [20:21:30] its not responsive to ping [20:21:35] it has a link light though? [20:22:10] ok, lemme see if this is in any odd pool before i shut it down [20:23:49] RECOVERY - Backend Squid HTTP on sq36 is OK: HTTP OK HTTP/1.0 200 OK - 27400 bytes in 0.024 seconds [20:24:11] New patchset: Ottomata; "base.pp,site.pp - parameterizing contact_group for base::monitoring::host" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14911 [20:24:25] RECOVERY - Frontend Squid HTTP on sq36 is OK: HTTP OK HTTP/1.0 200 OK - 27546 bytes in 0.007 seconds [20:24:45] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/14911 [20:25:01] site.pp is not in proper order, srv is listed alphabetically before sq..... [20:25:06] annoys me every single time i look at it. [20:25:24] ok, this appears to be a normal text squid, checking for the rest of the cluster its in [20:26:04] cmjohnson1: ok, there are not a ton of text squids, so we need to make sure to bring this one fully online when done [20:26:08] LeslieCarr, patched. [20:26:11] one shouldnt kill us. [20:26:13] I think that is what you meant. [20:26:21] https://gerrit.wikimedia.org/r/#/c/14911/ [20:26:36] !log shutting down sq36 for hardware troubleshooting [20:26:42] grrrr out of alphabetical order ! [20:26:44] Logged the message, RobH [20:27:10] cmjohnson1: its going down [20:27:17] LeslieCarr: I know right! [20:29:22] PROBLEM - Host sq36 is DOWN: PING CRITICAL - Packet loss = 100% [20:29:56] ottomata: grrr, one last whitespace and then should be perfect [20:30:05] ah! [20:30:37] we *really* need pre-commit hooks for this :( [20:30:45] yes, yes we do [20:31:12] isn't that a great wikimania hackathon project [20:31:19] New patchset: Bhartshorne; "adding new ms-be hosts to the swift backend cluster" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15226 [20:31:20] super easy, great for beginners [20:31:53] New patchset: Ottomata; "statistics.pp - gerrit-stats now uses cli options for data file location and pushes generated stats to gerrit-stats/data" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14922 [20:32:25] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/15226 [20:32:25] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15226 [20:32:26] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/14922 [20:32:27] New patchset: Ottomata; "base.pp,site.pp - parameterizing contact_group for base::monitoring::host" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/14911 [20:32:45] mmmmmmmmk LeslieCarr, how's that? [20:33:00] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/14911 [20:33:07] drdee_: oooo [20:33:45] drdee_: i guess, im just going to do it next site.pp commit i do [20:33:55] cuz otherwise its a lot of review, that particular file is the heart of the system [20:34:18] ie: if i reviewed it for a volunteer i would compare every single stanza [20:34:40] not a bad thing [20:36:54] added another task… http://wikimania2012.wikimedia.org/wiki/Hackathon/Tasks#Tasks [20:37:09] drdee_: point people at that :) [20:37:13] and make them do it... [20:37:14] hehe [20:37:17] LeslieCarr: so, maybe tomorrow from 304 [20:37:19] 3-4 [20:37:31] is earlier available ? [20:37:35] nope :( [20:37:44] then i guess that will have to do [20:37:45] that's the only slot [20:38:11] were you planning on skipping out during that time? [20:38:39] cause I might be able to do it [20:38:53] I was planning on doing part of it (labs + puppetmaster self) [20:39:02] i was going to try to skip out around 4:30 so i could get changed pre-party i think [20:39:11] so i was hoping to have time to talk to people aferwards [20:39:23] * Ryan_Lane nods [20:40:47] LeslieCarr: so, earlier is possible [20:40:50] 1-2 [20:40:56] cool [20:59:06] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [21:05:25] eeep [21:11:04] Ryan_Lane is 1 to 2 confirmed ? [21:12:15] yep [21:13:02] OrenBo: https://labsconsole.wikimedia.org/wiki/Git#Checking_out_the_repositories ( git clone ssh://gerrit.wikimedia.org:29418/operations/puppet.git ) [21:13:12] RECOVERY - Host ps1-a4-sdtpa is UP: PING OK - Packet loss = 0%, RTA = 7.01 ms [21:13:12] RECOVERY - Host ps1-a3-sdtpa is UP: PING OK - Packet loss = 0%, RTA = 8.92 ms [21:13:21] RECOVERY - Host ps1-a5-sdtpa is UP: PING OK - Packet loss = 0%, RTA = 3.15 ms [21:13:56] grrr [21:14:02] are there any spare switches ? [21:14:06] binasher: you're comparing ssd on raid0 with 15K SAS with raid10, evil :) [21:14:06] onsite ? [21:14:07] yah [21:14:25] awesome, swap it out ? [21:14:42] if you have to run today, swapping it out tomorrow would be okay, i can email about it [21:15:17] binasher: (price-wise that is) [21:16:46] New patchset: Jgreen; "shell access for Matt Walker" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15234 [21:17:23] !log rebooting new ms-be6-8 to change BIOS setting to boot from disk [21:17:30] Logged the message, Master [21:17:51] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/15234 [21:18:18] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15234 [21:20:27] hrmf, what's up with gerrit? [21:22:35] It's slow to the point of not loading for me atm =/ [21:23:07] heh, chad just left :( [21:24:09] PROBLEM - Host ms-be9 is DOWN: PING CRITICAL - Packet loss = 100% [21:24:39] it was totally screwed for a few minutes there, tons of git procs running, some defunct, loadavg ~27 [21:24:49] seems to be coming back down though [21:25:48] PROBLEM - swift-container-server on ms-be6 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [21:25:57] PROBLEM - swift-container-replicator on ms-be6 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [21:25:57] RECOVERY - Host ms-be9 is UP: PING OK - Packet loss = 0%, RTA = 0.24 ms [21:26:06] PROBLEM - swift-object-replicator on ms-be6 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [21:26:06] PROBLEM - swift-account-replicator on ms-be6 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [21:26:15] PROBLEM - swift-object-server on ms-be6 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [21:26:15] PROBLEM - SSH on ms-be6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:26:16] PROBLEM - swift-container-updater on ms-be6 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [21:26:24] PROBLEM - swift-object-updater on ms-be6 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [21:26:24] PROBLEM - swift-account-auditor on ms-be6 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [21:26:42] PROBLEM - swift-account-server on ms-be6 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [21:26:42] PROBLEM - swift-container-auditor on ms-be6 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [21:26:42] PROBLEM - swift-object-auditor on ms-be6 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [21:26:51] PROBLEM - swift-account-reaper on ms-be6 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [21:27:18] RECOVERY - swift-container-replicator on ms-be6 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-replicator [21:27:27] RECOVERY - swift-object-replicator on ms-be6 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-object-replicator [21:27:27] RECOVERY - swift-account-replicator on ms-be6 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-replicator [21:27:36] RECOVERY - swift-object-server on ms-be6 is OK: PROCS OK: 25 processes with regex args ^/usr/bin/python /usr/bin/swift-object-server [21:27:36] RECOVERY - SSH on ms-be6 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [21:27:36] RECOVERY - swift-container-updater on ms-be6 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-updater [21:27:45] RECOVERY - swift-object-updater on ms-be6 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-object-updater [21:27:45] RECOVERY - swift-account-auditor on ms-be6 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-auditor [21:28:03] RECOVERY - swift-account-server on ms-be6 is OK: PROCS OK: 25 processes with regex args ^/usr/bin/python /usr/bin/swift-account-server [21:28:03] RECOVERY - swift-object-auditor on ms-be6 is OK: PROCS OK: 2 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [21:28:03] RECOVERY - swift-container-auditor on ms-be6 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [21:28:12] RECOVERY - swift-account-reaper on ms-be6 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-reaper [21:28:30] RECOVERY - swift-container-server on ms-be6 is OK: PROCS OK: 25 processes with regex args ^/usr/bin/python /usr/bin/swift-container-server [21:29:43] New patchset: Alex Monk; "(bug 38287) Install LST extension on itwikibooks" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/15236 [21:32:42] PROBLEM - Host ms-be7 is DOWN: PING CRITICAL - Packet loss = 100% [21:37:21] RECOVERY - Host ms-be7 is UP: PING OK - Packet loss = 0%, RTA = 0.39 ms [21:42:09] PROBLEM - Puppet freshness on owa1 is CRITICAL: Puppet has not run in the last 10 hours [21:42:47] AaronSchulz: There's a concern (I'm not sure how valid it is) that the speed at which SHA1s are being populated is slowing down Toolserver replag. Have you heard anything about that? [21:47:13] I think it's true [21:47:23] But I don't see how it's making TS get relatively so lagged up [21:47:34] it's not as if it's generating gigs of new data [21:48:35] I loaded a TS page just now that showed 17 days or something replag [21:48:42] uh, 17 hours* [21:48:49] Yes, this is it: "Caution: Replication lag is high, changes newer than 17 hours, 48 minutes, 29 seconds may not be shown." [21:49:30] Krenair: s1-rr-a: 17h 49m 12s [+0.18 s/s]; s1-user: 2d 1h 22m 28s [+0.24 s/s] [21:49:38] lols [21:59:07] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/14506 [21:59:30] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/15236 [22:05:20] !log put swift backends ms-be9-12 in rotation for containers (weight 30) and objects (weight 5) [22:05:28] Logged the message, Master [22:06:56] New patchset: Jeremyb; "wikipedie.cz/experti_na_prirodu followup" [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/15237 [22:07:13] anyone care to deploy that? [22:08:00] hey LeslieCarr! [22:08:04] hey :) [22:08:15] feel like a deploy? ;-) [22:08:28] haha [22:08:32] not particularly right now [22:08:42] k [22:08:46] i was hoping to find peeps for food plans :) [22:14:05] New review: Jeremyb; "didn't work; followup in Iaf3a68d5001d0aa" [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/14869 [22:28:22] New patchset: Bhartshorne; "adding dhcp entries for swift frontends ms-fe3 and 4" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15238 [22:28:56] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15238 [23:08:56] New patchset: Bhartshorne; "adding ms-fe4" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15239 [23:09:30] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15239 [23:31:06] PROBLEM - Puppet freshness on nfs2 is CRITICAL: Puppet has not run in the last 10 hours [23:35:09] PROBLEM - Puppet freshness on nfs1 is CRITICAL: Puppet has not run in the last 10 hours [23:35:49] New patchset: Bhartshorne; "adding ms-fe3 and 4 (new frontend servers) into the memcache rotation for swift production cluster" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15241 [23:36:22] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/15241 [23:37:00] New patchset: Pyoungmeister; "some cleanup of redundant stuff, whitespace, etc from apaches.pp" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15242 [23:37:33] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15241 [23:37:34] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/15242 [23:47:11] !log put two new swift front ends into rotation ms-fe3 and ms-fe4 [23:47:18] Logged the message, Master [23:52:37] AaronSchulz: did you just start another run? [23:53:19] still doing c-f [23:54:07] ok, that's wrapping up [23:54:36] would you mind waiting 5-10 minutes before starting another? [23:54:47] I would like to see things settle down to the expected baseline. [23:55:19] I think I'm done for today [23:55:34] ok. [23:56:18] beautiful rain in DC!