[00:06:26] PROBLEM - Puppet freshness on maerlant is CRITICAL: Puppet has not run in the last 10 hours [00:37:48] New patchset: preilly; "add dolphin browser assets to bits docroot" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11973 [00:38:00] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/11973 [00:40:31] New review: Tim Starling; "(no comment)" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11973 [00:40:35] Change merged: Tim Starling; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11973 [01:08:32] PROBLEM - Puppet freshness on db29 is CRITICAL: Puppet has not run in the last 10 hours [01:41:32] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 260 seconds [01:43:20] PROBLEM - MySQL Slave Delay on storage3 is CRITICAL: CRIT replication delay 279 seconds [01:47:23] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 30 seconds [01:49:29] PROBLEM - Misc_Db_Lag on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 648s [01:50:59] RECOVERY - Misc_Db_Lag on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 37s [01:52:02] RECOVERY - MySQL Slave Delay on storage3 is OK: OK replication delay 2 seconds [02:06:08] PROBLEM - Host mw1148 is DOWN: PING CRITICAL - Packet loss = 100% [02:28:29] PROBLEM - Puppet freshness on virt0 is CRITICAL: Puppet has not run in the last 10 hours [03:01:29] RECOVERY - Puppet freshness on virt0 is OK: puppet ran at Tue Jun 19 03:00:57 UTC 2012 [04:05:32] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [04:56:32] PROBLEM - Puppet freshness on searchidx2 is CRITICAL: Puppet has not run in the last 10 hours [07:14:33] !log reboot snapshot2, package and kernel updates [07:14:39] Logged the message, Master [07:15:40] PROBLEM - Host snapshot2 is DOWN: PING CRITICAL - Packet loss = 100% [07:17:55] RECOVERY - Host snapshot2 is UP: PING OK - Packet loss = 0%, RTA = 0.39 ms [07:27:43] !log reboot snapshot1, package and kernel updates [07:27:48] Logged the message, Master [08:06:09] New patchset: Dereckson; "(bug 37699) Change logo on uz.wikipedia" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11977 [08:06:16] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/11977 [08:17:36] New review: Dereckson; "Waiting local consensus URL (shellpolicy)" [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/11977 [08:24:00] New patchset: Catrope; "Move pgehres from restricted to mortals" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11979 [08:24:32] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/11979 [08:27:27] New review: Dereckson; "The configuration is fine." [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/11943 [08:55:57] New review: Hashar; "''rephrasing my previous comment''" [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/9130 [09:02:11] New review: Nikerabbit; "(no comment)" [operations/mediawiki-config] (master); V: 0 C: 2; - https://gerrit.wikimedia.org/r/11833 [09:02:13] Change merged: Nikerabbit; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11833 [09:03:29] New patchset: ArielGlenn; "run as root; ConnectTimeout opt; correct basedir check" [operations/dumps] (ariel) - https://gerrit.wikimedia.org/r/11981 [09:04:28] New review: ArielGlenn; "(no comment)" [operations/dumps] (ariel); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11981 [09:04:30] Change merged: ArielGlenn; [operations/dumps] (ariel) - https://gerrit.wikimedia.org/r/11981 [09:26:11] New patchset: Petrb; "(Bug 37700) - Change logo for stewardwiki to http://commons.wikimedia.org/wiki/File:Steward_wiki_logo_3.svg and favicon to meta one." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11943 [09:26:18] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/11943 [09:31:21] New review: Dereckson; "Please read my review comments in the previous changeset, as they are also valid for your changeset ..." [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/11943 [09:36:58] New review: Dereckson; "(no comment)" [operations/mediawiki-config] (master) C: 0; - https://gerrit.wikimedia.org/r/11150 [09:41:17] !log updating TranslationNotifications extension with NikeRabbit [09:41:22] Logged the message, Master [09:59:55] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [10:05:31] !log TranslationNotifications extension updated by Nikerabbit! [10:05:36] Logged the message, Master [10:07:52] PROBLEM - Puppet freshness on maerlant is CRITICAL: Puppet has not run in the last 10 hours [10:27:59] New review: Hashar; "(no comment)" [operations/mediawiki-config] (master); V: 0 C: 0; - https://gerrit.wikimedia.org/r/11943 [10:30:52] paravoid: so, everything is switched over [10:31:04] paravoid: I'm going to give it a little while before I remove the 135 address [10:44:45] New patchset: Ryan Lane; "Add second master for labs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11986 [10:45:16] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/11986 [10:45:22] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/11986 [10:45:24] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11986 [11:03:13] !log adding IPs for virt6-8 [11:03:18] Logged the message, Master [11:09:49] PROBLEM - Puppet freshness on db29 is CRITICAL: Puppet has not run in the last 10 hours [11:15:09] Ryan_Lane: great [11:29:46] New patchset: Petrb; "(Bug 37700) - Change logo for stewardwiki to http://commons.wikimedia.org/wiki/File:Steward_wiki_logo_3.svg and favicon to meta one." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11943 [11:29:52] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/11943 [11:31:28] New patchset: Petrb; "(Bug 37700) - Change logo for stewardwiki to http://commons.wikimedia.org/wiki/File:Steward_wiki_logo_3.svg and favicon to meta one." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11943 [11:31:34] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/11943 [11:31:54] New review: Petrb; "I changed the format everywhere, if there is a standard for these comments it should be effective fo..." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11943 [11:50:45] Ryan_Lane: did you see the https thread on wikitech? [11:51:35] mark: ^ too [11:51:52] mozilla contcted us; they want to switch the wikipedia search box to https by default [12:00:49] paravoid: which https thread? [12:00:50] oh [12:00:51] that one [12:01:01] so, I doubt we'll see much traffic from it [12:01:08] but it sends a bad precedent. [12:01:13] *sets [12:01:33] we aren't doing https for anons. If we change search, it's sending anons to https. [12:01:50] I'd have less issue if we had https on every server [12:03:09] we're currently getting about 20MB/s across all of the nodes in all datacenters for https [12:03:48] the servers are pretty bored, but that's not much traffic [12:03:49] New review: Hashar; "Just forget about the bug format in comment, it clutter the 'git blame' history making it hard to fi..." [operations/mediawiki-config] (master); V: 0 C: -2; - https://gerrit.wikimedia.org/r/11943 [12:04:21] if we send 30x the traffic to them we'll saturate their links [12:04:53] well, that's esams anyay [12:04:59] I guess more like 50x total [12:05:40] that's with no hardware dead, and not needing to failover between sites. so, realistically, 20x more traffic. [12:05:46] * Ryan_Lane shrugs [12:06:59] New review: Mark Bergsma; "Almost there. :)" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/11574 [12:07:38] what do you mean by "we're not doing https for anons"? [12:10:46] mark: meaning, we're not sending anons to https [12:11:12] if they manually go to https, then they hit it, but otherwise they don't [12:11:43] I'd really like to see the load for https with logged in users using it by default [12:11:59] then decide if it would be a huge hit to put it on all squid/varnish frontends [12:17:06] mark: btw, I'm nearly positive you have the most gerrit commits out of everyone [12:17:11] you have over 1000 [12:17:49] hehe [12:17:56] not so many recently [12:18:02] it kind of depends on what i'm working on, that varies ;) [12:18:31] you cheat, though [12:18:56] you push entire branches in as separate changes [12:21:17] no, you're an idiot for not doing that ;) [12:21:21] :D [12:21:31] I do feature branches [12:21:38] so do I [12:21:41] but I dislike squashing [12:21:48] think it defeats the point of git [12:21:57] * Ryan_Lane nods [12:22:29] the change itself has history, if you do patchsets [12:26:29] I agree wholeheartedly on that [12:28:38] well, you can do a merge commit, but then you lose review capability, which sucks [12:29:06] I wish the merge commits worked like normal commits, and you could review what the merge would be [12:29:18] if you need to review an entire branch in gerrit as one commit, you've already lost [12:29:30] that's how github works [12:29:36] no [12:29:47] you review the merge as a single pull request [12:30:01] yes, but you keep the individual commits [12:30:13] hello [12:30:18] hashar: hi :) [12:30:22] Ryan_Lane: can we submit merges in gerrit ? [12:30:24] but technically, you review the entire branch as one commit ;) [12:30:32] hashar: yes, but then you can't code review [12:30:47] if just that one thing was fixed in gerrit, I'm sure it would make a ton of people happy [12:31:12] I wanted to create a fork of operations/production , cherry-pick several changes from tests then submit that branch for merge in production [12:31:16] of course it needs to be a no-ff merge to work [12:31:35] Ryan_Lane: no. you review the *pull*, but you can review individual commits and you keep those when you merge [12:31:37] I was thinking of spending today merging the branches [12:31:42] yeahhh [12:32:08] I talked about merging test /production a while back with Daniel Zahn [12:32:16] it's not an easy task [12:32:19] Ryan_Lane: also, I'm not sure if it's gerrit's or our fault, but the way our puppet git repo is, is driving me crazy [12:32:24] one way would have been to have each person to cherry pick its changes [12:32:26] every second commit is a merge [12:32:27] paravoid: what do you mean? [12:32:28] starting from the older ones [12:32:45] paravoid: that's because we're cherry-picking [12:32:48] paravoid: git log --no-merges [12:33:03] we are cherry-picking and allow gerrit to merge changes [12:33:16] I wonder if I have always-merge set in our repo [12:33:20] lemme check [12:33:25] I have set my integration repo to be ff only [12:33:28] I don't understand why cherry-picking results in merges. [12:33:42] nope. "Merge-if-necessary" [12:33:45] cause the parent of the commit is not the latest master [12:33:55] indeed. so it needs to merge [12:34:11] why can't if fast forward? [12:34:12] if I changed to "Fast-forward-only" we wouldn't have that [12:34:15] the only thing that is puzzling me is why it does not rebase first [12:34:19] right [12:34:20] but we'd have to rebase way more often [12:34:32] with Gerrit 2.4 (I think) we will have a [Rebase] button :-] [12:34:43] if we used git-review it would rebase for us [12:34:54] https://gerrit.wikimedia.org/r/#/admin/projects/integration/jenkins <-- fast forward only :-]]° [12:34:56] I think most ops people aren't using git-review [12:35:31] I guess most ops don't care about review [12:35:32] I don't understand why gerrit is not rebasing it instead of us doing it on the client [12:35:50] no clue [12:36:12] maybe it thinks that would be a bad assumption to make for the client? [12:36:17] hashar: speaking just for myself, I find review is important but we're using puppet for doing pretty much everything and you can't expect two people to sign off every little change you make [12:36:28] yep [12:36:30] it's like reviewing my shell [12:36:41] we need that for my DNS actions [12:36:44] oh I don't use shell any more [12:36:51] exactly [12:36:57] too cumbersome to wait for whoever to validate my command when I ever I press enter [12:37:03] I think reviews totally make sense for larger changes [12:37:13] I was just ranting anyway [12:37:13] yes. we ask for review on those [12:37:32] I often ask for reviews when people are around, too [12:37:50] I don't care about ops self merging their change. Ops know what they are doing and if they screw something they are responsible for it :) [12:37:55] we really need git annotations [12:37:56] we somehow do the same for mediawiki/core [12:38:02] git notes? what the hell are they called? [12:38:10] done via gerrit, so we can do review after merge [12:38:18] http://www.kernel.org/pub/software/scm/git/docs/git-notes.html [12:38:21] it's on gerrit's roadmap [12:38:39] OpenStack people do export notes from Gerrit to their Git repo I think [12:38:44] they do [12:38:50] this came up at the last design summit [12:38:52] I suggested it [12:39:11] I usually just git show , then copy paste the change id in Gerrit interface [12:40:01] paravoid: speaking of review, I have a debian package change for you to review one day ;-) https://gerrit.wikimedia.org/r/#/c/11610/ [12:40:40] New patchset: Ryan Lane; "Add dhcp entries for virt6-8" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11990 [12:40:51] Ryan_Lane: on the topic of production/test merging, m.ark did an attempt and reverted it. So the history is a bit screwed ;--D [12:40:54] paravoid: I'm not sure if Leslie did the networking for the new virt hosts [12:41:10] paravoid: we may want to tell them to hold off on everything except for virt6-8 [12:41:16] since we want to replace 1-4 [12:41:17] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/11990 [12:41:44] hashar: yeah. it's going to take me all day to successfully do the merge [12:41:57] Ryan_Lane: ? [12:42:12] Ryan_Lane: also, wasn't I supposed to do virt6-8 setup? :) [12:42:16] Ryan_Lane: I believe that should be some kind of a team job. [12:42:17] paravoid: well, right now they are set up at virt9-12 or something like that [12:42:25] paravoid: yeah. I just did dns and dhcp [12:42:28] I am pretty sure the ganglia stuff can be easily merged by Sara as an exemple [12:43:15] hashar: how can I see a cherry-pick difference between them? [12:43:38] what are the reasons again why we can't have a branch per labs project? [12:43:44] because of course this will happen again [12:43:54] there is: git cherry -v production test [12:44:01] New review: Faidon; "LGTM. I think there's a spurious start on init's gracefula action but this was before this change an..." [operations/debs/wikimedia-job-runner] (master); V: 0 C: 1; - https://gerrit.wikimedia.org/r/11610 [12:44:44] also git log production...test [12:44:50] I am not sure what the triple dot does though [12:45:19] seems to be a list of all changes in production and test based on a common ancestor [12:46:19] I'll probably just do a git merge production, from test [12:46:21] and figure it out [12:46:30] then push in a merge commit [12:47:05] then merge the opposite way and do it again [12:47:23] then you got to figure out all the planet, openswift, ganglia stuff (and more ) http://dpaste.org/Df4wo/ [12:47:35] that's fine [12:47:40] most of that I did anyway [12:47:43] just for ganglia, you have legit changes in bots prod and test :-( [12:47:50] bots -> boths [12:47:56] well, not the swift stuff [12:48:27] sometime I feel each project could get its own puppet repo ;-D [12:48:29] going from test to production is going to be much harder [12:48:34] just like mediawiki extensions [12:48:43] not that we have self:puppet, I want to get rid of the test branch [12:48:48] and run labs from production [12:48:58] yesss [12:49:03] ;) [12:49:09] that's why I was going to take the time to merge the branches [12:49:20] just throw test away ;-) [12:49:21] * mark ducks [12:49:27] can't do that :) [12:49:38] it would break labs horribly [12:49:45] that's what labs is for [12:49:48] I've tried to always cherry-pick across needed changes [12:49:54] mark: I can't have constant outages ;) [12:50:03] true you can't [12:50:06] but I still have credit! ;-) [12:50:10] hahahaha [12:52:21] New patchset: Hashar; "(bug 37644) Enable subpages on be.wikimedia.org for NS 0 and 4" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11723 [12:52:34] I am going to break the cluster configuration this afternoon [12:52:42] I am proceeding mediawiki-config queue [12:53:07] wtf gerrit? [12:53:09] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/11723 [12:53:19] it's taking forever to merge my change :( [12:53:25] we really need dbs in eqiad [12:53:25] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/11990 [12:53:26] just took some long time too [12:53:45] New review: Hashar; "(no comment)" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11723 [12:53:52] or move Gerrit to pmtpa ? ;-] [12:53:52] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11990 [12:53:56] * Ryan_Lane sighs [12:54:04] I really don't want to move the gerrit server [12:54:15] it's a pain in the ass [12:54:17] Change merged: Hashar; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11723 [12:54:26] so move the DB! ;-D [12:54:35] we don't have misc db servers in eqiad [12:54:42] I have a procurement rt in [12:55:06] is that because you are lacking the hardware to do so ? [12:55:06] I think delta-compression is making gerrit slower for reviews and merges and such [12:55:15] yes. there's no misc db hardware in eqiad [12:55:25] hence having a procurement ticket in ;) [12:55:35] I have no idea what a procurement ticket is :( [12:55:50] ah. heh [12:55:57] is that something like "pliizzz gimme $$$ 4 my project?" [12:56:09] no. it's "please buy this hardware" [12:58:09] New patchset: Hashar; "(bug 37661) Change Vietnamese Wikibooks logo" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11747 [12:58:23] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/11747 [12:58:39] New patchset: Ryan Lane; "Fix dhcp" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11992 [12:59:06] Ryan_Lane: anyway there must be another issue on manganese [12:59:08] New review: Hashar; "Logo protected http://vi.wikibooks.org/wiki/file:Wiki.png?uselang=en so we are safe to go :-)" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11747 [12:59:08] Change merged: Hashar; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11747 [12:59:09] ok. what's going on with the gerrit server? [12:59:09] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11992 [12:59:09] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11992 [13:00:09] nothing for the last 4 hours on manganese http://ganglia.wikimedia.org/latest/?r=4hr&cs=&ce=&m=&c=Miscellaneous+eqiad&h=manganese.wikimedia.org&tab=m&vn=&mc=2&z=medium&metric_group=ALLGROUPS [13:00:19] I'm wondering if its the dbs [13:00:33] 1 st blame the network and DNS [13:00:41] 2nd blame the intern software developer [13:00:45] 3rd db :-] [13:04:13] New review: Hashar; "The logo at http://commons.wikimedia.org/wiki/File:Wikipedia-logo-chr.png is not protected against e..." [operations/mediawiki-config] (master); V: 0 C: -2; - https://gerrit.wikimedia.org/r/11741 [13:04:39] Ryan_Lane: bugzilla is slow too [13:04:45] so might be db9 [13:04:55] or whatever db is for misc [13:08:54] New review: Hashar; "I am afraid this is going to allow import from any french or italian project." [operations/mediawiki-config] (master); V: 0 C: -1; - https://gerrit.wikimedia.org/r/11746 [13:09:44] yeah. is likely db9 [13:10:20] I don't see any long running queries [13:10:36] mysql seems quite busy, though [13:11:27] observium seems to be doing schema updates [13:11:43] for its eventlog [13:13:25] New patchset: Hashar; "(bug 37457) viwikibooks can import from fr/it wikibooks" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11746 [13:13:32] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/11746 [13:14:33] New review: Hashar; "Patchset 3:" [operations/mediawiki-config] (master); V: 0 C: 0; - https://gerrit.wikimedia.org/r/11746 [13:17:59] New review: Hashar; "Please rewrite your commit message to something shorter :) See: http://www.mediawiki.org/wiki/Git/Co..." [operations/mediawiki-config] (master); V: 0 C: -2; - https://gerrit.wikimedia.org/r/11748 [13:19:27] well, so far production -> test isn't horrible [13:19:41] New review: Hashar; "The file at http://commons.wikimedia.org/wiki/File:Wikipedia-logo-v2-uz.png needs to be protected fi..." [operations/mediawiki-config] (master); V: 0 C: -2; - https://gerrit.wikimedia.org/r/11977 [13:20:17] I have no clue about the swift stuff, though [13:20:37] you will want to merge that with Ben assistance I guess [13:20:41] cmjohnson1: sure. lemme shut it down [13:21:09] well, I think the stuff in test is newer [13:21:29] it is [13:21:46] cmjohnson1: should be down [13:23:04] so here is my bug request : https://zh.wikipedia.org/wiki/Wikipedia:VPM#.E6.9C.89.E9.97.9C.E9.98.B2.E6.BF.AB.E7.94.A8.E9.81.8E.E6.BF.BE.E5.99.A8 [13:23:07] ;( [13:23:28] PROBLEM - Host search23 is DOWN: PING CRITICAL - Packet loss = 100% [13:28:01] New review: Hashar; "This right is not available yet on the wmf cluster. We need to wait for the next deployment :-D" [operations/mediawiki-config] (master); V: 0 C: -2; - https://gerrit.wikimedia.org/r/11848 [13:29:40] New patchset: Hashar; "(bug 37679) zhwiki let rollbacker abusefilter-log-private" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11848 [13:29:47] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/11848 [13:30:19] New patchset: Hashar; "(bug 37679) zhwiki let rollbacker abusefilter-log-private" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11848 [13:30:25] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/11848 [13:30:53] New review: Hashar; "I am deploying the configuration nonetheless. Will have to wait for the new AbuseFilter extension to..." [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11848 [13:31:02] Change merged: Hashar; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11848 [13:32:40] hashar: were these imagescaler changes tested? [13:33:09] obvious for the merge from production -> test I'm keeping what's in test, but it would be good to know the changes work :) [13:33:50] which one ? [13:34:00] I did send some changes in test for Precise [13:34:06] that is currently running on labs :-D [13:34:42] does it work for the lucid instances too? [13:34:55] they should [13:34:57] untested though [13:34:58] heh [13:35:00] ok [13:35:07] but reviewed by paravoid so we are safe somehow [13:35:09] if I break everything I'll just blame it on you [13:35:16] yeah sure :-] [13:35:29] make sure to write a blog post about collaboratively breaking the site :-]]] [13:35:34] will be glad to share the frontage with you [13:35:36] *grin* [13:35:41] seriously, : [13:35:42] * b17da88 - (HEAD, gerrit/test, test) cronspam sprint - srv222 - use -ignore_readdir_race with find to avoid errors for missing files ( [13:35:44] * 885267c - RT #3117 phase out wikimedia-fonts package (4 days ago) [13:35:45] * 3a6d7bb - Explicitly define fonts package for Precise (4 days ago) [13:35:46] * 9722c0d - 'gs' package renamed 'ghostscript' in Precise (5 days ago) [13:35:47] * 2d04bd2 - enable imagescaler classes on Precise hosts. (5 days ago) [13:35:53] all of them should work on Lucid fine [13:35:55] :) [13:35:59] I have checked the packages manually on a lucid box [13:36:05] * Ryan_Lane nods [13:36:14] I had to do those changes for Precise cause ubuntu renamed several packages [13:36:14] this mwfatallog stuff.... [13:36:22] * 610f4b1 - Ensure that /a exists on imagescalers (4 weeks ago) [13:36:23] [13:36:30] that one I am not sure we want it in production though [13:36:37] that probably need to be something else [13:37:55] I wonder if I'm about to push in a million changes since I didn't do --no-ff [13:38:23] New review: Hashar; "Hmm that patch was adding NS_MAIN, NS_FILE, NS_TEMPLATE" [operations/mediawiki-config] (master); V: 0 C: -1; - https://gerrit.wikimedia.org/r/11839 [13:39:00] New patchset: Hashar; "(bug 37675) ruwiki: enable review on Portal:" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11839 [13:39:06] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/11839 [13:40:09] New review: Hashar; "Patchset 2:" [operations/mediawiki-config] (master); V: 0 C: 0; - https://gerrit.wikimedia.org/r/11839 [13:40:33] what's this gsbmonitoring stuff? [13:41:28] RECOVERY - Host search23 is UP: PING OK - Packet loss = 0%, RTA = 0.50 ms [13:42:50] cmjohnson1: okie dokie [13:44:02] New patchset: Hashar; "(Bug 35712) Add an alias to Help namespace in ml.wikisource" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11840 [13:44:08] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/11840 [13:44:16] New review: Hashar; "(no comment)" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11840 [13:44:18] Change merged: Hashar; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11840 [13:45:04] hasher: is there an issue with the indenting of my change? [13:45:26] looks weird on Gerrit, but it is correct on my text editor [13:46:47] New review: Hashar; "If the community is unsure if they want this change, you might want to abandon it and reopen it when..." [operations/mediawiki-config] (master); V: 0 C: -2; - https://gerrit.wikimedia.org/r/11150 [13:47:10] *ping wrongly, hashar [13:47:11] notpeter: any idea what this gsbmonitoring stuff in the test branch is? [13:48:40] Hydriz: pong [13:48:40] I'm going to assume that the gsbmonitoring stuff was removed from head [13:48:48] see ^ [13:48:51] from production, that is [13:49:01] 1-2 lines above [13:49:13] Hydriz: which change ? [13:49:22] 11840 [13:49:31] Hydriz: the initialisesettings.php files have a weird indenting anyway [13:49:34] it is a mix of tabs and spaces :-( [13:49:35] because it was targetting spence [13:49:46] heh [13:49:55] but we should be using tabs, isn't it? [13:50:08] the vim configuration says to use spaces [13:50:16] lol [13:50:20] I end up using whatever indent is used around the lines I edit [13:50:51] I'm going to be really annoyed if I do this merge and have to redo it because I forgot —no-ff [13:51:00] anyway its merged :P [13:51:59] New patchset: Hashar; "(Bug 36895) Restrict media upload to autopatrolled users on sv.wikisource" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11837 [13:52:05] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/11837 [13:52:17] New review: Hashar; "Patchset 3 is just a rebase" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11837 [13:52:20] Change merged: Hashar; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11837 [13:52:49] - Introduce AFTv5 lottery [13:52:50] yeahhh [13:53:17] lets merge all those changes [13:54:07] \o/ [13:56:58] New patchset: Hashar; "(bug 36972) activate the patroller group on nn.wiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11150 [13:57:04] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/11150 [13:57:22] New review: Hashar; "Patchset 2 is just a rebase." [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/11150 [13:57:28] Change merged: Hashar; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11150 [13:58:09] lets blank page! [14:00:14] mark, paravoid: http://pastebin.com/cr8qnDFg [14:00:23] I know the geoip stuff was added and should be kepy [14:00:25] kept [14:00:38] what about the purge decommissioned hosts exec? [14:00:53] looking [14:01:13] never seen this before :) [14:01:18] not sure if it was added to test or removed from production [14:02:21] that should still be there [14:02:46] it isn't [14:02:49] in production [14:02:54] Ryan_Lane: no idea [14:03:03] why the hell not [14:03:11] perhaps elsewhere in a different class? [14:03:18] lemme see [14:03:24] not [14:03:30] it's not there at all in prod [14:03:33] grr [14:03:33] only in test [14:03:56] trying to see where it was removed [14:06:49] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [14:08:15] ahhhh [14:08:20] it was turned into a script [14:08:28] git show 1db825bdd703fa94227521c1ef7ab1cd7b0e9661 [14:12:12] I never understand how site.pp ever conflicts [14:15:18] this manifests/swift.pp change makes no sense to me [14:21:04] New patchset: Ottomata; "Refactoring udp2log classes and defines." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11574 [14:21:37] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/11574 [14:23:36] New review: Ottomata; "Ok, done. " [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11574 [14:24:13] New review: Ottomata; "Ha! 'git' has replaced the word 'get' in my vocabulary." [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/11574 [14:27:20] New review: Ottomata; "Questions from Ben H:" [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/11898 [14:35:23] New patchset: RobH; "added a few old dbs to decom list" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12005 [14:35:58] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12005 [14:36:57] New review: RobH; "simple decom additions" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12005 [14:37:01] Change merged: RobH; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12005 [14:40:24] !log db14 is out of rotation, shutting down to make room for new es servers in rack [14:40:29] Logged the message, RobH [14:43:57] New patchset: Ottomata; "generic-definitions.pp - fixing generic::pythonpip so that pip can be found in /usr/local/bin (where it seems to be in Precise)." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12007 [14:44:25] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12007 [14:46:49] New patchset: Mark Bergsma; "Use consistent (AFI, SAFI) tuples for address families" [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12008 [14:46:57] New patchset: Mark Bergsma; "Simplify NaiveBGPPeering, setAdvertisements takes a flat set again" [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12009 [14:47:04] New patchset: Mark Bergsma; "Implement AttributeDict and FrozenAttributeDict" [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12010 [14:47:09] New patchset: Mark Bergsma; "Split Attribute classes in immutable and mutable" [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12011 [14:47:13] New patchset: Mark Bergsma; "Convert all usage of AttributeSet to AttributeDict" [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12012 [14:47:14] New patchset: Mark Bergsma; "Fix runtime errors to make inet6 MP announcements work" [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12013 [14:47:22] New patchset: Ottomata; "/var/run has been moved to /run in Ubuntu Precise. Updating generic::mysql::server accordingly." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11296 [14:47:52] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/11296 [14:48:11] New review: Ottomata; "Ok, done. Not sure if you wanted this global in base.pp, or in a class. I left it global, so that ..." [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/11296 [14:50:44] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/11296 [14:50:47] !log updating dns [14:50:51] Logged the message, RobH [14:51:10] thanks mark, who else should look at that one? [14:51:19] asher [14:51:46] ok, thanks, I will ping him when he's online later [14:52:01] this is a simple one, need it to fix a puppet rpoblem on stat1 since upgrade: [14:52:01] https://gerrit.wikimedia.org/r/#/c/12007/ [14:53:54] New review: Mark Bergsma; "I believe parameter defaults have some weird properties like applying for the entire manifest, or so..." [operations/puppet] (production); V: 0 C: -1; - https://gerrit.wikimedia.org/r/12007 [14:56:03] New patchset: Ottomata; "generic-definitions.pp - fixing generic::pythonpip so that pip can be found in /usr/local/bin (where it seems to be in Precise)." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12007 [14:56:27] mmmk, done that mark [14:56:34] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12007 [14:57:49] PROBLEM - Puppet freshness on searchidx2 is CRITICAL: Puppet has not run in the last 10 hours [15:03:24] robla: would you want to know the percentage of self reviewed gerrit patchsets? [15:04:35] ha, i would! [15:06:52] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12007 [15:06:55] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12007 [15:07:23] drdee: ooo, that would be nice! [15:07:33] gotta reboot now [15:07:46] roblaL you got it (assuming that gerrit allows this query :D) [15:10:26] New patchset: Jgreen; "remove jpostlethwait from aluminium/grosley" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12020 [15:10:56] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12020 [15:18:41] New review: Jgreen; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12020 [15:18:44] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12020 [15:20:53] New patchset: Platonides; "(bug 37700) - Change logo for stewardwiki to http://commons.wikimedia.org/wiki/File:Steward_wiki_logo_3.svg and favicon to meta one." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11943 [15:21:04] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/11943 [15:42:07] heya mark, when you got a sec, could you give me a re-re-review of this one? [15:42:07] https://gerrit.wikimedia.org/r/#/c/11574/ [16:08:18] New patchset: Ryan Lane; "Merge remote-tracking branch 'origin/test' into production" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12021 [16:08:22] mark, paravoid: ^^ [16:08:40] I'm not sure if I should review that :P [16:08:43] ben needs to review that too [16:08:53] New review: Ryan Lane; "Things to be sure to review in this merge:" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/12021 [16:08:53] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12021 [16:08:54] at least mark and ben do [16:09:18] errr [16:09:20] mutante: also, are your planet changes ready to be put into the production branch? [16:09:22] There's no changes in that revision [16:09:26] there are [16:09:29] Reedy: it's a merge [16:09:39] you need to fetch it and do a diff locally [16:09:45] ohh [16:10:14] i'm not gonna review that [16:10:24] why? it's fairly small [16:10:29] New review: Ryan Lane; "If you'd like something removed, it needs to be changed in test, then I can re-do the merge." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/12021 [16:10:56] hmm ok [16:11:06] I've reviewed things, and just asked for reviews on specific parts [16:11:40] for instance, you can totally skip the openstack stuff ;) [16:12:50] basically we just need to ensure it isn't going to break things. if it's just bad code that isn't being included anywhere, we can merge it in, then fix it [16:13:26] New patchset: Mark Bergsma; "Fix MP attribute encoding, Attribute.fromTuple loop" [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12022 [16:18:47] alright [16:32:41] New review: Alex Monk; "(no comment)" [operations/mediawiki-config] (master) C: 1; - https://gerrit.wikimedia.org/r/11746 [16:32:50] cough cough [16:52:06] New patchset: Alex Monk; "Change some autopatrol-related MWW userrights" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11748 [16:52:16] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/11748 [16:52:31] I hate these stupid commit guidelines [16:53:36] New patchset: Alex Monk; "Change some autopatrol-related MWW userrights" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11748 [16:53:43] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/11748 [16:54:22] New patchset: Mark Bergsma; "Handle multiple unknown attributes in a single UPDATE" [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/12026 [17:11:46] maplebed: please review for swift changes: https://gerrit.wikimedia.org/r/12021 [17:11:53] you'll need to fetch the change and do a diff [17:12:01] since it's a merge [17:12:14] this is merging test into production [17:12:16] Ryan_Lane: can I get back to you in an hour or so? [17:12:21] I'm working with robh on hardware at the moment. [17:12:37] if your swift changes aren't ready to be merged, then remove them from the test branch [17:12:48] (and put them into a local repo using the puppetmaster::self stuff) [17:12:50] I think they are, but I can't review it to make sure right now. [17:12:56] that's fine [17:13:07] PROBLEM - Host ms-be5 is DOWN: PING CRITICAL - Packet loss = 100% [17:13:16] no huge rush, but I'd like to merge this fairly soon (like tomorrow or the next day) if possible [17:13:27] np. [17:24:39] New patchset: Pyoungmeister; "removing hardy-specific stuff for apaches, adding precise" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12027 [17:25:11] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12027 [17:25:20] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12027 [17:25:23] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12027 [17:27:22] RECOVERY - Puppet freshness on searchidx2 is OK: puppet ran at Tue Jun 19 17:27:14 UTC 2012 [17:30:16] so , can anyone think of a reason not to say "yes" to mozilla switchig to https search ? [17:30:23] yes [17:30:25] Increased load? [17:30:30] because it'll send anons to the site [17:30:37] we need to consider this before we say yes [17:30:49] *it'll send anons to the site over https, which we don't do by default [17:30:54] yeah, there will be increased load [17:31:06] Ryan_Lane: is that a bad thing ? or is that just also increased load [17:31:08] it'll likely be small, but it's not a good precedent to set [17:31:24] why not ? https-everywhere already sets it in a way... [17:31:26] I'd prefer we test load with logged in users before we test it with anything else [17:31:33] we can't stop that [17:32:21] anons get http, unless they take measures to not [17:32:38] let's keep it that way for now [17:32:46] we can always have them change it later, when we're ready [17:33:11] ok, do you want to reply to the thread on wikitech-l ? [17:33:18] yeah, I'll reply [17:46:07] RECOVERY - Host db1047 is UP: PING WARNING - Packet loss = 93%, RTA = 26.41 ms [17:46:18] New patchset: Catrope; "Move pgehres from restricted to mortals" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11979 [17:46:54] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/11979 [17:50:01] woosters: the masters that need upgrading are all suns that are getting physically decommissioned form db service and replaced with the new servers that just came in, and all have over a month of uptime. nothing is going to happen with them outside of that. but good job following up on stuff :) [17:50:21] binasher: speaking of that, your 5 new dbs are ready for you [17:50:30] and we are racking the other 10 today (racked arleady, chris is wiring) [17:50:44] RobH: and pc1 is racked too, right? [17:51:26] yep, those are ciscos, been ready since friday for ya =] [17:51:30] well, friday late evening [17:51:40] PROBLEM - MySQL Slave Delay on db1047 is CRITICAL: CRIT replication delay 77040 seconds [17:52:43] PROBLEM - MySQL Replication Heartbeat on db1047 is CRITICAL: CRIT replication delay 76988 seconds [17:52:43] RobH: does it need adding to dns, pxe, etc? [17:53:15] yep [17:53:26] it is just setup and racked with mgmt, no vlan allocation yet [17:53:34] as I had no idea what ip and such you wanted, i assume internal [17:53:36] but not certain [17:53:47] so the network ports need vlan set, dns for ips, dhcp, etc. [17:54:07] make pretend s/pc/db [17:59:12] RobH: should i open a ticket for the vlan / ip alloc for pc1? [18:02:01] binasher: Nah, either myself or chris will by end of today, promise [18:02:06] we have to get the port # for ya [18:02:17] binasher: internal vlan, internal ip right? [18:02:20] like the databases? [18:06:26] !log failed out ms-be5 after failed ssd test [18:06:31] Logged the message, Master [18:07:29] New patchset: Bhartshorne; "pushing new swift ring files after failing out ms-be5" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12033 [18:08:33] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12033 [18:08:47] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12033 [18:09:23] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12033 [18:31:28] Once a request is in RT, how do you get it prioritized? [18:31:34] I currently have 2 requests in RT, one is important, one isn't, but they are both set to priority 50. [18:32:45] um, i think asking is the best way if you don't have an rt account [18:32:56] bug CT? [18:32:57] ;) [18:33:27] The priority is not really addressed [18:33:37] kaldari: indeed, ask CT and he will triage it [18:33:47] or ask in here and hope someone finds it interesting ;] [18:34:02] heh, OK [18:34:19] or super quick [18:38:40] James A was hoping to get https://rt.wikimedia.org/Ticket/Display.html?id=3086 resolved soon so he could implement the WikimediaShopLink extension with it. It's not critical, but since he's hoping to push that out in the next couple weeks, it would be timely to do now. Otherwise, he'll have to implement it with a '$( document ).ready( function() {', which will be annoying as it will move most of the sidebar links after they have rendered. [18:39:29] and much community complaining will ensue [18:39:35] ah [18:40:28] it's kind of hard to explain how that ticket is related, but it's a long dependancy chain starting with that ticket [18:40:35] hehe ok [18:40:52] i think i can do that, pending review from binasher (as the person who actually knows varnish well) [18:42:17] it sounds like it wouldn't require too much mucking, hopefully [18:43:19] this will also allow us to get rid of the annoying 'banner bump' from CentralNotice [18:43:35] Or bribe [18:43:40] Bribes work well I find [18:43:49] I'm not above bribes! [18:44:03] just let me know what the going rate is :) [18:44:17] New patchset: Lcarr; "Allowing bits.wikimedia.org/geoiplookup to work" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12038 [18:44:45] binasher: can you look at https://gerrit.wikimedia.org/r/12038 to tell me if it actually solves the issues in https://rt.wikimedia.org/Ticket/Display.html?id=3086 [18:44:53] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12038 [18:46:54] LeslieCarr: Thanks! Hope that works [18:52:19] LeslieCarr: not quite, i'll show you what it should be when done w/terry [18:53:16] cool thanks [19:00:36] hey binasher [19:00:38] could you review this for me? [19:00:39] https://gerrit.wikimedia.org/r/#/c/11296/ [19:00:45] mark already checked it, and he said you should look at it too [19:06:01] Change abandoned: Matthias Mullie; "other solution" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/11866 [19:09:52] New patchset: MaxSem; "Fix foundationwiki's mobile logo" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/12041 [19:09:58] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/12041 [19:10:34] New patchset: Matthias Mullie; "enable AFTv4 on testwiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/12042 [19:10:40] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/12042 [19:11:06] New review: Catrope; "(no comment)" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12042 [19:11:08] Change merged: Catrope; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/12042 [19:12:27] New patchset: MaxSem; "$wgMobileResourceVersion does not exist anymore" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/12043 [19:12:33] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/12043 [19:16:37] ottomata: i'll be done with meetings around 1:30, will review then [19:19:00] ok thanks [19:20:31] New patchset: Lcarr; "Allowing bits.wikimedia.org/geoiplookup to work" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12038 [19:20:39] binasher: that look better ? [19:21:06] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12038 [19:24:04] RECOVERY - Host ms-be5 is UP: PING OK - Packet loss = 0%, RTA = 0.17 ms [19:27:04] PROBLEM - swift-object-auditor on ms-be5 is CRITICAL: Connection refused by host [19:27:04] PROBLEM - swift-account-auditor on ms-be5 is CRITICAL: Connection refused by host [19:27:13] PROBLEM - swift-container-auditor on ms-be5 is CRITICAL: Connection refused by host [19:27:22] PROBLEM - swift-object-updater on ms-be5 is CRITICAL: Connection refused by host [19:27:31] PROBLEM - swift-container-replicator on ms-be5 is CRITICAL: Connection refused by host [19:27:31] PROBLEM - swift-container-server on ms-be5 is CRITICAL: Connection refused by host [19:27:31] PROBLEM - swift-container-updater on ms-be5 is CRITICAL: Connection refused by host [19:27:50] PROBLEM - SSH on ms-be5 is CRITICAL: Connection refused [19:27:50] PROBLEM - swift-account-reaper on ms-be5 is CRITICAL: Connection refused by host [19:28:07] PROBLEM - swift-object-server on ms-be5 is CRITICAL: Connection refused by host [19:28:07] PROBLEM - swift-object-replicator on ms-be5 is CRITICAL: Connection refused by host [19:28:07] PROBLEM - swift-account-replicator on ms-be5 is CRITICAL: Connection refused by host [19:28:16] PROBLEM - swift-account-server on ms-be5 is CRITICAL: Connection refused by host [19:46:07] PROBLEM - NTP on ms-be5 is CRITICAL: NTP CRITICAL: No response from NTP server [19:55:30] so... [19:55:34] adding a fact to all hosts [19:55:38] anyone see a problem with that? :) [19:56:17] mutante: around? [19:56:31] probly not [19:56:38] I'm sleepy myself [20:00:49] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [20:08:55] PROBLEM - Puppet freshness on maerlant is CRITICAL: Puppet has not run in the last 10 hours [20:09:56] what fact? (out of curiosity) [20:10:15] (and yeah I'm not really here but as long as I was peeking in...) [20:11:42] a custom one for package updates [20:12:09] what will it hold? [20:13:51] an XML of the apt state of package update [20:15:35] generated from polling dpkg or something? [20:16:58] apt [20:18:08] seems legit to me [20:19:06] cool [20:38:48] paravoid: http://www.openstack.org/blog/2012/06/openstack-summit-coming-october-15th-19th-to-san-diego-ca/?awesm=awe.sm_fmwJ [20:38:59] again?! [20:39:07] every 6 months [20:39:14] wow [20:39:20] it's a design summit for the next release [20:39:24] so, every release has one [20:39:38] well, considering how I'll be in SF at mid-September [20:39:47] * Ryan_Lane nods [20:39:59] and how I don't want to spend a month there again (and even if I wanted, I don't think the foundation would pay for that) [20:40:11] andrew is really the one that needs to attend the most [20:40:29] and the fact that I also don't want to crossover the atlantic for the 4th time in 6 months [20:40:33] heh [20:40:35] yeah [20:40:46] so [20:40:50] no worries [20:40:56] ssh -L 8000:localhost:8000 sockpuppet [20:41:05] then fire up your browser on http://localhost:8000/ [20:41:15] andrew and I can hold it down [20:42:34] hm [20:42:38] hm? [20:42:41] socks proxy didn't work [20:42:55] no need for socks, just do a simple port redirection [20:43:11] but I have foxyproxy :D [20:45:25] why 8000? [20:45:33] what's running on 80/443? [20:45:44] nothing, I haven't setup Apache yet [20:45:47] ah [20:45:48] heh [20:45:51] since I'm doing code modifications still [20:45:56] I've done something like 30 commits so far [20:46:00] * Ryan_Lane nods [20:46:07] well, I guess I'm done [20:46:15] I need to deploy the fact, as to have package updates [20:46:24] (and hope that puppetmaster/db9 won't melt) [20:46:24] * Ryan_Lane nods [20:46:44] and the cronjob for parsing those [20:46:55] binasher: does this look correct ? https://gerrit.wikimedia.org/r/#/c/12038/ [20:47:19] I should write a script to populate labsconsole's server pages with facter info [20:47:43] what do you think? [20:47:52] the package stuff are not there yet obviously [20:48:29] Ryan_Lane: hey, so i emailed you but i don't think you'll get it any time soon… basically the labs-ns0 monitoring attempts are killing spence and neon (spence by not allowing it to puppet update, neon by the script inputting the bad data and therefore it is not successful in its attempt to start up icinga). I believe this is due to the fact that there are two instances with the same name but different ip sets, which chokes up nagios [20:48:35] (oh btw, one of the things that I added is support for flagging security updates, that I began while at SF per your request) [20:48:59] LeslieCarr: two instances? [20:49:22] paravoid: looks good [20:49:38] so in manifest/role/dns.pp -- there's two things named labs-ns0.wikimedia.org [20:49:53] the fact query page is really interesting [20:49:56] and in manifest/dns.pp line 36, it converts that into a host [20:50:10] e.g. you can pick all hosts and kernrelease, uptime_days [20:50:16] get the output and sort by uptime_days [20:50:20] to check for the uptime bug [20:50:39] also, the inventory thing is nice imho [20:51:57] LeslieCarr: oh, there's something else wrong with it [20:51:58] paravoid: huh, interesting. the SOA name for each of the production nameservers is different [20:52:05] soa_name [20:52:12] LeslieCarr: the request won't get passed like 9 [20:52:16] *line [20:52:29] Ryan_Lane: ? [20:52:52] looks the same here [20:52:58] in puppet [20:53:18] but that's default-soa-name isn't it? [20:53:21] LeslieCarr: also, you'd want "req.url = "/geoiplookup" [20:53:22] yeah [20:53:22] if we move line 9 and 10 down in puppet will they be evaluated later ? [20:53:26] which is only read when the backend doesn't supply it [20:53:31] * Ryan_Lane nods [20:53:32] which in this case it does [20:53:57] it's breaking the monitoring. I'll use the hostname rather than the soa_name for the monitor check [20:54:39] cool [20:54:41] ugh [20:54:46] it's using it for monitor_service too [20:55:02] I guess I can parameterize the role [20:55:55] oh. wait. no. I can just change the stupid soa_name for eqiad [20:56:07] durr [20:56:55] New patchset: Ryan Lane; "Change soa_name for labs eqiad NS" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12106 [20:57:28] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12106 [20:58:12] New patchset: Lcarr; "Allowing bits.wikimedia.org/geoiplookup to work" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12038 [20:58:16] Ryan_Lane: can you make a repo for me? [20:58:26] preilly: one of the devs can [20:58:26] binasher: ^^ [20:58:28] it's 11pm [20:58:35] Ryan_Lane: it's for operations/debs/squid [20:58:43] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12038 [20:58:46] ok. gimme a sec [20:59:01] LeslieCarr: I thought you said you commented out the check [20:59:03] err [20:59:05] monitor [20:59:09] you mean locally? [20:59:27] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12038 [20:59:30] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12038 [20:59:31] Ryan_Lane: I was going to but hadn't yet [21:00:23] I don't see my change... [21:00:28] oh [21:00:29] didn't merge it [21:00:38] wtf did I just merge? [21:00:49] crap [21:00:54] I just merged your geoip change [21:01:14] asking asher to check out that change [21:01:29] to see if we need to revert or not [21:01:46] LeslieCarr: was that change ready to be merged? [21:01:50] not yet [21:02:09] LeslieCarr: was that change ready to be merged? [21:02:09] Ryan_Lane: welcome back! [21:02:10] not yet ready to be merged [21:02:18] lemme revert, then [21:02:28] New patchset: Ryan Lane; "Revert "Allowing bits.wikimedia.org/geoiplookup to work"" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12107 [21:02:57] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12107 [21:02:57] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12107 [21:02:58] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12106 [21:03:16] LeslieCarr: for line 11, i would prefer: if (req.url == "/geoiplookup") if there will never be arguments, the regex eval is ever so slightly more expensive and bits gets 50k reqs/sec but other than, it looks good [21:03:17] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12106 [21:03:32] ok, i'll change that [21:03:43] if there will ever be arguments passed to geoiplookup, it has to be like it is now [21:04:50] New patchset: Lcarr; "Changing varnish config to allow bits.wikimedia.org/geoiplookup to work" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12109 [21:05:13] but as of now since we're not doing that, exact match is preferred ? [21:05:20] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12109 [21:05:45] New patchset: Ryan Lane; "Change soa_name for labs eqiad NS" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12106 [21:05:46] really wish I had that rebase button in gerrit right about now [21:06:13] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12106 [21:06:13] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12106 [21:06:27] LeslieCarr: ok. added fix for moniroting [21:06:29] monitoring [21:06:37] thanks Ryan_Lane [21:06:58] binasher: can you double check https://gerrit.wikimedia.org/r/#/c/12109/ ? [21:07:20] !log deployed a hacked up exim conf on sodium to block a mail ddos, puppet disabled there too [21:07:25] Logged the message, Master [21:07:48] Jeff_Green: don't forget to find and turn off the cron job that occasionally restarts puppet [21:07:53] thanks, i did [21:08:44] cool [21:10:52] PROBLEM - Puppet freshness on db29 is CRITICAL: Puppet has not run in the last 10 hours [21:29:59] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/12109 [21:31:25] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12109 [21:31:29] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12109 [21:36:22] RECOVERY - Puppet freshness on spence is OK: puppet ran at Tue Jun 19 21:36:15 UTC 2012 [21:36:46] binasher: https://gerrit.wikimedia.org/r/#/c/11919/ [21:39:26] binasher: btw, $wgDefaultExternalStore is an array that can have several stores [21:39:47] ES will randomly pick one and save it there, unless that fails, and then it will try another [21:40:15] I recall a time ages ago when there were always 2 items in there [21:40:51] since mail is broken, i give you RT ticket . . . [21:40:59] #3150: redo sodium's mail configuration not to be a giant incoming spam vector [21:43:41] kaldari: http://bits.wikimedia.org/geoiplookup [21:43:53] woo-hoo!!! [21:44:00] thanks!! [21:44:35] yw [21:45:07] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/11919 [21:45:11] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/11919 [21:46:01] AaronSchulz: thanks for checking out the ES write balancing.. i should request a second new cluster for writes [21:47:24] New patchset: Sara; "Apply many ganglia changes from test branch to production branch: apply all gmond changes and add (but do not yet reference) new files and templates used for gmetad and the webfrontend." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12111 [21:47:56] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12111 [21:48:19] New review: Lcarr; "I would prefer for this change to use $::lsbdistrelease >= 10.04 (we have a few old hardy machines ..." [operations/puppet] (production); V: 0 C: -1; - https://gerrit.wikimedia.org/r/11299 [22:16:16] New review: Sara; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12111 [22:16:19] Change merged: Sara; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12111 [22:19:49] New patchset: Faidon; "Add apt2xml & apt fact to handle package updates" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12114 [22:20:22] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12114 [22:20:57] New patchset: Lcarr; "fixing monitoring for wikinews-lb.esams.wikimedia.org" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12115 [22:21:30] New review: Faidon; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12114 [22:21:30] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12114 [22:21:31] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12115 [22:21:35] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12115 [22:21:37] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12115 [22:21:59] LeslieCarr: should I merge? [22:22:08] yes please [22:23:02] . [22:23:38] I hope I didn't just kill puppet :-) [22:24:14] I added a fact that tends to be big (well, depending on how many updates we have, but considering we have hosts with > 100 updates pending, "big") [22:24:35] if you see anything weird, revert and ensure => absent apt.rb [22:24:53] (I'll be around for the next half-hour or so anyway, just saying) [22:24:53] ok [22:24:55] :) [22:25:12] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/11296 [22:30:53] heh. sara is merging in her changes from test to production for ganglia [22:30:58] that makes that part of the merge easier [22:31:43] Ryan_Lane: I was planning on doing it this week anyway. [22:31:50] cool [22:31:57] today I merged production into test [22:32:13] soon I'll be merging test into production and killing off the test branch [22:32:20] I should write a labs-l post [22:33:28] you should. ideally, i'd prefer to have production and test branches. but only if they're (nearly) in sync. having several months of drift is not very useful. [22:33:47] yes [22:33:49] it's horrible [22:33:52] and now we have this: https://labsconsole.wikimedia.org/wiki/Help:SelfHostedPuppet [22:34:07] so, it's not necessary to have separate branches [22:37:39] i'd still like the ability to have checked in configs (and packages) that get deployed to some hosts but not all. i guess i can accomplish some of that with the puppet manifest logic. [22:38:06] you can do that with puppetmaster::self in a project [22:38:47] you can also use a remote branch, and have the instances in a project use that branch [22:39:36] oh [22:39:43] we can do per-project debs now too! [22:39:56] maplebed: do you mind sending a post to labs-l about that? [22:40:05] awesome! [22:40:16] ssmollett: https://labsconsole.wikimedia.org/wiki/Help:Using_debs_in_labs#When_installing_a_non-standard_package [22:41:35] New review: preilly; "(no comment)" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12041 [22:41:37] Change merged: preilly; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/12041 [22:44:58] New review: Ryan Lane; "Here's the diff:" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/12021 [22:45:38] bah [22:45:58] New review: Ryan Lane; "https://gerrit.wikimedia.org/r/gitweb?p=operations/puppet.git;a=commitdiff;h=a7f05f30b54e60205e6445c..." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/12021 [22:46:13] seriously gerrit? you're fucking up my link that badly? [22:46:15] New patchset: Lcarr; "trying to remove incorrectwikinews-lb.wikimedia.org monitors" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12120 [22:46:47] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12120 [22:46:47] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12120 [22:47:02] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12120 [22:47:15] New review: Ryan Lane; "[https://gerrit.wikimedia.org/r/gitweb?p=operations/puppet.git;a=commitdiff;h=a7f05f30b54e60205e6445..." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/12021 [22:47:29] even the fucking markdown syntax doesn't work? [22:49:49] RECOVERY - MySQL Slave Delay on db1047 is OK: OK replication delay 1 seconds [22:49:58] RECOVERY - MySQL Replication Heartbeat on db1047 is OK: OK replication delay 0 seconds [22:58:09] yay! [22:58:14] things are filling up in servermon [22:58:15] how cool :) [22:58:31] also runs with apache on port 80 [22:58:36] anyone else want to have a look at a new tool? [22:59:00] \o/ [23:02:35] paravoid: heh. security updates and packages lead to the same url [23:02:40] I know [23:03:02] I'm not really sure if I should make it separate pages tbh [23:03:24] this is really awesome, though [23:03:24] try clicking on hooft [23:03:33] and look for the padlock [23:03:45] I don't see hooft [23:03:59] on the host list? [23:04:02] btw, is it possible to make this app not take /? [23:04:12] certainly is [23:04:29] there's always the possibility we'd want another app on here :) [23:04:31] well, *probably* is :) [23:04:35] heh [23:05:02] did you find hooft? [23:05:07] New patchset: MaxSem; "Revert "Fix foundationwiki's mobile logo": we need plan B:)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/12121 [23:05:14] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/12121 [23:05:53] yeah [23:05:55] New review: preilly; "(no comment)" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12121 [23:05:57] Change merged: preilly; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/12121 [23:20:46] ssmollett: are the ganglia graphs looking weird you? [23:20:59] it looks like a bunch of gmonds were just restarted. [23:21:22] (except for the application servers pmtpa, which just look horrible) [23:21:23] maplebed: i think that must have been on the last puppet run when i merged changed. [23:22:03] hm. something happened last thurs or fri that broke the app servers cluster reporting. [23:22:13] ssmollett: cool. nice to have a reason. [23:22:22] Ryan_Lane: yeah, I'll send an email about the packaging stuff. [23:22:38] well, lists is down right now [23:22:42] might want to wait a little bit [23:22:46] it won't just queue? [23:22:56] ok. I need to update the docs a little anyways. [23:23:08] no. it's actively rejecting right now [23:23:43] bummer. [23:24:07] * maplebed redacts inappropriate comment about a certain internal list. [23:34:55] New patchset: Faidon; "exim4: add a defer_domains list" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12125 [23:35:26] New review: Faidon; "Tested on sodium, works." [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12125 [23:35:27] New patchset: preilly; "small fixes for redirector.c" [operations/debs/squid] (master) - https://gerrit.wikimedia.org/r/12126 [23:35:27] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12125 [23:36:00] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12125 [23:37:52] !log temporarily adding wikimedia.org, wikipedia.org etc. to sodium's /etc/exim4/defer_domains [23:37:57] Logged the message, Master [23:39:27] the app server cluster looks like there are multiple data sources configured in gmetad.conf and some of them are reporting the wrong thing [23:39:46] you get an oscillation as it rotates through the various sources [23:39:52] New patchset: Ryan Lane; "Ensure jobs are run on labsconsole." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12127 [23:39:52] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/12127 [23:39:56] New patchset: preilly; "small fixes for redirector.c" [operations/debs/squid] (master) - https://gerrit.wikimedia.org/r/12126 [23:40:26] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/12127 [23:40:28] New review: preilly; "(no comment)" [operations/debs/squid] (master) C: 1; - https://gerrit.wikimedia.org/r/12126 [23:40:29] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12127 [23:41:48] ssmollett: ^^ [23:42:53] Ryan_Lane: go to bed [23:43:03] soon enough :) [23:43:48] TimStarling: Do you mean Application servers pmtpa? [23:44:51] yes [23:47:52] do you know when this started? [23:48:13] (or what should i be looking at to see what you're seeing?) [23:48:42] New patchset: preilly; "Add a .gitreview file" [operations/debs/squid] (master) - https://gerrit.wikimedia.org/r/12129 [23:49:01] New review: preilly; "(no comment)" [operations/debs/squid] (master) C: 1; - https://gerrit.wikimedia.org/r/12129 [23:49:02] I don't know if it's the problem, it's just that I have seen such problems in ganglia before and the graphs looked the same [23:49:55] you can see when it started on the yearly graph [23:50:33] there was a drop in CPU count in mid-to-late May