[00:51:35] New patchset: Bhartshorne; "grrr... missing definition" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2421 [00:51:55] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2421 [00:51:55] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2421 [00:58:09] grumblegrumblegrumble. I killed ganglia swift stats for about 90min. :( [01:19:22] well, at least the graph now shows 304s in addition to the other response codes. [01:19:43] I really want to see 404s drop and 200s rise. [01:19:45] ::sigh:: [01:37:44] New patchset: Bhartshorne; "taking owa out of the prod swift cluster until we need them." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2422 [01:38:03] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2422 [01:38:04] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2422 [02:02:58] PROBLEM - Misc_Db_Lag on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 607s [02:44:12] RECOVERY - Misc_Db_Lag on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 20s [03:32:32] RECOVERY - Puppet freshness on srv226 is OK: puppet ran at Thu Feb 9 03:32:22 UTC 2012 [07:57:14] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: Puppet has not run in the last 10 hours [07:57:14] PROBLEM - Puppet freshness on lvs1003 is CRITICAL: Puppet has not run in the last 10 hours [09:29:30] !log Reactivated term selected-paths in policy-statement BGP_transit_in on cr2-eqiad, making path 14907 3257 1299 43821 active again [09:29:32] Logged the message, Master [10:02:26] New patchset: Mark Bergsma; "Generate varnish node entries from a function" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2423 [10:02:46] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2423 [10:02:53] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2423 [10:02:53] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2423 [10:09:28] New patchset: Mark Bergsma; "Formatting" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2424 [10:09:49] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2424 [10:09:50] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2424 [10:09:50] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2424 [10:12:00] New patchset: Mark Bergsma; "Formatting" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2425 [10:12:19] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2425 [10:12:19] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2425 [10:12:20] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2425 [10:16:59] New patchset: Mark Bergsma; "Try Array.inspect" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2426 [10:17:19] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2426 [10:17:30] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2426 [10:17:31] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2426 [10:21:01] New patchset: Mark Bergsma; "Formatting" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2427 [10:21:20] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2427 [10:21:20] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2427 [10:21:21] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2427 [10:27:12] New patchset: Mark Bergsma; "Formatting" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2428 [10:27:32] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2428 [10:27:43] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2428 [10:27:44] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2428 [10:29:26] New patchset: Mark Bergsma; "Formatting" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2429 [10:29:45] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2429 [10:29:46] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2429 [10:29:46] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2429 [10:35:27] New patchset: Mark Bergsma; "Formatting" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2430 [10:35:47] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2430 [10:35:47] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2430 [10:35:53] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2430 [10:35:54] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2430 [10:37:49] New patchset: Mark Bergsma; "Formatting" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2431 [10:38:09] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2431 [10:38:09] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2431 [10:38:09] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2431 [10:39:16] New patchset: Mark Bergsma; "Formatting" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2432 [10:39:36] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2432 [10:39:36] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2432 [11:01:36] New patchset: Mark Bergsma; "Generate all varnish nodes in varnish.xml from a function" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2433 [11:01:55] New review: gerrit2; "Change did not pass lint check. You will need to send an amended patchset for this (see: https://lab..." [operations/puppet] (production); V: -1 - https://gerrit.wikimedia.org/r/2433 [11:03:38] New patchset: Mark Bergsma; "Generate all varnish nodes in varnish.xml from a function" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2433 [11:03:57] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2433 [11:04:07] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2433 [11:04:08] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2433 [11:08:37] New patchset: Mark Bergsma; "Corrections" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2434 [11:08:57] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2434 [11:09:05] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2434 [11:09:06] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2434 [11:21:12] New patchset: Mark Bergsma; "Make squid.xml generated by a template as well" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2435 [11:21:32] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2435 [11:21:51] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2435 [11:21:52] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2435 [11:29:11] PROBLEM - Puppet freshness on owa3 is CRITICAL: Puppet has not run in the last 10 hours [11:34:01] PROBLEM - Puppet freshness on owa1 is CRITICAL: Puppet has not run in the last 10 hours [11:36:01] PROBLEM - Puppet freshness on owa2 is CRITICAL: Puppet has not run in the last 10 hours [11:50:23] !log Changed topology of eqiad text squids to request from pmtpa (similar to esams) [11:50:25] Logged the message, Master [11:56:01] RECOVERY - Host cp1017 is UP: PING OK - Packet loss = 0%, RTA = 26.42 ms [11:56:18] !log Power cycled cp1017 [11:56:20] Logged the message, Master [12:03:13] New patchset: Mark Bergsma; "Make sure Squid doesn't automatically start at boot" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2436 [12:03:42] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2436 [12:03:42] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2436 [12:07:30] New patchset: Mark Bergsma; "Using the 'enable' service resource type parameter is a better way" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2437 [12:07:51] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2437 [12:07:57] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2437 [12:07:58] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2437 [12:30:27] New patchset: Mark Bergsma; "Automatically generate cachemgr.conf content from active_nodes Puppet list" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2438 [12:30:49] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2438 [12:30:58] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2438 [12:30:59] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2438 [12:35:35] New patchset: Mark Bergsma; "Sort lists" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2439 [12:35:55] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2439 [12:36:05] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2439 [12:36:05] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2439 [13:15:43] PROBLEM - Host cp1017 is DOWN: PING CRITICAL - Packet loss = 100% [13:28:48] !log Redirecting mediawiki-lb.pmtpa traffic to mediawiki-lb.eqiad (geodns) [13:28:50] Logged the message, Master [13:35:32] wow, mediawiki.org servers 30 Mbps of traffic apparently ;) [13:38:26] :O [13:38:55] What's the aggregate traffic of all of our domains? That's measures in Gbps, right? :) [13:38:58] *measured [13:44:16] RoanKattouw: around 12 Gbps [13:45:53] !log Redirecting wikimedia-lb.pmtpa traffic to wikimedia-lb.eqiad (geodns) [13:45:54] Logged the message, Master [13:45:57] there goes commons [13:46:40] This is LVS-eqiad --> Squid-eqiad --> Apache-pmtpa? Or does it go to the pmpta Squids? [13:46:54] for the next few days, squid-eqiad -> squid-pmtpa [13:46:57] just to fill the cache [13:46:59] Ah, of course [14:02:59] PROBLEM - Auth DNS on ns1.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [14:11:26] !log Redirecting {foundation,wikibooks,wikinews}-lb.pmtpa traffic to .eqiad (geodns) [14:11:27] Logged the message, Master [14:14:45] New patchset: Mark Bergsma; "Update creator-info" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2440 [14:15:05] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2440 [14:15:09] RECOVERY - Auth DNS on ns1.wikimedia.org is OK: DNS OK: 0.028 seconds response time. www.wikipedia.org returns 208.80.152.201 [14:15:49] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2440 [14:15:49] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2440 [14:23:24] !log Redirecting {wikiquote,wikisource,wikiversity,wiktionary}-lb.pmtpa traffic to .eqiad (geodns) [14:23:26] Logged the message, Master [14:39:37] mark: what's good to pick up in amsterdam? [14:39:39] New patchset: Mark Bergsma; "Improve total requests graph" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2442 [14:39:51] and you are missing from a channel. heh [14:40:00] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2442 [14:40:05] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2442 [14:40:05] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2442 [14:40:43] pot? [14:40:50] ummmm [14:40:51] no [14:40:59] stroopwafels? gouda cheese? [14:41:01] I dunno ;) [14:41:04] also, this is the -ops channel :D [14:41:09] so? [14:41:23] heh [14:45:14] New patchset: Mark Bergsma; "Apparently AREA didn't work, despite torrus docs suggestion" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2443 [14:45:50] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2443 [14:45:50] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2443 [14:51:45] where in the world is Ryan_Lane? :-) [14:52:22] (you know, when people say "there goes commons" it usually makes me really nervous, mark :-P) [14:52:45] apergos: heh. I'm in amsterdam right now. [14:53:12] how is it? I am guessing: cold [14:53:46] Not as cold as it was a few days ago probably [14:53:51] pretty cold. not as cold as a few days ago [14:54:04] I actually had to delayer a little [14:54:07] When I drove down there for my visa interview (Tuesday 6am) it was -13.5 C when I left the house [14:54:28] Today it's supposed to get up to zero [14:54:34] woo hoo [14:54:39] time to break out the shorts [14:54:50] haha [14:55:11] Well hold your horses, the forecast for Saturday night/morning is -11 [14:55:22] * Ryan_Lane groans [14:55:33] so, I see that some of the time I am at 100% iutilization of my lvm volume on dataset1001 [14:55:45] wondering what I can do about it [14:55:45] New patchset: Mark Bergsma; "Change eqiad colors for clarity" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2444 [14:56:06] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2444 [14:56:17] there goes apergos [14:56:27] that make ne even *more* nervous :-P [14:56:45] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2444 [14:56:46] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2444 [15:07:27] mutante: I think you can clause RT 2059 'deploy testswarm on gallium' https://rt.wikimedia.org/Ticket/Display.html?id=2059 :) [15:08:24] hashar: thanks, done [15:13:01] !log Sending Canadian wikipedia traffic to wikipedia-lb.eqiad [15:13:02] Logged the message, Master [15:22:15] New patchset: Hashar; "rt: force HTTPS protocol" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2446 [15:22:19] git review for the win! [15:22:34] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2446 [15:26:44] New review: Dzahn; "aah, i'd love to see that work, but please read through RT 714 for possible problems and why this ha..." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/2446 [15:33:21] New patchset: Mark Bergsma; "Add aggregate stats for mobile caches" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2447 [15:33:41] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2447 [15:34:50] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2447 [15:34:51] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2447 [15:49:14] !log Sending some Asian wikipedia traffic to wikipedia-lb.eqiad [15:49:15] Logged the message, Master [16:03:27] mark: were we able to send specific sites different places before we made the -lb addresses? [16:03:46] not specific projects, no [16:03:57] so now I first did mediawiki.org [16:04:04] then wikimedia.org (commons and stuff) [16:04:16] then the non-wikipedia projects (very little traffic) [16:04:24] and now specific countries wikipedia traffic [16:04:27] and soon all wikipedia traffic [16:04:47] do we have apaches running in eqiad yet? [16:04:48] the ratios between projects are easy to see on the lvs servers [16:04:50] no [16:05:03] so the squids there are talking to apaches in tampa? [16:05:14] to squids in tampa [16:05:23] ah, just like esams [16:05:26] yes [16:05:31] ok [16:05:31] in a few days they can talk to apaches directly [16:05:35] grea [16:05:36] first need to seed the caches [16:05:36] t [16:05:41] right [16:06:03] apaches in eqiad... will write to masters in tampa? [16:06:11] no [16:06:28] they will be inactive if the master is in tampa [16:06:44] ok [16:08:06] eqiad text squid hit ratio is at 74% now [16:09:05] so only a quarter of requests goes on to the pmtpa squids already [16:09:20] morning ct [16:09:32] hi morning [16:09:53] i see traffic thru squid@eqiad :-) [16:09:56] yes [16:10:15] slowly ramping up traffic [16:10:17] from exam? [16:10:21] no [16:10:22] esam [16:10:30] from asia, canada [16:10:39] and all non-wikipedia projects are served in full [16:10:55] but it's clients -> eqiad -> pmtpa squids -> pmtpa apaches [16:10:56] nice [16:11:04] in a few days they can talk to pmtpa apaches directly [16:11:38] they're doing ~ 4000 req/s now [16:11:47] 75% hit rate [16:12:36] hardly breaking into a sweat … ;-) [16:13:19] yeah not at all [16:13:58] I fixed up http://torrus.wikimedia.org/torrus/CDN?path=%2FTotals%2FAll_client_requests again [16:14:02] should be mostly correct now [16:14:39] i like it very much [16:14:51] it needs a new color scheme now [16:14:53] but i'll fix that later ;) [16:15:02] hopefully with manutius, torrus will not break so often [16:16:59] probably will [16:17:03] the software isn't fixed [16:17:10] !log Added Brazil traffic to eqiad text squids [16:17:12] Logged the message, Master [16:17:22] nearly 200 million people [16:18:49] toad went up a sliver [16:18:55] just wait a minute [16:19:13] s/t/l [16:19:18] it takes 10-15 minutes for all traffic to shift [16:21:45] i want to have most big languages somewhat cached by eqiad before I put all traffic on it [16:21:55] english, japanese, spanish, etc [16:22:51] makes sense [16:26:05] hit ratio gradually climbing [16:26:10] it's looking good [16:30:48] I think it's time to put all traffic on [16:30:58] yea, can see the traffic going up on ganglia [16:31:46] !log Sending ALL non-european wikipedia traffic to eqiad text squids [16:31:48] Logged the message, Master [16:31:50] there we go [16:34:10] aww, i cannot see any lights flashing on the camera feeds [16:34:26] i can see the lcds for the cp servers, but nothin else [16:35:03] soon, wikipedia should be pretty fast for you ;) [16:35:15] perhaps next week i'll do varnish for upload as well [16:35:27] i am a logged in user ;] [16:36:14] join the Anonymous [16:36:18] they are legion! :-D [16:45:23] PROBLEM - check_nginx on payments4 is CRITICAL: PROCS CRITICAL: 0 processes with command name nginx [16:50:23] PROBLEM - check_nginx on payments4 is CRITICAL: PROCS CRITICAL: 0 processes with command name nginx [16:55:23] RECOVERY - check_nginx on payments4 is OK: PROCS OK: 50 processes with command name nginx [16:57:41] New patchset: Andre Engels; "More on my own version of the pipelines (simplepipelines.py), new class MultiVariable, new file selectors.py with standard selectors of which log lines to include. apireturn.py, containing some code that I have used for analyzing the return MIME types of " [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2452 [17:06:49] New patchset: Mark Bergsma; "Change RRD RRA sizes for torrus usage" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2453 [17:07:11] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2453 [17:07:21] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2453 [17:07:21] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2453 [17:15:25] I'm going to bump swift traffic to 100% for thumbnails in 15 minutes. [17:15:30] nice [17:16:28] apergos: were you around for the squid torrus graphs binasher linked last night? [17:16:37] hmm I don't remember [17:16:38] http://torrus.wikimedia.org/torrus/CDN?path=%2FSquids%2Fpmtpa%2Fupload%2FTotals%2FService_times [17:16:43] http://torrus.wikimedia.org/torrus/CDN?path=%2FSquids%2Fpmtpa%2Fupload%2Fsq82.wikimedia.org%2Fbackend%2FService_times%2FHTTP_service_times [17:16:46] http://torrus.wikimedia.org/torrus/CDN?path=%2FSquids%2Fpmtpa%2Fupload%2Fsq82.wikimedia.org%2Fbackend%2FPerformance%2FHit_ratios [17:17:01] my mind is full of a haze of misbehaving rsyncs and misbehaving politicians from then [17:17:15] yeah I noticed that [17:17:17] what happens when you rsync a misbehaving politician? [17:17:20] I think not, I didn't clock through to any torrus graphs [17:17:37] you get a lot of packet loss :-P [17:17:45] and eventually rsync gives up! [17:18:10] I have one guess about why it's so spikey on misses. [17:18:30] most of the requests to swift are 404s so it has to go back to ms5 and write out the thumbnail [17:18:53] and the ms servers and ms-be are pretty disproportionately loaded [17:18:59] ah [17:19:07] and swift waits for two confirmed writes before it returns its success on a put [17:19:26] hm [17:19:44] sadly I don't have latency numbers for any of teh backends, only load and cpu/iowait etc. [17:20:04] but it's a theory. [17:20:23] New patchset: Andre Engels; "Typos in my previous commit." [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2456 [17:20:27] the other theory says that when we have fewer 404s and more 200s, the overall squid stats will look nicer even if the 404s still look shitty. [17:30:21] deploying 100% to sq86 to test [17:31:03] tests pass [17:31:22] !log deployed squid config to upload squids sending 100% of all thumbnail traffic to swift [17:31:23] Logged the message, Master [17:31:28] \o/ [17:31:37] w00t! [17:34:13] PROBLEM - Mobile WAP site on ekrem is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:44:43] We're deploying things fast these days aren't we [17:44:45] I believe most of the load left on ms5 is actually NFS ;) [17:45:25] what, it took a full 4 days just to enable swift! that's fast? [17:45:46] I mean we're doing most of the Swift switchover and the eqiad Squid switchover on the same day [17:46:02] i'll setup upload varnish in eqiad next week [17:46:03] And most likely we'll have 1.19 running on at least some wikis next week [17:46:06] could do it tomorrow, but... [17:46:21] That sounds like a fast pace to me, and I like it :) [17:51:47] just checked a random upload squid: not a single request to ms5 [17:54:31] \o/ [17:55:01] we don't have ntop installed anywhere, do we? [17:55:11] it'd be nice to ask ms5 for a list of its top 10 network endpoints; [17:55:18] they should be ms-fe1 and 2 and the image scaling cluster [17:55:55] we have had in the past but I dunno any more [17:58:37] maplebed: no, but easy to check with a few accounting iptables rules :) [17:59:01] or a 1m tcpdump and wireshark's analysis. [18:00:21] or that [18:00:26] we used sflow/netflow in the past [18:00:31] but that's not setup anymore indeed [18:08:00] PROBLEM - Puppet freshness on lvs1003 is CRITICAL: Puppet has not run in the last 10 hours [18:08:00] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: Puppet has not run in the last 10 hours [18:11:44] New patchset: Mark Bergsma; "Attempt to unbreak LVS in eqiad" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2459 [18:12:06] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2459 [18:12:12] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2459 [18:12:12] does anyone here have mad nginx skills? [18:12:12] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2459 [18:15:18] New patchset: Mark Bergsma; "Unbreak LVS by allocating eqiad swift svc ip" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2460 [18:15:38] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2460 [18:20:38] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2460 [18:20:38] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2460 [18:21:50] RECOVERY - Puppet freshness on lvs1006 is OK: puppet ran at Thu Feb 9 18:21:39 UTC 2012 [18:22:42] Jeff_Green: replied to your RT ticket anyway [18:23:54] oh interesting [18:24:11] according to nginx docs max-age=0 prevents caching [18:24:19] that's odd [18:24:20] RECOVERY - Puppet freshness on lvs1003 is OK: puppet ran at Thu Feb 9 18:23:50 UTC 2012 [18:24:27] squid should cache it anyway, that's how our main infra works [18:24:45] yes--that probably explains why squid does cache at least [18:25:33] max-age=0 should not prevent caching if s-maxage is set [18:25:48] Doing so would be a blatant violation of the HTTP spec, surely nginx isn't that broken [18:25:50] that's what I said in the rt ticket, yes ;) [18:27:09] hrm. so the observed nginx behavior was that it refuses to cache until i ignore Cache-Control [18:29:47] i suspect it's a bug actually . . . [18:33:25] Changes with nginx 0.8.20 "Bugfix: nginx did not treat a comma as separator in the "Cache-Control" backend response header line. [18:33:29] sigh. it is that broken. [18:34:06] haha [18:36:11] hell all arrows point to switching to a newer version--I'd also get the directives that would allow cache control based on a regex match in the query string [18:44:19] * mark food [18:49:56] now I want a tshirt that says "lolumad, Oligarch?" [18:50:12] best taunt ever. [19:00:00] RECOVERY - Mobile WAP site on ekrem is OK: HTTP OK HTTP/1.1 200 OK - 1642 bytes in 9.380 seconds [19:00:30] notpeter: I expect to see you wearing one of those next time you visit. [19:01:46] maplebed: I did get the "my marxist feminist dialectic brings all the boys to the yard" shirt recently [19:02:10] so, link me to a "lolumad, Oligarch?" shirt and I'd be game [19:02:42] er, I mean, I'm going to screen print my own on a shirt that I made from cotton that I grew on my roof. no buying. [19:03:12] * maplebed hands notpeter a cafepress account [19:30:03] PROBLEM - HTTP on ekrem is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:33:53] PROBLEM - Mobile WAP site on ekrem is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:38:36] maplebed / notpeter / binasher: can one of you copy some of the sampled1000 log files on emery to the analytics virtual instance in labs? [19:39:18] have one specifically in mind? [19:39:26] also what's the name / ip address of the labs instance? [19:40:24] 1 sec [19:40:31] 10.4.0.63 [19:40:49] pref about 30 files (1 month of data) [20:05:23] RECOVERY - HTTP on ekrem is OK: HTTP OK HTTP/1.1 200 OK - 453 bytes in 8.531 seconds [20:06:31] maplebed: when will the files be transferred? [20:07:23] sorry, I got torn away. [20:07:41] I won't be able to get to it till this afternoon. [20:09:23] okay :( [20:11:48] if you open an RT ticket it's more likely some other opsen will be able to do it (rather than pinging me directly) [20:13:17] don't know about that :) i've got a pretty long list of RT tickets [20:14:40] PROBLEM - MySQL Replication Heartbeat on db1043 is CRITICAL: CRIT replication delay 1253 seconds [20:16:20] PROBLEM - MySQL Replication Heartbeat on db1047 is CRITICAL: CRIT replication delay 1353 seconds [20:20:40] PROBLEM - MySQL Slave Delay on db1017 is CRITICAL: CRIT replication delay 1613 seconds [20:21:14] New patchset: Jgreen; "added community-analytics vhost for aluminium/grosley" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2464 [20:21:37] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2464 [20:22:00] PROBLEM - MySQL Replication Heartbeat on db1033 is CRITICAL: CRIT replication delay 1693 seconds [20:22:10] PROBLEM - MySQL Replication Heartbeat on db42 is CRITICAL: CRIT replication delay 1703 seconds [20:22:11] New review: Jgreen; "add, but don't yet enable apache virtual server definition" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2464 [20:22:11] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2464 [20:29:36] puppet oh puppet why are you evil [20:30:00] PROBLEM - MySQL Replication Heartbeat on db1017 is CRITICAL: CRIT replication delay 2173 seconds [20:31:17] New patchset: Jgreen; "removing references to puppet:///files/nagios/nrpe_local.fundraising.cfg which was removed" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2465 [20:31:38] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2465 [20:31:48] New review: Jgreen; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2465 [20:31:48] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2465 [20:35:00] New patchset: Demon; "Add option for skipping non-matches since writing extra rules is annoying" [operations/software] (master) - https://gerrit.wikimedia.org/r/2466 [20:35:01] New review: gerrit2; "Lint check passed." [operations/software] (master); V: 1 - https://gerrit.wikimedia.org/r/2466 [20:44:30] PROBLEM - HTTP on ekrem is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:44:52] New review: Catrope; "(no comment)" [operations/software] (master); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2466 [20:44:59] New review: Diederik; "(no comment)" [analytics/udp-filters] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2233 [20:45:21] New review: Diederik; "Ok." [analytics/udp-filters] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2182 [20:45:21] New review: Demon; "(no comment)" [operations/software] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2331 [20:45:21] Change merged: Diederik; [analytics/udp-filters] (master) - https://gerrit.wikimedia.org/r/2233 [20:45:21] Change merged: Diederik; [analytics/udp-filters] (master) - https://gerrit.wikimedia.org/r/2182 [20:45:22] Change merged: Demon; [operations/software] (master) - https://gerrit.wikimedia.org/r/2331 [20:45:41] New review: Diederik; "Ok." [analytics/udp-filters] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2232 [20:45:41] Change merged: Diederik; [analytics/udp-filters] (master) - https://gerrit.wikimedia.org/r/2232 [20:46:08] New review: Diederik; "Ok." [analytics/udp-filters] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2234 [20:46:08] Change merged: Diederik; [analytics/udp-filters] (master) - https://gerrit.wikimedia.org/r/2234 [20:46:25] New review: Diederik; "Ok." [analytics/udp-filters] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2235 [20:46:25] Change merged: Diederik; [analytics/udp-filters] (master) - https://gerrit.wikimedia.org/r/2235 [20:49:29] New review: Demon; "(no comment)" [operations/software] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2332 [20:49:29] Change merged: Catrope; [operations/software] (master) - https://gerrit.wikimedia.org/r/2466 [20:49:29] Change merged: Demon; [operations/software] (master) - https://gerrit.wikimedia.org/r/2332 [20:56:00] RECOVERY - MySQL Slave Delay on db1017 is OK: OK replication delay 0 seconds [20:59:30] PROBLEM - MySQL Slave Delay on db1033 is CRITICAL: CRIT replication delay 701 seconds [21:01:20] PROBLEM - MySQL Slave Delay on db42 is CRITICAL: CRIT replication delay 1790 seconds [21:02:30] PROBLEM - MySQL Slave Delay on db1043 is CRITICAL: CRIT replication delay 880 seconds [21:04:40] RECOVERY - MySQL Replication Heartbeat on db1017 is OK: OK replication delay 0 seconds [21:04:50] PROBLEM - MySQL Slave Delay on db1047 is CRITICAL: CRIT replication delay 2000 seconds [21:18:30] PROBLEM - MySQL Replication Heartbeat on db1018 is CRITICAL: CRIT replication delay 691 seconds [21:22:40] PROBLEM - MySQL Replication Heartbeat on db1034 is CRITICAL: CRIT replication delay 941 seconds [21:23:12] mutante: hey, still around ? [21:24:00] PROBLEM - MySQL Slave Delay on db1034 is CRITICAL: CRIT replication delay 1021 seconds [21:24:10] PROBLEM - MySQL Replication Heartbeat on db1002 is CRITICAL: CRIT replication delay 1031 seconds [21:24:30] RECOVERY - MySQL Replication Heartbeat on db1043 is OK: OK replication delay 0 seconds [21:29:05] RECOVERY - MySQL Slave Delay on db1043 is OK: OK replication delay 0 seconds [21:29:45] between nagios and gerrit, this channel is useless. [21:37:15] RECOVERY - MySQL Slave Delay on db1033 is OK: OK replication delay 0 seconds [21:40:25] PROBLEM - Puppet freshness on owa3 is CRITICAL: Puppet has not run in the last 10 hours [21:40:35] PROBLEM - MySQL Slave Delay on db1002 is CRITICAL: CRIT replication delay 929 seconds [21:42:03] New patchset: Ottomata; "Renaming the concept of variables to 'traits'. Allowing trait_sets to be specified so that we don't record HUGE amounts of data." [analytics/reportcard] (otto/pipeline) - https://gerrit.wikimedia.org/r/2467 [21:42:04] New patchset: Ottomata; "Adding loader.py - first hacky loader, just so we can get some data into mysql to work with." [analytics/reportcard] (otto/pipeline) - https://gerrit.wikimedia.org/r/2468 [21:45:25] PROBLEM - Puppet freshness on owa1 is CRITICAL: Puppet has not run in the last 10 hours [21:46:35] RECOVERY - MySQL Replication Heartbeat on db1033 is OK: OK replication delay 0 seconds [21:47:10] hrm, so owa1 had run puppet 6 minutes before this page [21:47:15] PROBLEM - Puppet freshness on owa2 is CRITICAL: Puppet has not run in the last 10 hours [21:51:17] New patchset: Ottomata; "device_pipeline.py - comments about hackyness" [analytics/reportcard] (otto/pipeline) - https://gerrit.wikimedia.org/r/2469 [21:55:10] ah interesting, so it looks like the issues keep bumping up right after nagios thinks it has stale data, tries to "nudge" the check and it's nudged for too long [21:55:55] PROBLEM - MySQL Slave Delay on db1018 is CRITICAL: CRIT replication delay 774 seconds [22:02:51] New patchset: Lcarr; "Upping concurrent nagios service checks This should avoid false alarms Also removing bonding.conf which is irrelevant" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2470 [22:03:55] RECOVERY - MySQL Replication Heartbeat on db1047 is OK: OK replication delay 0 seconds [22:04:15] any comments on the increase to concurrent nagios service checks ? [22:05:10] * maplebed bets it'll fail. [22:05:25] (I think nagios is running behind regardless of concurrency) [22:05:33] but hey, probably can't hurt either, so rock on. [22:05:55] RECOVERY - MySQL Slave Delay on db1047 is OK: OK replication delay 0 seconds [22:15:15] RECOVERY - MySQL Slave Delay on db1002 is OK: OK replication delay 0 seconds [22:18:25] RECOVERY - Mobile WAP site on ekrem is OK: HTTP OK HTTP/1.1 200 OK - 1642 bytes in 9.907 seconds [22:19:15] RECOVERY - MySQL Slave Delay on db1018 is OK: OK replication delay 0 seconds [22:22:01] !log adding community-analytics.wikimedia.org to DNS [22:22:03] Logged the message, Master [22:22:55] . . . and there goes ns3 faceplanting [22:23:08] oh nm. ha. [22:24:12] the fact that we have a hostname for ns3 pointing to a dead IP always confuses me [22:36:17] New patchset: Jgreen; "enabled community-analytics virtual server" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2471 [22:36:38] New review: Jgreen; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2471 [22:36:38] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2471 [22:43:49] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2470 [22:43:50] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2470 [22:47:15] RECOVERY - HTTP on ekrem is OK: HTTP OK HTTP/1.1 200 OK - 453 bytes in 0.130 seconds [22:56:57] New patchset: Ottomata; "Movied wurfl.py file" [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2472 [22:56:58] New patchset: Ottomata; "More documenation, added tests for AccessLogPipeline methods." [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2473 [22:56:59] New patchset: Ottomata; "device_pipeline.py - getting rid of debug() function, using main()." [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2474 [22:57:00] New patchset: Ottomata; "Hacky first work on loader classes." [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2475 [22:57:02] New patchset: Ottomata; "base.py - adding schema in comments. Got lots of work to do to make this prettier" [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2476 [22:57:03] New patchset: Ottomata; "Renaming the concept of variables to 'traits'. Allowing trait_sets to be specified so that we don't record HUGE amounts of data." [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2477 [22:57:04] New patchset: Ottomata; "Adding loader.py - first hacky loader, just so we can get some data into mysql to work with." [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2478 [22:57:05] New patchset: Ottomata; "device_pipeline.py - comments about hackyness" [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2479 [22:57:29] hmm, that was a merge to master [23:00:20] New patchset: Ottomata; "pipeline/user_agent.py - adding comment that this file should not be used" [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2480 [23:12:43] RECOVERY - MySQL Replication Heartbeat on db42 is OK: OK replication delay 0 seconds [23:12:53] PROBLEM - MySQL Slave Delay on db1018 is CRITICAL: CRIT replication delay 1661 seconds [23:13:26] New patchset: Diederik; "Adding dependency list for virtualenv" [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2481 [23:14:13] RECOVERY - MySQL Slave Delay on db42 is OK: OK replication delay 0 seconds [23:14:28] New review: Ottomata; "(no comment)" [analytics/reportcard] (master); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2481 [23:16:23] PROBLEM - MySQL Slave Delay on db1002 is CRITICAL: CRIT replication delay 1871 seconds [23:17:34] New patchset: Lcarr; "Increasing the max concurrent checks on nagios to 128" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2482 [23:19:03] PROBLEM - Mobile WAP site on ekrem is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:19:36] New review: Lcarr; "Since 96 has no real impact on the box memory and cpu-wise, trying 128" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2482 [23:19:36] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2482 [23:21:05] New review: Diederik; "Ok" [analytics/reportcard] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/2481 [23:21:05] Change merged: Diederik; [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2481 [23:21:38] New review: Diederik; "Ok." [analytics/reportcard] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2478 [23:23:00] New review: Diederik; "Ok." [analytics/reportcard] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2479 [23:23:25] New review: Diederik; "Ok." [analytics/reportcard] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2477 [23:23:50] New review: Diederik; "Ok." [analytics/reportcard] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2480 [23:24:06] New review: Diederik; "Ok." [analytics/reportcard] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2476 [23:24:18] New review: Diederik; "Ok." [analytics/reportcard] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2475 [23:24:31] New review: Diederik; "Ok." [analytics/reportcard] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2474 [23:24:59] New review: Diederik; "Ok." [analytics/reportcard] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2473 [23:25:23] New review: Diederik; "Ok." [analytics/reportcard] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2472 [23:25:23] Change merged: Diederik; [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2474 [23:25:23] Change merged: Diederik; [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2473 [23:25:24] Change merged: Diederik; [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2472 [23:25:38] New review: Diederik; "Ok." [analytics/reportcard] (otto/pipeline); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2469 [23:25:52] New review: Diederik; "Ok." [analytics/reportcard] (otto/pipeline); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2468 [23:26:28] New review: Diederik; "Ok." [analytics/reportcard] (otto/pipeline); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2467 [23:26:29] Change merged: Diederik; [analytics/reportcard] (otto/pipeline) - https://gerrit.wikimedia.org/r/2469 [23:26:29] Change merged: Diederik; [analytics/reportcard] (otto/pipeline) - https://gerrit.wikimedia.org/r/2468 [23:26:29] Change merged: Diederik; [analytics/reportcard] (otto/pipeline) - https://gerrit.wikimedia.org/r/2467 [23:28:13] PROBLEM - Disk space on ms1002 is CRITICAL: DISK CRITICAL - free space: /export/upload 762855 MB (3% inode=98%): [23:34:53] RECOVERY - Mobile WAP site on ekrem is OK: HTTP OK HTTP/1.1 200 OK - 1642 bytes in 8.363 seconds [23:43:50] PROBLEM - HTTP on ekrem is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:57:37] New patchset: Lcarr; "More nagios tweaking upped parallel service checks to 192 and now will process them every 9 seconds instead of every 10" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2483 [23:59:40] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2483 [23:59:40] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2483