[00:02:03] PROBLEM - gitblit.wikimedia.org on antimony is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Server Error - 1703 bytes in 6.668 second response time [00:14:22] RECOVERY - gitblit.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 133887 bytes in 7.264 second response time [01:03:23] PROBLEM - RAID on analytics1010 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:05:13] RECOVERY - RAID on analytics1010 is OK: OK: Active: 6, Working: 6, Failed: 0, Spare: 0 [01:06:03] PROBLEM - RAID on analytics1009 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:06:53] RECOVERY - RAID on analytics1009 is OK: OK: Active: 6, Working: 6, Failed: 0, Spare: 0 [01:32:57] !log sugar down for move to labs [01:33:04] Logged the message, Master [01:51:23] PROBLEM - Puppet freshness on lvs3001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [01:51:23] PROBLEM - Puppet freshness on lvs3002 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [01:51:23] PROBLEM - Puppet freshness on lvs3003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [01:51:23] PROBLEM - Puppet freshness on lvs3004 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [02:12:58] !log LocalisationUpdate completed (1.23wmf20) at 2014-04-06 02:12:57+00:00 [02:13:06] Logged the message, Master [02:18:43] !log LocalisationUpdate completed (1.23wmf21) at 2014-04-06 02:18:43+00:00 [02:18:48] Logged the message, Master [02:35:07] (03Abandoned) 10Ori.livneh: applicationserver: retab [operations/puppet] - 10https://gerrit.wikimedia.org/r/96746 (owner: 10Ori.livneh) [02:35:44] (03Abandoned) 10Ori.livneh: (WIP) rewrite mwprof in Go [operations/software/mwprof] - 10https://gerrit.wikimedia.org/r/98602 (owner: 10Ori.livneh) [02:36:39] (03PS4) 10Ori.livneh: Add Icinga checks for important sysctl params [operations/puppet] - 10https://gerrit.wikimedia.org/r/111163 [02:39:39] (03Abandoned) 10Ori.livneh: Geo-cookie: enable on one production text Varnish (cp1066) [operations/puppet] - 10https://gerrit.wikimedia.org/r/119098 (owner: 10Ori.livneh) [02:39:53] (03Abandoned) 10Ori.livneh: No-op commit to test grrrit-wm [operations/puppet] - 10https://gerrit.wikimedia.org/r/119233 (owner: 10Ori.livneh) [02:41:22] (03Abandoned) 10Ori.livneh: Set AuthName for graphite/icinga/ishmael to 'Wikimedia Labs' [operations/puppet] - 10https://gerrit.wikimedia.org/r/81630 (owner: 10Ori.livneh) [02:42:12] (03Abandoned) 10Ori.livneh: single-threaded, no locks, fully async [operations/software/mwprof] - 10https://gerrit.wikimedia.org/r/104345 (owner: 10Ori.livneh) [02:53:16] !log LocalisationUpdate ResourceLoader cache refresh completed at Sun Apr 6 02:53:13 UTC 2014 (duration 53m 12s) [02:53:22] Logged the message, Master [03:56:43] PROBLEM - Packetloss_Average on erbium is CRITICAL: packet_loss_average CRITICAL: 15.406772449 [04:00:43] RECOVERY - Packetloss_Average on erbium is OK: packet_loss_average OKAY: 3.961286875 [04:52:23] PROBLEM - Puppet freshness on lvs3001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [04:52:23] PROBLEM - Puppet freshness on lvs3002 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [04:52:23] PROBLEM - Puppet freshness on lvs3003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [04:52:23] PROBLEM - Puppet freshness on lvs3004 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [04:58:23] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [04:59:13] RECOVERY - RAID on searchidx1001 is OK: OK: optimal, 1 logical, 4 physical [05:21:04] (03PS1) 1001tonythomas: Styled the alias field value differently [wikimedia/bugzilla/modifications] - 10https://gerrit.wikimedia.org/r/124140 [05:46:43] PROBLEM - Packetloss_Average on emery is CRITICAL: packet_loss_average CRITICAL: 10.6596448437 [05:50:43] RECOVERY - Packetloss_Average on emery is OK: packet_loss_average OKAY: -0.0586478125 [07:19:16] (03PS6) 10Ori.livneh: Gzip SVGs on front & back upload varnishes [operations/puppet] - 10https://gerrit.wikimedia.org/r/108484 [07:20:41] (03CR) 10Ori.livneh: "PS6: Apply VCL to backend Varnishes only. BBlack, what sort of testing do we need to do? Can I help?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/108484 (owner: 10Ori.livneh) [07:30:00] (03CR) 10Giuseppe Lavagetto: "It needed one more gitlbit restart however... I'd say we wait a week or so before we call it off" [operations/puppet] - 10https://gerrit.wikimedia.org/r/123848 (owner: 10Dzahn) [07:53:23] PROBLEM - Puppet freshness on lvs3001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [07:53:23] PROBLEM - Puppet freshness on lvs3002 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [07:53:23] PROBLEM - Puppet freshness on lvs3003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [07:53:23] PROBLEM - Puppet freshness on lvs3004 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [08:02:33] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [08:11:06] (03Abandoned) 10Ori.livneh: MySQL gmond module: handle server restarts [operations/puppet] - 10https://gerrit.wikimedia.org/r/98068 (owner: 10Ori.livneh) [09:41:33] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [10:54:23] PROBLEM - Puppet freshness on lvs3001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [10:54:23] PROBLEM - Puppet freshness on lvs3002 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [10:54:23] PROBLEM - Puppet freshness on lvs3003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [10:54:23] PROBLEM - Puppet freshness on lvs3004 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [11:56:43] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [12:15:51] (03PS7) 10Ori.livneh: Gzip SVGs on back upload varnishes [operations/puppet] - 10https://gerrit.wikimedia.org/r/108484 [13:55:23] PROBLEM - Puppet freshness on lvs3001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [13:55:23] PROBLEM - Puppet freshness on lvs3002 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [13:55:23] PROBLEM - Puppet freshness on lvs3004 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [13:55:23] PROBLEM - Puppet freshness on lvs3003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [14:08:25] hmm I suspect that's not true, icinga-wm [14:08:48] but thanks for the effort to break silence [15:47:12] (03PS2) 10Tim Landscheidt: Tools: Alias tools.wmflabs.org to internal webproxy [operations/puppet] - 10https://gerrit.wikimedia.org/r/123149 [16:06:19] https://en.wikipedia.org/w/index.php?diff=602800100 [16:06:21] what the... [16:56:23] PROBLEM - Puppet freshness on lvs3001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [16:56:23] PROBLEM - Puppet freshness on lvs3002 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [16:56:23] PROBLEM - Puppet freshness on lvs3003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [16:56:23] PROBLEM - Puppet freshness on lvs3004 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [17:17:45] <_joe|away> ori: your plan is to put varnish in front of gitblit, right? That is very nice and may help, but the problem we had last friday was not due to high load [17:18:28] <_joe|away> gitblit just got out of free memory due to a memory leak (i've seen thousands of Nullpointer Exceptions in the logs) [17:19:22] <_joe|away> since most of the heap becomes non-freeable by GC, GC cycles start to be more frequent and last longer - rendering eventualy the application unusable [17:19:41] <_joe|away> but that happened independently of any request load. [19:52:23] (03PS1) 10Odder: Create a FeaturedFeed for the Tech News bulletin [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/124272 [19:57:14] (03CR) 10Brian Wolff: "This also needs to modify 'wmgUseFeaturedFeeds' in InitialiseSettings.php, and possibly wmgFeaturedFeedsOverrides to make it weekly and po" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/124272 (owner: 10Odder) [19:57:23] PROBLEM - Puppet freshness on lvs3001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [19:57:23] PROBLEM - Puppet freshness on lvs3002 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [19:57:23] PROBLEM - Puppet freshness on lvs3003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [19:57:23] PROBLEM - Puppet freshness on lvs3004 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [19:57:32] (03CR) 10Nemo bis: "Were all those system messages created already on wiki?" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/124272 (owner: 10Odder) [19:57:42] (03PS2) 10Odder: Create a FeaturedFeed for the Tech News bulletin [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/124272 [20:05:43] (03PS3) 10Odder: Create a FeaturedFeed for the Tech News bulletin [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/124272 [20:05:52] (03CR) 10jenkins-bot: [V: 04-1] Create a FeaturedFeed for the Tech News bulletin [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/124272 (owner: 10Odder) [20:06:16] (03CR) 10Frédéric Wang: "Moritz: so I guess this change is now abandoned?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/61767 (owner: 10Physikerwelt) [20:08:01] (03PS4) 10Odder: Create a FeaturedFeed for the Tech News bulletin [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/124272 [20:09:30] (03CR) 10Frédéric Wang: "Moritz: What is the status here?" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110240 (owner: 10Physikerwelt) [20:37:15] (03PS1) 10Odder: Add New York Public Library to wgCopyUploadsDomains [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/124273 [20:39:55] (03CR) 10MaxSem: [C: 04-1] Create a FeaturedFeed for the Tech News bulletin (032 comments) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/124272 (owner: 10Odder) [20:48:16] (03CR) 10Odder: Create a FeaturedFeed for the Tech News bulletin (032 comments) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/124272 (owner: 10Odder) [21:05:22] (03CR) 10Brian Wolff: Create a FeaturedFeed for the Tech News bulletin (031 comment) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/124272 (owner: 10Odder) [21:07:30] (03PS5) 10Odder: Create a FeaturedFeed for the Tech News bulletin [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/124272 [21:16:56] (03CR) 10Odder: "Relevant messages are already available on Meta." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/124272 (owner: 10Odder) [22:12:30] (03Abandoned) 10Physikerwelt: Initial version of puppet script for LaTeXML [operations/puppet] - 10https://gerrit.wikimedia.org/r/61767 (owner: 10Physikerwelt) [22:46:33] PROBLEM - MySQL InnoDB on db1047 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [22:47:23] RECOVERY - MySQL InnoDB on db1047 is OK: OK longest blocking idle transaction sleeps for 0 seconds [22:58:23] PROBLEM - Puppet freshness on lvs3001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [22:58:23] PROBLEM - Puppet freshness on lvs3002 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [22:58:23] PROBLEM - Puppet freshness on lvs3003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [22:58:23] PROBLEM - Puppet freshness on lvs3004 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [23:30:20] varnish errors on wmflabs [23:30:45] Request: GET http://en.wikipedia.beta.wmflabs.org/wiki/, from 82.75.221.191 via deployment-cache-text02 frontend ([10.68.16.16]:80), Varnish XID 1539887359 [23:30:48] Forwarded for: 82.75.221.191 [23:30:50] Error: 503, Service Unavailable at Sun, 06 Apr 2014 23:29:55 GMT [23:36:34] thedj: works for me (slow, though) [23:36:45] seems like I lost root on beta?! [23:37:14] hoo: lost or misplaced? :p [23:37:33] sudo rm -rf JohnLewis #trolling [23:38:34] hoo: /mode #wikidata +b *!*@wikipedia/Hoo-man? :p [23:39:01] Guess I could unban myself :P [23:39:10] Was about to write unban mysql... wtf brain [23:39:32] hoo: Not in #wikidata >:D [23:40:21] mh [23:40:36] So, shall we call it even? [23:40:37] we should fix that, but I don't care enough :P [23:44:15] night hoo :) [23:44:24] good night ;)