[00:07:33] nighty ! [00:15:49] new happy year [00:38:24] RECOVERY - Lucene on search6 is OK: TCP OK - 9.005 second response time on port 8123 [00:49:34] !log removed some logs on search6 to fix /a disk space exhaustion [00:49:44] Logged the message, Master [00:54:03] Hi I am not able to get wikipedia logo using class="wikilogo" [00:55:38] could you please see :http://bit.ly/uRbvjf ? [00:58:14] RECOVERY - LVS Lucene on search-pool2.svc.pmtpa.wmnet is OK: TCP OK - 0.007 second response time on port 8123 [01:00:23] why would that work? [01:37:31] !log increased FD limit on search6 and restarted lsearchd [01:37:33] Logged the message, Master [01:39:16] !log adjusted FD limit in /etc/init.d/lsearchd on all search servers with sed [01:39:17] Logged the message, Master [01:55:27] PROBLEM - DPKG on search11 is CRITICAL: DPKG CRITICAL dpkg reports broken packages [01:59:46] PROBLEM - DPKG on search7 is CRITICAL: DPKG CRITICAL dpkg reports broken packages [02:04:30] !log LocalisationUpdate completed (1.18) at Sun Jan 1 02:04:30 UTC 2012 [02:04:32] Logged the message, Master [03:24:54] Betacommand: are you related to Garnig ? [03:25:40] jeremyb: yeah [03:26:03] jeremyb: its my toolserver irssi screen [03:27:14] jeremyb: any other questions? [03:27:45] Betacommand: just looking at ppl in another channel to try to find ppl that don't belong :) [03:27:55] Betacommand: i thought maybe you grew an impersonator [03:28:01] jeremyb: Nope [03:57:57] RECOVERY - DPKG on search7 is OK: All packages OK [03:59:05] !log fixed broken package on search7 and search11 [03:59:07] Logged the message, Master [04:01:17] RECOVERY - DPKG on search11 is OK: All packages OK [04:37:47] PROBLEM - MySQL slave status on es1004 is CRITICAL: CRITICAL: Slave running: expected Yes, got No [05:44:46] RECOVERY - Disk space on search6 is OK: DISK OK [06:38:17] PROBLEM - Puppet freshness on es1002 is CRITICAL: Puppet has not run in the last 10 hours [07:02:37] PROBLEM - Disk space on search6 is CRITICAL: DISK CRITICAL - free space: /a 5206 MB (3% inode=99%): [07:35:47] PROBLEM - LVS Lucene on search-pool1.svc.pmtpa.wmnet is CRITICAL: Connection timed out [08:01:08] PROBLEM - Lucene on search3 is CRITICAL: Connection timed out [08:04:08] PROBLEM - Lucene on search4 is CRITICAL: Connection timed out [08:04:59] PROBLEM - Lucene on search9 is CRITICAL: Connection timed out [08:05:59] PROBLEM - Lucene on search1 is CRITICAL: Connection timed out [08:13:58] PROBLEM - Disk space on search6 is CRITICAL: DISK CRITICAL - free space: /a 5121 MB (3% inode=99%): [08:35:38] RECOVERY - LVS Lucene on search-pool1.svc.pmtpa.wmnet is OK: TCP OK - 8.994 second response time on port 8123 [09:08:18] PROBLEM - LVS Lucene on search-pool1.svc.pmtpa.wmnet is CRITICAL: Connection timed out [09:35:09] PROBLEM - Disk space on search6 is CRITICAL: DISK CRITICAL - free space: /a 5056 MB (3% inode=99%): [09:41:59] RECOVERY - Lucene on search3 is OK: TCP OK - 0.002 second response time on port 8123 [09:45:24] RECOVERY - Disk space on search6 is OK: DISK OK [09:48:04] RECOVERY - Lucene on search4 is OK: TCP OK - 0.001 second response time on port 8123 [09:48:04] RECOVERY - Lucene on search9 is OK: TCP OK - 0.003 second response time on port 8123 [09:49:44] RECOVERY - LVS Lucene on search-pool1.svc.pmtpa.wmnet is OK: TCP OK - 0.001 second response time on port 8123 [09:49:45] RECOVERY - Lucene on search1 is OK: TCP OK - 0.000 second response time on port 8123 [09:51:44] PROBLEM - Disk space on es1004 is CRITICAL: DISK CRITICAL - free space: /a 453075 MB (3% inode=99%): [09:52:44] PROBLEM - MySQL disk space on es1004 is CRITICAL: DISK CRITICAL - free space: /a 448042 MB (3% inode=99%): [10:00:34] RECOVERY - MySQL slave status on es1004 is OK: OK: [10:14:35] hi. hypothetical question: can I rebuild a fully functional local mirror of Wikipedia if I a) download the mediawiki software, b) download the database XML dumps at dumps.wikimedia.org, c) download the required uploaded images? [10:14:59] yes [10:15:35] it may take a while to get all the article revisions in and for the rebuild of all the link tables but if you have the patience you can certainly do it [10:22:16] apergos: oh ok. what exactly are link tables? The "What articles/pages use this image?" metadata? [10:22:55] so there are a pile of mysql tables which contain among other things lists of links contained to images, links to other pages, links to exernal sites, etc etc [10:23:09] anyways there are scripts you run to rebuild these after an initial import of the data [10:23:22] the scripts take a long time [10:23:34] alternatively we provide dumps of the tables along with the dumps of the data [10:24:16] Got it. [10:31:14] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [11:06:10] http://en.planet.wikimedia.org/ seems to be redirecting to https://contacts.wikimedia.org/ [11:07:42] again? We just fixed that so it didn't redirect to the blog [11:07:55] and by we I mean other people [12:55:13] Um, is there an alternative entry URL for http://en.planet.wikimedia.org? [12:57:22] (I assume it's not just me forgetting the correct URL, but actually broken at the moment.) [13:08:23] PROBLEM - Apache HTTP on srv278 is CRITICAL: Connection refused [13:11:22] I don't even know who handles DNS issues though [13:27:53] RECOVERY - Apache HTTP on srv278 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.034 second response time [15:48:58] PROBLEM - Auth DNS on ns0.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [16:01:38] RECOVERY - Auth DNS on ns0.wikimedia.org is OK: DNS OK: 7.887 seconds response time. www.wikipedia.org returns 208.80.152.201 [16:17:16] any devs that can run a script around? [16:41:33] wikipedia-lb.esams.wikimedia.org (91.198.174.225 doesn't like to serve me a de.wikipedia page :( [16:41:44] anybody else with problems? [16:45:48] works again [16:53:01] PROBLEM - Puppet freshness on es1002 is CRITICAL: Puppet has not run in the last 10 hours [19:52:28] zzz [20:26:44] Nikerabbit: are you around, I'm having problems with Special:Translate [20:27:08] https://meta.wikimedia.org/wiki/Translations:Stewards/Elections_2012/Guidelines/3/es [20:27:17] gives me PHP fatal error in /usr/local/apache/common-local/php-1.18/extensions/Translate/TranslateEditAddons.php line 37: [20:27:19] Call to a member function initCollection() on a non-object [20:30:42] Thehelpfulone, lemme look a bit, he's busy now [20:30:57] thanks Nemo_bis [20:31:09] and now it's back up.. :S [20:31:29] he mentioned how it has to go in the job queue first - perhaps that caused the bug? [20:32:45] did you see the error just opening it? [20:33:00] I've clicked almost immediately after you linked it and I didn't see any error [20:34:43] it was about 2 minutes later [20:41:18] PROBLEM - Auth DNS on ns2.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [20:45:09] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [20:58:19] Ryan_Lane: i think ns2 wants a kick [20:58:30] I already kicked it [20:58:35] I think it's nagios misreporting [20:58:38] it answers fine [20:58:48] okey :) [21:01:06] Nemo_bis: Translate shows up in the RC feed, but it doesn't respect the flood flag [21:01:10] https://meta.wikimedia.org/wiki/Special:RecentChanges [21:01:29] I added flood to myself at 20:39 [21:01:53] is there any way to fix this - perhaps Ryan_Lane too ^ [21:02:28] Thehelpfulone, do you mean edits to Stewards/Elections 2012/Guidelines/es‎‎ ? [21:02:33] yes [21:03:57] it's the same with https://meta.wikimedia.org/w/index.php?namespace=&tagfilter=&translations=only&title=Special%3ARecentChanges [21:04:53] Thehelpfulone: fix what? [21:05:18] doesn't look like something for Ryan :) [21:05:44] Log a bug [21:06:01] the page shows up, data is there. seems like its working from my perspective [21:06:17] Thehelpfulone, just go on, not a big problem [21:06:34] Ryan_Lane: the flood flag means that your contributions *don't* show up in recent changes - but with translations they seem to [21:06:37] people will just use enhanced RC ;) [21:06:45] RECOVERY - Auth DNS on ns2.wikimedia.org is OK: DNS OK: 9.243 seconds response time. www.wikipedia.org returns 208.80.152.201 [21:06:54] yeah, not something I can solve [21:06:56] I don't do dev [21:07:02] okay [21:07:05] well, not for the sites anyway [21:24:57] PROBLEM - Recursive DNS on 208.80.152.131 is CRITICAL: CRITICAL - Plugin timed out while executing system call [22:20:47] PROBLEM - Recursive DNS on 91.198.174.6 is CRITICAL: CRITICAL - Plugin timed out while executing system call [22:27:27] RECOVERY - Recursive DNS on 208.80.152.131 is OK: DNS OK: 7.516 seconds response time. www.wikipedia.org returns 208.80.152.201 [22:32:58] RECOVERY - Recursive DNS on 91.198.174.6 is OK: DNS OK: 8.879 seconds response time. www.wikipedia.org returns 91.198.174.225 [23:03:06] PROBLEM - check_job_queue on spence is CRITICAL: (Service Check Timed Out) [23:19:19] PROBLEM - Recursive DNS on 91.198.174.6 is CRITICAL: CRITICAL - Plugin timed out while executing system call [23:21:32] PROBLEM - Auth DNS on ns1.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [23:27:10] PROBLEM - Auth DNS on ns0.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [23:37:42] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [23:38:03] RECOVERY - check_job_queue on spence is OK: JOBQUEUE OK - all job queues below 10,000 [23:46:19] PROBLEM - Auth DNS on ns2.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [23:47:16] Reedy: if you are online: Can you please check if nagios is lying again or if there are realy problems with the dns-servers? [23:48:17] DaBPunkt, Ryan_Lane checked earlier and it seemed it was nagios lying [23:48:55] Reedy: I saw that, but is was only ns2 at that time [23:49:17] but I get no reports about "unreachables", so I guess it is lying [23:51:11] Look responsive to me [23:52:01] PROBLEM - Recursive DNS on 208.80.152.131 is CRITICAL: CRITICAL - Plugin timed out while executing system call