[00:04:16] mediawiki, you mean? [00:05:09] sorry mediawiki, apologies [01:01:53] New patchset: Lcarr; "Adding in ganglia apache file temporarily using nickel.wikimedia.org in file for testing purposes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1787 [01:09:17] New patchset: Lcarr; "Adding in ganglia apache file temporarily using nickel.wikimedia.org in file for testing purposes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1787 [01:09:32] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1787 [01:09:33] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1787 [01:27:03] New patchset: Lcarr; "adding in rrdtool to ganglia::web" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1789 [01:29:48] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1789 [01:29:48] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1789 [01:37:07] PROBLEM - Misc_Db_Lag on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1143s [01:43:17] PROBLEM - MySQL replication status on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1492s [02:05:04] !log LocalisationUpdate completed (1.18) at Thu Jan 5 02:05:03 UTC 2012 [02:05:12] Logged the message, Master [02:05:16] PROBLEM - ps1-a5-sdtpa-infeed-load-tower-A-phase-Y on ps1-a5-sdtpa is CRITICAL: ps1-a5-sdtpa-infeed-load-tower-A-phase-Y CRITICAL - *2650* [02:08:27] New patchset: Ryan Lane; "We aren't using nova-volume right now, and it throws errors since it's not configured. Remove it." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1791 [02:08:44] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/1791 [02:08:45] New patchset: Ryan Lane; "Adding requires to all services." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1792 [02:09:02] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1791 [02:09:02] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1791 [02:09:15] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1792 [02:09:16] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1792 [02:20:46] RECOVERY - MySQL replication status on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 0s [02:26:06] RECOVERY - Misc_Db_Lag on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 0s [02:54:36] RECOVERY - ps1-a5-sdtpa-infeed-load-tower-A-phase-Y on ps1-a5-sdtpa is OK: ps1-a5-sdtpa-infeed-load-tower-A-phase-Y OK - 2388 [03:01:42] New patchset: Ryan Lane; "Make a ganglia cluster for the virt cluster." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1793 [03:02:03] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1793 [03:02:04] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1793 [04:19:22] RECOVERY - Disk space on es1004 is OK: DISK OK [04:20:02] RECOVERY - MySQL disk space on es1004 is OK: DISK OK [04:28:32] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [04:37:02] PROBLEM - MySQL slave status on es1004 is CRITICAL: CRITICAL: Slave running: expected Yes, got No [06:01:57] PROBLEM - Puppet freshness on es1002 is CRITICAL: Puppet has not run in the last 10 hours [09:54:01] PROBLEM - Disk space on es1004 is CRITICAL: DISK CRITICAL - free space: /a 430286 MB (3% inode=99%): [09:57:41] PROBLEM - MySQL disk space on es1004 is CRITICAL: DISK CRITICAL - free space: /a 416314 MB (3% inode=99%): [10:02:11] RECOVERY - MySQL slave status on es1004 is OK: OK: [12:22:41] apergos: here? [12:23:02] yes but about to get food [12:23:04] what's up? [12:23:27] yesterday we got the machine we use for deployment.wmflabs crashed for some reason and I am trying to resume import [12:23:35] oh boy [12:23:41] :o [12:23:49] wre you running importDump? [12:23:57] but it always eat 1.8gb of ram and crash again [12:24:02] yes [12:24:07] on what project? [12:24:15] PROBLEM - Puppet freshness on lvs1004 is CRITICAL: Puppet has not run in the last 10 hours [12:24:25] http://deployment.wmflabs.org/simple_wiki/wiki/ target source is simple wikipedia full [12:24:45] I don't know why vm crashed yesterday... it rebooted [12:25:08] history file? [12:25:17] I tried --debug but I get no output from import [12:25:41] history for what? [12:26:04] you're importing the meta-istory file or some other one? [12:26:18] yes [12:26:41] I started php w/maintenance/importDump.php --report 1 --debug /tmp/simplewiki-latest-pages-meta-history.xml.bz2 [12:27:03] it produced no output and crashed after 2 minutes [12:27:47] I have no clue if it's a bug of importDump, it just eat too much ram for some reason [12:28:05] I think it stuck in a loop allocating more memory [12:28:23] it's mediawiki 1.19 so it's possible there is a bug [12:28:29] sure is [12:28:34] well I would use mwdumper or [12:28:41] oh you're on 1.19 [12:28:43] mm [12:28:46] mwdumper need more space than we have on labs right now [12:28:57] Ryan ordered new hardware, so maybe later [12:29:14] if you write the output to a bz2 compressed file it will be just fine [12:29:16] for dumper I would need to extract the dump and then convert it to sql [12:29:28] that is what mwdumper does [12:29:34] you mean read input from bz2 and output write to it too? [12:29:36] butt here's a perl script you can use, dos the same thing [12:29:43] aha [12:29:45] you might need to tweak it for 1.19 [12:29:50] where is it, svn? [12:29:57] bzcat the input, bzip2 the output [12:30:02] lemme look, (no it's on meta) [12:30:12] yes, that would work [12:30:15] we don't maintain it but I used it the last time I needed to do an iport and it was great [12:30:22] but I have only 20gb of space on machine I run import on [12:30:31] 80 gb on sql server [12:30:34] uh huh [12:30:53] that machine I run import on is only apache server [12:30:56] http://meta.wikimedia.org/w/index.php?title=Data_dumps/mwimport&action=history [12:31:04] this script should import stuff from 1.18 fine [12:31:24] but I could create a new vm for that with more space but I just hope I wouldn't eat too much space because we are on the edge now [12:31:37] then you'll need to do things like populate sha1 hash etc to make it 1.19 compliant [12:31:40] ok I will try it I guess I have to remove current db? [12:31:52] yeah, just drop all the ... well actually [12:31:57] maybe it would be best to import it to 1.18 then run update [12:32:05] look at the output first, I think it might drop and create all thet tables anyhow [12:32:11] sure [12:32:12] ok [12:32:39] I will create a new wiki for this so in worse case I drop back to what we have now, hexmode wanted to have it available asap [12:32:49] ok [12:33:07] this wiki on deployment has 200+ content pages so it's better than nothing [12:33:09] otherwise you can try building a copy of mwdumper and see if you can get it working [12:33:16] ok [12:34:06] 3 007 183 revisions [12:34:12] that's a chunk of data [12:34:21] I need to talk to Ryan and find out why machines are randomly getting restarted... that suck [12:35:05] well it's still very beta you know [12:35:16] yes I know that's why he should know it [12:35:26] yup [12:35:37] we don't have a bug tracker yet, maybe I could use bz for now [12:35:51] is there a labs compoment ? [12:35:55] if so yeah I would use it [12:36:13] yes there is but I think there was an idea to merge RT from wikitech with labs [12:36:57] 76,644 content pages [12:37:00] that's plenty [12:37:05] yes I know [12:37:06] 231,545 overall [12:37:19] no onder though that importdump is working so hard, 3 million revs is a lot [12:37:22] poor thing ;-) [14:37:10] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [14:39:00] RECOVERY - Memcached on srv290 is OK: TCP OK - 0.004 second response time on port 11000 [14:39:09] why upload import ends up with wikimedia error page or 504 bad gate? [14:49:10] RECOVERY - Memcached on srv193 is OK: TCP OK - 0.002 second response time on port 11000 [15:01:40] RECOVERY - Memcached on magnesium is OK: TCP OK - 0.031 second response time on port 11211 [15:16:50] RECOVERY - Puppet freshness on lvs1004 is OK: puppet ran at Thu Jan 5 15:16:37 UTC 2012 [15:19:53] (SQL-Abfrage versteckt) aus der Funktion „LocalFileDeleteBatch::doDBDeletes“. Die Datenbank meldete den Fehler „1213: Deadlock found when trying to get lock; try restarting transaction (10.0.6.32)“. [15:20:03] cant delete file versions at commons... [15:20:18] https://commons.wikimedia.org/w/index.php?title=File:Nokia_7650_I..JPG&action=delete tried to delete the whole file - but the file versions stay [15:21:20] now they are gone..?! [15:21:27] DB lag? [15:25:22] there's quite a lot of transient errors like that [15:25:58] when I tried to delete the remaining file versions I got told some error (don't remember) [15:26:11] apparently there are indeed lock / lag problems [15:40:36] upload.esams.wikimedia.org is still not listing on its ipv6-adress. Was at least the dns-record deleted? [16:10:57] PROBLEM - Puppet freshness on es1002 is CRITICAL: Puppet has not run in the last 10 hours [16:34:10] ACKNOWLEDGEMENT - MySQL master status on es1001 is CRITICAL: CRITICAL: Read only: expected OFF, got ON daniel_zahn RFC in RT #2216 [16:35:40] ACKNOWLEDGEMENT - Disk space on es1002 is CRITICAL: Connection refused by host daniel_zahn please update RT 2216 [16:49:10] ACKNOWLEDGEMENT - Host db19 is DOWN: PING CRITICAL - Packet loss = 100% daniel_zahn request for comment in !rt 2217 [16:55:40] PROBLEM - Host dataset1 is DOWN: CRITICAL - Host Unreachable (208.80.152.166) [17:09:26] apergos: is there a megapixel limit for image scaling? [17:09:44] any workaround, like letting sysops whitelist certain big images? [17:09:53] there is a point where you run out of memory [17:10:01] we will cut your head off at that point [17:10:13] well, we cut the process's head off [17:10:27] I forget what the ulimits are [17:11:15] well presumably there's more than one going at once? [17:12:11] the ulimits are per process [17:12:17] I don't mean general oom [17:12:40] right. i'm just thinking it can be a bigger img if it's the only thing running on the box [17:14:01] so you could e.g. have some way to push on a job queue (maybe by sysop?) and then whenever the scaling load is low enough 1 scaler can do one from the queue say every min or every 2 mins. and while that's running no other scalings are going. in between it's a normal scaler [17:14:55] apergos: (this is to fix some of https://commons.wikimedia.org/wiki/Special:ListFiles/Dominic btw :) ) [17:15:57] that sounds too complicated [17:16:18] just have a separate scaler that takes requests one at a time, you have something large, you can send it there [17:16:22] but be prepared to wait [17:16:35] yeah, sure be prepared to wait [17:16:54] the other scalers let's say this particular second there's one with less load [17:17:01] the next second there might be a spike in requests [17:17:07] (happens pretty often actually) [17:17:52] but the idea of adding to a job queue and having some dedicated box do them isn't so bad [17:17:54] i meant more like "ignore the big queue if we just killed some cached thumbs or had a scaling outage and we're still recovering" [17:18:41] the main thing was i expected the queue to not always be full and wanted the box to not be idle when it could be scaling small stuff [17:22:40] I see [17:49:14] hi [17:49:23] I need an mailinglist admin here [17:49:33] so a sysadmin with access to the WMF mailinglists [17:49:57] somebody accidentially sent a private mail to a mailinglist which needs to get removed [17:51:47] root-80686: and it's not already in public archives run by other ppl? (not wmf) [17:51:55] e.g. gmane and others [17:52:11] this list shouldn't be listed there [17:52:16] but I don't know [17:52:17] ok [17:52:31] (also hi and HNY) [17:52:31] I've got the request from the user and I would like to forward it [17:52:39] jeremyb: thanks, same to you ;-) [17:52:44] * jeremyb forwards it [17:53:08] !log hashar: gallium: cleaned /tmp . Our test suites leak a large amount of files :D [17:53:09] Logged the message, Master [17:53:37] jeremyb: thanks. I know this is annoying... [18:48:36] !log preilly synchronized php-1.18/extensions/MobileFrontend/javascripts/application.js [18:48:37] Logged the message, Master [18:48:56] !log preilly synchronized php-1.18/extensions/MobileFrontend/ApplicationTemplate.php [18:48:57] Logged the message, Master [18:48:59] !log pushing fix for js error on production [18:49:00] Logged the message, Master [18:58:44] win 25 [19:03:35] what allowed formats are there for lossless audio uplaod? [19:03:43] apergos: ^ ? [19:04:03] 05 19:02:09 < jeremyb> i see nothing relevant at $wgFileExtensions in CommonSettings.php ? [19:04:18] ogg [19:04:22] mid [19:04:27] ogg allows lossless? [19:04:39] mid of course but that's not for recordings [19:04:42] Heh [19:04:56] I was just looking what hte old upload form on commons said was allowed [19:05:11] * jeremyb consults [[ogg]] [19:05:48] oggflac [19:05:59] yeah, i see that. but does the player support it? [19:06:28] Maybe [19:06:33] File:Maqam Rast.flac.ogg [19:07:57] hrmmm. playback fails for me [19:08:04] in Firefox [19:09:41] haha, a PNG as an "other veresion" for a oggflac file [19:09:54] mindspillage: ^ [19:14:16] I have no idea if flac is supposed to be supported generally with the in-browser player but it doesn't work for me. (The file is fine though; I can download and play it.) [19:15:07] * jeremyb didn't actually try downloading it yet [19:34:46] New patchset: Catrope; "WIP for breaking out puppet-specific hooks to puppet.py" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1794 [19:35:14] Change abandoned: Catrope; "WIP, shouldn't actually be merged" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1794 [20:05:48] !log reedy synchronized php-1.18/extensions/ShortUrl 'Pushing ShortUrl files out' [20:05:49] Logged the message, Master [20:07:48] !log reedy synchronized wmf-config/InitialiseSettings.php 'wmgUseShortUrl' [20:07:48] Logged the message, Master [20:08:14] !log reedy synchronized wmf-config/CommonSettings.php 'wmgUseShortUrl' [20:08:15] Logged the message, Master [20:10:18] !log reedy synchronizing Wikimedia installation... : Update extensionmessages [20:10:20] Logged the message, Master [20:11:12] Reedy: are the apache changes live too? [20:11:36] No [20:11:39] It's only enabled on test [20:11:47] !log Created ShortUrl tables on testwiki [20:11:48] Logged the message, Master [20:12:22] sync done. [20:12:22] * Reedy waits for scap to finish updating message stuff [20:12:33] * jeremyb wonders why it would need a table [20:12:58] map short name to full name [20:13:07] 1 -> Wikipedia:What we do on this wiki [20:13:55] huh [20:14:22] > SpecialUrl is a special page extension that helps create shortened URLs for wiki pages, using their base36 encoded IDs [20:14:26] https://www.mediawiki.org/wiki/Extension:ShortUrl [20:14:43] [20:25:46] apergos: You around? [20:27:38] How easy/difficult would it be to get a full copy of Commons (the 14 TB)? How would we do that? [20:34:45] multichill, I guess it what you mean by get (all images, all versions etc)... And to where? [20:35:30] A contact in the Netherlands who wants to have a big dataset for research [20:36:06] 14TB is gonna take a while to transfer over the interwebs [20:36:18] And sending and recieving physical drives is just asking for trouble [20:37:48] If the data is coming from the NL datacenter it's pretty local traffic [20:38:00] (all 10G I think) [20:38:49] the upload squids in esams would have some, but certainly not all [20:39:24] It'd have to directly come from ms\d [20:39:30] I'm not sure, but isn't there an off site copy over there? [20:40:16] I don't think so [20:40:26] oh [20:40:30] ms6 is in esams.. [20:40:31] can't you just do the copy in DC and then carry the HDDs with you on the way home from GLAMcamp? or is DC not closer enough? the data's in tampa i guess [20:41:15] yeah [20:41:46] multichill, wikiteh suggests it might have some data from apr 2009 [20:41:55] gmaxwell used to have a completish copy himself. but that might be 5 years ago [20:41:58] ms6? [20:42:02] mmm [20:42:07] listed as being a test server by river [20:42:35] Getting it from the nl datacenter is easy compaired to getting it all from the US [20:42:36] [[wikitech:The ms srevers]] [20:42:47] Says remote upload backup, updated May 2009 [20:43:06] So that's about half [20:43:19] Would be a good start ;-) [20:43:26] heh [20:43:43] Have to get someone to confirm what's there [20:44:12] If it's a test server we can just borrow it for a couple of days :P [20:44:41] Ganglia suggests it's busy ish [20:44:46] it's being used to serve images right now [20:44:48] load just under 20%, constant network [20:45:29] esams squids hit ms6 before traversing the ocean [20:45:43] that makes sense [20:45:51] as a more permenant cache [20:46:09] multichill, just go in and remove it from the rack for a few days [20:46:14] no one will notice ;) [20:46:33] * multichill actually did that with a router this week [20:46:41] :D [20:47:22] One of my collegues did ask if I made a mess of Nagios ...... again [20:47:22] Reedy: can you reconcile what i said with needing a table? (shorturl) [20:48:04] jeremyb, hm? The table is a map from an integer, to ns/title [20:48:09] Ask yuvipanda, he wrote it [20:48:12] jeremyb: it is out of date. [20:48:17] i remember editing it... [20:48:55] Reedy: I'll just wait for apergos to wake up [20:48:58] When this is time for asking: Does anybody knows if the test-ipv6-entry for upload.esams.wikimedia.org or removed or not? [20:49:10] or → was [20:52:13] jeremyb: krinkle made that change a while ago, not sure why it's gone. will fix [20:58:46] yuvipanda: huh. well i thought i saw you talking about the method when i gave you the rewrite pointer and you said it used page id. maybe faulty memory [20:59:15] jeremyb: it *initially* used pageids, and then I posted it to wikitech-l and got told how that's a bad idea [20:59:19] so used a table. [20:59:34] orly. /me searches wikitech-l [20:59:39] krinkle then changed the wikipage, and i can see it in the history, but for some reason it isn't showing up [21:00:03] well look at the diff :) [21:00:15] i did [21:00:16] what's your email? [21:00:21] jeremyb: yuvipanda@gmail.com [21:00:49] was it more than 2 months ago? [21:00:52] i don't see it [21:01:06] in my mail [21:03:03] jeremyb: waaay more than 2 months ago :D [21:03:14] jeremyb: that was my first mw related hack, about ~9 months ago, IIRC [21:03:25] ohh [21:03:32] * jeremyb expands search [21:04:15] !log reedy synchronized wmf-config/InitialiseSettings.php 'wmgShortUrlPrefix' [21:04:15] Logged the message, Master [21:04:29] ohhh, you're GSoC [21:05:04] But isn't his gsoc project [21:05:06] !log reedy synchronized wmf-config/CommonSettings.php 'wmgShortUrlPrefix' [21:05:07] Logged the message, Master [21:05:10] * Reedy hides from yuvipanda about html5 [21:05:25] :D [21:05:37] jeremyb: yeah, but ShortURL was before GSoC [21:13:24] * jeremyb has finished reading the thread, http://lists.wikimedia.org/pipermail/wikitech-l/2011-April/052804.html [21:14:02] jeremyb: also built tawp.in/en/ta/India [21:14:09] that's an interwiki redirector in python hosted elsewhere [21:16:02] * jeremyb reads http://lists.wikimedia.org/pipermail/wikitech-l/2011-March/052418.html [21:23:00] jeremyb: that was pretty much 'no-no''d [21:23:32] but was fun working with apergos and doing all that profiling on a tiny atom netbook [21:23:39] i'm on like the 3rd msg of the thread [21:23:59] what's the latest on dumps? [21:24:41] jeremyb: from me? Nothing :D [21:25:12] yuvipanda: well, anyone? [21:25:36] jeremyb: you'd have to ask apergos. I think she put out a new type of dump a few days ago [21:25:39] yuvipanda: 2 sentence reason for 'no-no'? [21:25:57] oh, yeah... i can't remember teh difference [21:26:07] jeremyb: problem isn't language, but the fact that dumps aren't parallelized [21:26:19] yeah, exactly [21:26:40] so rewriting in another language would be sortof pointless [21:32:29] yuvipanda: oh, i remember it was block size? [21:32:45] jeremyb: ? [21:33:00] jeremyb: new ones, i think there are new dumps based on number of articles rather than raw size... [21:33:00] yuvipanda: the latest iteration on dump [21:33:08] articles count, I thought? [21:33:14] with some block alignment magic [21:33:14] ? [21:33:24] idk what size means? [21:33:38] raw size -> size of the files [21:33:43] like 'stop after 10G'' [21:33:47] or something? [21:33:49] not sure :D [21:33:54] i read the same email you did :) [21:34:57] he tried with each article compressed independantly and 100 in a block and somethings in between [21:35:11] * jeremyb heads afk [22:04:30] umpf Request: GET http://commons.wikimedia.org/wiki/Special:ListFiles/Nemo_bis, from 208.80.152.87 via sq66.wikimedia.org (squid/2.7.STABLE9) to () [22:04:31] Error: ERR_CANNOT_FORWARD, errno (11) Resource temporarily unavailable at Thu, 05 Jan 2012 22:03:59 GMT [22:05:51] dammit, pt.wn is down [22:06:57] Will anybody note that? ;) [22:07:09] Looking [22:08:38] mediawiki.org is up for me [22:08:52] There's a bump in the graphs but load seems to be going back down now [22:09:25] pt.wn works here too [22:18:57] !log preilly synchronized php-1.18/extensions/MobileFrontend/MobileFrontend.php [22:18:58] Logged the message, Master [22:19:09] !log small fix for iPhone vary support [22:19:10] Logged the message, Master [22:20:26] New patchset: Ryan Lane; "Putting LDAP before files is insane, when most sudo is being handled by files." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1795 [22:21:39] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1795 [22:21:39] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1795 [22:48:01] nighty~ [22:52:24] sigh http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=652948 [22:58:10] Nemo_bis: the debian-guys are not THAT fast, you know ;) [23:52:18] New patchset: Lcarr; "Puppetizing ganglia and gangliaweb Puppetizing automatic saving and restoration of rrd's from tmpfs to disk Modifying gmetad startup to import rrd's" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1797 [23:56:10] gn8 folks [23:57:40] PROBLEM - mobile traffic loggers on cp1042 is CRITICAL: PROCS CRITICAL: 0 processes with command name varnishncsa