[06:11:49] https://wikitech.wikimedia.org/wiki/Nova_Resource_Talk:Tools/Help#Hosted_jQuery_etc. [06:12:18] what could be the URL to load a copy jquery.min.js at on bits.wikimedia.org? [06:17:36] https://en.wikipedia.org/w/load.php?modules=jquery is wrapped in a mw.loader.load [06:17:44] er, .implement [06:18:51] https://bits.wikimedia.org/en.wikipedia.org/load.php?modules=jquery [06:18:58] Nemo_bis: ^ [06:23:33] hello [06:23:33] Nemo_bis: what does "free access" mean? [06:23:46] cortexA9: again? [06:23:52] Nemo_bis: have you issues ? [06:23:55] what ? [06:24:35] cortexA9: are you still having problems with connections to esams? [06:24:49] no.. [06:24:57] ok :) [06:25:03] :) [06:25:55] what is the issue ? [06:25:59] cortexA9: anyway, fyi, first 2 letters of datacenter is vender. last 3 letters is airport [06:26:06] no issue afaik [06:27:16] jeremyb: free as in not restricted by something ensuring it comes from MediaWIki [06:27:32] bits URIs are usually very weird [06:27:53] cortexA9: besides wtf, why am i awake at this hour?! [06:28:08] that's a better question :p [06:28:28] eheh jeremyb idk [06:28:35] :) [06:29:23] but the problem i thought was bits.wikimedia.org [06:29:31] loading time.. [06:29:32] earlier? [06:29:45] no, it was all of the domains across the board [06:30:21] jeremyb: suggested that to pathos, thanks [06:30:22] Nemo_bis have the same issue.. [06:32:34] Nemo_bis: i haven't looked at the bits URLs... idk how stable they are or not. but i guess we could just put a caching+anonymizing proxy in front of google apis? [06:32:37] idk [06:32:44] have to think about it :) [06:33:06] jeremyb: that URL looks rather stable, I've seen it around for a while :) [06:33:20] no idea what the content is though ;) [06:35:59] jeremyb: bits.wikimedia.org is an alias for bits-lb.esams.wikimedia.org. [06:36:15] jeremyb: bits-lb.esams.wikimedia.org has address 91.198.174.233 [06:36:29] i have that [06:37:25] cortexA9: ok, but your traceroute may have changed. or at least our BGP setup has. i think [06:37:35] cortexA9: what's your point? [06:37:49] jeremyb: the point is slows wikipedia.. [06:38:04] bits serves all the CSS and JS... [06:38:14] you can't exactly browse the site with no css [06:38:22] 11 06:24:35 < jeremyb> cortexA9: are you still having problems with connections to esams? [06:38:25] 11 06:24:49 < cortexA9> no.. [06:38:25] yes, just unstyled [06:38:28] well, it would just look weird. [06:38:28] you said no [06:38:33] are you saying yes now? [06:39:28] no problems here... ping 8 times faster than yesterday and 0 packet loss [06:40:16] but sometimes.. [06:40:21] not always [06:40:31] continue? [06:40:35] try full sentences [06:40:46] yesterday i mean.. [06:40:58] who cares about yesterday [06:41:01] focus on today [06:41:01] ok [06:41:07] today is good [06:41:14] full stop [06:41:19] :) [06:52:27] well howdy, I was directed here by the Powers that Be. [06:52:53] if there's anything I can assist with wrt DDoS mitigation, please let me know :) [06:55:07] hello nakon [06:55:52] howdy cortexA9, that's a name i haven't seen i a few years [06:55:55] :) [06:56:30] hehe nakon [06:57:16] 2 years makes me a long-time contributor [06:57:17] :) [06:57:45] well, probaly more than 2yr :) [06:58:26] wow [06:59:34] i wouldn't "contributor" after that... sry :c [07:01:12] nakon: why not ? [07:02:22] sometimes i join in this channel for report issues.. [07:02:46] :) [07:05:59] nakon: i like wikipedia i am an old visitor :) [07:06:45] no not indeed, I do apologize for elders :) [07:07:41] no [07:07:48] i mean i am young :D [07:07:53] yound old visitor :D [07:07:58] *young [07:09:27] i want to contribute in a better wikipedia. [07:09:37] :) [07:09:49] nakon [07:10:26] cortex hai [07:10:44] we all do :) [07:16:42] this is my favorite channel [07:16:45] :) [07:19:54] techs araound? [07:21:03] hello Steinsplitter [07:22:06] have you ever thought of doing a mirror backup of wikipedia? [07:22:43] likeā€¦ dumps.wikimedia.org? [07:23:10] legoktm: the mediadabase is broken :/ [07:23:20] i cant do anything about it :x [07:23:41] i know. is evil :D [07:23:42] legoktm: like example: mirror.wikipedia.org [07:23:50] :) [07:24:08] cortexA9: what's the point exactly? what's wrong with what dumps gives? [07:25:12] i mean a very mirror of wikipedia [07:25:20] online [07:25:35] not a dump.. [07:27:29] for security purpose. [07:29:51] why not make a torrent of the all wikipedia too [07:30:19] people can host wikipedia [07:30:30] if they want :) [07:31:39] cortexA9: https://meta.wikimedia.org/wiki/Data_dumps [07:36:55] legoktm: with pictures ? [07:36:59] :) [07:37:12] that's a different dump, its on archive.org i think. Nemo_bis would know. [07:40:54] http://web.archive.org/web/20010727112808/http://www.wikipedia.org/ [07:40:54] wow [07:40:58] 2001 :) [07:45:04] hello apergos [07:45:08] hello [07:46:25] dump of what [07:47:09] cortexA9: if you mean off-site backups for volcanoes exploding in Virginia and the like, https://wikitech.wikimedia.org/wiki/Bacula [07:48:57] Nemo_bis: i mean all wikipedia in another host. mirror backup. [07:49:36] yes, see above [07:49:44] cortexA9: the full dumps are replicated to several organisations [07:49:58] the dumps are not a reliable source for us for a mirror [07:49:59] and we also have the dbs replicated accross DCs as well [07:50:11] (not for long) [07:50:12] we already have db snapshots, replicate the db [07:50:27] of course multiple hosts have the mediawiki installation [07:51:22] we can ask archive.org to host [07:51:35] they already have dump copies [07:51:37] no ? [07:51:45] we can't give them db copies; these contain private data [07:52:07] oh [07:53:11] I'm also archiving Commons files on archive.org as we speak https://archive.org/search.php?query=subject%3A%22Wikimedia+Commons%22 [07:53:27] so far, 10,617,718,431 KB [07:54:16] they could probably replicate from prelabsdb1 in realtime if they really wanted to, but I doubt there would be that much benefit compared to storing the dumps [07:56:50] well, WMF claims that maintaining replication of that sort is very expensive to them [07:58:01] wouldn't it basically just be bandwidth now, since that feed is maintained for labs? [07:58:59] I doubt it, it was mentioned as a reason to kill TS [07:59:09] (as in, TS replication) [08:01:01] sigh, Either this webhost hasn't responded to a ticket I filed on the 8th, or the web interface doesn't show responses and I have to wait till I go into the office... [08:01:11] and i'm in the wrong channel... [08:10:40] apergos: http://www.httrack.com/ [08:10:47] :) [08:11:11] that's not how we would or could go about it [08:12:01] text for old versions of pages, for example [08:12:41] doesn't sit in a file in a directory which can be recursively copied... revision data lives in a table in one database, the text lives in a database on another cluster [08:21:00] apergos: oh i understood [08:22:45] p858snake|l: turns out they might have changed their mind http://lists.wikimedia.org/pipermail/toolserver-l/2013-September/006289.html [08:23:12] the nda is the tough part I guess [08:25:04] what is NDA ? [08:25:14] non-disclosure agreement [08:25:32] apergos: well, that's easy, only roots have to sign it [08:25:52] anyone with access to that data would have to sign it, I think [08:26:09] if we are talking about replication of certain user data for example [08:26:19] "that data" being the private data [08:27:03] so if I were able to query those tables in the db (not required to be a root) for the purposes of my specific tool [08:34:33] among the other things that could be discussed on that thread is who would need to sign an nda, what the nda would have to say, what identifying information the user would have to provide (full, as checkusers do, I suppose), and for which sorts of data access [08:34:57] if there were a wiki page with a clear policy and set of requirements, that could be very useful [08:35:11] that's for tools, not backups [08:35:37] you can imagine archive.org taking a full copy on some hosts with only few employees of their having access [08:36:21] on a replica for tools only roots would have (potential) access to private data, as on toolserver, the normal users only public stuff [08:37:03] but yes, it would be very useful to have it documented somewhere, though I don't think it's the main roadblock for such an arrangemenet [08:40:03] yes, I thought we were talking about replication to another site that would host tools (according to the email) [08:46:34] apergos: wikipedia have different servers in europe ? [08:46:54] we have three data centers [08:47:28] european readers get content via amsterdam [08:47:29] apergos: if i am in europe how it can decide the best server for me ? [08:48:09] it's automatic ? [08:49:32] yes, it's automatic [08:50:04] dns resolution of the hostname will give you the right ip [08:50:09] and the rest just happens [08:51:27] https://developers.google.com/speed/pagespeed/ [08:51:31] this can help wikipedia ? [08:51:35] in terms of speed [08:52:44] people have worked a lot on the speed issue [08:53:22] from minimized js to serving only the pieces needed for initial page loading to caching output from the parser to serving all static resources from a separate cluster (and with a cache in front) [08:53:35] this is constant and ongoing work [08:54:39] if there is a particular area you are interested in, you could follow the discussion via bugzilla our the appropriate mailing list (or of course irc in wikimedia-dev, when such discussions are happening) [08:54:47] and you can contribute as well [08:56:43] apergos: https://developers.google.com/speed/pagespeed/module [08:56:48] for apache [08:57:30] ok so what I would highly suggest you do [08:57:42] (since I am not involved in that part of the development at all) [08:57:52] is to have a look at the archives of the wikitech-l mailing list [08:58:11] ok [08:58:13] http://lists.wikimedia.org/pipermail/wikitech-l/ [08:58:38] specifically around the issues of speed, page rendering and delivery [08:59:01] if you have trouble finding them you might subscribe and post to the list asking for pointers to such discussions [08:59:17] and from there you can see what has been done or what is still missing [08:59:42] how to open a new discussion ? [09:02:00] well you wuld want to subscribe to the list [09:02:32] just bear in mind that you're coming into the middle of a topic that has had some work done on it, so you'll want to find out what work has been done first [09:03:09] yes [09:07:29] apergos: subscribed [09:08:34] ok [09:20:11] apergos: gmane.science.linguistics.wikipedia.technical is this ? [09:20:39] p858snake|l: turns out they might have changed their mind http://lists.wikimedia.org/pipermail/toolserver-l/2013-September/006289.html uh, dunno, I don't read the archives from gmane [09:21:42] f you look at one of the messages in there it ought to have a link to the actual list though [09:25:54] apergos: http://lists.wikimedia.org/pipermail/toolserver-l/2013-September/006285.html [09:26:24] that was the comment from louis about transporting db contents off labs to a 3rd party to allow processing by non opensource tools [09:26:27] ok [09:26:43] presumably they have to have signed an nda also [09:27:02] apergos: under that theory, every user on labs has to as well [09:27:17] everything on labs is already sanitised anyway [09:27:23] oh, off of labs [09:27:36] yes, the labs data is public data I think [09:34:53] User has two different usernames on two different wikis. He wants to merge them. How is this done? [09:36:35] wmf wikis, or another project? [09:38:53] ok apergos just posted it. [09:40:34] wmf wikis [09:41:16] cortexA9: the speed mod email? [09:41:27] afaik thats already been discussed, check the archives [09:41:46] oh sorry [09:42:07] what about it ? [09:42:36] If i remember correctly, TimStarling researched it and it wouldn't be much use due to our caching layer in front [09:43:31] ok i understood sorry for double post. [09:44:05] https://www.mediawiki.org/wiki/User:MaxSem/mod_pagespeed [09:44:14] thats maxsem's research on it [09:45:13] good [09:45:19] nice to know [09:46:48] http://lists.wikimedia.org/pipermail/wikitech-l/2013-July/070310.html (start of one thread on it) and a even older thread http://lists.wikimedia.org/pipermail/wikitech-l/2010-November/050140.html [09:48:16] thanks for digging those up, p858snake|l [09:54:05] apergos: what about Prolexic ? in case of ddos. [09:55:39] well if we want to talk about ddos it's good for you to have an understanding of our basic architecture [09:55:51] p858snake|l, wmf wikis [09:56:06] so you can see where the specific weaknesses are, where we're more vulnerable and where we're not [09:56:51] most everything about our setup is available either on wikitech.wikimedia.org (look through the docs) [09:57:39] there are occasional talks folks give too about our setup, you can find those .. hm.. commons maybe? not too sure where those are gathered at this point [09:58:49] apergos: load balancing i think just enough right ? [09:58:52] we will in general prefer open source solutions to proprietary ones, and we will prefer local work as opposed to outsourcing when it goes to something in our core focus (such as keeping the site up) [09:59:13] we have load balancing of course [09:59:25] so what is the problem.. [09:59:30] :) [09:59:51] https://wikitech.wikimedia.org/wiki/LVS [09:59:57] there you go, load balancing [10:00:13] https://wikitech.wikimedia.org/wiki/Pybal and pooling/depooling hosts automatically [10:00:49] there are many types of ddos, not just 'give me this static html page' [10:01:09] anyways, again this is something where you want to get up to speed on the current setup and on past discussions [10:01:15] then you would be in position to jump in [10:01:46] apergos: i like the diagram [10:05:07] apergos: nameservers are protected ? from attacks. [10:05:29] we have some things we do [11:00:36] apergos: hello [11:14:44] hello [11:14:47] sorry, landlord [11:15:09] i don't see parent here [11:15:38] I'll ping [11:15:47] ok [11:18:07] pinged [11:18:15] I will be back in 5 mins [11:25:06] hmm [11:25:09] no parent [11:25:19] but rent paid so that's something [11:25:32] so how are things going? [11:25:58] ohh I see a pile of commits in the last little bit [11:26:24] quite well, it turns out deleting texts from a group was simpler than i expected [11:26:38] yay! [11:26:50] what were you thinkign you would have to do? [11:27:24] i wasn't sure, but until now, some problems always appeared when i was doing things like that [11:27:54] Eep. I need to get this week's tech report done [11:28:23] oh, i'm making some assumptions about the text of articles, and i wanted to verify if they are right [11:28:37] let's hear them [11:28:46] 1. the text of a page won't contain the zero byte [11:29:18] 2. the text of the page won't be UTF-8 encoded U+FFFF Unicode Noncharacter [11:29:52] if those are not true, i will have to figure out some other way to represent text groups [11:30:15] because i am using the zero byte as a delimiter between pages [11:30:43] and page text that is just U+FFFF means that the text was deleted from the group [11:31:11] (i'm updating the specification page to reflect the recent changes now) [11:31:14] wikipedia is the six website of the world :) [11:34:25] it's not possible to have a real time wiki? [11:34:40] a sort of a wikipedia 2.0 :P [11:37:53] Do you mean ONLY U+FFFF? [11:38:02] Or CONTAIN U+FFFF [11:38:28] Because the first is almost certainly safe, the second not so much [11:38:36] ? [11:38:39] what's that [11:38:45] Svick's question [11:39:00] U+FFFF being a Unicode null character [11:39:27] you remember google wave fail ? [11:39:28] i mean only U+FFFF [11:39:42] I'm going to try to make a zero-byte page [11:40:32] p858snake|l: that reply by Luis is later, not earlier :) [11:40:42] Given vandalism, it's not entirely a sasfe assumtion, but it's vanishingly unlikely [11:40:53] ? [11:40:58] it also doesn't say nothing on actual transfer of data [11:41:17] it was in September, the message I quoted in late August [11:42:10] https://en.wikipedia.org/w/index.php?title=User:Adam_Cuerden/Sandbox&action=history <- Okay, yes, a page can be zero bytes [11:42:19] Just a matter of someone blanking the page [11:42:33] AdamCuerden: it looks like U+FFFF is replaced with U+FFFF REPLACEMENT CHARACTER upon saving, so that should be okay [11:42:59] I'm pretty sure I make the assumption that null (\0) is not allowed in text [11:43:06] atfer having looked at the code [11:43:10] I am looking at it again though [11:43:16] Oh, do you mean the null character? [11:43:18] and i know a page can be empty, but that's not what i meant; i meant that it can't contain the zero byte '\0' [11:43:26] yeah [11:43:39] Oh, that I can't help with [11:44:13] Let's see.. [11:44:26] I suppose if any page'll have them, it'll be [[Null character]] [11:46:57] there is a wiki 2.0 ? [11:47:24] https://en.wikipedia.org/w/index.php?title=User:Adam_Cuerden/Sandbox&action=history <- I've attempted to add both null characters to this page. Are they there? [11:47:34] what is the most evolved wiki right now. [11:48:57] AdamCuerden: i see only U+25BA there [11:49:02] because you are xml encodings back you are likely going to be fine [11:49:35] in the future we can see a real time mediawiki ? [11:49:35] Then probably safe [11:50:04] I tried Alt+0, Alt+0000, and Alt+69904 [11:50:20] apergos: but i am not saving the texts XML encoded [11:50:27] I suppose a more precise test could be tried if you have them ready to paste into a page [11:50:52] of course you're not, what was I thinking [11:51:30] how can i suggest feature requests for mediawiki [11:51:55] http://www.fileformat.info/info/unicode/char/ffff/browsertest.htm has a pasteable U+FFFF, i tried that and it got saved as U+FFFD [11:53:30] Right [11:53:36] Then I think you're safe [11:54:25] yeah, it looks that way [11:54:45] you could try adding � and see what it does with that [11:55:01] I suppose it might be possible to bot-inject such codes. [11:55:41] Or, in theory, a parsoid bug. [11:55:56] parsoid bugs aren't theory :-D [11:56:15] Yes, but they seem to prefer chess pieces. [11:56:37] heh [11:59:01] apergos: if i write that to a wiki page normally, then it look like &#0000; in the XML, so i after decoding, i will get literally � back [11:59:21] if i edit the XML manually, then that won't tell me if MediaWiki can produce such XML [11:59:27] leaving your browser to do that final decoding [12:00:56] and it looks like MediaWiki itself does something with such code: if i write �* to a wiki page, the HTML contains &#0000;* [12:02:00] it's got some sanitize routines [12:02:00] but I'm not sure how they interact with the new contenthandler stuff [12:04:41] hi [12:04:49] http://www.mediawiki.org/wiki/Future/Real-time_collaboration [12:04:52] apergos [12:04:56] :) [12:06:32] i have page source of this page https://en.wikipedia.org/w/index.php?title=Portal:Arts how i can add the source to my wiki to use it and change on it???? [12:07:48] ? [12:11:08] i have page source of this page https://en.wikipedia.org/w/index.php?title=Portal:Arts how i can add the source to my wiki to use it and change on it???? [12:12:17] static function decodeChar( $codepoint ) { [12:12:18] got it [12:12:26] you should be good to go with null and ffff [12:12:29] marktraceur: hello [12:12:40] validateCodepoint [12:12:53] see Sanitizer.php [12:12:59] it was staring me right in the face [12:13:02] svick: [12:13:04] apergos: ok, i tried to add those to a page using the API, and the results were also good [12:13:21] sorry to take so bloomin long [12:13:32] thanks for looking into it for me [12:14:04] i have page source of this page https://en.wikipedia.org/w/index.php?title=Portal:Arts how i can add the source to my wiki to use it and change on it???? [12:14:05] sure [12:14:24] cortexA9: ehtereditor I guess, I've played with it and like the concept but the big deal will be attribution [12:14:34] *ethereditor [12:15:09] so, now that i think all highest-priority things are done, i'm going to work on lower-priority things; starting with using LZMA with groups for diff dumps [12:15:25] ok [12:15:37] apergos: what u mean for attribution [12:15:45] i have page source of this page https://en.wikipedia.org/w/index.php?title=Portal:Arts how i can add the source to my wiki to use it and change on it???? [12:16:27] ahmed__: http://en.wikipedia.org/wiki/Wikipedia:Reusing_Wikipedia_content [12:18:22] cortexA9: after 5 people have edited a revision on the etherpad popup and someone clicks 'save', who does the edit get attributed to? if you don't do it that way but every edit to the pad is considered a new version, how do you handle that (and is it reasonable)? [12:19:16] apergos: we need to find a new way. [12:20:34] apergos: i i don't needt the content of the page just need the forms and css and code [12:21:06] svick: how long do you think it would be before I could try testing with it in production? basically I will need to do the following, I think: convert an initial full xml dump to the new format, run an incremental of that same wiki, apply it to create a new full, convert the new full to xml, keeping track of space/time requirements [12:21:45] apergos: maybe an integration of the feature of etherpad in mediawiki [12:24:07] and I'd like to do that with one of the larger wikis to get a sense of things, and then with en (hmm, how will incrememtals applied to chunks work?) [12:25:08] apergos: i guess after i implement the better compression for diff dumps (which i think should be done by tomorrow) [12:25:24] nice! [12:25:37] ??? [12:25:40] do you get what I mean aboout the en dumps? right now they are done in 27 pieces [12:25:42] apergos?? [12:25:54] at the same time, the 27th is where the new pages end up [12:26:05] but old changed pages wind up anywhere in the first 26 [12:26:05] the only changes that i think could affect your results that i plan after that is better compression for metadata, but that shouldn't affect it that much [12:26:10] sure [12:26:31] ahmed__: I didn't understand what it is you want [12:27:29] d I just give the page range for each chunk? is that going to work out? [12:27:31] apergos: are u worried about the dumps ? [12:27:35] svick: [12:27:40] if you treated each piece basically as a separate wiki, then the current code will work [12:28:08] apergos : u see the link i need the source code for this page and put it in my wiki page to make page have the same style , where i can put page source code?? [12:28:28] apergos: if you need something more, then i would have to write that [12:29:02] ok so ahmed__ and cortexA9, I'm actually in a meeting right now so I'm going to ask you both if you want to talk to me, to wait a while (you can ask other folks of course) [12:29:20] svick: what I need to be able to do is update the pages from A to B say [12:29:32] so getting all changes for those pages as an incremental [12:29:45] apergos: cai i wait you [12:29:47] ?? [12:30:04] being able to convert the original xml for those pages to the new format [12:30:19] apply the incremental to that [12:30:38] there shouldn't be anything blocking me from doing that right? [12:30:51] yeah, if you want to convert a single XML piece to a single incremental piece, that should work [12:31:27] and if you then want to update the incremental piece, i think that should be just a matter of specifying the right parameters to dumpBackup [12:31:33] if a page is moved then we will see it gone in the one piece (so that's just a delete) and we'll see a new page (with old revs but the other chunk won't have those revs so ..) for a later chunk [12:31:55] i have page source of this page https://en.wikipedia.org/w/index.php?title=Portal:Arts how i can add the source to my wiki to use it and change on it???? any some one help? [12:32:03] the pieces are by title? [12:32:06] well I know how to get it to give me stubs for a given page range [12:32:13] they are by start page id to end page id [12:32:25] but if you move a page, its id doesn't change [12:32:38] ordered by page id and within a page the revs are typically by rev id but it's not 100%, as we discussed the other day [12:32:56] ok well the redirect will be new [12:33:04] whichever, I always get those screwed up [12:33:22] a delete and an undelete then [12:33:35] apergos: what u are doing ? [12:33:39] we'd see a delete in the one chunk and an apparent new page in the other [12:33:57] where by other I mean the last chunk which gets all new pages [12:34:20] ok, yeah, that will look like a delete in one piece and a completely new page in another piece (and all revisions for the undeleted page will be loaded from the database) [12:34:25] right [12:34:35] that's completely acceptable [12:34:45] heck it's what we do now [12:35:09] ok I'm going to assume there are no hidden gotchas for that and we'll see [12:35:11] right [12:37:21] i have page source of this page https://en.wikipedia.org/w/index.php?title=Portal:Arts how i can add the source to my wiki to use it and change on it???? any some one help? [12:38:05] if it turns out that everything works fine and the speed is more than okay, then i would like to combine all those pieces into one [12:38:16] btw have you tested the 'convert to xml' on something largish, like the tr wp ne format file? [12:38:37] what, all 27 pieces? it turns into a few hundred gb to download [12:38:44] this is another reason to split them up [12:38:56] why split ? [12:39:10] and don't make a one torrent [12:39:26] ok [12:39:58] and yeah, i tried incremental to XML on trwiki, and it worked fine [12:40:03] ah cool [12:40:34] how did memory usage for that look? [12:41:30] i'm not sure exactly, but i think it wasn't much [12:41:36] apergos: how many space all the files ? [12:42:53] ok, can't wait to do some real world playing [12:42:54] oh dump in progress.. [12:43:21] http://dumps.wikimedia.org/backup-index.html [12:43:57] apergos: i tried it again, and it looks like it peaks at ~200 MB [12:44:21] pretty nice [12:44:36] I'll try it on some others that are a bit beefier :-) [12:45:12] hey i used Export pages to the page i want it and it download xml file how i use it?? [12:45:45] well, the memory usage shouldn't depend on the size of the wiki, at least not by much [12:46:29] we'll find out! [12:46:55] any help from any one? [12:47:52] oh, i just realized that one of the fixes for memory usage i did won't work for reading, i'll have to fix that [12:48:01] ah ha [12:48:34] :) [12:48:41] apergos: wikipedians can seed torrents too each other :) [12:49:17] why use the bandwidth of the foundation.. [12:50:56] on sept 27 folks are supposed to start submitting their code... I'd like you to plan for at least some broad comments in the code before that happens (some of the main classes at least ought to get a couple lines) [12:51:23] enough so that other folks looking at it won't be left completely clueless [12:51:53] maybe we can start a tracker on wikimedia [12:52:40] i thought that i'll spend the next week writing documentation (if it takes that long), so that should be okay [12:52:47] cool! [12:53:19] I think that's all of my concerns/questions for right now [12:54:02] ok, i don't have anything else either, see you tomorrow [12:54:13] ok, see you then [12:54:44] ahmed__: special:import (make sure you exported the current revision only, plus all the templates) [12:55:08] i do that [12:55:24] bot how i import?? [12:55:34] cortexA9: volunteers torrent files already, which makes more sense than us doing it; we produce the dumps because a volunteer cannot, but a volunteer can download a dump and create a torrent, freeing us to work on things only we can do [12:55:43] Special:Import ahmed__ [12:55:55] good apergos :) [12:55:58] i found it [12:56:05] http://www.mediawiki.org/wiki/Manual:Importing_XML_dumps [12:56:11] you don't need all that but anyways [12:57:01] but i change on oit after import (chane color, title, width,.....) [12:57:58] might depend on your local wiki's MediaWiki:x.css/js files plus your skin [12:58:04] you'll have to look at all that [12:58:56] greate , you are helpful man [12:58:58] realy thanks [12:59:11] http://dumps.wikimedia.org/enwiki/20130904/ [12:59:17] when this end ? [12:59:46] try looking at the previus run; you can make an estimate from that [13:01:06] apergos: can i try and ask you again about what happened with me? [13:01:17] go ahead [13:02:04] it give mt an error ( ! ) Fatal error: Maximum execution time of 30 seconds exceeded in C:\wamp\www\mediawiki\mediawiki-1.21.1\includes\parser\Preprocessor_DOM.php on line 977 [13:02:33] what is it you are trying to do, when it gives this error? [13:03:46] just import the xml [13:03:54] ah [13:04:12] right, it renders them all in order to update links tables >_< [13:04:43] mmmmm [13:05:10] so what i do?? [13:06:38] in your php.ini you need to change max_execution_time [13:06:45] bump it up to a few minutes I guess [13:07:09] alternatively you could try to import the file via the maintenance script [13:07:18] ok i i will do it now [13:07:50] maintenance/importDump.php and that's discussed on the mediawiki page I linked above as well [13:08:21] yea [13:08:46] i wanna make a new torrent of the 20130904 [13:08:49] i will try and tell you [13:09:19] tryin to download all the 27 parts [13:10:19] you might want to wait til the 7z files are done, cortexA9 [13:10:39] smaller that way, unless you only want to provide the pages_articles current versions [13:10:50] that's available as a single file so you could just seed that [13:11:57] ok i wait [13:12:45] there already is a torrent for pages-articles: http://burnbit.com/torrent/255157/enwiki_20130904_pages_articles_xml_bz2 [13:13:50] i mean for [13:14:02] pages-history-meta [13:14:05] apergos: it done but in the end this error appear (faild: No handler for model 'Scribunto'' registered in $wgContentHandlers) in the last [13:14:34] sorry pages-meta-history [13:14:34] ah, well you will need that for some content (the scribunto extension plus lua [13:14:35] ) [13:14:58] i wanna make one torrent [13:15:13] when finished [13:16:16] i will find this extension [13:16:32] but how i show the page?? [13:16:58] when the page is imported you'll be able to view it just like any other page, by the page title [13:17:17] with luck it will even show up in your recent changes feed [13:18:47] oh svick [13:19:34] maybe i can seed the burnbit one [13:21:14] i think that's a good idea, because that torrent is combined with HTTP, so it won't die as long as the file as available from dumps.wikimedia.org [13:22:19] no i can't fount it in recent or page title [13:22:55] if the import didn't succeed then you won't find it [13:23:11] you had an error message aboout a missing contenthandler [13:23:21] go get scribunto set up and try again [13:23:32] oh right [13:26:13] apergos: what about this error (Warning: XMLReader::open(): Unable to open source data in C:\wamp\www\mediawiki\mediawiki-1.21.1\includes\Import.php on line 51 Call Stack ) [13:26:33] looks like it's not seeing your fie [13:26:35] file [13:28:28] because the extension i think [13:31:21] what is infbox??? [13:36:18] apergos: if i come after 2 hours i can find you? [13:36:36] no, but there are plenty of other people [13:36:41] just ask whoever is here [13:41:02] ok thanks but another eq# apergos [13:43:13] apergos: after finishing i want this page to appear in every page in my wiki how, because i will use it like page template ? [13:44:08] you'll have to make it into a template to add to every page or [13:44:40] find an extension that lets you include given wikitext or something. sorry I don't really have a good answer for you about this [13:44:47] someone else might have a better idea [13:45:24] apergos: thanks you are great man [13:45:31] good luck! [13:46:27] thank you in my first project [13:48:54] apergos: thanks [15:19:42] I just googled something and saw: "Jump to mw.loader.load - [edit | edit source]." [15:19:51] (i searched for mw.loader.load) [15:20:01] but i dont think the edit labels should be showing up.... [15:23:38] legoktm: they are supposed to be visible for css-less clients [15:23:54] and googlebot happens to be one [15:24:07] i think it's they who should fix their algorithms in this case [15:24:28] ok [15:25:05] and tbh i don't see how we could avoid this apart from detecting googlebot [16:40:50] manybubbles or ^d: how does one search for an exact sentence with CirrusSearch? [16:41:27] Nemo_bis: you can force a phrase search by quoting the phrase [16:41:42] Nemo_bis: see if that does what you want [16:42:02] manybubbles: no it doesn't [16:42:15] e.g. https://www.mediawiki.org/w/index.php?title=Special:Search&limit=20&offset=20&redirs=0&profile=all&search=%22most+used%22 [16:44:42] Nemo_bis: Ah! so what you are seeing is that quotes don't turn off stemming. Also, I've probably set the phrase slop too high. [16:45:50] Nemo_bis: I'll file a bug about both. Turning off stemming might be somewhat annoying to implement but I'll get it. [16:46:15] from "most used" to "most of us" and "most commonly used" is definitely very aggressive "stemming" :) [16:47:02] maybe it works better in other languages! thanks for taking care of filing it [16:47:24] used -> use [16:48:00] the reason "commonly" appears is because of the phrase slop. which I've set way way too high. [16:48:21] used -> us (that is crazy!) [16:48:46] :) [16:56:56] Nemo_bis: 54020 and 54022, both of which I've added you as a cc [16:59:54] hi [17:00:41] thanks [17:04:19] "[04da3fd9] 2013-09-11 17:03:49: Fatal exception of type MWException" trying to go to https://en.wikipedia.org/wiki/Ellipse#Circumference [17:05:38] Still getting similar error messages upon refreshing several times [17:06:38] Reedy: ^ [17:07:15] <^d> On it. I think it's Aaron. [17:09:57] Hi, I think this is wrong - [fde601cf] 2013-09-11 17:08:16: Fatal exception of type MWException [17:10:15] <^d> Already known, fix is sync'ing out. [17:10:46] Yeah, it's fixed now :) [17:10:51] Thanks [17:11:21] Yup, fixed for me as well. Thanks! [20:18:39] search [20:18:41] elastic [20:18:45] manybubbles: [20:25:15] <^d> greg-g: Yo sup? [20:25:46] ^d: hehe, nothing, just pinging manybubbles [20:26:13] <^d> :) [21:16:46] [02:12:05 PM] i'm trying to update a map image on commons and it's tetlling me "Could not create directory "mwstore://local-multiwrite/local-public/f/fd"." [21:16:49] from -en [21:17:40] what up [21:18:15] AaronSchulz: ^ [21:19:05] has my tale been pasted here or should i retell it [21:20:20] i pasted it [21:20:44] mmm [21:21:12] i had a speed issue.. [21:21:25] another one [21:21:34] many seconds to open [21:23:45] legoktm: there are several such broken files, add to bugzilla [21:24:04] Nemo_bis: do you know which ticket? [21:24:24] nope [21:24:30] there are several iirc [21:25:19] ok [21:25:21] ill just [21:25:22] file a new one [21:25:25] ziggy_sawdust: directlic? [21:25:36] ziggy_sawdust: whats the filename of the image you're trying to overwrite? [21:31:33] legoktm: https://bugzilla.wikimedia.org/show_bug.cgi?id=53553 [21:32:43] mkay, i'll comment on that. [21:33:29] n8i [21:40:14] [17:25] ziggy_sawdust: whats the filename of the image you're trying to overwrite? [21:40:16] sorry i was afk [21:40:22] https://en.wikipedia.org/wiki/File:World-cannabis-laws.png [21:40:30] steinsplitter, legoktm [22:00:04] hey can someone explain me whats a gap limit? [22:00:09] got: WARNING: API warning (allpages): gaplimit may not be over 500 (set to 5000) for users [22:01:46] https://www.mediawiki.org/wiki/API:Query#Generators