[03:32:56] Hi, I'm trying to get a pywikipediabot running but I'm pretty inept. Is anybody here willing and able to help? [03:37:28] Metaknowledge_: try asking in #pywikipediabot . also if you have a more specific question, people will usually answer it :P [03:38:08] nobody online over there [03:38:36] but in the mean time I figured out my problem, so I'll try to abstain from bothering people until I'm really stuck :) [03:39:53] hehe, ok. well if there's no one on IRC and you still need help, there's https://www.mediawiki.org/wiki/Manual_talk:Pywikipediabot or our mailing list [03:57:13] Thanks, currently it's not accepting anything I put into Terminal and spitting out (for example) -bash: replace.py: command not found [06:38:37] ohnoes, the degraded enotifs by echo have arived on Meta too now :( [09:33:21] <|RicZzz|> Hi, since a few days I have massive connectivity problems reading wikipedia articles from the 82.113.121.210 and similar IPs. It is a mobile provider, sometimes falling back to EDGE so high latency could be part of the problem. Other websites however are fully ok. [09:41:11] <|RicZzz|> With wireshark I am noticing many "previous segment lost" and subsequently "double ack". [09:41:23] no idea sorry :( [09:42:16] |RicZzz|: do you have a way to traceroute to mobile-lb.esams.wikimedia.org [09:42:18] ideally both ipv4 and ipv6 if your provider support the later [09:44:45] <|RicZzz|> traceroute mobile-lb.esams.wikimedia.org [09:44:46] <|RicZzz|> traceroute to mobile-lb.esams.wikimedia.org (91.198.174.236), 30 hops max, 60 by [09:44:46] <|RicZzz|> te packets [09:44:46] <|RicZzz|> 1 andro (192.168.42.129) 0.519 ms 0.775 ms 1.455 ms [09:44:46] <|RicZzz|> 2 * * * [09:44:46] <|RicZzz|> 3 82.113.122.198 (82.113.122.198) 239.843 ms 257.883 ms 258.056 ms [09:44:46] <|RicZzz|> 4 IARMUN1-Gi0-2-199.net.de.o2.com (82.113.122.2) 280.878 ms 280.560 ms 300.881 ms [09:44:47] <|RicZzz|> 5 xmwc-mnch-de01-gigaet-5-1-510.nw.mediaways.net (195.71.164.209) 301.069 ms 317.633 ms 317.30 [09:44:47] <|RicZzz|> 9 ms [09:44:47] <|RicZzz|> 6 ge-1-0-5-core0.ixfr1.de.as6908.net (78.41.155.221) 318.106 ms 317.978 ms 340.174 ms [09:44:48] <|RicZzz|> 7 xe-5-1-0-core0.nknik.nl.as6908.net (62.149.50.42) 340.015 ms 258.737 ms 240.896 ms [09:44:48] <|RicZzz|> 8 * * * [09:44:49] <|RicZzz|> 9 * * * [09:44:49] <|RicZzz|> ... not got any further [09:45:29] <|RicZzz|> don't think that I have ipv6 here [09:45:38] yeah that is v4 [09:45:58] traceroute -I mobile-lb.esams.wikimedia.org [09:46:02] that will send ICMP packets [09:46:15] and you can paste the result on http://tools.wmflabs.org/paste/ [09:46:44] once you get traces, you can send an emails to the Wikimedia operations team at ops-requests@rt.wikimedia.org that will create a ticket for them to investigate. [09:47:28] <|RicZzz|> need an extra arg for '-l' what should that be? [09:47:51] is this only for mobile? [09:47:58] depending on your trace route implementation, -I usually request trace route to send ICMP packets [09:48:13] it's a capital I, not an l [09:48:47] i :-) [09:49:05] <|RicZzz|> ok -I is doing something now [09:49:39] <|RicZzz|> yes, its from my mobile connection that this happens [10:31:06] <|RicZzz|> sent the mail to ops-requests@rt.wikimedia.org now [10:39:15] thank you |RicZzz| [12:44:01] Will 1.22wmf12 be the latest release for 1.22? [12:44:11] last, not latest* [12:50:39] no [12:50:39] We should be going to 24 [12:54:27] That's what I thought; I just visited the roadmap and only saw versions up to 12 [12:54:38] thanks Reedy [14:11:53] twkozlowski: Most likely because no one has bothered to work out the dates and do all of the copy pasting ;) [14:14:22] https://www.mediawiki.org/wiki/MediaWiki_1.22/wmf12 points to 1.23/wmf1 Reedy [14:14:53] | next = [[Special:MyLanguage/MediaWiki 1.23/wmf1|MediaWiki 1.23/wmf1]] [14:14:55] It's a wiki.. [14:14:57] {{sofixit}} [14:14:59] I know. [14:15:13] I'll just introduce an awful red link then [14:15:30] I think I did that originally as I wasn't sure if we were doing to 24, or just having 12 weeks between version branches [14:16:32] Create the page? [14:16:37] On a different note, why marking those pages for translations? [14:16:39] You can pretty much copy paste the boilerplate code [14:16:47] Update the numbers/dates [14:17:08] Not sure why we're marking them for translation [14:17:12] Some of the bigger commits do get done [15:02:49] Reedy, Krinkle|detached are you guys planning to expand the incident page for yesterday's outage of Wikivoyage? [15:02:57] not sure we should link to it from Tech News for the time being [15:03:29] "Shit broke. So we fixed it" [15:06:55] we have a project called wikivoyage? [15:07:00] it broke? [15:07:17] Nice deduction skills, p858snake|l [15:21:24] Happy Sysadmin Appreciation Day. Thank you, systems administrators! [16:16:29] apergos, parent5446: hello [16:16:34] Hey [16:17:02] hello! [16:17:50] So anything interesting since yesterday? [16:18:09] not much to report today: i'm working on XML output and trying to make it look as close to the original XML as possible [16:19:27] indenting and all that eh? [16:19:42] (indenting is the easy stuff) [16:20:53] yeah, things like writing revisions with empty text the same and so on [16:21:38] I'm just curious here and I don't have all the background, but will you be writing some kind of automated tests to guard against future regressions, since you already sort of have a spec to check against? [16:21:42] i still don't have siteinfo and handling of deleted fields, so that's next [16:22:22] i didn't think about test much so far [16:22:33] We could use Check or CUnit to do unit testing if we wanted (probably something for down the road). [16:23:03] tests would be nice but realistically they probably won't fit in the timeframe [16:23:09] you mean to check that the generated binary file is in the correct format? i think writing back the XML from it is the best test [16:23:14] but I hope to keep our coder around afterwards so maybe then :-) [16:23:28] Of course! ;) [16:24:19] qchris wrote the dump test suites so he might be a good one for collaboration later [16:24:48] i didn't even know something like that existed [16:25:26] yup! [16:25:30] Svick: Yes, they ran against 1.19 and 1.20. [16:25:58] Svick: But since we were lacking machines back than (IIRC) were never run automatically. [16:26:22] are they in the dumps repository? [16:26:26] they are! [16:26:50] Svick: https://gerrit.wikimedia.org/r/#/admin/projects/operations/dumps/test [16:27:07] git clone https://gerrit.wikimedia.org/r/operations/dumps/test [16:27:14] you were faster :-) [16:27:23] oh, it's a separate project in git [16:27:29] that's the admin link, I dunno if he can see it (?) [16:27:30] https://www.mediawiki.org/w/index.php?diff=748124&oldid=747915 Reedy [16:27:58] Thanks [16:28:15] yeah, i can see it [16:28:16] Hmm seems like there's a whole test definition format. Interesting... [16:28:22] There is no July 4 in August, September or October, right? [16:28:56] Because if there is, changing those dates would be a bit painful ;-) [16:29:05] (in the US, I mean.) [16:35:40] Svick: OK, so today is more XML? Anything else to report/ask? [16:35:41] There are other Independence Days. [16:35:52] For various countries. Surely some of them are in August, September, or October. [16:36:00] (Was this a serious question?) [16:36:19] yeah, i'll continue with that today, i can't think of anything else [16:37:55] ok, sounds good [16:38:01] Elsie: Well, I dunno, Thanksgiving is in November? [16:38:07] I don't have anything on my end [16:38:40] ok, in that case, see you monday [16:38:49] at sme point we'll want to comb through the xml and look for all the edge cases as far as attributes but it doesn't have to happen today [16:38:52] so [16:38:56] have a good weekend! [16:39:00] thanks all :) [16:39:10] I'll be around from time to time if you need anything [16:39:43] Have a good weekend! [16:39:53] you two too, bye [16:47:06] twkozlowski: Yep, third Thursday, I think. [16:47:08] Or fourth? [16:47:59] hm, twkozlowski [16:48:07] you sure about "Users can now write their own edit summaries on Wikidata. [5]" in tech news? [16:48:43] because i tried and i don't think i can :) [16:48:51] this might apply to the API only [16:49:51] that's guillom [16:50:35] MatmaRex: You even commented on that patch in Gerrit [16:51:19] i did, doesn't mean i know what it does :D [16:51:56] oh yes, this works just through the API [16:52:00] nicely spotted, thanks MatmaRex [16:52:08] I'll update this while there aren't too many translations yet. [16:59:26] Thanks for noticing, MatmaRex, and thanks for fixing, twkozlowski :) [16:59:51] Polish mafia watches your every step, guillom. [18:58:45] !seen JeLuF [18:58:45] Did you mean @seen JeLuF? [18:58:54] possibly [18:58:58] @seen JeLuF [18:58:58] Kronf: I have never seen JeLuF [19:56:29] for all you sysadmins, happy sysadmin day! [20:03:23] I have root access to my local system, should I celebrate, MartijnH? ;-) [20:03:57] definately twkozlowski [20:04:54] * twkozlowski writes the date down in his calendar. Perhaps I can still order a pizza. [20:21:09] hello everyone [20:22:02] I need some suggestion to query wikipedia database. [20:22:15] ok [20:22:27] what do you need to know? [20:22:51] can you point me some resources where I can start to work with it. [20:23:01] The database replicas? [20:23:13] I need to get the content of a page so that I can probably train a classifier over it. [20:23:36] I don't think page texts are in the public database replicas [20:24:26] no. some tutorial that would help me with querying the dump to extract the content [20:24:33] hmm :( [20:25:00] Why not just query the API for stuff like that? [20:26:50] cant I get the content from the Page text and associated information tables, as desribed in the following link http://upload.wikimedia.org/wikipedia/commons/3/36/Mediawiki_database_Schema.svg [20:27:33] I need to train it over a huge corpus. It would also be unfair to ping them at such length. [20:28:00] MariaDB [enwiki_p]> EXPLAIN text; [20:28:00] ERROR 1146 (42S02): Table 'enwiki_p.text' doesn't exist [20:28:00] No. [20:28:29] Besides, I also intend to use some link structure. [20:28:59] So the dumps don't contains any of the text that is present on the page? [20:29:13] if you import a dump you can then run queries against the text tabe [20:29:14] *except for the changes in the change table? [20:29:33] if you want to use the dups you can get the articles content dumps [20:29:46] which have the current version of every article (not user talk pages and such though) [20:29:50] *dumps [20:30:05] The dumps? [20:30:41] xml dumps [20:31:04] there is even a skeleton python program that does very basic text extraction for a given page title, from the compresed multistream dumps [20:31:13] so you could poke at that too [20:31:49] which table I need to query on this link http://upload.wikimedia.org/wikipedia/commons/thumb/4/42/MediaWiki_1.20_%2844edaa2%29_database_schema.svg/2500px-MediaWiki_1.20_%2844edaa2%29_database_schema.svg.png [20:32:18] awesome. :) can you please share the link for the python code with me [20:32:22] so if you want to run a query against *our* databases... text is stored in an external storage cluster [20:32:28] sec lemme find [20:35:34] https://git.wikimedia.org/tree/operations%2Fdumps.git/ca6cc3a0c15de239c4e684619924f18839312db9/toys%2Fbz2multistream [20:45:13] thanks apergos. It'll take me some time to have a look. Will get back to the forum once I get stuck once again. :) [20:45:51] ok, havea look at the multistream file and the indices and how it's produced, then you can see how you want to use it, [20:46:16] it may turn out that just importing the regular xml file into a db and runnig ms=ysql queries of your choice on the text table would be the way to go [20:46:20] anyways good luck! [20:58:56] Anyone from tech around? [20:59:11] Just ask your question. [20:59:35] Or rather, i think i already see what went wrong here. [21:01:11] Ill file a bug as well, but it seems that "&editintro=Template:BLP_editintro" is broken in Monobook. On Vector that link is present on the edit source button (Where it belongs) and not on the edit page button. On Monobook the "&editintro=Template:BLP_editintro" is present on Edit Page instead of edit source. [21:18:57] No Reedy? [21:18:57] Bah. [21:18:57] Who am I going to bother with shell requests? [21:46:51] * hashar hides [21:46:57] I am sleeping already anyway. [21:47:00] hashar: not very well :P [21:47:01] :-) [21:47:08] I was going to ask about the UploadBlacklist log. [21:47:20] But it's not urgent at all. Just curious how many hits it gets. [21:47:26] folks from WMF mw/core team can handle it probably [21:47:34] if not urgent, pause() till monday :-] [21:47:52] :-) [21:47:59] now really sleeping. have a good week-end everyone. [21:48:20] Bye, hashar. [22:43:56] eeee [22:44:05] https://wikitech.wikimedia.org/wiki/Network_design [22:44:09] what happened here. [22:45:01] Ryan_Lane, ^ [22:46:29] hell if I know ;) [22:47:01] This /used/ to work. Perhaps VIPS-related stuff? [22:47:25] On wikitech? [22:47:32] wikitech is largely outside the cluster. [22:47:39] oh, true. [22:49:07] twkozlowski: the image thumb nailing errors? [22:49:09] hmm funky [22:49:35] it's a png, why would convert say it was missing a delegate for this? [22:49:58] Ryan_Lane: most likely it's hitting the memory limit that mediawiki sets and it's failing to load a library or something [22:50:04] you get weeeeeeeird errors from that sort of thing [22:50:07] ah. yeah, probably [22:50:28] http://www.imagemagick.org/discourse-server/viewtopic.php?f=1&t=12366 [22:50:40] oh [22:50:49] I forgot to switch MW's image directory [22:50:53] I made some changes with that [22:50:53] hah [22:52:23] @seen bawolff [22:52:24] twkozlowski: Last time I saw bawolff they were quitting the network with reason: Ping timeout: 276 seconds N/A at 7/26/2013 9:11:55 PM (1h40m28s ago) [22:52:54] fixed [23:15:56] if a commit is related to two bugs, should I use two Bug: X lines in commit message or something like Bug: X, Y? [23:16:17] jgonera: Two lines [23:16:36] thanks