[07:33:58] strange, I'm trying to use the api, and I don't get a continue record from action=query&list=prefixsearch but I do get one from action=query&list=allpages [09:07:20] b_jonas: On which wiki? [09:09:09] b_jonas: http://paste.ubuntu.com/10155722/ [09:11:09] cgt: on http://mw.lojban.org/api.php?action=query [09:11:54] it is using Mediawiki 1.24.0 [09:14:19] cgt: for example http://mw.lojban.org/api.php?action=query&format=jsonfm&list=prefixsearch&pssearch=B&continue= [09:14:20] I see it too [09:15:11] increasing pslimit did return more results, and there's no 'batchcomplete' field, so it's not that there's no more results... [09:15:17] odd [09:15:37] thanks for looking [09:15:57] I wonder if it is possible to disable query continuations [09:16:40] https://www.mediawiki.org/wiki/API:Query#Continuing_queries says this style of continue is supposedly available starting from MediaWiki version 1.21 [09:16:44] yeah [09:16:58] and before that there were the old style [09:17:11] I also tried using rawcontinue, but it didn't help [09:18:30] And if I list=allpages instead, it gives a continue token: http://mw.lojban.org/api.php?action=query&format=jsonfm&list=allpages&apfrom=B&continue= [09:19:00] yes, that's strange. Maybe there's a bug in this particular version of the MW API. [09:19:14] try #mediawiki, if that exists [09:19:18] or the mediawiki API mailing list [10:00:34] that wiki will "soon" be moved to a different server, and upgraded to a newer version of mediawiki, so I simply ignore this problem now and hope all those changes will fix it, or use list=allpages instead [13:51:03] jzerebecki, chasemp is in charge of Phabricator upgrades [15:49:02] Hi guys, I have a question regarding the downloading of revisions of articles. [15:49:16] So suppose I want to get all revisions of an article [15:49:25] I know that there is the Special:Export as well as the API [15:49:36] Special:Export has a limit of 1000 revisions per call [15:49:43] the API seems to have a limit of 50 [15:49:50] so I suppose Special:Export is the better way? [15:50:02] You have to loop [15:50:12] Yeah, know that. [15:50:13] Or you can use https://github.com/WikiTeam/wikiteam/blob/master/dumpgenerator.py [15:50:16] But which is better? [15:50:36] API in theory, but they are different beasts [15:51:24] But the limit for the API seems so low, so I have to loop way more frequently. [15:51:49] So what, do you have to push pedals every time your script loops? :) [15:52:02] If you are lucky, continuation for the web API is already handled by your API library [15:52:36] Does the API has some kind of restrictions of continuous calls? [15:52:44] have* [15:52:49] What do you mean "continuous"? [15:52:56] looping [15:53:03] barsch: there is a rate limit, yes [15:53:36] i think it will just make you wait if you send requests too fast [15:53:55] I see. [15:54:26] And without being a bot there is no chance to increase the usual 50 limit? [15:54:29] barsch: also, the API limit is a lot higher (500 i think) if your client uses a bot account [15:54:35] No, there's not actually a rate limit for reading calls [15:54:45] Just be reasonable to the servers ;) [15:54:47] Is it easy to get a bot account? [15:55:10] easy enough, but not necessarily fast [15:55:19] may take a week or so to get approval. depends on the wiki [15:55:33] Fair enough. [15:56:02] I think Ill try to benchmark the speed of API vs. Special:Export for my task and then decide. [15:56:10] barsch: a read-only bot is a bit strange, since it's not really a bot. but if you explain what you want this for, there should be no proble [15:56:19] Special:Export is working fine for me atm, but the API seems to be more reliable. [15:56:27] Okay, cool. [15:56:34] don't use Special:Export, it has all kinds of nasty issues. [15:56:43] for one thing, if you hit the limit, it's not going to tell you [15:56:48] it fails silently, omitting data [15:57:20] That's not good. [15:57:24] we should probably just disableexport of old revisions via Special:Export [15:57:35] it's prone to failrue and data loss [15:58:18] Don't do that for now please, some of my/our stuff is relying on that. [15:58:39] But I'll check the API out and maybe it's better to use that anyhow. [15:58:48] Omitted data is bad though. [17:43:09] is it a known issue that sometimes the ordering of revisions is by timestamps in the history overview of an article (which seems to be correct), but when I browse trough the old revisions one by one using the "newer revision" link, they are ordered by increasing revision_id instead (which obviously is non-sequential for some articles for the first couple of revisions)? I spotted this with some articles in the first (oldest) revisions, e.g. https://en. [17:44:47] the api also returns the list as ordered by time stamps [17:46:46] I know that dumps retrieves without specifying whether ordering shouold be by ts or id, and it turns out to be complicated [17:46:52] that may be generic to mw [17:47:50] i.e. the query with which revs for a page a retrieved may in fact not specify anywhere in mw, we'd have to look at the code I guess [17:52:52] thx. do you know if that incoherence between revision_ids and timestamps only occurs in older revisions, maybe because of the https://en.wikipedia.org/wiki/User:Conversion_script back in the day? [17:53:42] that would make it less of a problem if that only happened for a couple of revisions back in 2001/2002 [17:55:26] ah ok, seems like it according to that user page [19:59:49] FaFlo: I would use time stamp [20:00:41] deletions, imports and history merges can all cause issues too [20:11:28] Betacommand: yes, thanks, I already switched to that strategy :) I was just wondering why the "previous/next revision" link in the old revisions view doesn't do that. but might be of low relevance if that mixup really just exists for the first 10-20 revisions of some articles