[05:29:45] <badman>	 Hi. I need some help in understanding the data dumps provided by Wikimedia...
[05:34:34] <badman>	 Okay, I'm trying to understanding the data dumps. I'm working on 20180520 dumps. 
[05:34:58] <badman>	 It contains many sections, each having different data.  
[05:35:42] <badman>	 And I would like to know what each section's data represent. Although its written in a brief, I don't get it clearly.
[05:36:49] <badman>	 Like in section "All pages, current versions only." Does each and every article's current version is present in this data?
[05:38:40] <badman>	 I just downloaded "enwiki-20180520-pages-meta-current1.xml-p10p30303.bz2", the first page information is of "AccessibleComputing" but it does not have complete article's information in it?
[06:14:20] <snitch>	 [[Tech]]; 14.139.9.9; /* Queries regarding Wikimedia's Data Dumps. */ new section; https://meta.wikimedia.org/w/index.php?diff=18104707&oldid=18102456&rcid=11964974
[08:17:53] <Nemo_bis>	 badman: "AccessibleComputing" is not a proper title, surely it's a redirect
[08:18:02] <Nemo_bis>	 What are  you trying to do?
[08:18:45] <Nemo_bis>	 page_id 10, remarkable https://en.wikipedia.org/w/index.php?title=AccessibleComputing&action=info
[08:24:35] <badman>	 Actually, I have just started working on a research project. And for the same I need English Wikipedia Data
[08:28:13] <badman>	 BTW why some page have redirects and others don't have. And the one that do not have redirect have the complete Article's text?
[08:29:39] <badman>	 What data does "All pages, current versions only." section even has?
[10:18:13] <badman>	 And how to download the complete history of Wikipedia Article. 
[10:18:40] <badman>	 https://www.mediawiki.org/wiki/Manual:Parameters_to_Special:Export says it doesn't work in some case, where the number of revisions are very high
[10:19:12] <badman>	 Is there any alternative, to download the complete history of let's say "India"
[11:59:26] <snitch>	 [[Tech]]; Ruslik0; /* Queries regarding Wikimedia's Data Dumps. */; https://meta.wikimedia.org/w/index.php?diff=18105401&oldid=18104707&rcid=11966060
[22:59:22] <Ursula>	 Hi.
[23:42:06] <ragesoss>	 Is there any performance difference for (for example) doing one API query with 50 pageids and using continue, vs. using a smaller number of pageids per query so that no continue is needed?
[23:46:51] <Reedy>	 Probably not... What're you looking up?
[23:47:45] <ragesoss>	 pageviews in this case. mostly just curious about it in general, though.
[23:48:28] <Reedy>	 If it's an indexed DB lookup... Both should basically be the same
[23:49:24] <ragesoss>	 (I didn't realize that getting pageviews via mediawiki API means you can get data for multiple articles per query, unlike the wikimedia.org rest API... yay!)
[23:49:34] <ragesoss>	 cool, thanks!
[23:57:50] <Ursula>	 ragesoss: The performance difference would be like HTTP overhead since you'll be making more requests.
[23:58:09] <Reedy>	 Ursula: well, he said smaller number. Not 1
[23:58:32] <Reedy>	 So, if he does count( $ids ) approx = MAX_IDs...
[23:58:36] <Ursula>	 Querying as many as you can at once is probably faster since you're only round-tripping once. But the data will take a bit longer to generate.
[23:58:59] <Ursula>	 I do lots of stuff one at a time.