[06:13:36] Is the GeoData project already in place on the Wikipedia project https://blog.wikimedia.org/2013/01/31/geodata-a-new-age-of-geotagging-on-wikipedia/ ? If so I wondered why there is no field in the XML dump specifying the coordinate of each page, but instead I have to parse the text to find it which is very cumbersome [06:15:31] BadDesign: IIRC correctly, Wikidata is intended to eventually be able to centralize it [06:18:12] BadDesign: you don't have to parse it [06:18:48] Nemo_bis: but how else can I get *just* the coordinate of the subject's page, not the other coordinates the text of the page refers to? [06:19:33] Using the GeoData API to make zillions of requests for each page title to get the coords? [06:20:28] it's one per page, geodata tells you if it's subject coordinates or other coordinates [06:20:31] I have to parse the coordinates specified with the https://en.wikipedia.org/wiki/Template:Coord [06:22:13] Very little context from the page is abstracted [06:22:32] If we add co-ordinates, then we have people asking for a lot more other things pulling out [06:22:33] Nemo_bis: So, say I have 6 million article titles, do I make 6 million HTTP requests to get the primary coodinate? [06:23:36] Is there any assurance that that an article has the same coordinate for each language it's written in? [06:24:15] Reedy: Geo coordinations are very important! [06:24:19] And? [06:24:35] And no, there is no guarantee that different languages have hte same co-ordinates [06:24:40] For exactly the same reaosn [06:25:35] that is, until Wikidata collects and centralizes this info [06:25:51] yup [06:31:31] or rather, until wikis use wikidata coordinates [06:31:42] Better yet [06:32:13] or rather, until wikis use wikidata coordinates AND GeoData [06:32:19] ;P [06:33:23] But aren't the two mutually exclusive? Doesn't GeoData use Solr to store the coordinatesi in the database? Is WikiData using the same database? [06:33:36] BadDesign: one thing you can do is checking {{coord}} in all languages and helping local sysops to implement the last version (which uses GeoData) where they don't yet [06:34:57] I don't see why, the templates would just fetch coordinates values from wikidata but why should parser functions move? [06:35:15] * Nemo_bis knows nothing and is no authoritative source though [06:40:15] Nemo_bis: What do you mean by "implement the last version (which uses GeoData)"? I'm not sure I understood [06:50:26] BadDesign: see history of the template [06:50:38] only few wikis use the geodata syntax [06:52:58] BadDesign: you should ask something like this to be one on each wiki https://en.wikipedia.org/w/index.php?title=Template:Coord&diff=526665541&oldid=368854805 [06:53:38] explain why it's needed and it's important for you, what code to replace the templates with, find a sysop to update it where protected [07:21:28] * Jasper_Deng hopes Reedy is around [07:22:26] hey y'all. http://www.wikidata.org/wiki/Special:Contributions/10.64.0.127 is editing Wikidata again, and apparently it shouldn't be, since it's listed at https://noc.wikimedia.org/conf/highlight.php?file=squid.php [07:22:50] Susan might have an answer [07:27:00] it doesn't look like a blacklist to me [07:27:28] we don't want these edits? [07:27:45] yeah, b/c it hides the bot that's making the edits [07:27:55] uh [07:28:04] hm [07:28:11] we'd block it, but apparently ops wasn't sure what kind of side effects that could have [07:29:09] ah now I see what you mean about the list [07:29:10] hm [07:33:04] cp1005? seriously? ugh [07:34:36] honestly I think you should go ahead and block it and keep an eye out for anything weird, anyone else editing would edit through an account and there should be no problem, right/ [07:34:39] ? [07:39:58] apergos: if you think you can clarify things at all on-wiki, feel free to comment @ https://www.wikidata.org/wiki/Wikidata:Administrators%27_noticeboard#Should_we_block_10.64.0.127.3F [07:41:11] (not canvassing, of course; just since you're the only available sysadmin on call at the moment) [07:41:38] some words definitely found their way into that last sentence without my typing them [07:42:04] ok lemme read it and see [07:46:15] added [07:46:33] I might be totally wrong. but then what, folks revert the block [07:47:06] just make sure that people stick around for awhile, announce it in wikitech-l and on wikidata, and watch for a while afterwards for consequences [07:47:37] * Jasper_Deng has yet to subscribe to wikitech-l [07:47:46] QuelqueChoseRose: I think that clears our block [07:47:48] er #wikimedia-tech [07:47:49] sorry [07:47:51] not the mailing list [07:48:22] we're in #wikimedia-tech :P [07:48:27] chck with user addshore first [07:48:50] * Jasper_Deng knows [07:48:53] addshore is offline atm [07:48:56] yes. I am saynig, if /when you decide to do the block, announce it here, announce it in the wikidata channel, and on wiki, then make sure folks are around for awhile to undo it incase... [07:49:08] ok. best to get their sign-off [07:49:46] seems reasonable [07:50:03] ah ok. yeah, there's no massive rush. i just wanna take care of it before we wind up with a completely fucked-up RecentChanges. [07:50:24] sure [07:50:27] well, IPs do have rate limits [07:50:37] do we know the bot doing the edits? [07:51:02] not that I know of [07:54:21] apergos: someone more knowledgeable about our various bots might know. there are only a few bots that do category items. [07:54:29] BadDesign: https://en.wikipedia.org/w/api.php?action=query&prop=coordinates&titles=Main%20Page&coprimary=primary [07:54:32] well this one is adding fa [07:54:39] for all of them, so maybe that's a clue [07:54:43] as an api request to get main coordinate of an individual page. [08:01:33] thedj[wo1k]: yes, I know... I intend to get the coordinates of all the Wikipedia articles (which have coordinates) in all the languages, which means a lot of HTTP requests... because I would be making pointless HTTP requests cause I don't know beforehand which articles have or don't have geo coordinates [08:02:06] those who don't use {{coord}} most likely don't :) [08:03:23] Nemo_bis: ah, right, I could parse the dumps and grab all the titles that use {{coord in them, good point [08:03:51] and then use the GeoData api to get the coords insted of parsing the text of the article from the dumps [08:04:25] Is the GeoData API using the text of the article to determine the coordonates or something else? [08:05:01] why parse the dumps, just use templatelinks [08:05:25] it's a parser function, so yes it's defined in wikitext not in heaven :) [08:09:39] http://www.wikidata.org/wiki/User:Rezabot [08:09:39] I guess (but can't be 100% sure) that this is the bot [08:09:39] it's making a lot of (logged in) edits that look like the ones from the ip [08:09:59] hhm... [08:10:04] worth contacting the author to see if the ip edits are edits in the list the bot is going through [08:10:31] (I checked the 4 or 6 bots from fa speaking users on wikidata, that's the only candidate) [08:10:55] Nemo_bis: What can I use template links for? [08:11:23] *For what can I use template links? What can I determine with them? [08:11:44] From what I see a template link "Shows a small list of useful links after a page name" [08:17:44] check this out [08:17:46] http://www.wikidata.org/w/index.php?title=Special:Contributions&offset=20130527171951&tagfilter=&contribs=user&target=10.64.0.127&namespace= [08:17:52] http://www.wikidata.org/w/index.php?title=Special:Contributions/Rezabot&offset=20130527172054&limit=500&target=Rezabot [08:18:10] http://www.wikidata.org/wiki/User_talk:Reza1615#Bot_editing_logged_out.3F [08:18:11] you can see that the edits from the ip at 17:10 and 17:09 are part of the same series [08:18:26] (Reza now edits as Yamaha5, but still uses his old talk page) [08:18:33] yeah I saw that [08:19:05] so it's definitely them, you want to add excerpts from those links maybe [08:19:18] I dunno if the bot is actually logged out for real or if something else is going on [08:19:34] could be a bug on the back end who knows [08:19:52] yeah. I mean, it's happened in the past, and I think the bot-ops have been able to fix it in those cases. [08:19:55] ok [08:20:33] and yeah, definitely a plausible match [08:58:44] anyone from -tech here care to look into the DB for a useragent from wikidata again for me? :D [08:59:24] ? [08:59:43] oh btw you are keeping track of the bot as ip saga right? [08:59:46] addshore: [09:00:05] yes, I just made an edit filter to keep it out :) [09:00:22] ok! [09:00:30] I dont think I actually need you as the attempted edit is the same as all of the others so im pretty sure its the bot [09:00:42] yeah, I'm about 100% sure. [09:01:00] http://www.wikidata.org/wiki/User_talk:Reza1615#Bot_editing_logged_out.3F [09:01:21] I expect after looking that the user will say 'oh, huh those look like mine all right, I wonder how come they show an ip' [09:01:46] and no I don't think we should dig up user agents etc from the db for this [09:01:59] I just wanted to make sure the nop :P [09:02:21] (also I'm not a cu or a steward so I wouldn't) [09:08:47] apergos: I am guessing the logged out bots should only edit through the squid labeled as api? [09:09:10] well it seems unlikely to me that the bot actually logged out [09:09:19] *agrees* [09:09:22] what I mean is there's a 'logged out' edit at 17:09 and then one at 17:10 [09:09:31] in the middle of a bunch of 'logged in' ones [09:09:44] but any edits should only come through the api squids? [09:09:59] an anon edit would come in as an anon edit from the ip of the client [09:10:12] it shouldn't show the api of a squid or any other darn thing [09:10:27] s/api/ip/ [09:10:30] :P [09:11:04] but if the request uses the api and it slips through it will only be on cp1001-5 and sq33 sq34 sq36 ? [09:14:01] there's no reason for it to advertise the cp100x address [09:14:10] these are text squids [09:14:11] http://ganglia.wikimedia.org/latest/?r=hour&cs=&ce=&s=by+name&c=Text%2520squids%2520eqiad&tab=m&vn= [09:14:41] * addshore is looking at https://noc.wikimedia.org/conf/highlight.php?file=squid.php with the ips :) [09:15:12] dunno, doesn't show it in puppet as such [09:19:14] 'specialParents' => array( [09:19:14] 'pmtpa' => array( [09:19:14] '=api_php' => array ( 'sq33.wikimedia.org', 'sq36.wikimedia.org' ), [09:19:14] ), [09:19:14] 'eqiad' => array( [09:19:15] '=api_php' => array ( 'cp1001.eqiad.wmnet', 'cp1002.eqiad.wmnet', 'cp1003.eqiad.wmnet', 'cp1004.eqiad.wmnet', 'cp1005.eqiad.wm [09:19:15] net' ), [09:19:16] ), [09:19:16] ), [09:19:31] it's from the squid fles we parse to generate configuration [09:19:58] those I guess [09:20:07] heh [09:20:21] but given I have no idea why that address shows up for the user, I have no idea whether the address would be limited to just those [09:20:26] apergos you were hitting Antispam filter for flooding. :P [09:20:48] :-D [09:20:56] sorry, usually I would pastebin but [09:20:58] lazy [09:21:20] I know, AntiSpamMeta isn't very effective in a lot of channels. ;) [09:21:52] well sorry for the gratuitous wakeup :-D [09:22:26] Anyway, everything good? Are you guys back from AMS? [09:22:59] I didn't go, so I've been back the whole time :-D [09:23:13] Aww. :S [09:23:49] it's fine, I went out of town to give a talk and see friends, so it all worked out ;-) [09:23:59] AMS would prob. have been fun this time of the year. [09:24:03] oh, nice! [09:33:25] apergos: thanks for your help :) [09:34:31] you're welcome! [15:27:15] Hi, my wikimedia seems to be giving me a 404 whenever I add too many external links into the article..what is it doing and is there anyway around it? [15:30:17] Are you sure they're 404s, not some other kind of error? [15:30:28] Could you check your Apache error logs for lines that look related? [15:31:47] It says I don't have access to /wiki/index.php, one moment [15:31:59] !wikipmediawiki [15:32:00] Confused about the differences between MediaWiki, Wikimedia, Wikipedia and wiki? See https://www.mediawiki.org/wiki/Wikipmediawiki [15:32:24] Ah, crap. I'm using MediaWiki -.- [15:33:33] The-Sky: Don't worry, everyone gets confused one point or another :) [15:35:43] Unless I can find it without SSH, I don't believe I have access to Apache Logs. I'm on a shared hosting server atm. [15:36:43] Forbidden: You don't have permission to access /wiki/index.php on this server. Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request. [15:38:18] I'm trying to contact my host now to see if they will allow me access. [15:38:50] hm, try /w/index.php ? [15:40:41] It seems as if, /w/ does not work but /wiki/ does [15:47:23] If I switch the links around (remove one link and add another), it doesn't matter. It still errors on me. It's not one specific link messing it up. [16:02:24] I guess I'll have to install it on a dedi to get the apache logs [20:47:56] hi guys, this might be an operations issue, but the API (for commons) is returning 504 HTTP errors [20:48:14] I have a bot currently in a retry loop, and it is echoing [20:48:17] HTTPError: 504 Gateway Time-out [20:48:17] WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'. [20:48:17] Maybe the server is down. Retrying in 30 minutes... [20:48:26] :-((( [20:49:18] Er [20:49:24] Are you trying to save a really large page? [20:49:45] OHHHHHHHHH! [20:49:48] I might... [20:49:51] let me check [20:50:25] api is certainly working fine [20:50:32] when in doubt try some other api link... [20:51:12] http://commons.wikimedia.org/wiki/Commons:Quality_images/Subject/Objects/Statues,_Monuments_and_Plaques [20:51:18] yeah... it is pretty big [21:19:43] dschwen: yeah thats why. i configured my bot to just take a 504 as an "edit successful" message and keep going [23:19:46] gn8 folks [23:28:15] Nobody's asking how the hell a caching proxy IP's is making edits? [23:28:25] A local block is completely tangential. [23:28:59] Where, Susan? [23:29:29] https://www.wikidata.org/wiki/Special:Contributions/10.64.0.127 [23:30:27] It must be an extension issue [23:30:46] Susan: it was discussed before [23:30:49] You say "must be" based on... [23:30:49] Or it may be a malfunctioning script [23:30:59] Clouseau! [23:31:30] Jasper_Deng: Yeah... everybody seems to have forgotten to report it. [23:31:34] I am quite the detective. [23:31:42] Susan: it was reported last night [23:31:49] In Bugzilla? [23:31:55] no [23:31:57] on IRc [23:32:04] Right... [23:32:12] These types of edits are extremely problematic. [23:32:24] IRC is the new chat based bugzilla. [23:32:27] False attribution is a big issue. [23:35:14] * RoanKattouw looks up 10.64.0.127 [23:35:22] I betcha that's the new eqiad maintenance host [23:36:17] RoanKattouw: wasn't it having something to do w/ XFF? [23:37:11] https://bugzilla.wikimedia.org/show_bug.cgi?id=48919 [23:37:17] RoanKattouw: ^ [23:37:32] It's really bad to have false attribution on edits. [23:38:54] cp1005 [23:38:55] eww [23:39:01] Yeah that's XFF [23:39:12] We should really just whitelist 10.0.0.0/8 for XFF IMO [23:39:38] Let's fix that [23:40:01] Did you see the Wikipedia Review article that was all like "ZOMG the people at the WMF office are making penis edits" and the IPs weren't office IPs, they were datacenter IPs? [23:40:03] It appears the 208... range has also caused issues in the past. [23:40:09] RoanKattouw: I linked it on the bug. ;-) [23:40:19] That's part of the reason I think this is a high priority. [23:40:26] It creates confusion and misunderstanding. [23:40:33] And it undermines the integrity/credibility of page histories. [23:40:50] WTF [23:40:55] It's *in* $wgSquidServersNoPurge [23:41:23] When you said XFF, I thought you meant TrustedXFF. [23:41:24] That address is explicitly listed as being a caching proxy in the MW config [23:41:36] TrustedXFF is for external proxies [23:41:48] We also trust XFF for internal proxies [23:42:03] I think it's something to do with Labs. [23:42:18] But I'm mostly just speculating. [23:42:21] Hmm