[04:15:47] Ich haben ein Swastika [04:16:03] ssswwwaaassstttiiikkkaaa [04:16:12] !ops JD|cloud, mark, _8hzp` [04:16:23] ... [04:16:42] Warum müssen du bist ein dumb kompf? [09:41:14] what's the deal with search? it seems broken. [10:40:03] srdjan_m: in what way? works for me [10:40:26] Nemo_bis: i think it might be broken on wikis that use languageconverter [10:41:33] Nemo_bis: because no suggestions come up when i type something in latin if the title of the page i'm looking for is in cyrillic on the serbian wiki. [10:41:43] Special:Search works for me [10:41:48] Ah, suggestions [10:42:19] This is quite a specific issue, your original line sounded like everything was on fire [10:42:30] oh. [10:42:56] Suggestions work for me, but I've not tested infra-script suggestions [10:43:27] you could try going to sr.wiki and typing "Mesečina". nothing should come up. [10:44:07] even though this link does work https://sr.wikipedia.org/wiki/Mese%C4%8Dina_(film_iz_2016) [10:44:25] and it did show up in the suggestions before [10:44:53] well, before the search tab in the preferences disappeared i think. [10:45:00] I wonder why sr.wiki still has the search bar down down the sidebar [10:45:34] I get as suggestions https://sr.wikipedia.org/wiki/Mese%C4%8Deve_mene https://sr.wikipedia.org/wiki/Mesechinus_hughi https://sr.wikipedia.org/wiki/Mezeklazon https://sr.wikipedia.org/wiki/%D0%9C%D0%B5%D1%81%D0%B5%D1%87%D0%B5%D0%B2%D0%B8_%D1%87%D0%B2%D0%BE%D1%80%D0%BE%D0%B2%D0%B8 when entering "Meseč" [10:45:52] well, that's odd [10:45:53] (via https://sr.wikipedia.org/w/index.php?title=Meseclazone&redirect=no https://sr.wikipedia.org/w/index.php?title=Mesecevi_cvorovi&redirect=no ) [10:46:46] ah, it's a redirect [10:47:02] Probably has to do with the new diacritics folding [10:47:19] But I have no idea how things used to work [10:47:37] this practically makes the suggestions useless on sr.wiki [10:48:39] I can confirm that typing "Mesečina (film iz 2016)" shows no suggestions although when clicking "go" I do reach the article [10:48:53] This is probably the easiest way to describe the issue in a report [10:49:06] because i'd have to know the exact title of the page in cyrillic, and even then it won't show up [10:49:46] Not true: if I type "Месечина", I get "Месечина (филм из 2016)" as suggestion [10:50:06] (Where by "type" I mean "copy and paste" of course. :P) [10:50:10] it is true if i'm typing it in latin [10:51:40] which is not the "exact title of the page in cyrillic" [10:52:08] BTW I didn't like that film much :) [10:52:25] well, it would be quite difficult to mess up simply matching the title of a page... [10:52:44] Maybe [11:07:39] Nemo_bis: welp, i'm guessing this is descriptive enough https://phabricator.wikimedia.org/T160896 [15:16:26] hello there [15:17:27] https://en.wikipedia.org/wiki/Special:AllPages has been disabled on all wikis [15:17:35] this is a know issue [15:18:01] and hopefully, temporary [15:19:03] we will add more information on https://phabricator.wikimedia.org/T160916 [15:57:47] jynus: Looks like someone was trying to get all page titles by scraping that... https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)#Special:AllPages_disabled [15:59:15] we have a dump of titles- that would be much, much faster! [15:59:24] Yes... [15:59:43] or if someone needs real time- the public replica databases [16:03:21] They were creating a huge amount of requests too... How can they be that desperate for the information... [16:03:25] Or is it just not knowing any better? [16:05:04] Possibly just too used to other sites, which offer no way of getting any data off them at all, and assumed wikipedia would be the same? I'd think at least looking around for options, or asking, would be the first step anyway though.. [16:05:30] The account is a few years old. Not looked as to how active they actually are [16:05:43] I guess, the "it worked on other wikis"... [16:07:32] well, there is also the ideal of 'requesting public pages can't take down the site' which isn't an unreasonable expectation. [16:07:45] it may not be optimal, but it should have "worked" [16:09:56] ebernhardson: It seems they were requesting a lot, at the same time [16:12:25] ahh, thats a rather annoying way to do it :S we've pondered how to block the same in search but come up with no reasonable answers. We ran per-ip pool counters for awhile but it didn't seem a good solution [16:13:37] How many pages would they need to get to see all article titles? [16:14:33] depends, there are ~5M content pages and ~22M non-content pages [16:29:53] there is a far easier way to that list then scraping the page [16:30:13] Database dumps [16:30:13] API [16:30:16] Labs replicas [16:31:08] Reedy: yep. Im running an sql query right now for that issue for you. [16:31:17] https://dumps.wikimedia.org/enwiki/20170301/enwiki-20170301-all-titles-in-ns0.gz [16:31:23] That dump has exactly what he's looking for [16:31:29] Granted, a bit out of date [16:32:12] hah, just beat me to replying.. [16:33:02] Except I don't think the first one you gave is what they wanted? [16:33:32] What exactly do they want? [16:33:47] I'm not building the API query for them [16:33:48] I think the list of titles... [16:33:56] If they can't work it out from the help documentation... [16:33:56] As that's what they'd get from AllPages [16:34:21] So that second file you gave should be perfect. [16:34:37] I mean, hell, use the example, remove the start from [16:34:38] https://en.wikipedia.org/w/api.php?action=query&list=allpages [16:34:48] They'll soon realise there's a limit parameter, increase that [16:34:54] Then realise they've still gotta paginate [16:35:03] Many many ways to skin a cat [16:35:25] Dump is still quickest and easiest... Then maybe use api to keep the list up to date... [16:36:17] Depends really what they're trying to do [16:36:31] Doesn't really help with deletions [16:50:26] Reedy: http://tools.wmflabs.org/betacommand-dev/reports/en_articles.txt [16:51:04] It's not for me ;) [18:01:40] Reedy: I cant post on the VPP [18:02:11] Reply to https://phabricator.wikimedia.org/T160920#3114875 ? :) [19:13:14] Reedy: done