[02:52:23] > sent from a desktop device. please excuse my verbosity. [02:52:24] <3 [06:55:48] !admin [06:55:50] !admin [06:56:02] !log [15:54:57] if I wanted a dynamically-updated list of all pages on enwiki which fit a certain criterion, would this be a suitable place to ask for someone to help me with that? [15:55:04] or would you recommend a different channel? [15:55:31] Depends how you're wanting to find/be provided that list [15:56:43] a while back, Coren made a tool for me that's hosted on WMFLabs. It generates a list for me when I go to the URL. [15:56:50] I'd like a similar tool. [15:56:57] I have no idea how to even begin doing this [15:57:06] presumably there's a learning curve to climb [15:57:22] Likely [15:57:31] Depends how far away the current tool is from what you want [15:57:44] And how it's written [15:57:57] mm [15:58:34] so, what the current tool does (and I still use it, and do not want it altered): it lists all the files whose names are... I believe it's 9 characters or less (not including file extension) [15:58:51] Link to the tool? [15:58:51] because short filenames usually are very ambiguous, and usually benefit from being changed [15:58:57] https://tools.wmflabs.org/shortnames/ [15:59:01] I'd suspect, Corens tool would be open source [15:59:03] right [15:59:07] what I'd like [15:59:27] is a list of all articles whose references include.... well, I call them "referrer strings" [15:59:48] consider the difference between [15:59:49] I suspect looking at article text is an amount more complex than just doing a title search [15:59:49] > www.tor.com/2016/09/28/the-city-born-great/?utm_source=exacttarget&utm_medium=newsletter&utm_term=tordotcom-tordotcomnewsletter&utm_content=na-readblog-blogpost&utm_campaign=9780765393456 [15:59:51] and [15:59:54] > www.tor.com/2016/09/28/the-city-born-great [16:00:00] one has a referrer string, one doesn't [16:00:08] probably, yes. [16:00:10] granted. [16:00:20] you see what I mean by referrer strings, though? [16:00:33] Yeah [16:00:36] kiiiiillllllll [16:00:43] and those *are* in loads of articles [16:01:14] I initially thought that perhaps a bot could identify and truncate them, but I was advised that there'd be substantial risk of both false positives and false negatives [16:01:26] So, MW stores all external links in a db table [16:01:31] so a list of all such pages would make them human-verifiable [16:01:34] externallinks [16:01:46] doing a LIKE query would do a lot of the work [16:01:53] I dunno how efficient it's gonna be with so many rows etc [16:02:13] at this point, you're talking over my head. [16:02:21] though, most of them won't fit the indexes [16:02:38] I honestly don't know if we have anywhere obvious to make requests for things like ths [16:02:53] do you concur that removing such strings will be a good thing? [16:03:32] indeed [16:03:41] They really serve no useful purpose in references [16:04:28] I do this manually when sharing urls ;) [16:04:34] * Dragonfly6-7 nodnodnodnods [16:06:55] Reedy, Dragonfly6-7: you could write up a task and tag it with https://phabricator.wikimedia.org/project/profile/2235/ [16:07:21] Not sure that will actually get someone to write the bot quickly, but it's at least a place to document such things [16:07:24] Aha [16:07:24] That seems relevant [16:09:02] uh [16:09:16] bd808 - you're presuming a skillset not in evidence [16:09:36] Dragonfly6-7: you can't type in a text box? Seems unlikely :) [16:10:11] whcih text box? [16:10:24] https://phabricator.wikimedia.org/maniphest/task/edit/form/1/ [16:10:34] I don't see an edit link. [16:10:35] okay [16:10:45] Click the MediaWiki button below to connect your Wikimedia unified account. [16:10:45] Alternatively, you can introduce your Labs/Gerrit LDAP credentials. [16:11:00] click the mediawiki button [16:11:05] I'm just a bit wary about creating new accounts. [16:11:07] * Dragonfly6-7 goes ahead [16:11:27] just so you know, if this goes wrong in some way, I'm going to hold you personally responsible. Got it? [16:11:42] * bd808 accepts responsibility [16:12:06] it wants my e-mail. [16:12:10] I don't want to give it my e-mail. [16:12:23] We've already got your email [16:12:36] then why is it asking for it again? [16:13:01] so that phabricator can send you emails (like watchlist notifications) [16:13:31] this site is run by the Wikimedia Foundation. It is not scarrier than enwiki [16:13:32] what if I don't want it to? [16:13:40] I prefer to just check my watchlist. [16:14:18] things that happen on pahb won't be in your watchlist, but you can set it to only give you in-app notices [16:14:36] if you want that jsut give it a bogus email address [16:15:12] you can change it some other day when you realize that your tin foil hat was covering your eyes ;) [16:15:20] You must verify your email address to login. [16:16:11] huh. I don't remember that part, but I creAted my account using LDAP creds and years ago [16:19:10] well, this is going wrong in some way. [16:19:15] Therefore, you're responsible. [16:19:17] As you accepted.