[01:07:36] Heh, Platonides discovers how insane Commons' categorization is. [01:07:42] discovered, rather. [01:07:57] scfc_de: That subcategory function only works with a sane category structure, BTW. [01:15:29] Susan: Okay, then we're lucky that stub categories are sane :-). [01:17:45] We hope! [01:17:55] If there are recursive categories or whatever, it'll probably explode horribly. [01:18:08] scfc_de: How is Tool Labs? [01:18:13] Is it nice/fast? [01:18:38] Recursive doesn't matter, because the function ignores them. Worst case is "all categories" :-). [01:19:30] I think you have more faith in that code than I do. ;-) [01:20:42] Tool Labs is like ... I should watch more Top Gear for idioms. Brilliant! Fast, no replag (http://ganglia.wmflabs.org/ -> tools -> tools-login -> "Replication Lags metrics"), no Solaris/Linux confusion. [01:22:33] Maybe I'll poke at it this weekend. [01:23:06] Is there any documentation for the database reports project on Tool Labs? [01:23:15] I saw you have a new crontab.tools file. [01:23:27] But I don't really know what host to log in to, etc. [01:23:45] https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Tools/dbreps [01:24:30] tools-login.wmflabs.org [01:25:26] We can't port all reports yet to Tools, as some depend on toolserver.namespace and others on Commons cross-database joins. But I do them one by one. [01:25:49] I can probably help rewrite some of them. [01:25:57] Thanks for the link, that looks great! [01:26:17] So no more mysqldb? [01:26:41] Coren is working (will be working) on both of those, and I wouldn't want to do something that makes his work unnecessary :-). [01:27:10] There are over 100 reports. Some of them have crappy logic or evil subqueries. [01:27:16] Lots of low-hanging fruit. [01:27:21] On Tools, no, I only port reports there after I switched them to the "distributor concept" that uses oursql. [01:27:23] I'll leave plenty of work for Coren, I promise. ;-) [01:27:35] Cool. [01:27:46] It'll be nice if everything can be stabilized. [01:27:52] And then expandable to other wikis... :-) [01:27:59] speaking of toolserver.namespace I just copied it into my tool's directory a few minutes ago [01:28:05] Commons has some database reports, but they're kind of in disrepair. [01:28:17] Hey carl-cbm. [01:28:28] I think we can probably do away with toolserver.namespace in dbreps. [01:28:34] It's easy enough to just query the API. [01:28:36] It's one query. [01:28:59] Surely that's less costly than a table join. [01:29:06] No! No! No! :-) We just have to idle for a week or two, and it will come to Labs. [01:29:15] Heh. [01:29:24] I guess it'll be renamed. [01:29:27] Probably even more efficient in database. [01:29:28] The join used to cause issues. [01:29:37] MySQL is cruel. [01:29:54] I don't know how much of that was due to Toolserver. [01:29:58] It used to cause occasional locks when you joined a long-running query on that table. [01:30:24] Anyway, I'll wait a bit. :-) [01:30:31] I really should get a key set up, though. [01:30:32] I think on Tools we have experienced personnel that can diagnose that stuff much more easily. [01:30:33] Maybe this weekend. [01:30:57] Take your time, you could use the same key. [01:31:10] I have to set up a key in Gerrit, I think. [01:31:12] Or somewhere. [01:31:16] I'm not really sure. [01:31:55] There are other reports not under the namespace "Wikipedia:Database reports/" that may be good to put in revision control. [01:32:14] I'm not sure how you'd feel about that. [01:32:24] https://en.wikipedia.org/wiki/Wikipedia:WBE is one. [01:32:47] https://en.wikipedia.org/wiki/Wikipedia:Featured_articles/By_length is another. [01:33:01] Very fine with me. Alkamid had some for plwiki as well where he effectively did in Python "allcategories - somecategories" - would be nice to do that in SQL :-). [01:33:20] K. [01:33:26] I'll see about getting those in to the repo. [01:33:32] It'd be really nice to get them tracked. [01:33:59] https://en.wikipedia.org/wiki/Wikipedia:List_of_Wikipedians_by_article_count is much more annoying. [01:34:18] Could you just add the links to TODO? We can import the source afterwards. [01:34:42] I encountered a few cases where my "report" model doesn't work properly, so I have to rethink some of it. [01:34:54] Heh, yeah, that always stopped me from rewriting the project. [01:34:59] I always got caught up in the details. [01:35:02] Like localization support. [01:35:50] Yep, another of those things. BTW, do you run BernsteinBot on another machine as well? Do you run ... indefsemiarticles.py or something similar still? [01:39:12] I only run the reports from the Toolserver. [01:39:16] Not quite sure what you're asking. [01:40:01] http://p.defau.lt/?_MM8n5EqA9Op44LVEkVi7w [01:40:05] I think that may answer it, though. [01:41:52] Yes :-). I wondered why some database report was reset :-), and now I see why. [01:49:16] Susan: Could you remove all $HOME/scripts/database-reports/*.py stuff? I checked, and it's all in dbreps. [01:53:58] Lemme look. [02:01:19] https://en.wikipedia.org/wiki/Wikipedia:Database_reports/Broken_section_anchors [02:01:23] It'd be nice to get reports like that up and running again. [02:01:27] I enjoyed that one. [02:03:42] scfc_de: So... will crazy queries like mostrevisions.py work now? [02:03:43] :D [02:04:28] http://p.defau.lt/?CiDdBMY_JYwA62uKLbck_A [02:05:37] Nice! I haven't tried mostrevisions.py yet because it relies (in my rewrite :-)) on toolserver.namespace. Some tables aren't available (yet), so I don't know how to port mostwatchedsomething.py (yet or at all). [02:06:13] Broken section anchors is stuck with me because of the table you use in your DB, that I don't understand yet how it's built. [02:06:22] Must spend more time there. [02:06:35] I completely forget how that report works. [02:06:41] I will log in over the holiday weekend. [02:06:45] So I can start to help out. [02:07:26] I have a rough idea, but it's bad to start work on a wrong assumption :-). So a little diligence, and everything is more easy. [02:35:26] got a reply [02:35:28] > Date: Fri, May 24, 2013 at 02:13 UTC [02:35:28] > Subject: What could possibly make [02:35:28] > [02:35:28] > You ask me a.math question? [02:35:31] > Your answer is 4. [02:35:34] offlist [02:42:09] Susan: the key has to be on wikitechwiki [02:42:11] not gerrit [02:42:12] jeremyb: The news are full of people who behave irrationally, so I much prefer them sending some ununderstandable stuff than picking up knives, guns, and whatever :-). [02:42:40] you mean like in britain? [02:43:00] anyway, we can also send them to /dev/null [02:43:30] * jeremyb runs away [02:45:27] They're human! [02:49:20] Oh, if it was confined to Britain -- if I look at the police report in my city, there are a lot of confrontations where people draw weapons where any rational person would think: "Really?" Last year or the year before a kid was murdered because he owed his dealers a two-digit Euro sum, sometimes it was just the way of looking at the (later) attacker. I agree though that the mailing list moderator could use deal with "her" mails [02:49:20] himself. [06:58:05] Patricia Pintilie * Re: [Toolserver-l] Encoding issue using SGE [06:58:06] Patricia Pintilie * Re: [Toolserver-l] Patricia Pintilie [06:58:06] Patricia Pintilie * Re: [Toolserver-l] Toolserver db outperformed by labs [07:33:49] 2013/05/24 07:31 OK z-dat-s3-a SMTP SMTP OK - 0.203 sec. response time [07:33:49] 2013/05/24 07:32 OK z-dat-s3-a SSH SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [07:33:49] 2013/05/24 07:31 OK z-dat-s4-a SMTP SMTP OK - 0.223 sec. response time [07:33:49] 2013/05/24 07:32 OK z-dat-s4-a SSH SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [07:33:49] 2013/05/24 07:31 OK z-dat-s6-a SMTP SMTP OK - 0.216 sec. response time [07:33:50] 2013/05/24 07:31 OK z-dat-s6-a SSH SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [07:33:50] 2013/05/24 07:32 OK z-dat-s7-a SMTP SMTP OK - 0.328 sec. response time [07:33:50] 2013/05/24 07:32 OK z-dat-s7-a SSH SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [07:39:49] 2013/05/24 07:34 WARN nightshade Load avg. WARNING - load average: 6.71, 13.96, 15.12 [07:41:49] 2013/05/24 07:40 OK nightshade Load avg. OK - load average: 12.36, 13.86, 14.94 [07:49:49] 2013/05/24 07:44 WARN nightshade Load avg. WARNING - load average: 8.86, 14.43, 15.65 [07:53:16] 2013/05/24 07:51 OK nightshade Load avg. OK - load average: 10.93, 13.05, 14.88 [08:57:04] Johannes Kroll * Re: [Toolserver-l] Toolserver db outperformed by labs [09:38:04] seth * Re: [Toolserver-l] Patricia Pintilie [10:22:06] hi Platonides [12:34:41] [[Special:Log/newusers]] create 10 * Nicolasa889 * (New user account) [15:21:31] [[Special:Log/newusers]] create 10 * Karol578 * (New user account) [17:01:17] jeremyb the filenames would be sufficient [17:02:00] I am not trying to find dupes [17:02:07] its meant to be a quick statistic [17:02:34] perhaps you could treat the 3 categories seperately listing the number of duplications afterwards [17:20:40] ToAruShiroiNeko: Did you see Platonides's data?