[00:38:11] !log tools.lexeme-forms kubectl delete deployment lexeme-forms.purge-all-lexemes # T225510 done [00:38:14] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL [11:44:39] Hello, [11:44:39] I'm trying to count the editors in the German Wikipedia replica ('SELECT COUNT(DISTINCT rev_actor),rev_page FROM revision GROUP BY rev_page') and it dies every time. [11:44:39] It is failing for the entire week with both this full query and batches (based on rev_page). [11:44:41] It seems that since the rev_actor field, the revision tables are much much slower. [11:44:43] Does anyone know when it is going to work as usual? [13:32:42] Hello, [13:32:42] I'm trying to count the editors in the German Wikipedia replica ('SELECT COUNT(DISTINCT rev_actor),rev_page FROM revision GROUP BY rev_page') and it dies every time. [13:32:42] It is failing for the entire week with both this full query and batches (based on rev_page). [13:32:44] It seems that since the rev_actor field, the revision tables are much much slower. [13:32:46] Does anyone know when it is going to work as usual? [14:20:22] mmecor: try to cnt the user table [14:21:11] but i can conform that the replicas are lately very slow, compared to a few months ago. [14:38:28] Steinsplitter: user table? but i want the number of editors who intervened in every article. [14:38:43] not the number of edits per user [14:39:23] in a matter of two weeks, replicas became incredibly slow (and this time it doesn't work :(, it is not that it makes me wait, it just does not work) [14:41:22] mmecor: have you tried using the revision_userindex view instead of revision? [14:41:46] nope. what does it contain? i didn't know about this table [14:42:25] same data, but redacted differently so other indexes can be used [14:42:34] see https://wikitech.wikimedia.org/wiki/Help:MySQL_queries#Alternative_Views (though it hasn’t been updated for the actor migration yet) [14:43:11] but https://wikitech.wikimedia.org/wiki/News/Actor_storage_changes_on_the_Wiki_Replicas#What_changed%3F says those views were also updated to account for the actor migration [14:43:16] so that might be more efifcient [14:44:09] see also https://phabricator.wikimedia.org/T221339 [14:44:12] Lucas, the db is generally slow (analytics host), even if a specific index is used. [14:44:29] and not just revision [14:44:42] months ago is was faster, years ago it was very fast :-D [14:47:18] mmecor: also, quering revision_actor_temp (https://www.mediawiki.org/wiki/Manual:Revision_actor_temp_table) might work [14:47:22] SELECT revactor_page, COUNT(DISTINCT revactor_actor) FROM revision_actor_temp GROUP BY revactor_page; [14:47:37] though that’s a temporary table, of course, so nothing you want to put in a long-term tool I suppose [14:47:59] (revision on the toolforge dbs is a view that joins the real revision table with revision_actor_temp under the hood) [14:48:04] (I think) [14:52:14] i might use this temporarily - but this is a script that should be automatized on a monthly basis [14:52:19] this is the problem [14:54:07] i'll try using this other table. thanks Lucas_WMDE [14:54:16] good luck [14:55:07] Toolforge SQL Optimizer seems to be broken at the moment unfortunately https://tools.wmflabs.org/sql-optimizer?use=dewiki_p&sql=SELECT+COUNT%28DISTINCT+rev_actor%29%2C+rev_page%0D%0AFROM+revision_userindex%0D%0AGROUP+BY+rev_page%3B [14:57:43] what is this SQL Optimizer doing? [14:58:52] it’s supposed to show you the explain [14:58:59] *the explain output [14:59:04] because otherwise it’s a bit annoying to get (https://wikitech.wikimedia.org/wiki/Help:MySQL_queries#Optimizing_queries) [15:51:06] !log toolsbeta deleted toolsbeta-k8s-lb-01 since it isn't being actively worked on just now [15:51:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [15:55:12] !log toolsbeta T221721 deleted toolsbeta-proxy-01 until it can be actively worked on. [15:55:13] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [15:55:14] T221721: Puppet broken on several vms in toolsbeta - https://phabricator.wikimedia.org/T221721 [16:03:28] !log toolsbeta T221721 hard rebooted toolsbeta-sgegrid-master because it had oomkilled basically everything [16:03:31] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [16:03:32] T221721: Puppet broken on several vms in toolsbeta - https://phabricator.wikimedia.org/T221721 [16:16:59] bstorm_: you were talking about a possible "change made by the DBAs in how the replicas are arranged". Did we find out exactly what happened? [16:17:10] SQL Optimizer no longer works. It used to work by making a query on s1 and running SHOW EXPLAIN FOR [process ID]. My guess is that connections to s1 no longer are guaranteed to be the same database server? [16:17:20] Weird. [16:17:53] We know that the CPU was churning out of control on the web replicas, so they rebalanced the servers so that two serve web while one serves analytics [16:18:20] okay, let me try using analytics then [16:18:22] nothing to do with S1 [16:18:29] Yes :) [16:18:32] I was just going to say that [16:18:57] We'd want to switch from web to analytics (for now). [16:21:07] hmm that didn't seem to work. I tried connecting to s1.analytics.db.svc.eqiad.wmflabs [16:22:29] oh wait, it does work. All good then. Thank you! [16:23:25] Cool! :) [16:23:30] Thank you for getting it working [16:24:20] bstorm_: there is analytics and web, and sow thing should run on analytics .... but query's on analytics are never killed automatically because they are too slow or so? [16:24:21] !log toolsbeta Manually failed "back" to the toolsbeta-sgegrid-master to get the grid functioning again in toolsbeta [16:24:23] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [16:25:11] no problema [16:25:20] Lucas_WMDE: SQL Optimizer is back online [16:25:40] Steinsplitter: I'd have to check the config of the query killer at the moment (which I'll do in a bit). When servers need depooling, things get killed either way (which happened a lot recently). [16:25:57] musikanimal: cool, thanks [16:26:06] If it was set up to be different between replicas, then it probably needs adjusting :) [16:26:30] Part of it is to keep the web queries quick enough to be used by web things. [16:27:00] So it is partly just "best practice". However, I know efforts have been made to make things technically match that as well. [16:27:53] :) [17:10:43] Steinsplitter: yes, the query killer is adjusted accordingly [17:10:54] The web killer is much more aggressive than the analytics one [17:11:59] bstorm_: thanks for the information! :) [17:18:58] !help https://phabricator.wikimedia.org/T225838 [17:18:58] RF1dle: If you don't get a response in 15-30 minutes, please create a phabricator task -- https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?projects=wmcs-team [17:19:13] Can someone look at that task while I'm around [17:22:12] added it to the general “delete tools” tracking task [17:22:20] not much more I can do, I’m not a cloud admin or anything [17:22:32] Lucas_WMDE: thx [17:24:13] Lucas_WMDE: That's been replaced with a tag so I've put that on [17:25:29] oh ok thanks [17:25:32] I wasn’t aware of that [17:25:53] No problem [19:44:35] !log tools.lexeme-forms deployed c48127f696 (update Russian translations) [19:44:37] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL [20:23:38] !log quarry framawiki: Deployed b303ce8 on -web-01 [20:23:40] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Quarry/SAL [22:06:42] !log ores ran "sudo service nagios-nrpe-server restart" on ores-web-01 [22:06:43] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Ores/SAL