[00:02:50] 10Tools: Tool "wikidata-primary-sources" loads assets from online.swagger.io - https://phabricator.wikimedia.org/T172970#3514775 (10zhuyifei1999) Or alternatively you could ask for consent before loading it. Eg.: replace the icon with a button that says something like "load validator on online.swagger.io", and i... [00:09:56] RECOVERY - Puppet errors on tools-worker-1010 is OK: OK: Less than 1.00% above the threshold [0.0] [00:58:06] PROBLEM - Puppet errors on tools-exec-1439 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [01:38:06] RECOVERY - Puppet errors on tools-exec-1439 is OK: OK: Less than 1.00% above the threshold [0.0] [03:20:23] 10Striker, 10User-bd808: Can’t log into Striker, potentially because of accents in logins - https://phabricator.wikimedia.org/T172949#3514877 (10bd808) To check to see if this is likely to be a unicode handling bug, I tried to create the user `unicode test 🦄` through wikitech's Special:CreateAccount. This actu... [03:40:44] PROBLEM - Puppet errors on tools-exec-1408 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [03:45:29] 10Toolforge, 10Wikisource, 10Bengali-Sites: Update the "tesseract-ben" package on Toolforge for OCR on Bengali Wikisource - https://phabricator.wikimedia.org/T167566#3514886 (10Bodhisattwa) a:03Tpt [04:28:18] 10Tools: Tool "wikidata-slicer" loads fonts from google - https://phabricator.wikimedia.org/T172971#3514894 (10Ricordisamoa) As you can see, the googleapis request comes from https://tools-static.wmflabs.org/cdnjs/ajax/libs/bootswatch/3.3.7/flatly/bootstrap.min.css. Unfortunately, there doesn't seem to be an eas... [04:58:22] 10Tools, 10Toolforge-standards-committee, 10Privacy: Hunt for Toolforge tools that loads resources from third party sites - https://phabricator.wikimedia.org/T172065#3514918 (10Ricordisamoa) [04:58:24] 10Tools: Tool "ricordisamoa" loads fork-me-on-github ribbon from Amazon AWS - https://phabricator.wikimedia.org/T172828#3514916 (10Ricordisamoa) 05Open>03Resolved Replaced with the CSS version in https://github.com/ricordisamoa/labs/commit/e64c85a243c0db3c808d1bb0bb75bab79eb00405 [06:05:44] RECOVERY - Puppet errors on tools-exec-1408 is OK: OK: Less than 1.00% above the threshold [0.0] [08:32:58] 10Data-Services, 10XTools, 10Patch-For-Review: s51187 and p50380g50692 database users are generating excessive lag on replica service - https://phabricator.wikimedia.org/T172882#3515047 (10jcrespo) > Hard to believe we were causing all this lag, since the lag was continuing to be a problem even when none of... [09:06:35] 10Data-Services, 10XTools, 10Patch-For-Review: s51187 and p50380g50692 database users are generating excessive lag on replica service - https://phabricator.wikimedia.org/T172882#3515083 (10jcrespo) This keeps happening: ``` p50380g50692__DPL_p Command: Query Time: 79 State: Sending data Info: CR... [09:11:03] 10Data-Services, 10XTools, 10Patch-For-Review: s51187 and p50380g50692 database users are generating excessive lag on replica service - https://phabricator.wikimedia.org/T172882#3515085 (10jcrespo) Both memory, aria and myisam have table-level locking, and not row-level or mvcc: ``` DELETE FROM ch_pl W... [09:32:17] 10cloud-services-team, 10Operations, 10Patch-For-Review: notebook100[12] - Invalid relationship: Apt::Pin[r-base] - https://phabricator.wikimedia.org/T171924#3515121 (10elukey) 05Open>03Resolved a:03elukey [09:34:15] PROBLEM - Puppet errors on tools-exec-1420 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [09:44:24] 10Data-Services, 10cloud-services-team: Disk utilization of labsdb1001 went through the roof - https://phabricator.wikimedia.org/T172981#3515128 (10jcrespo) [10:15:36] 10Data-Services, 10Toolforge: Drop database s53003__xtools_prod - https://phabricator.wikimedia.org/T170645#3515151 (10Samwilson) 05Open>03Resolved a:03Samwilson I am very sorry @bd808 I honestly did try that before creating this ticket! It worked this time. :-P [10:27:40] 10Data-Services, 10XTools, 10Patch-For-Review: s51187 and p50380g50692 database users are generating excessive lag on replica service - https://phabricator.wikimedia.org/T172882#3515157 (10jcrespo) Also: ``` CREATE TABLE IF NOT EXISTS `s51187__xtools_tmp`.`201708101025461935804d7bc9bfde231b57ce8330830b_paren... [11:28:35] 10Tools: Tool "wikidata-slicer" loads fonts from google - https://phabricator.wikimedia.org/T172971#3515175 (10zhuyifei1999) >>! In T172971#3514894, @Ricordisamoa wrote: > As you can see, the googleapis request comes from https://tools-static.wmflabs.org/cdnjs/ajax/libs/bootswatch/3.3.7/flatly/bootstrap.min.css.... [11:34:18] RECOVERY - Puppet errors on tools-exec-1420 is OK: OK: Less than 1.00% above the threshold [0.0] [12:38:25] !help Are you allowed to test features on the bata cluster? [12:38:25] sau226: If you don't get a response in 15-30 minutes, please create a phabricator task -- https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?projects=wmcs-team [12:40:52] Thank you bot [12:42:37] sau226: what kind of features? the features already there? sure [12:45:35] Are you allowed to get sysop? If the answer's yes then I'm formally requesting them on the beta cluster [12:45:45] on which wiki? [12:46:23] sau226: and can you give me your username there at on the normal cluster? [12:47:37] Is this wiki ok? https://simple.wikipedia.beta.wmflabs.org/wiki/Main_Page [12:47:53] Also my username for testing is thetrentus (new account just for wikitech) [12:48:49] sau226: beta is not connected to wikitech, so you will need to create an account at beta as well [12:49:45] Done Sagan [12:51:54] sau226: and can you give me your username on the normal cluster? (username at en.wikipedia.org etc.)? [12:52:07] thetrentus [12:52:23] Or i can provide my main acc that I use for editing [12:52:36] sure [12:53:21] sau226 [12:53:23] There [12:54:39] Sagan [12:54:55] sau226: what kind of features do you want to check= [12:56:48] Maybe examine the test cases and just look for bugs using chrome and explorer. My version of explorer is a little old so I may be able to run compatiability checks [12:57:16] sau226: you can do that at any time there. my question is just why you say you would need sysop rights [12:58:27] Delete pages and clean up any potential vandalism [13:00:23] sau226: hm, ok. I've assigned you them for 1 month at that wiki. you can ping me again a week or so before they expire [13:00:50] thank you [13:01:06] sau226: correction: I will assign it in a moment, currently my login is misbehaving [13:02:34] ok, worked now [13:02:36] No worries [13:08:24] 10Data-Services, 10XTools, 10Patch-For-Review: s51187 and p50380g50692 database users are generating excessive lag on replica service - https://phabricator.wikimedia.org/T172882#3515268 (10russblau) ENGINE=MEMORY changed to ENGINE=InnoDB for all dplbot jobs. [13:09:40] I'm no serious developer but found this bug: Lua error in package.lua at line 80: module 'Module:Citation/CS1/Date_validation' not found. [13:09:49] Can that error be rectified please? [13:10:04] that looks like a error in the local module there [13:10:45] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1416 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [13:11:07] Time to fix the module then [13:12:42] it looks like the lua script tried to load a module from the mentioned page, but even simplewiki has no such module [13:13:53] Can any of you go ssh into the server and delete the reference to the module from the code? [13:14:19] were did you get that error? [13:14:28] https://simple.wikipedia.beta.wmflabs.org/wiki/Hoxne_treasure [13:14:31] reference 7 [13:15:57] that looks like because cite web is used there and the module it needs it not available there. not a server side issue [13:16:05] since all lua things are managed onwiki [13:17:58] Do we need to take this to the wiki itself? [13:18:00] looks like I fixed it now [13:18:06] Thanks [13:50:45] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1416 is OK: OK: Less than 1.00% above the threshold [0.0] [14:03:19] (03PS1) 10BryanDavis: Add new wheels [labs/striker/wheels] - 10https://gerrit.wikimedia.org/r/371036 (https://phabricator.wikimedia.org/T149458) [14:06:51] 10Data-Services, 10XTools, 10Patch-For-Review: s51187 and p50380g50692 database users are generating excessive lag on replica service - https://phabricator.wikimedia.org/T172882#3515566 (10jcrespo) 05Open>03Resolved a:03jcrespo Right now the lag is 0 everywhere, maybe it is just a coincidence, but that... [14:09:40] 10Data-Services, 10XTools, 10Patch-For-Review: s51187 and p50380g50692 database users are generating excessive lag on replica service - https://phabricator.wikimedia.org/T172882#3515593 (10MusikAnimal) >>! In T172882#3515157, @jcrespo wrote: > Also: > ``` > CREATE TABLE IF NOT EXISTS `s51187__xtools_tmp`.`20... [14:11:47] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1416 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [14:23:32] 10Data-Services, 10XTools, 10Patch-For-Review: s51187 and p50380g50692 database users are generating excessive lag on replica service - https://phabricator.wikimedia.org/T172882#3515648 (10jcrespo) Thank you both of you- even if this wasn't the cause, it certainly helps catching up faster and avoiding unnece... [14:42:57] 10Tool-wikiloves, 10Wiki-Loves-Monuments (2017): Ensure wikiloves tool detects first uploader, not latest - https://phabricator.wikimedia.org/T173003#3515735 (10JeanFred) [14:44:05] 10Tool-wikiloves, 10Wiki-Loves-Monuments (2017): Ensure wikiloves tool detects first uploader, not latest - https://phabricator.wikimedia.org/T173003#3515735 (10JeanFred) [14:46:47] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1416 is OK: OK: Less than 1.00% above the threshold [0.0] [14:48:40] 10Tool-wikiloves: Generalize wikiloves tool beyond the "Wiki Loves X" format - https://phabricator.wikimedia.org/T173005#3515782 (10JeanFred) [14:57:57] any toolforge admin available to take a look at some tool? [14:58:03] it looks like the webservice just needs a restart [14:58:25] Sagan: which? [14:58:39] chasemp: https://tools.wmflabs.org/stimmberechtigung and https://tools.wmflabs.org/intersect-contribs [14:58:47] both do not load, 502 [14:59:03] and both ownsers are currently inactive [14:59:10] *owners [14:59:56] !log tools 'become stimmberechtigung && restart' && 'become intersect-contribs && restart' [15:00:00] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [15:00:20] chasemp: wow, that was fast. Thank you very much :) [15:00:26] np [15:14:14] 10Data-Services, 10cloud-services-team (Kanban): Promote beta test of new Wiki Replica servers - https://phabricator.wikimedia.org/T172704#3515878 (10bd808) [15:35:54] 10Tool-wikiloves: Wikilove tool should credit first uploader, not last - https://phabricator.wikimedia.org/T173010#3515949 (10JeanFred) [15:36:51] 10Tool-wikiloves: Wikilove tool should credit first uploader, not last - https://phabricator.wikimedia.org/T173010#3515965 (10JeanFred) [15:36:53] 10Tool-wikiloves, 10Wiki-Loves-Monuments (2017): Ensure wikiloves tool detects first uploader, not latest - https://phabricator.wikimedia.org/T173003#3515968 (10JeanFred) [15:37:17] 10Tool-wikiloves: Wikilove tool should credit first uploader, not last - https://phabricator.wikimedia.org/T173010#3515949 (10JeanFred) [15:37:19] 10Tool-wikiloves, 10Wiki-Loves-Monuments (2017): Ensure wikiloves tool detects first uploader, not latest - https://phabricator.wikimedia.org/T173003#3515735 (10JeanFred) [15:51:10] 10Data-Services, 10cloud-services-team (Kanban): Promote beta test of new Wiki Replica servers - https://phabricator.wikimedia.org/T172704#3516043 (10bd808) Pinging @Samwilson, @MusikAnimal, @Niharika, and @kaldari here to see if they would be interested in moving some of their personal/CommTech tools over to... [16:38:48] 10Tool-wikiloves, 10Wiki-Loves-Monuments (2017): Ensure wikiloves tool detects first uploader, not latest - https://phabricator.wikimedia.org/T173003#3516287 (10JeanFred) I deployed and rescheduled a database update. Here are the differences: # Wiki Loves Earth | Year | Countries | Uploads | Images used in th... [16:43:32] 10Tool-wikiloves, 10Wiki-Loves-Monuments (2017): Ensure wikiloves tool detects first uploader, not latest - https://phabricator.wikimedia.org/T173003#3516319 (10JeanFred) I’m not actually sure it worked :-( There are still eg SteinsplitterBot in http://tools.wmflabs.org/wikiloves/earth/2016/Pakistan :-( [16:49:26] 10Striker, 10User-bd808: Can’t log into Striker, potentially because of accents in logins - https://phabricator.wikimedia.org/T172949#3514139 (10valhallasw) A few observations. The encoding of the username in the production ldap seems OK: ``` >>> base64.b64encode('Jean-Frédéric'.encode('utf-8')) b'SmVhbi1Gcs... [16:57:59] 10Tool-wikiloves: Allow to export a wikiloves competition roster, eg as Wikimetrics cohort - https://phabricator.wikimedia.org/T173024#3516364 (10JeanFred) [17:07:37] 10Tool-wikiloves: Allow to export a wikiloves competition roster, eg as Wikimetrics cohort - https://phabricator.wikimedia.org/T173024#3516414 (10JeanFred) Can I somehow create a Wikimetrics cohort via API ? Or sticking as a URL parameter to https://metrics.wmflabs.org/cohorts/upload ? cc @Milimetric [17:15:11] 10Tool-wikiloves: Allow to export a wikiloves competition roster, eg as Wikimetrics cohort - https://phabricator.wikimedia.org/T173024#3516426 (10Milimetric) Someone could change the code to do that, but right now you need to be authenticated to access /cohorts/upload [17:15:47] 10Tool-wikiloves: Allow to export a wikiloves competition roster, eg as Wikimetrics cohort - https://phabricator.wikimedia.org/T173024#3516427 (10Milimetric) once you authenticate, and if you can construct the post the way it expects it, then yes, you can POST to /cohorts/upload [17:16:23] 10Cloud-Services, 10DBA, 10User-MarcoAurelio: Prepare and check storage layer for hi.wikivoyage - https://phabricator.wikimedia.org/T173027#3516428 (10MarcoAurelio) [17:21:45] 10Tool-wikiloves: Allow to export a wikiloves competition roster, eg as Wikimetrics cohort - https://phabricator.wikimedia.org/T173024#3516453 (10JeanFred) @Milimetric Thanks for the swift answer ! That does sound like a bit work − I think I’ll just export the roster as a Wikimetrics-compatible CSV, and let user... [17:23:33] 10Cloud-Services, 10DBA, 10User-MarcoAurelio: Prepare and check storage layer for hi.wikivoyage - https://phabricator.wikimedia.org/T173027#3516472 (10Marostegui) Thanks - ping us once the database is actually created. Note for the DBAs, if this is done before - T172514 gets merged, we need to apply the ALTE... [18:05:51] 10VPS-project-Wikistats, 10User-MarcoAurelio: Add hi.wikivoyage to wikistats - https://phabricator.wikimedia.org/T173033#3516616 (10MarcoAurelio) [18:41:14] !log rcm Neon: Running update [18:41:17] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Rcm/SAL [18:53:56] hm, "User 's52261' has exceeded the 'max_user_connections' resource" [18:54:07] madhuvishy: can you help us? [18:54:22] I'm not seeing any open connections, and we just started getting this error, which is killing wikimetrics [18:55:48] milimetric: i can look in a moment [18:55:56] yay, thanks, no rush [19:06:40] it says there's existing 10 connections in the error [19:07:12] milimetric: ^ [19:10:42] madhuvishy: "in the error"? [19:11:10] I'm doing like show processlist; on the different dbs [19:11:16] like jawiki_db for example [19:11:24] uh jawiki_p [19:11:55] (1226, "User 's52261' has exceeded the 'max_user_connections' resource (current value: 10)") [19:12:12] looking at logs from wikimetrics-queue [19:12:12] oh, it's saying that's the setting, right [19:12:35] but I don't see any open connections, not sure if I should look differently [19:12:59] hmmm let me see [19:13:01] also, I think the setting for this tool was set to like 100 at some point, because of this reason [19:13:15] so if that's possible, we should do that (but maybe I'm mis-remembering) [19:19:55] 10Data-Services, 10cloud-services-team (Kanban): Promote beta test of new Wiki Replica servers - https://phabricator.wikimedia.org/T172704#3516936 (10bd808) [19:20:55] 10Data-Services, 10cloud-services-team (Kanban): Promote beta test of new Wiki Replica servers - https://phabricator.wikimedia.org/T172704#3506381 (10bd808) New service names are live. ``` $ mysql -h wikireplica-analytics.eqiad.wmnet Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB con... [19:22:27] milimetric: can you try running something now? [19:22:36] (on wikimetrics - like a report) [19:25:53] doing [19:26:37] madhuvishy: k, just ran the validate job that has been failing [19:26:45] (failed again) [19:26:46] yeah [19:27:03] well, didn't quite fail, sort of worked but only on a small subset of the wikis [19:27:11] it looks like there are open connections to the other wikis [19:28:02] Hi [19:28:54] milimetric: You are opening at least 10 connections, so mysql complains (rightly): https://phabricator.wikimedia.org/P5877 [19:29:20] (this is from labsdb1001) [19:30:22] marostegui: yeah, this tool has been doing that forever, because it runs big jobs in parallel on all the wikis [19:30:33] like for example when it has to validate across 280 wikis [19:31:04] I think maybe the problem is we made an exception for wikimetrics a while back and it got reverted? [19:31:04] milimetric: can that be done in batches? [19:31:08] yeah, and the limit hasn't changed too [19:31:13] may be [19:31:25] milimetric: as far as I know neither me or jaime have touched this for weeks [19:31:37] right, we haven't touched the code in over a year htough [19:32:00] milimetric: Then I don't know :-) I can ask Jaime tomorrow if has touched it but I really doubt it [19:32:06] and we could, of course, change it any way we like, but we're not really maintaining this tool beyond just checking on it from time to time [19:32:13] Could it be that queries are taking longer and pile up? [19:32:30] that's possible for some tasks, but this is just user lookups, so not for this [19:32:46] Because of this: https://phabricator.wikimedia.org/T172981 [19:33:33] what's labsdb1009/10/11? Are all wikis available on each of those? [19:33:37] yep [19:33:46] that's the new one yall set up? [19:34:03] yup [19:34:15] madhuvishy: is going to try on labsdb1009 [19:34:17] I mean, I can switch, but it'll probably have the same problem, because of how it opens everything up in parallel [19:34:20] I'll try it anyway [19:34:56] milimetric: not really, they are faster, so if the issue is that the queries are piling up...it should not happen there [19:35:24] I don't think they're piling up, I think they're just being opened in parallel and not being closed properly, but let's see [19:35:44] milimetric if that is the issue, I would suggest you open less connections then? [19:35:46] madhuvishy: did you just change it to 1009? [19:35:53] yeah [19:35:58] i have puppet disabled [19:36:00] :) k, i was like... magic!! [19:36:26] marostegui: it's not really possible without a lot of re-writing, and this tool is very low priority [19:36:44] k, madhuvishy lemme know and I can retry the job [19:37:01] milimetric: try now? [19:41:36] milimetric: yeah it happens in the new dbs too [19:41:50] do you have an idea on how many concurrent connections the tool would need [19:42:06] madhuvishy: this error could be just the wikimetrics db on tools.labsdb [19:42:31] but the main question is how do I see these connections? Because show processlist doesn't seem to show them anymore? [19:43:09] milimetric: You cannot see them because they are gone really quickly [19:43:52] oh, so they're not stale connections or things that take too long, it's simply that it's opening too many things in parallel [19:44:11] then... it seems that there must have been some exception for wikimetrics that got inadvertently over-written [19:44:47] I definitely remember discussing such an exception I just don't remember if it was implemented or not [19:47:41] but we are seeing hanging connections on tools.db [19:47:44] like not being closed [19:48:56] right now there are 2 of them [19:51:04] yeah, I see those with show processlist too, that I think was always there [19:51:23] I think that's the connection pool doing its thing [19:52:52] milimetric: how many connections you think you'd need to open at the same time? [19:53:15] We are trying to figure from which host the error comes from specifically [19:56:05] 10Cloud-Services, 10DBA: Prepare and check storage layer for hi.wikivoyage - https://phabricator.wikimedia.org/T173027#3517111 (10MarcoAurelio) [19:56:30] 10VPS-project-Wikistats: Add hi.wikivoyage to wikistats - https://phabricator.wikimedia.org/T173033#3517116 (10MarcoAurelio) [19:56:35] I'm not sure marostegui, I seem to remember at some point we were talking about 100 [19:57:02] we could try it and see if it works, and if not, file a task and timebox it, we just don't have any resources to work on it [19:57:43] milimetric: I can ask jaime tomorrow if he remebers the history about this [20:00:27] Because going from 10 to 100 sounds a bit extreme :) [20:03:25] if I remember correctly, this was before jaime got involved, I think Coren set up that limit [20:03:55] meanwhile I will send an update that wikimetrics is down, stop the services, and look at the code [20:06:38] milimetric: ok - then it is even weirder, because we have not touched that user I am 99% in the last months [20:07:16] 99% sure I meant [20:07:30] (03PS3) 10BryanDavis: [WIP] DDL changes for tool account management [labs/striker/deploy] - 10https://gerrit.wikimedia.org/r/370139 (https://phabricator.wikimedia.org/T149458) [20:07:33] yeah, that's ok, I stopped it, and honestly it should have been stopped as soon as it was deprioritized [20:07:56] having a service this complicated running without anyone being assigned to maintain it is a recipe for disaster [20:08:26] milimetric: i'm not sure if the max_connections thing is having trouble with toolsdb or labsdb [20:08:37] if you can think of a way to log that we can look more [20:08:52] if it's labsdb we can try switching to new servers and bumping limit some [20:09:00] if it's toolsdb, i'm not sure [20:09:36] however, bumping something from 10 to 100 looks quite too much to me [20:09:51] but at least, as madhuvishy say, we should try to log which specific host is complaining [20:09:53] yeah, agreed [20:09:59] as we are kind of blind now :( [20:10:06] i'm switching back my local changes now [20:10:58] milimetric: also, puppet will bring up the services when it run [20:10:59] runs [20:11:32] you can keep puppet stopped for a bit, but i recommend not having it run stale - since that causes problems when we do puppet updates and such [20:13:29] should I just shut down the box madhuvishy ? [20:15:14] marostegui: I can see why 100 connection would seem like a lot, this tool can do a lot of work at once though, and it needs to be able to do that. And I could tell you more details if I had looked at the code in the past two years, but that's not the case :) [20:17:30] milimetric: Yeah, my main concern is how that can affect other users and/or the server itself :) [20:17:49] milimetric: I think if you are killing the service - remove from puppet, delete the boxes. I am not sure if that's what you are saying though [20:17:50] That is why I am so careful when bumping connections, specially that many [20:18:08] madhuvishy: no, I just want to stop it, not kill it [20:18:33] of course, marostegui, that makes perfect sense, I don't think we should do anything until we find out the problem [20:18:42] milimetric: ah yes, then just keep puppet stopped, and not let it get stale for more than a few days or so [20:19:08] or you can remove the service => running declarations from puppet [20:19:08] ok. I guess I can just change puppet to not run the services too [20:19:11] yes [20:19:12] yea [20:19:17] jynx :) [20:19:21] :) [20:22:24] madhuvishy: for the 1009 change there was a different error: "Can't connect to MySQL server on 'labsdb1009.eqiad.wmnet' (110)") [20:24:00] madhuvishy: also I'm only seeing the max connections error on the wiki dbs, not the wikimetrics db [20:26:22] milimetric: yeah, i changed it to the service url (labsdb-analytics) to fix it [20:26:26] milimetric: the can't connect was fixed by using the dns instead, it was fixed [20:26:59] oh ok, but I don't see any other errors with the validation after that though [20:30:41] milimetric It was mostly me testing stuff after [20:30:57] i was running a report from my account [20:31:57] oh ok, I see, then it's definitely just the wiki dbs failing [20:34:07] and I see the error consistently there since at least August 2nd, so it's not something recent [20:34:33] but I also see stuff working, so who knows how broken this thing is, sheesh [20:35:31] 10Cloud-VPS, 10Huggle: huggle.huggle.eqiad.wmflabs does not have puppet installed - https://phabricator.wikimedia.org/T166588#3517304 (10Krenair) We found this box has actually already become disconnected from LDAP. @bd808 added our keys to root's authorized_keys and I've figured out how to insert the new WMF... [20:45:28] Can someone restart https://tools.wmflabs.org/mrmetadata/ ? It seems down.. (linked from https://meta.wikimedia.org/wiki/File_metadata_cleanup_drive #Progress ) [20:46:10] Not sure for how long or whether it is intentionall down or broken [20:49:40] 10Cloud-VPS, 10Huggle: huggle.huggle.eqiad.wmflabs does not have puppet installed - https://phabricator.wikimedia.org/T166588#3517345 (10Petrb) 05Open>03Resolved a:03Petrb instance is down [20:53:32] 10Cloud-Services, 10Toolforge, 10Labs-Team-Backlog, 10Mail: Set up A-based SPF for tools.wmflabs.org - https://phabricator.wikimedia.org/T104733#1425752 (10Platonides) @valhallasw is there anything left to do here? [22:43:41] legoktm: do you think you can make http://tools.wmflabs.org/extreg-wos/ sortable? [22:52:52] TabbyCat: I have no idea how to do that outside of MW :( [22:53:05] oh, well, it doesn't matter :) [22:53:31] I'm trying to figure out how to run the php maintenance/extensions/converttoextensionregistration [23:25:59] !log rcm Neon: Updateing packages, including git [23:26:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Rcm/SAL [23:26:56] !log rcm Oxygen: Updateing packages, including git [23:26:57] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Rcm/SAL [23:28:35] !log rcm Xenon: Updating packages [23:28:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Rcm/SAL [23:29:31] !log rcm Tin: Removed broken source from package list [23:29:33] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Rcm/SAL [23:30:25] !log rcm Tin: Updating packages [23:30:27] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Rcm/SAL [23:30:55] !log rcm CAC: Removed broken source from package list [23:30:56] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Rcm/SAL [23:32:02] !log rcm CAC: Updating packages [23:32:04] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Rcm/SAL [23:37:21] !log rcm Xenon: Running update.sh [23:37:23] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Rcm/SAL [23:37:32] !log rcm CAC: Running vagrant git-update [23:37:34] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Rcm/SAL [23:55:27] madhuvishy: was there anything else I needed to do besides restarting the queue / uwsgi if I make a change? It doesn't seem to pick up my logging [23:55:37] I saw that you added some logging, did that work?