[01:04:34] 06Labs, 10Tool-Labs, 10Mail, 03ToolLabs-Goals-Q4: Make tools-mail redundant - https://phabricator.wikimedia.org/T96967#2862452 (10scfc) [01:53:13] 06Labs, 10Labs-Infrastructure, 06Operations, 06Reading-Web-Backlog, and 2 others: https://wikitech.m.wikimedia.org/ serves wikimedia.org portal - https://phabricator.wikimedia.org/T120527#2862604 (10Krenair) From the sounds of what @MaxSem wrote on T87633, it actually can't support this, and we don't need... [02:03:48] PROBLEM - Free space - all mounts on tools-worker-1017 is CRITICAL: CRITICAL: tools.tools-worker-1017.diskspace._var_lib_docker.byte_percentfree (No valid datapoints found)tools.tools-worker-1017.diskspace.root.byte_percentfree (<22.22%) [03:23:07] 10Labs-project-Phabricator: Authentication Failure in phab-03.wmflabs.org - https://phabricator.wikimedia.org/T152891#2862758 (10Pokefan95) [04:35:02] Hey jsub doesn't appear to be working right now, and hasn't been working for like 15 hours. None of my cron jobs are running [04:53:03] 06Labs, 10Labs-Infrastructure: Change upper-bound system uid range to 499 - https://phabricator.wikimedia.org/T45795#2862805 (10scfc) [04:53:17] 06Labs, 10Labs-Infrastructure: Change upper-bound system uid range to 499 - https://phabricator.wikimedia.org/T45795#515306 (10scfc) a:03scfc [05:07:58] nevermind it was a bug in my tool [05:28:05] 06Labs, 10Tool-Labs, 10Tool-Labs-tools-Database-Queries: supercount: High Replication Lag - https://phabricator.wikimedia.org/T152894#2862812 (10JustBerry) [05:43:25] 06Labs, 10DBA: high s1 replag to labs - https://phabricator.wikimedia.org/T152894#2862842 (10Krenair) [05:44:48] 06Labs, 10DBA: high s1 replag to labs - https://phabricator.wikimedia.org/T152894#2862812 (10Krenair) https://tools.wmflabs.org/replag/ shows it too - but https://tendril.wikimedia.org/tree does not ```MariaDB [heartbeat_p]> select * from heartbeat; +-------+----------------------------+-------+ | shard | las... [08:55:37] Hi, why SQL query "select page_title from page where page_namespace=0 and page_is_redirect=0 and page_title!="HlavnĂ­_strana" and page_id not in (select cl_from from categorylinks) and page_id not in (select tl_from from templatelinks where tl_title="Kategorizovat");" run at cswiki_p returns Leicester_City_FC_(rezerva_a_akademie) although this page should be in select cl_from from categorylinks (as there are some cats in it)? [08:55:38] https://tools.wmflabs.org/replag shows no lag for cswiki. Can anybody help me? [08:55:45] MariaDB [cswiki_p]> select page_title from page where page_namespace=0 and page_is_redirect=0 and page_title!="HlavnĂ­_strana" and page_id not in (select cl_from from categorylinks) and page_id not in (select tl_from from templatelinks where tl_title="Kategorizovat"); [08:55:45] +----------------------------------------+ [08:55:46] | page_title | [08:55:46] +----------------------------------------+ [08:55:47] | Leicester_City_FC_(rezerva_a_akademie) | [08:55:47] +----------------------------------------+ [08:55:49] 1 row in set (8.85 sec) [08:55:51] MariaDB [cswiki_p]> [09:34:48] Hi all, I run an unofficial tool on Tool Labs. Are there any monitoring services that I can take advantage of? [09:35:57] dargasea: you mean "tool" as a web tool? [09:36:23] zhuyifei1999_, nah, those ones you jstart into the Grid. [09:36:49] uh, afaik, no [09:37:32] there are services to keep restarting if it quits however [09:38:29] ah okay, I was looking for something that can check returns. WM Ops' got icinga. [09:38:33] I guess I'll roll my own then [09:38:38] yeah :( [09:38:44] thanks tho! [09:40:01] It might be part of some other services in the future [09:40:18] I mean k8s or Community-labs-monitoring project [09:41:56] cool, I hope it moves forward. [09:42:14] But in reality we've got BigBrother already, so their is little incentive to actually do this. [09:42:20] there is* [10:15:41] 06Labs, 10DBA: high s1 replag to labs - https://phabricator.wikimedia.org/T152894#2862812 (10Marostegui) That is because replication is broken on db1069 (sanitarium) for the S1 instance and that is why labservers are lagging behind the primary master: ``` Error 'Duplicate entry '0-Levent_Karahan' for key 'nam... [11:19:55] andrewbogott: is it possible for v2c to store the encoded files on NFS? (T152899) [11:19:55] T152899: Disk space went not cool again on the video encoding instances - https://phabricator.wikimedia.org/T152899 [11:20:30] the service user is not attached to ldap [11:22:42] zhuyifei1999_: technically I think that would probably still work (although uid=X may be resolved to a different username on a different host) [11:22:57] I think the more important question is 'do we want hundreds of GBs of encoded videos on NFS' :-) [11:24:27] zhuyifei1999_: you suggest 'add more hosts' as a solution -- does that mean that there's an automatic failover when a drive is full? [11:25:20] it's a pool load-balanced with unknown and unsmart mechanism [11:26:17] there's no failover, but the new hosts can takeover if the job isn't running already [11:27:02] but if the job already started, there's no failover unless the user decides to restart it after abort or fail [11:27:14] ah, just a central queue and whichever host picks up the job first runs it? [11:27:23] I see [11:27:24] kind of like that [11:28:15] * valhallasw`cloud ponders [11:28:39] and is NFS okay for hundreds of GBs of videos, each storing for like 15-16 days (if nobody cleans it up manually)? [11:30:20] I'm inclined to say 'rather not' [11:30:41] btw: idek how a host decides whether to pick up a job or not [11:30:53] zhuyifei1999_: a simple improvement could be 'only accept jobs if > 10GB free space' [11:31:05] ah, ok. [11:32:59] there are many workers without a job, and their are currently 2 pending jobs without a worker [11:35:30] yuvipanda: http://docs.celeryproject.org/en/latest/reference/celery.exceptions.html#celery.exceptions.Reject <= does this work? (just asking since quarry use celery as well) [11:49:40] 10Tool-Labs-tools-Other: https://tools.wmflabs.org/pirsquared/iw.php crashes with fatal error - https://phabricator.wikimedia.org/T150592#2790492 (10MarcoAurelio) We're discussing at T152793#2863012 about an interwiki conflict with a namespace. While to tool gives some results, it's throwing all kind of errors a... [11:51:13] 10Tool-Labs-tools-Other: CVNBot: not able to reload wikis - https://phabricator.wikimedia.org/T152494#2863019 (10MarcoAurelio) It requires me to create an account there. I'm not sure if it's worth for just this. [11:56:23] 10Tool-Labs-tools-Other: CVNBot: not able to reload wikis - https://phabricator.wikimedia.org/T152494#2850237 (10valhallasw) I have copied the issue to https://github.com/countervandalism/CVNBot/issues/25 -- please follow the discussion there (I'm not going to copy messages and responses back and forth). Yes, g... [12:00:28] 10Tool-Labs-tools-Other: CVNBot: not able to reload wikis - https://phabricator.wikimedia.org/T152494#2863030 (10MarcoAurelio) @valhallasw Thanks for reporting there. I'm fine with that and will try to follow the issue. Hopefully CVN will migrate to Phabricator sometime. [14:15:16] 10Labs-project-Phabricator: Authentication Failure in phab-03.wmflabs.org - https://phabricator.wikimedia.org/T152891#2863199 (10Paladox) 05Open>03Resolved I managed to login and added the auth to login. There will probably be some breakages which I have reported here T152902 [15:36:37] !log video depooling encoding02, less than 5G of disk space left [15:36:40] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Video/SAL [15:40:06] apparently asking celeryd to stop isn't no longer wait-for-done-then-exit like it used to be, but immediately exit [17:15:27] (03PS1) 10MarcoAurelio: Resurrecting old IRC watchbot project [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/326356 [17:16:29] (03CR) 10jenkins-bot: [V: 04-1] Resurrecting old IRC watchbot project [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/326356 (owner: 10MarcoAurelio) [17:25:59] (03PS2) 10MarcoAurelio: Resurrecting old IRC watchbot project [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/326356 [17:26:47] (03CR) 10jenkins-bot: [V: 04-1] Resurrecting old IRC watchbot project [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/326356 (owner: 10MarcoAurelio) [17:48:28] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 06Developer-Relations, and 4 others: Set up process / criteria for taking over abandoned tools - https://phabricator.wikimedia.org/T87730#2863459 (10bd808) [17:50:37] (03PS3) 10MarcoAurelio: Resurrecting old IRC watchbot project [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/326356 [17:51:23] (03CR) 10jenkins-bot: [V: 04-1] Resurrecting old IRC watchbot project [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/326356 (owner: 10MarcoAurelio) [17:52:39] (03PS4) 10MarcoAurelio: Resurrecting old IRC watchbot project [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/326356 [17:53:27] (03CR) 10jenkins-bot: [V: 04-1] Resurrecting old IRC watchbot project [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/326356 (owner: 10MarcoAurelio) [17:53:45] argh damn you [17:54:35] bd808: any tool to autofix flake8 issues? I already run autopep8 and docformatter [17:56:31] (03CR) 10MarcoAurelio: "Ran autopep8 (aggresive x4) on BLWatcher.py to fix most of the coding issues, still tox keeps failing for other stuff." [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/326356 (owner: 10MarcoAurelio) [17:58:49] MarcoA: the auto tools won't fix everything. local variable assigned but never used warnings need human intervention for example because there could be side effects that are needed [18:10:17] (03PS5) 10MarcoAurelio: Resurrecting old IRC watchbot project [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/326356 [18:10:58] (03CR) 10jenkins-bot: [V: 04-1] Resurrecting old IRC watchbot project [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/326356 (owner: 10MarcoAurelio) [18:12:32] (03PS6) 10MarcoAurelio: Resurrecting old IRC watchbot project [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/326356 [18:13:21] (03CR) 10jenkins-bot: [V: 04-1] Resurrecting old IRC watchbot project [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/326356 (owner: 10MarcoAurelio) [18:15:28] 06Labs, 10DBA, 10Wikimedia-Site-requests: Reduce watchlist_count threshold - https://phabricator.wikimedia.org/T150548#2789278 (10Dereckson) I concur with @jcrespo: it makes sense to agree on a main API value instead to have several values according the tool used for consultation. [18:29:11] (03PS7) 10MarcoAurelio: Resurrecting old IRC watchbot project [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/326356 [18:29:50] (03CR) 10jenkins-bot: [V: 04-1] Resurrecting old IRC watchbot project [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/326356 (owner: 10MarcoAurelio) [18:42:39] (03PS8) 10MarcoAurelio: Resurrecting old IRC watchbot project [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/326356 [18:43:23] (03CR) 10jenkins-bot: [V: 04-1] Resurrecting old IRC watchbot project [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/326356 (owner: 10MarcoAurelio) [18:44:55] (03CR) 10MarcoAurelio: "All it rests to do is to work on BLWatcher.py code to fix several flake8 issues. I'm not smart enough to do that." [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/326356 (owner: 10MarcoAurelio) [20:06:31] 06Labs, 10DBA: high s1 replag to labs - https://phabricator.wikimedia.org/T152894#2863688 (10jcrespo) 05Open>03Resolved a:03jcrespo Fixed, replication lag is going down and it will be 0 in 6-8 hours from now https://tools.wmflabs.org/replag/ [21:30:11] 06Labs, 10DBA: high s1 replag to labs - https://phabricator.wikimedia.org/T152894#2863767 (10JustBerry) Thanks Jaime! [22:33:56] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/JustBerry was modified, changed by JustBerry link https://wikitech.wikimedia.org/w/index.php?diff=1121159 edit summary: [23:30:12] 06Labs, 10Tool-Labs, 06Operations, 10Phabricator, and 2 others: Install Arcanist in toollabs::dev_environ - https://phabricator.wikimedia.org/T139738#2863925 (10Dereckson) As there is a consensus to merge this right now and take care of Precise later if needed, this change has been scheduled to [[ https:/... [23:59:32] 06Labs, 10Tool-Labs, 06Operations, 10Phabricator, and 2 others: Install Arcanist in toollabs::dev_environ - https://phabricator.wikimedia.org/T139738#2863940 (10Dereckson) p:05Triage>03Normal