[01:09:21] 06Labs, 10wikitech.wikimedia.org: Upgrade SMW to 1.9 or later - https://phabricator.wikimedia.org/T62886#659361 (10bd808) It is more likely that {T53642} will be completed. Newer versions of SMW make use of Composer in a way that Wikimedia's production environment is not really ready to deal with. [02:12:54] PROBLEM - Puppet run on tools-worker-1002 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [02:14:12] PROBLEM - Puppet run on tools-webgrid-lighttpd-1205 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:14:27] PROBLEM - Puppet run on tools-checker-02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [02:14:57] PROBLEM - Puppet run on tools-exec-1214 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [02:15:51] PROBLEM - Puppet run on tools-exec-1217 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:17:09] PROBLEM - Puppet run on tools-webgrid-lighttpd-1210 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [02:17:11] PROBLEM - Puppet run on tools-webgrid-lighttpd-1209 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [02:17:21] PROBLEM - Puppet run on tools-exec-1410 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [02:17:22] PROBLEM - Puppet run on tools-bastion-03 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:17:40] PROBLEM - Puppet run on tools-exec-1409 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [02:17:51] PROBLEM - Puppet run on tools-webgrid-lighttpd-1409 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [02:17:59] PROBLEM - Puppet run on tools-webgrid-lighttpd-1408 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [02:18:13] PROBLEM - Puppet run on tools-exec-1212 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [0.0] [02:18:23] PROBLEM - Puppet run on tools-mail is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [02:18:26] PROBLEM - Puppet run on tools-worker-1011 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [02:18:44] PROBLEM - Puppet run on tools-exec-1216 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [02:18:52] PROBLEM - Puppet run on tools-exec-gift is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [02:18:57] PROBLEM - Puppet run on tools-exec-1418 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [02:19:14] PROBLEM - Puppet run on tools-exec-1420 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:20:08] PROBLEM - Puppet run on tools-webgrid-lighttpd-1202 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [02:20:16] PROBLEM - Puppet run on tools-checker-01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:20:20] PROBLEM - Puppet run on tools-exec-1412 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [02:20:40] PROBLEM - Puppet run on tools-webgrid-lighttpd-1415 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [02:21:02] PROBLEM - Puppet run on tools-grid-shadow is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [02:21:10] PROBLEM - Puppet run on tools-cron-01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [02:21:14] PROBLEM - Puppet run on tools-worker-1021 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:23:07] PROBLEM - Puppet run on tools-grid-master is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [02:23:19] PROBLEM - Puppet run on tools-webgrid-lighttpd-1416 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [02:24:11] PROBLEM - Puppet run on tools-webgrid-lighttpd-1203 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [02:24:23] PROBLEM - Puppet run on tools-webgrid-lighttpd-1411 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [02:24:25] PROBLEM - Puppet run on tools-webgrid-lighttpd-1401 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [02:24:47] PROBLEM - Puppet run on tools-webgrid-lighttpd-1404 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [02:25:11] PROBLEM - Puppet run on tools-webgrid-lighttpd-1201 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [02:25:25] PROBLEM - Puppet run on tools-exec-1219 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [02:25:31] PROBLEM - Puppet run on tools-docker-registry-02 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [02:25:44] PROBLEM - Puppet run on tools-exec-1405 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:26:00] PROBLEM - Puppet run on tools-worker-1008 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [02:26:12] PROBLEM - Puppet run on tools-webgrid-generic-1401 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:27:08] PROBLEM - Puppet run on tools-webgrid-lighttpd-1414 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [02:27:46] PROBLEM - Puppet run on tools-worker-1010 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [02:28:02] PROBLEM - Puppet run on tools-exec-1415 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [02:28:38] PROBLEM - Puppet run on tools-docker-registry-01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [02:28:39] PROBLEM - Puppet run on tools-webgrid-lighttpd-1407 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [02:28:40] PROBLEM - Puppet run on tools-worker-1022 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [02:29:08] PROBLEM - Puppet run on tools-exec-1419 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [02:29:55] PROBLEM - Puppet run on tools-worker-1005 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [02:31:13] PROBLEM - Puppet run on tools-webgrid-lighttpd-1207 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [02:41:16] 06Labs, 10Labs-Infrastructure, 06Operations, 10ops-eqiad, 07Wikimedia-Incident: Replace fans (or paste) on labservices1001 - https://phabricator.wikimedia.org/T154391#2909243 (10Andrew) [02:42:25] RECOVERY - Puppet run on tools-bastion-03 is OK: OK: Less than 1.00% above the threshold [0.0] [02:42:50] 06Labs, 10Labs-Infrastructure, 06Operations, 10ops-eqiad, 07Wikimedia-Incident: Replace fans (or paste) on labservices1001 - https://phabricator.wikimedia.org/T154391#2909256 (10chasemp) p:05Triage>03High [02:43:12] 06Labs, 10Labs-Infrastructure, 06Operations, 10ops-eqiad, 07Wikimedia-Incident: Replace fans (or paste) on labservices1001 - https://phabricator.wikimedia.org/T154391#2909257 (10Andrew) a:03Cmjohnson Chris -- I'm not sure what the procedure is here. If you need to power down the machine for this we'll... [02:47:54] RECOVERY - Puppet run on tools-worker-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [02:49:26] RECOVERY - Puppet run on tools-checker-02 is OK: OK: Less than 1.00% above the threshold [0.0] [02:49:54] RECOVERY - Puppet run on tools-exec-1214 is OK: OK: Less than 1.00% above the threshold [0.0] [02:50:50] RECOVERY - Puppet run on tools-exec-1217 is OK: OK: Less than 1.00% above the threshold [0.0] [02:52:12] RECOVERY - Puppet run on tools-webgrid-lighttpd-1210 is OK: OK: Less than 1.00% above the threshold [0.0] [02:52:13] RECOVERY - Puppet run on tools-webgrid-lighttpd-1209 is OK: OK: Less than 1.00% above the threshold [0.0] [02:52:21] RECOVERY - Puppet run on tools-exec-1410 is OK: OK: Less than 1.00% above the threshold [0.0] [02:52:29] RECOVERY - Puppet run on tools-exec-1409 is OK: OK: Less than 1.00% above the threshold [0.0] [02:52:51] RECOVERY - Puppet run on tools-webgrid-lighttpd-1409 is OK: OK: Less than 1.00% above the threshold [0.0] [02:53:00] RECOVERY - Puppet run on tools-webgrid-lighttpd-1408 is OK: OK: Less than 1.00% above the threshold [0.0] [02:53:14] RECOVERY - Puppet run on tools-exec-1212 is OK: OK: Less than 1.00% above the threshold [0.0] [02:53:20] RECOVERY - Puppet run on tools-mail is OK: OK: Less than 1.00% above the threshold [0.0] [02:53:24] RECOVERY - Puppet run on tools-worker-1011 is OK: OK: Less than 1.00% above the threshold [0.0] [02:53:42] RECOVERY - Puppet run on tools-exec-1216 is OK: OK: Less than 1.00% above the threshold [0.0] [02:53:54] RECOVERY - Puppet run on tools-exec-gift is OK: OK: Less than 1.00% above the threshold [0.0] [02:54:14] RECOVERY - Puppet run on tools-webgrid-lighttpd-1205 is OK: OK: Less than 1.00% above the threshold [0.0] [02:55:22] RECOVERY - Puppet run on tools-exec-1412 is OK: OK: Less than 1.00% above the threshold [0.0] [02:56:00] RECOVERY - Puppet run on tools-grid-shadow is OK: OK: Less than 1.00% above the threshold [0.0] [02:56:12] RECOVERY - Puppet run on tools-cron-01 is OK: OK: Less than 1.00% above the threshold [0.0] [02:58:06] RECOVERY - Puppet run on tools-grid-master is OK: OK: Less than 1.00% above the threshold [0.0] [02:58:19] RECOVERY - Puppet run on tools-webgrid-lighttpd-1416 is OK: OK: Less than 1.00% above the threshold [0.0] [02:58:59] RECOVERY - Puppet run on tools-exec-1418 is OK: OK: Less than 1.00% above the threshold [0.0] [02:59:13] RECOVERY - Puppet run on tools-exec-1420 is OK: OK: Less than 1.00% above the threshold [0.0] [02:59:25] RECOVERY - Puppet run on tools-webgrid-lighttpd-1411 is OK: OK: Less than 1.00% above the threshold [0.0] [02:59:27] RECOVERY - Puppet run on tools-webgrid-lighttpd-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [03:00:09] RECOVERY - Puppet run on tools-webgrid-lighttpd-1202 is OK: OK: Less than 1.00% above the threshold [0.0] [03:00:09] RECOVERY - Puppet run on tools-webgrid-lighttpd-1201 is OK: OK: Less than 1.00% above the threshold [0.0] [03:00:13] RECOVERY - Puppet run on tools-checker-01 is OK: OK: Less than 1.00% above the threshold [0.0] [03:00:31] RECOVERY - Puppet run on tools-docker-registry-02 is OK: OK: Less than 1.00% above the threshold [0.0] [03:00:39] RECOVERY - Puppet run on tools-webgrid-lighttpd-1415 is OK: OK: Less than 1.00% above the threshold [0.0] [03:00:59] RECOVERY - Puppet run on tools-worker-1008 is OK: OK: Less than 1.00% above the threshold [0.0] [03:01:13] RECOVERY - Puppet run on tools-worker-1021 is OK: OK: Less than 1.00% above the threshold [0.0] [03:02:08] RECOVERY - Puppet run on tools-webgrid-lighttpd-1414 is OK: OK: Less than 1.00% above the threshold [0.0] [03:02:46] RECOVERY - Puppet run on tools-worker-1010 is OK: OK: Less than 1.00% above the threshold [0.0] [03:03:36] RECOVERY - Puppet run on tools-webgrid-lighttpd-1407 is OK: OK: Less than 1.00% above the threshold [0.0] [03:03:39] RECOVERY - Puppet run on tools-docker-registry-01 is OK: OK: Less than 1.00% above the threshold [0.0] [03:03:54] 10Quarry, 07Community-Wishlist-Survey-2016: "Running" query being displayed as unsubmitted - https://phabricator.wikimedia.org/T71176#2909387 (10Liuxinyu970226) [03:03:56] 10Quarry, 07Community-Wishlist-Survey-2016: Time limit on quarry queries - https://phabricator.wikimedia.org/T111779#2909388 (10Liuxinyu970226) [03:04:00] 10Quarry, 07Community-Wishlist-Survey-2016: Quarry task running for a while - https://phabricator.wikimedia.org/T133738#2909389 (10Liuxinyu970226) [03:04:04] 10Quarry, 07Community-Wishlist-Survey-2016: Wrong status of queries in Recent Queries list - https://phabricator.wikimedia.org/T137517#2909390 (10Liuxinyu970226) [03:04:10] 10Quarry, 07Community-Wishlist-Survey-2016: Query runs over 5 hours without being killed - https://phabricator.wikimedia.org/T139162#2909391 (10Liuxinyu970226) [03:04:11] RECOVERY - Puppet run on tools-webgrid-lighttpd-1203 is OK: OK: Less than 1.00% above the threshold [0.0] [03:04:47] RECOVERY - Puppet run on tools-webgrid-lighttpd-1404 is OK: OK: Less than 1.00% above the threshold [0.0] [03:04:53] RECOVERY - Puppet run on tools-worker-1005 is OK: OK: Less than 1.00% above the threshold [0.0] [03:05:21] RECOVERY - Puppet run on tools-exec-1219 is OK: OK: Less than 1.00% above the threshold [0.0] [03:05:44] RECOVERY - Puppet run on tools-exec-1405 is OK: OK: Less than 1.00% above the threshold [0.0] [03:06:12] RECOVERY - Puppet run on tools-webgrid-generic-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [03:06:12] RECOVERY - Puppet run on tools-webgrid-lighttpd-1207 is OK: OK: Less than 1.00% above the threshold [0.0] [03:08:00] RECOVERY - Puppet run on tools-exec-1415 is OK: OK: Less than 1.00% above the threshold [0.0] [03:08:40] RECOVERY - Puppet run on tools-worker-1022 is OK: OK: Less than 1.00% above the threshold [0.0] [03:09:10] RECOVERY - Puppet run on tools-exec-1419 is OK: OK: Less than 1.00% above the threshold [0.0] [03:10:17] 10Tool-Labs-tools-Other, 06Commons, 06Community-Tech, 10Internet-Archive, 07Community-Wishlist-Survey-2016: Create a new DerivativeFX after the Toolserver shutdown [AOI] - https://phabricator.wikimedia.org/T110409#2909439 (10Liuxinyu970226) [03:16:20] 10Tool-Labs-tools-Xtools, 06Community-Tech, 07Community-Wishlist-Survey-2016: Epic: Rewriting XTools - https://phabricator.wikimedia.org/T153112#2909490 (10Liuxinyu970226) [04:39:50] PROBLEM - Free space - all mounts on tools-exec-gift is CRITICAL: CRITICAL: tools.tools-exec-gift.diskspace._public_dumps.byte_percentfree (No valid datapoints found)tools.tools-exec-gift.diskspace.root.byte_percentfree (<20.00%) [08:39:51] 06Labs, 10Tool-Labs, 10community-labs-monitoring: Implement a system to monitor tools on tool-labs - https://phabricator.wikimedia.org/T53434#2909575 (10Matthewrbowker) Hi, all. Apologies about the delay, I didn't see emails related to this task for some reason... >>! In T53434#2902974, @scfc wrote: > (Lab... [08:55:04] PROBLEM - Puppet run on tools-services-02 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [09:35:00] RECOVERY - Puppet run on tools-services-02 is OK: OK: Less than 1.00% above the threshold [0.0] [09:47:03] PROBLEM - Free space - all mounts on tools-worker-1003 is CRITICAL: CRITICAL: tools.tools-worker-1003.diskspace._var_lib_docker.byte_percentfree (No valid datapoints found) tools.tools-worker-1003.diskspace._public_dumps.byte_percentfree (No valid datapoints found)tools.tools-worker-1003.diskspace.root.byte_percentfree (<100.00%) [10:44:52] RECOVERY - Free space - all mounts on tools-exec-gift is OK: OK: tools.tools-exec-gift.diskspace._public_dumps.byte_percentfree (No valid datapoints found) [12:09:42] PROBLEM - Puppet run on tools-services-01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [13:49:31] 06Labs, 10Tool-Labs, 10community-labs-monitoring: Implement a system to monitor tools on tool-labs - https://phabricator.wikimedia.org/T53434#2909723 (10zhuyifei1999) >>! In T53434#2909575, @Matthewrbowker wrote: > Could we do management as part of Striker? That would make this very easy, at least on the To... [16:18:34] 06Labs, 10Labs-Infrastructure, 10DBA: Replication inconsistency - https://phabricator.wikimedia.org/T154398#2909863 (10Huji) [16:20:00] 06Labs, 10Labs-Infrastructure, 10DBA: Replication inconsistency - https://phabricator.wikimedia.org/T154398#2909876 (10Huji) PS: it used to be possible to connect to specific Labs DB instances using commands like `mysql -h labsdb1001 fawiki` (like T123985#1943926) but that does not work anymore. Can you plea... [17:35:02] 06Labs, 10Tool-Labs, 10community-labs-monitoring: Implement a system to monitor tools on tool-labs - https://phabricator.wikimedia.org/T53434#2909937 (10bd808) >>! In T53434#2909575, @Matthewrbowker wrote: > Could we do management as part of Striker? That would make this very easy, at least on the Tool Labs... [17:40:20] am i doing something stupid with this sql query? [17:40:22] select * from revision where rev_user_text='Jackmcbarn' order by rev_timestamp desc limit 50; [17:40:42] it takes upwards of 5 minutes to run, even though looking at my contributions page does the same thing in less than a second [17:54:12] 10PAWS: Paws display 504 - Bad gateway time-out - https://phabricator.wikimedia.org/T143493#2909948 (10Ivanhercaz) Hi @yuvipanda! I am working again with CanaryBot in a easy task that I think is very light. The task was stopped several times (in terminal) with a message "killed". I tried to reload the page and I... [18:41:03] 06Labs, 10Labs-Infrastructure, 10DBA: Replication inconsistency - https://phabricator.wikimedia.org/T154398#2909863 (10Krenair) Nope, production definitely matches I'm afraid: ```mysql:wikiadmin@db1086 [fawiki]> select * from pagelinks where pl_from = 3466098; Empty set (0.00 sec)``` Several other productio... [18:44:42] 06Labs, 10Labs-Infrastructure, 10DBA: Replication inconsistency - https://phabricator.wikimedia.org/T154398#2909992 (10Krenair) >>! In T154398#2909876, @Huji wrote: > PS: it used to be possible to connect to specific Labs DB instances using commands like `mysql -h labsdb1001 fawiki` (like T123985#1943926) bu... [20:07:48] 10Wikibugs: Wrong comment anchors linked - https://phabricator.wikimedia.org/T141837#2910053 (10MZMcBride) I guess the issue is somewhere around here? https://phabricator.wikimedia.org/diffusion/TWBT/browse/master/wikibugs.py;54181a0aadaf3ab6ec249e03c7ab654ddca4720b$163 https://phabricator.wikimedia.org/diffus... [21:21:24] 06Labs, 10Labs-Infrastructure, 10DBA: Replication inconsistency - https://phabricator.wikimedia.org/T154398#2910119 (10Huji) >>! In T154398#2909985, @Krenair wrote: > Nope, production definitely matches I'm afraid: > ```mysql:wikiadmin@db1086 [fawiki]> select * from pagelinks where pl_from = 3466098; > Empty... [21:53:50] 06Labs, 10Labs-Infrastructure, 10DBA: Replication inconsistency - https://phabricator.wikimedia.org/T154398#2910124 (10Krenair) I agree it's strange, it's just not a replication inconsistency. [22:57:15] 06Labs, 10Tool-Labs, 10community-labs-monitoring: Implement a system to monitor tools on tool-labs - https://phabricator.wikimedia.org/T53434#2910133 (10scfc) >>! In T53434#2909575, @Matthewrbowker wrote: > […] >>>! In T53434#2902974, @scfc wrote: >> (Labs is slowly moving authorative information about //ins... [23:09:41] RECOVERY - Puppet run on tools-services-01 is OK: OK: Less than 1.00% above the threshold [0.0] [23:59:41] jackmcbarn: I wonder if we have an index on rev_user_text in the replica dbs? I'm not sure I even know how to figure that out. I don't think SHOW INDEX works through the *_p views