[00:08:43] PROBLEM - Puppet errors on tools-exec-1405 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [00:43:00] PROBLEM - Puppet errors on tools-exec-1418 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [00:48:41] RECOVERY - Puppet errors on tools-exec-1405 is OK: OK: Less than 1.00% above the threshold [0.0] [01:18:00] RECOVERY - Puppet errors on tools-exec-1418 is OK: OK: Less than 1.00% above the threshold [0.0] [01:25:24] PROBLEM - Puppet errors on tools-exec-1433 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [02:16:13] 10Tool-Labs-tools-Xtools: xtools edit counter throwing a 500 error - https://phabricator.wikimedia.org/T170061#3418348 (10Cameron11598) [03:00:25] RECOVERY - Puppet errors on tools-exec-1433 is OK: OK: Less than 1.00% above the threshold [0.0] [03:24:41] 10Data-Services, 10Tool-Labs-tools-Xtools, 10cloud-services-team (Kanban), 10Community-Tech-Sprint: ukwikimedia_p needs to be removed from meta_p table and production CentralAuth tables - https://phabricator.wikimedia.org/T170005#3418452 (10Legoktm) [03:33:18] PROBLEM - Puppet errors on tools-exec-1402 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0] [03:42:33] 10Tool-Labs-tools-Xtools: xtools edit counter throwing a 500 error - https://phabricator.wikimedia.org/T170061#3418473 (10MusikAnimal) 05Open>03Resolved a:03MusikAnimal A good ole restart did the job. Look forward to the new XTools being released very soon -- with stability as our top priority, along with... [03:58:18] RECOVERY - Puppet errors on tools-exec-1402 is OK: OK: Less than 1.00% above the threshold [0.0] [04:13:31] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1416 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [04:14:10] <[[Cameron]]> musikanimal: thanks! [04:14:26] np [04:48:28] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1416 is OK: OK: Less than 1.00% above the threshold [0.0] [05:12:51] PROBLEM - Puppet errors on tools-worker-1007 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [05:52:52] RECOVERY - Puppet errors on tools-worker-1007 is OK: OK: Less than 1.00% above the threshold [0.0] [06:13:52] PROBLEM - Puppet errors on tools-worker-1007 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [06:23:22] PROBLEM - Puppet errors on tools-webgrid-generic-1401 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [06:42:19] PROBLEM - Free space - all mounts on tools-logs-02 is CRITICAL: CRITICAL: tools.tools-logs-02.diskspace._srv.byte_percentfree (<55.56%) [06:52:19] RECOVERY - Free space - all mounts on tools-logs-02 is OK: OK: All targets OK [06:53:51] RECOVERY - Puppet errors on tools-worker-1007 is OK: OK: Less than 1.00% above the threshold [0.0] [06:58:22] RECOVERY - Puppet errors on tools-webgrid-generic-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [07:38:23] PROBLEM - Free space - all mounts on tools-worker-1020 is CRITICAL: CRITICAL: tools.tools-worker-1020.diskspace.root.byte_percentfree (<100.00%) [07:44:00] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1414 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [07:44:53] PROBLEM - Puppet errors on tools-worker-1007 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [07:52:22] 10Toolforge: Delete tool yunomi - https://phabricator.wikimedia.org/T170070#3418541 (104shadoww) [08:19:03] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1414 is OK: OK: Less than 1.00% above the threshold [0.0] [08:19:53] RECOVERY - Puppet errors on tools-worker-1007 is OK: OK: Less than 1.00% above the threshold [0.0] [08:35:22] PROBLEM - Puppet errors on tools-worker-1017 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [09:10:23] RECOVERY - Puppet errors on tools-worker-1017 is OK: OK: Less than 1.00% above the threshold [0.0] [13:01:04] 10Toolforge: Node.js on gridengine - https://phabricator.wikimedia.org/T166830#3308975 (10zhuyifei1999) >>! In T166830#3418317, @edsu wrote: > It just takes longer to run out of memory. Memory leak? There will be an upper limit on the memory that are available to you anyhow. > I would like to upgrade Node if po... [13:51:25] (03Draft2) 10Mess: Queries fix in order to prevent computational overruns [labs/tools/lists] - 10https://gerrit.wikimedia.org/r/364103 [14:52:36] PROBLEM - Puppet errors on tools-prometheus-01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:01:21] PROBLEM - Puppet errors on tools-worker-1017 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [15:07:58] PROBLEM - Puppet errors on tools-prometheus-02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:38:31] 10Tool-Labs-tools-anagrimes, 10MediaWiki-extension-requests, 10Wiktionary, 10User-Dereckson: Create a MediaWiki extension to supersede http://tools.wmflabs.org/anagrimes/hasard.php - https://phabricator.wikimedia.org/T154730#3418829 (10Framawiki) [16:11:22] RECOVERY - Puppet errors on tools-worker-1017 is OK: OK: Less than 1.00% above the threshold [0.0] [20:02:22] PROBLEM - Puppet errors on tools-worker-1017 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [20:37:23] RECOVERY - Puppet errors on tools-worker-1017 is OK: OK: Less than 1.00% above the threshold [0.0] [22:31:07] (03PS1) 10BryanDavis: Upgrade parsley.js to v2.7.2 [labs/striker] - 10https://gerrit.wikimedia.org/r/364132 [22:31:09] (03PS1) 10BryanDavis: Create new tools [labs/striker] - 10https://gerrit.wikimedia.org/r/364133 (https://phabricator.wikimedia.org/T149458) [22:31:11] (03PS1) 10BryanDavis: Update to use Toolforge branding [labs/striker] - 10https://gerrit.wikimedia.org/r/364134 (https://phabricator.wikimedia.org/T168480) [22:31:13] (03PS1) 10BryanDavis: Manage tool maintainers [labs/striker] - 10https://gerrit.wikimedia.org/r/364135 (https://phabricator.wikimedia.org/T149458) [22:31:15] (03PS1) 10BryanDavis: Refactor striker.tools.views [labs/striker] - 10https://gerrit.wikimedia.org/r/364136 [22:31:41] (03CR) 10BryanDavis: "Ready for review" [labs/striker] - 10https://gerrit.wikimedia.org/r/353909 (https://phabricator.wikimedia.org/T149458) (owner: 10BryanDavis) [22:33:23] PROBLEM - Puppet errors on tools-worker-1017 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [23:05:34] 10Striker, 10Patch-For-Review: Striker error log events not getting into ELK cluster due to UDP truncation of JSON payload - https://phabricator.wikimedia.org/T151422#3419208 (10bd808) Switching to `logstash.TCPLogstashHandler` does not seem to have worked. Errors can be triggered by loading https://toolsadmin... [23:08:24] RECOVERY - Puppet errors on tools-worker-1017 is OK: OK: Less than 1.00% above the threshold [0.0]