[00:12:33] PROBLEM - Free space - all mounts on tools-bastion-03 is CRITICAL: CRITICAL: tools.tools-bastion-03.diskspace._public_dumps.byte_percentfree (No valid datapoints found)tools.tools-bastion-03.diskspace.root.byte_percentfree (<40.00%) [00:42:34] PROBLEM - Puppet errors on tools-exec-1442 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [01:02:05] PROBLEM - Puppet errors on tools-exec-1439 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [01:14:16] PROBLEM - Puppet errors on tools-exec-1424 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [01:17:35] RECOVERY - Puppet errors on tools-exec-1442 is OK: OK: Less than 1.00% above the threshold [0.0] [01:37:05] RECOVERY - Puppet errors on tools-exec-1439 is OK: OK: Less than 1.00% above the threshold [0.0] [01:48:02] PROBLEM - Puppet errors on tools-bastion-03 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [01:53:20] (03PS110) 10Ricordisamoa: Initial commit [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 [02:02:12] (03CR) 10Ricordisamoa: [C: 04-2] "PS110 updates jQuery from 3.1.0 to 3.2.1" [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 (owner: 10Ricordisamoa) [02:19:16] RECOVERY - Puppet errors on tools-exec-1424 is OK: OK: Less than 1.00% above the threshold [0.0] [02:22:01] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1413 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [02:28:04] RECOVERY - Puppet errors on tools-bastion-03 is OK: OK: Less than 1.00% above the threshold [0.0] [02:29:07] (03PS111) 10Ricordisamoa: Initial commit [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 [02:40:06] (03CR) 10Ricordisamoa: [C: 04-2] "PS111 updates grunt-stylelint from ~0.7.0 to ~0.8.0 and adds stylelint 7.8.0" [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 (owner: 10Ricordisamoa) [02:52:00] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1413 is OK: OK: Less than 1.00% above the threshold [0.0] [03:48:51] (03PS112) 10Ricordisamoa: Initial commit [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 [03:52:00] (03CR) 10Ricordisamoa: [C: 04-2] "PS112 uses json extension for .stylelintrc" [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 (owner: 10Ricordisamoa) [03:55:58] PROBLEM - Puppet errors on tools-exec-1426 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [04:25:58] RECOVERY - Puppet errors on tools-exec-1426 is OK: OK: Less than 1.00% above the threshold [0.0] [05:03:26] (03PS113) 10Ricordisamoa: Initial commit [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 [05:10:35] (03CR) 10Ricordisamoa: [C: 04-2] "PS113 replaces JSHint and JSCS with ESLint" [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 (owner: 10Ricordisamoa) [06:03:06] PROBLEM - Puppet errors on tools-exec-1439 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [06:33:16] PROBLEM - Puppet errors on tools-exec-1404 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [06:38:07] RECOVERY - Puppet errors on tools-exec-1439 is OK: OK: Less than 1.00% above the threshold [0.0] [07:13:11] RECOVERY - Puppet errors on tools-exec-1404 is OK: OK: Less than 1.00% above the threshold [0.0] [07:36:15] 10Toolforge, 10Wiki-Loves-Monuments-Database: Toolforge tool.heritage webservice keeps crashing - https://phabricator.wikimedia.org/T176110#3616941 (10Emijrp) p:05Normal>03High My map doesn't show any monuments https://tools.wmflabs.org/wlm-maps/ I guess this bug is related. If you connect to sql local, u... [07:36:43] 10cloud-services-team (Kanban), 10Patch-For-Review: Replace kernel and reboot labvirt1015, 1016, 1017, 1018 - https://phabricator.wikimedia.org/T176044#3616944 (10MoritzMuehlenhoff) Was network connectivity lost to the server at large or to the VMs running on that labvirt instance? If it's the latter I'm wonde... [07:57:31] PROBLEM - Puppet errors on tools-worker-1006 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [08:01:32] 10Data-Services, 10DBA: Significant replication lag for the s1, s2, and s4 wikis on labsdb100[13] - https://phabricator.wikimedia.org/T175487#3616955 (10jcrespo) 05Open>03Resolved a:03jcrespo [08:32:28] RECOVERY - Puppet errors on tools-worker-1006 is OK: OK: Less than 1.00% above the threshold [0.0] [08:58:42] 10Cloud-VPS, 10Huggle: https://huggle.wmflabs.org gives ERR_NAME_NOT_RESOLVED - https://phabricator.wikimedia.org/T175901#3616998 (10Petrb) Yes, I am kind of wondering, what purpose should huggle.wmflabs.org serve @MarcoAurelio? Did that website even exist in past? What is supposed to be there? [09:12:33] PROBLEM - Puppet errors on tools-worker-1026 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [09:17:15] 10Toolforge, 10Wiki-Loves-Monuments-Database: Toolforge tool.heritage webservice keeps crashing - https://phabricator.wikimedia.org/T176110#3617021 (10JeanFred) >>! In T176110#3616941, @Emijrp wrote: > My map doesn't show any monuments https://tools.wmflabs.org/wlm-maps/ I guess this bug is related. > > If yo... [09:25:25] 10Toolforge, 10Wiki-Loves-Monuments-Database: Toolforge tool.heritage webservice keeps crashing - https://phabricator.wikimedia.org/T176110#3617039 (10Emijrp) Yes, it is ok now. Thanks. [09:30:34] 10cloud-services-team, 10Wikimedia-Hackathon-2018-Organization, 10Developer-Relations (Jul-Sep 2017): Featured Projects related to Wikimedia Cloud Services and/or Technical Operations? - https://phabricator.wikimedia.org/T170242#3617044 (10Qgil) Well, somehow it looks like this is not going to happen? Or am... [09:31:09] 10Cloud-VPS, 10Huggle: https://huggle.wmflabs.org gives ERR_NAME_NOT_RESOLVED - https://phabricator.wikimedia.org/T175901#3617045 (10MarcoAurelio) @Petrb I discovered the site browsing the on-wiki docs. If it's not to exist, then lets remove it from the wiki pages. [09:44:43] (03PS4) 10Jean-Frédéric: Add Bash linting using bashate [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378006 (https://phabricator.wikimedia.org/T175906) [09:46:10] 10Cloud-VPS, 10Huggle: https://huggle.wmflabs.org gives ERR_NAME_NOT_RESOLVED - https://phabricator.wikimedia.org/T175901#3617133 (10Petrb) link me [09:50:19] 10Cloud-VPS, 10Huggle: https://huggle.wmflabs.org gives ERR_NAME_NOT_RESOLVED - https://phabricator.wikimedia.org/T175901#3617144 (10MarcoAurelio) In https://www.mediawiki.org/wiki/Manual:Huggle the link https://huggle.wmflabs.org/data/wl.php?wp=en.wikipedia&action=display which is used via https://www.mediawi... [09:52:32] RECOVERY - Puppet errors on tools-worker-1026 is OK: OK: Less than 1.00% above the threshold [0.0] [09:58:07] (03CR) 10Jean-Frédéric: "> > is there a particular reason for using bashate instead of" (031 comment) [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378006 (https://phabricator.wikimedia.org/T175906) (owner: 10Jean-Frédéric) [10:32:20] 10Cloud-VPS, 10Huggle: https://huggle.wmflabs.org gives ERR_NAME_NOT_RESOLVED - https://phabricator.wikimedia.org/T175901#3617196 (10Petrb) 05Open>03Resolved a:03Petrb OK, I corrected that link [10:40:18] 10Cloud-VPS, 10Huggle: https://huggle.wmflabs.org gives ERR_NAME_NOT_RESOLVED - https://phabricator.wikimedia.org/T175901#3617206 (10MarcoAurelio) And I've marked the page for translation so the new link is pushed to the translation pages. Thanks. [10:48:00] PROBLEM - Puppet errors on tools-docker-builder-05 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [10:48:02] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1413 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [11:22:59] RECOVERY - Puppet errors on tools-docker-builder-05 is OK: OK: Less than 1.00% above the threshold [0.0] [11:23:03] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1413 is OK: OK: Less than 1.00% above the threshold [0.0] [11:37:24] (03PS4) 10Jean-Frédéric: Allow to choose the year in wlm-latest tool [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/377743 [12:17:58] (03PS1) 10Jean-Frédéric: Skip SPARQL type in missing_commonscat_links.py [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378886 (https://phabricator.wikimedia.org/T175899) [12:19:54] PROBLEM - Puppet errors on tools-webgrid-generic-1401 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [12:54:54] RECOVERY - Puppet errors on tools-webgrid-generic-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [13:25:02] !log tools wall Bastion disk is full and needs attention and reboot in 60 [13:25:06] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [13:25:15] :-( [13:39:08] !log tools bastion-03 someone dropped 8.6G in /tmp which is /not/ seemingly on a temp file system [13:39:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [13:41:12] that's a big file [13:47:30] RECOVERY - Free space - all mounts on tools-bastion-03 is OK: OK: tools.tools-bastion-03.diskspace._public_dumps.byte_percentfree (No valid datapoints found) [13:56:30] !log bastion add profile::openstack::main::cumin_auth_group: cumin_real_masters for puppet prefix bastion-restricted [13:56:33] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Bastion/SAL [14:00:41] !help My username (sau226) on wikitech is not a member of the bastion project. I'd like to be added as a member if possible. Thanks [14:00:41] sau226: If you don't get a response in 15-30 minutes, please create a phabricator task -- https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?projects=wmcs-team [14:01:57] sau226: you don't get added to the bastion project until you are added to a project that requires it I believe [14:02:04] are you a member of a another project? [14:02:08] Ok [14:02:28] Not yet but I was wondering why I could not ssh to bastion [14:33:05] sau226: when you get added to a VPS project (tools or other) you will get automatically added to bastion I believe. So not being able to ssh now is the expected behavior [14:34:10] 10cloud-services-team (Kanban), 10Patch-For-Review: Replace kernel and reboot labvirt1015, 1016, 1017, 1018 - https://phabricator.wikimedia.org/T176044#3617664 (10Andrew) > Was network connectivity lost to the server at large or to the VMs running on that labvirt instance? It was the host itself. For the mos... [14:37:20] We're doing a bit of maintenance and there may be a brief DNS interruption in the next few minutes. I'll post again when the change is over. [14:49:03] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1413 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [14:49:56] 10cloud-services-team (Kanban), 10Patch-For-Review: Replace kernel and reboot labvirt1015, 1016, 1017, 1018 - https://phabricator.wikimedia.org/T176044#3617731 (10faidon) >>! In T176044#3617664, @Andrew wrote: >> Was network connectivity lost to the server at large or to the VMs running on that labvirt instanc... [14:59:44] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1406 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [14:59:50] PROBLEM - Puppet errors on tools-exec-1428 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [14:59:56] 10cloud-services-team (Kanban), 10Patch-For-Review: Replace kernel and reboot labvirt1015, 1016, 1017, 1018 - https://phabricator.wikimedia.org/T176044#3617753 (10chasemp) A summary from the IRC conversation I had with @robh on 2017-09-12 chasemp: I rebooted labtestvirt2001 and it never came back for SSH. So... [15:00:06] PROBLEM - Puppet errors on tools-worker-1014 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [15:00:13] andrewbogott: ^ fyi [15:00:27] chasemp: when you guys figure out what the fuck is wrong with those [15:00:29] let me konw! [15:00:40] im lurking on the task [15:01:13] PROBLEM - Puppet errors on tools-exec-1441 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [15:01:55] PROBLEM - Puppet errors on tools-worker-1010 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [15:01:55] robh: yeah man will do [15:02:06] such a strange issue [15:02:17] i noticed in the ops meeting no one had a quick answer ;_; [15:02:45] PROBLEM - Puppet errors on tools-static-11 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [15:03:03] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1422 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [15:03:19] PROBLEM - Puppet errors on tools-redis-1001 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [15:03:24] PROBLEM - Puppet errors on tools-worker-1017 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [15:03:42] PROBLEM - Puppet errors on tools-exec-1417 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [15:04:12] PROBLEM - Puppet errors on tools-exec-1404 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [15:04:22] PROBLEM - Puppet errors on tools-worker-1016 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [15:04:27] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1418 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [15:04:38] PROBLEM - Puppet errors on tools-k8s-etcd-03 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [15:04:38] PROBLEM - Puppet errors on tools-webgrid-generic-1404 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [15:04:39] PROBLEM - Puppet errors on tools-exec-1440 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [15:05:26] PROBLEM - Puppet errors on tools-paws-worker-1016 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [15:05:27] hi it seems i am getting puppet errors [15:05:30] Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Failed when searching for node puppet-phabricator.phabricator.eqiad.wmflabs: Failed to find puppet-phabricator.phabricator.eqiad.wmflabs via exec: Execution of '/usr/local/bin/puppet-enc puppet-phabricator.phabricator.eqiad.wmflabs' returned 1: [15:05:30] Warning: Not using cache on failed catalog [15:05:51] PROBLEM - Puppet errors on tools-worker-1019 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [15:06:01] PROBLEM - Puppet errors on tools-k8s-etcd-02 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [15:06:05] PROBLEM - Puppet errors on tools-worker-1009 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [15:07:01] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1411 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [15:07:04] paladox: known [15:07:10] ok [15:07:17] PROBLEM - Puppet errors on tools-checker-01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [15:07:17] PROBLEM - Puppet errors on tools-worker-1015 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [15:07:27] PROBLEM - Puppet errors on tools-flannel-etcd-03 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [15:07:31] PROBLEM - Puppet errors on tools-checker-02 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [15:07:39] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1409 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [15:07:59] PROBLEM - Puppet errors on tools-worker-1020 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [15:08:28] PROBLEM - Puppet errors on tools-paws-worker-1005 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [15:08:36] PROBLEM - Puppet errors on tools-exec-1442 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [15:08:36] PROBLEM - Puppet errors on tools-prometheus-02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [15:08:58] PROBLEM - Puppet errors on tools-exec-1414 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [15:09:04] PROBLEM - Puppet errors on tools-bastion-03 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [15:10:35] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1419 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [15:10:45] PROBLEM - Puppet errors on tools-exec-1438 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [15:10:45] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1416 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [15:10:55] PROBLEM - Puppet errors on tools-worker-1025 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [15:10:56] PROBLEM - Puppet errors on tools-exec-1401 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [15:12:55] PROBLEM - Puppet errors on tools-paws-worker-1001 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [15:13:11] PROBLEM - Puppet errors on tools-exec-1405 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [15:13:27] PROBLEM - Puppet errors on tools-logs-02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [15:14:08] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1426 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [15:14:46] PROBLEM - Puppet errors on tools-exec-1427 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [15:15:00] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1420 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [15:15:30] PROBLEM - Puppet errors on tools-bastion-02 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [15:15:34] PROBLEM - Puppet errors on tools-docker-registry-01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [15:15:35] PROBLEM - Puppet errors on tools-exec-1423 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [15:15:37] PROBLEM - Puppet errors on tools-worker-1012 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [15:15:55] PROBLEM - Puppet errors on tools-puppetmaster-01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [15:16:17] PROBLEM - Puppet errors on tools-paws-worker-1007 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [15:17:39] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1415 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [15:17:44] waiting for our wave of recoveries :) [15:18:22] :D [15:22:32] !log tools tools-clushmaster-01:~$ clush -f 5 -g all 'sudo puppet agent --test' [15:22:37] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [15:23:38] RECOVERY - Puppet errors on tools-exec-1417 is OK: OK: Less than 1.00% above the threshold [0.0] [15:24:04] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1413 is OK: OK: Less than 1.00% above the threshold [0.0] [15:24:44] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1406 is OK: OK: Less than 1.00% above the threshold [0.0] [15:29:03] RECOVERY - Puppet errors on tools-bastion-03 is OK: OK: Less than 1.00% above the threshold [0.0] [15:32:29] RECOVERY - Puppet errors on tools-checker-02 is OK: OK: Less than 1.00% above the threshold [0.0] [15:35:05] 10cloud-services-team, 10Wikimedia-Hackathon-2018-Organization, 10Developer-Relations (Jul-Sep 2017): Featured Projects related to Wikimedia Cloud Services and/or Technical Operations? - https://phabricator.wikimedia.org/T170242#3617844 (10Aklapper) >>! In T170242#3425293, @bd808 wrote: >I'll bring it up wit... [15:35:30] RECOVERY - Puppet errors on tools-bastion-02 is OK: OK: Less than 1.00% above the threshold [0.0] [15:35:32] RECOVERY - Puppet errors on tools-docker-registry-01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:35:58] RECOVERY - Puppet errors on tools-worker-1025 is OK: OK: Less than 1.00% above the threshold [0.0] [15:37:16] RECOVERY - Puppet errors on tools-checker-01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:37:26] RECOVERY - Puppet errors on tools-flannel-etcd-03 is OK: OK: Less than 1.00% above the threshold [0.0] [15:37:40] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1409 is OK: OK: Less than 1.00% above the threshold [0.0] [15:37:44] RECOVERY - Puppet errors on tools-static-11 is OK: OK: Less than 1.00% above the threshold [0.0] [15:37:58] RECOVERY - Puppet errors on tools-worker-1020 is OK: OK: Less than 1.00% above the threshold [0.0] [15:38:01] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1422 is OK: OK: Less than 1.00% above the threshold [0.0] [15:38:17] RECOVERY - Puppet errors on tools-redis-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [15:38:24] RECOVERY - Puppet errors on tools-worker-1017 is OK: OK: Less than 1.00% above the threshold [0.0] [15:38:30] RECOVERY - Puppet errors on tools-paws-worker-1005 is OK: OK: Less than 1.00% above the threshold [0.0] [15:38:34] RECOVERY - Puppet errors on tools-exec-1442 is OK: OK: Less than 1.00% above the threshold [0.0] [15:39:23] RECOVERY - Puppet errors on tools-worker-1016 is OK: OK: Less than 1.00% above the threshold [0.0] [15:39:29] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1418 is OK: OK: Less than 1.00% above the threshold [0.0] [15:39:37] RECOVERY - Puppet errors on tools-webgrid-generic-1404 is OK: OK: Less than 1.00% above the threshold [0.0] [15:39:39] RECOVERY - Puppet errors on tools-exec-1440 is OK: OK: Less than 1.00% above the threshold [0.0] [15:39:54] RECOVERY - Puppet errors on tools-exec-1428 is OK: OK: Less than 1.00% above the threshold [0.0] [15:40:06] RECOVERY - Puppet errors on tools-worker-1014 is OK: OK: Less than 1.00% above the threshold [0.0] [15:40:24] RECOVERY - Puppet errors on tools-paws-worker-1016 is OK: OK: Less than 1.00% above the threshold [0.0] [15:40:36] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1419 is OK: OK: Less than 1.00% above the threshold [0.0] [15:40:52] RECOVERY - Puppet errors on tools-worker-1019 is OK: OK: Less than 1.00% above the threshold [0.0] [15:41:00] RECOVERY - Puppet errors on tools-k8s-etcd-02 is OK: OK: Less than 1.00% above the threshold [0.0] [15:41:04] RECOVERY - Puppet errors on tools-worker-1009 is OK: OK: Less than 1.00% above the threshold [0.0] [15:41:14] RECOVERY - Puppet errors on tools-exec-1441 is OK: OK: Less than 1.00% above the threshold [0.0] [15:41:56] RECOVERY - Puppet errors on tools-worker-1010 is OK: OK: Less than 1.00% above the threshold [0.0] [15:42:03] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1411 is OK: OK: Less than 1.00% above the threshold [0.0] [15:42:17] RECOVERY - Puppet errors on tools-worker-1015 is OK: OK: Less than 1.00% above the threshold [0.0] [15:43:36] RECOVERY - Puppet errors on tools-prometheus-02 is OK: OK: Less than 1.00% above the threshold [0.0] [15:43:56] RECOVERY - Puppet errors on tools-exec-1414 is OK: OK: Less than 1.00% above the threshold [0.0] [15:44:08] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1426 is OK: OK: Less than 1.00% above the threshold [0.0] [15:44:10] RECOVERY - Puppet errors on tools-exec-1404 is OK: OK: Less than 1.00% above the threshold [0.0] [15:44:38] RECOVERY - Puppet errors on tools-k8s-etcd-03 is OK: OK: Less than 1.00% above the threshold [0.0] [15:45:02] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1420 is OK: OK: Less than 1.00% above the threshold [0.0] [15:45:35] RECOVERY - Puppet errors on tools-worker-1012 is OK: OK: Less than 1.00% above the threshold [0.0] [15:45:45] RECOVERY - Puppet errors on tools-exec-1438 is OK: OK: Less than 1.00% above the threshold [0.0] [15:45:45] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1416 is OK: OK: Less than 1.00% above the threshold [0.0] [15:45:59] RECOVERY - Puppet errors on tools-exec-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [15:46:17] RECOVERY - Puppet errors on tools-paws-worker-1007 is OK: OK: Less than 1.00% above the threshold [0.0] [15:47:37] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1415 is OK: OK: Less than 1.00% above the threshold [0.0] [15:47:48] 10cloud-services-team, 10Wikimedia-Hackathon-2018-Organization, 10Developer-Relations (Jul-Sep 2017): Featured Projects related to Wikimedia Cloud Services and/or Technical Operations? - https://phabricator.wikimedia.org/T170242#3617883 (10bd808) >>! In T170242#3617844, @Aklapper wrote: > @bd808: Wondering i... [15:47:58] RECOVERY - Puppet errors on tools-paws-worker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [15:48:14] RECOVERY - Puppet errors on tools-exec-1405 is OK: OK: Less than 1.00% above the threshold [0.0] [15:48:28] RECOVERY - Puppet errors on tools-logs-02 is OK: OK: Less than 1.00% above the threshold [0.0] [15:49:46] RECOVERY - Puppet errors on tools-exec-1427 is OK: OK: Less than 1.00% above the threshold [0.0] [15:50:35] RECOVERY - Puppet errors on tools-exec-1423 is OK: OK: Less than 1.00% above the threshold [0.0] [15:50:55] RECOVERY - Puppet errors on tools-puppetmaster-01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:25:57] (03CR) 10Lokal Profil: [C: 032] Add Bash linting using bashate [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378006 (https://phabricator.wikimedia.org/T175906) (owner: 10Jean-Frédéric) [16:31:32] (03CR) 10Lokal Profil: [C: 031] "typo in commit message otherwise good to go. maybe create a follow up task for making it work with sparql harvests" (031 comment) [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378886 (https://phabricator.wikimedia.org/T175899) (owner: 10Jean-Frédéric) [16:38:59] (03Merged) 10jenkins-bot: Add Bash linting using bashate [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378006 (https://phabricator.wikimedia.org/T175906) (owner: 10Jean-Frédéric) [16:50:32] (03PS2) 10Jean-Frédéric: Skip SPARQL type in missing_commonscat_links.py [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378886 (https://phabricator.wikimedia.org/T175899) [17:04:49] (03CR) 10Jean-Frédéric: [C: 032] Skip SPARQL type in missing_commonscat_links.py [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378886 (https://phabricator.wikimedia.org/T175899) (owner: 10Jean-Frédéric) [17:06:45] (03Merged) 10jenkins-bot: Skip SPARQL type in missing_commonscat_links.py [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378886 (https://phabricator.wikimedia.org/T175899) (owner: 10Jean-Frédéric) [17:11:35] !log tools.heritage Deploy latest from Git master: 5e5c828, 2486630 (T175906), 3ac13b6 (T175899) [17:11:40] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.heritage/SAL [17:11:40] T175899: ErfgoedBot missing_commonscat_links crashes with 'NoneType' object is not iterable - https://phabricator.wikimedia.org/T175899 [17:11:40] T175906: Add Shell linting to heritage repo - https://phabricator.wikimedia.org/T175906 [17:14:53] Hi! I ran the the webservice command. The webservice starts but when I go to https://tools.wmflabs.org// in the web browser. I am getting 502 gateway error. I tried using a different browser as well. But, facing the same problem. [17:19:36] Mridu: what is the tool name you are using in place of the placeholder in that link? [17:20:01] The name of my tool. That is thankyou. [17:20:50] https://tools.wmflabs.org/thankyou/ [17:21:40] Mridu: did you fix the error that is shown in thankyou's uwsgi.log file? [17:22:02] It was coming permission denied. [17:22:33] the pod is in "CrashLoopBackOff" which means it is failing to start the service due to that error [17:22:46] running `kubectl get pods` will show you that. [17:23:10] Ideally `webservice status` would as well, but apparently it does not [17:24:57] should I run `kubectl get pods`? [17:25:42] webservice --backend=kubernetes python start command is starting the webservice though. [17:26:51] your job is already running it says [17:28:02] (03CR) 10jenkins-bot: Add Bash linting using bashate [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378006 (https://phabricator.wikimedia.org/T175906) (owner: 10Jean-Frédéric) [17:52:07] bd808: kubectl get pods gives no output. [17:52:48] Mridu: I ran `webservice stop` there because it was just in a purpetual death loop [17:53:40] oh.. okay. Do I need to run webservice again? [17:54:28] Mridu: first you need to fix the code [17:55:02] Mridu: run: less /data/project/thankyou/uwsgi.log -- and then fix the code [17:56:14] ok. yes going through it. [18:00:20] bd808: resolved the error it was showing in line 1. [18:01:03] Mridu: good. :) now try: webservice start -- and then look for new errors if it doesn't start [18:02:21] are webservice start and webservice --backend=kubernetes python start command the same? [18:04:08] no, webserivce start uses your service.manifest to see which backend / webservice type to use [18:04:43] and iirc, if your webservice is stopped the service.manifest is almost empty [18:05:17] and with an empty service.manifest webservice start will use default configuration [18:05:29] iirc it's lighttpd on grid [18:05:53] `webservice --backend=kubernetes python start` overrides whatever is in your manifest [18:06:11] zhuyifei1999_: unless this changed recently, webservice /restart/ uses the manifest, but /start/ will use the command line arguments [18:06:30] valhallasw`cloud: o.O [18:06:59] service.manifest is the cache on how to figure out how to restart after a crash (what bigbrother used to do), not a configuration file [18:07:12] at least, that's how it used to work -- maybe it changed. [18:07:56] valhallasw`cloud: https://github.com/wikimedia/operations-software-tools-webservice/blob/7e8da902d4ac6a7bc05f548ffa7bb3f6267e65ed/scripts/webservice#L50 [18:08:12] I don't know why start would never read from that manifest [18:08:36] uh wrong line [18:08:47] https://github.com/wikimedia/operations-software-tools-webservice/blob/7e8da902d4ac6a7bc05f548ffa7bb3f6267e65ed/scripts/webservice#L75 [18:09:55] start uses cli args only because the manifest is meaningless for a stopped service, afaict [18:10:35] I tried running webservice start it gave me "Could not find a public_html folder or a .lighttpd.conf file in your tool home" error. While after running webservice --backend=kubernetes python start. It runs perfectly fine. [18:11:07] zhuyifei1999_: after reading the code, I remember the point. `webservice start` doesn't do anything unless you first `webservice stop`, and `webservice stop` clears the manifest [18:11:20] yeah [18:11:59] Mridu: lighttpd on grid (default) expects public_html or .lighttpd.conf [18:12:11] otherwise it has nothing to serve [18:13:06] oh.. okay. So, using the second command specifies explicitly that instead of grid engine. We are using kubernetes. [18:13:21] yeah [18:14:08] (03CR) 10jenkins-bot: Skip SPARQL type in missing_commonscat_links.py [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378886 (https://phabricator.wikimedia.org/T175899) (owner: 10Jean-Frédéric) [18:16:09] bd808: Improvements for the toolforget `webservice` command project is mentioned on wikimedia's landing page along with other outreachy projects. But the project is not listed on official outreachy website. [18:16:23] 10Cloud-VPS (Project-requests), 10User-Zppix: Request creation of Zppix-Wiki-AI VPS project - https://phabricator.wikimedia.org/T175846#3605167 (10Andrew) Is this something that could be done within the existing ores project? [18:26:53] (03PS1) 10Jean-Frédéric: Add SQL table for `statistics` [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378974 [18:26:55] (03PS1) 10Jean-Frédéric: Remove warnings on statistics page when there are undefined fields [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378975 (https://phabricator.wikimedia.org/T174503) [18:39:24] Mridu: that's .... probably something that srish_aka_tux_ can take care of if it is time for it to be officially posted. I just made the Phabricator task for it a week ago so maybe she has not had time to post it everywhere? [18:40:57] okay. [18:41:50] Mridu: it falls under the 4 spots for Wikimedia. I don't think its a big deal for you starting the application process that there isn't a bullet point listed there. It is on our local landing page [18:43:11] yes. Through the local landing page only I came to know about this one. And it looks interesting :) [18:48:42] I proceeded with doing the add a configration file part. It's again giving a bad gateway error. On running less /data/project/thankyou/uwsgi.log. I am getting the same error related to line 1. But, that is already resolved. I rechecked the source code. [18:52:35] (03PS1) 10Jean-Frédéric: Harvest the source page of unknown fields [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378981 (https://phabricator.wikimedia.org/T117330) [18:57:27] Mridu: look at the end of the file. There is an error about "NameError: name 'os' is not defined". [18:57:53] which would indicate a missing "import os" statement [18:59:56] I will recheck. I did not see this error. [19:02:33] (03PS1) 10Jean-Frédéric: Skip SPARQL type in unused_monument_images.py [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378982 [19:07:39] bd808: I am only seeing *** Operational MODE: preforking *** mounting /data/project/thankyou/www/python/src/app.py on /thankyou File "/data/project/thankyou/www/python/src/app.py", line 1 coding: utf-8 -*- ^ [19:07:50] error [19:08:10] Have resolved this already [19:10:49] Did import os as well. But, still getting error 502. [19:14:37] Mridu: if it's not loading, there's likely another error at the bottom of the uwsgi.log [19:16:11] I am unable to see the os error as well. I am able to see just the coding utf error though. I will go through the logs again. [19:16:59] Mridu: did you scroll all the way down? (page down until it shows '(END)' on the bottom left) [19:18:39] No. I was using ctr+c to exit. As same lines were getting repeated again. [19:20:24] Right, there can be some repetition (because the server tried to restart before the core issue was fixed). But new issues get added to the end of the file, so that's where any new information is. [19:21:51] (03CR) 10jerkins-bot: [V: 04-1] Harvest the source page of unknown fields [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378981 (https://phabricator.wikimedia.org/T117330) (owner: 10Jean-Frédéric) [19:22:42] okay, will run the command again [19:30:04] https://gist.github.com/mridubhatnagar/d65f69c63dad52708874cd606267e1a9 this error is getting repeated. [19:33:38] Mridu valhallasw`cloud: there's a good keyboard shortcut in less to scroll to bottom: G [19:33:53] and g scrolls to top [19:34:09] or you can less +G filename [19:36:31] zhuyifeil999_: thanks. Wasn't knowing about it. Reached EOF. [19:36:40] 10cloud-services-team (Kanban), 10DC-Ops, 10Operations, 10ops-eqiad: labvirt1015 crashes - https://phabricator.wikimedia.org/T171473#3619052 (10Cmjohnson) 05Open>03Resolved Resolving this, if you have any further issues please reopen the task [19:36:50] np [19:40:34] 10cloud-services-team (FY2017-18), 10Goal, 10User-bd808: Perform initial Cloud Services rebranding - https://phabricator.wikimedia.org/T168480#3619069 (10RobH) [19:40:36] 10cloud-services-team (FY2017-18), 10Goal, 10Patch-For-Review: Program 10 Outcome 2: Rebranding - https://phabricator.wikimedia.org/T166404#3619070 (10RobH) [19:40:39] 10Cloud-Services, 10cloud-services-team (Kanban), 10Wikimedia-Mailing-lists: Create cloud-admin and archive labs-admin mailing list - https://phabricator.wikimedia.org/T167155#3619067 (10RobH) 05Open>03Resolved Sorry about this, Ok labs-admin disabled. Archives should be available at current location, a... [19:41:32] 10cloud-services-team (FY2017-18), 10Goal, 10User-bd808: Perform initial Cloud Services rebranding - https://phabricator.wikimedia.org/T168480#3365842 (10RobH) [19:41:34] 10cloud-services-team (FY2017-18), 10Goal, 10Patch-For-Review: Program 10 Outcome 2: Rebranding - https://phabricator.wikimedia.org/T166404#3295273 (10RobH) [19:41:36] 10Cloud-Services, 10cloud-services-team (Kanban), 10Wikimedia-Mailing-lists: Create cloud-admin and archive labs-admin mailing list - https://phabricator.wikimedia.org/T167155#3619071 (10RobH) 05Resolved>03Open [19:43:13] 10Cloud-Services, 10cloud-services-team (Kanban), 10Wikimedia-Mailing-lists: Create cloud-admin and archive labs-admin mailing list - https://phabricator.wikimedia.org/T167155#3319354 (10RobH) Ok, removed my old comment, it was incorrect. The new list is in place with settings and Chase has the new admin em... [19:44:58] Errors resolved. It's working now. There was a yaml error and keyerror. [19:46:45] 10Cloud-Services, 10cloud-services-team (Kanban), 10Wikimedia-Mailing-lists: Create cloud-admin and archive labs-admin mailing list - https://phabricator.wikimedia.org/T167155#3619118 (10RobH) Adding @herron as he has done quite a bit of mailman work recently and may have insight. [19:46:56] 10Cloud-Services, 10cloud-services-team (Kanban), 10Wikimedia-Mailing-lists: Create cloud-admin and archive labs-admin mailing list - https://phabricator.wikimedia.org/T167155#3619121 (10RobH) a:05RobH>03None [19:49:14] 10Cloud-Services, 10cloud-services-team (Kanban), 10Wikimedia-Mailing-lists: Create cloud-admin and archive labs-admin mailing list - https://phabricator.wikimedia.org/T167155#3319354 (10MarcoAurelio) Can't we just rename it? I think there was a command for that? [19:51:39] bd808: Step 4 is still left to do from https://wikitech.wikimedia.org/wiki/Help:Toolforge/My_first_Flask_OAuth_tool tutorial. Meanwhile, I have jotted down couple of changes that we can do in documentation. Should I create a new task or subtask or documentation updates? [19:56:33] 10Cloud-Services, 10cloud-services-team (Kanban), 10Wikimedia-Mailing-lists: Create cloud-admin and archive labs-admin mailing list - https://phabricator.wikimedia.org/T167155#3619180 (10RobH) >>! In T167155#3619139, @MarcoAurelio wrote: > Can't we just rename it? I think there was a command for that? >>!... [19:57:04] (do we really want to integrate OAuth into this task? sure OAuth is quite interesting to learn, but imo it's not that relevant to improving the webservice command) [19:57:12] 10Cloud-Services, 10cloud-services-team (Kanban), 10Wikimedia-Mailing-lists: Create cloud-admin and archive labs-admin mailing list - https://phabricator.wikimedia.org/T167155#3619182 (10RobH) While I put the new list in place, it seems it may have been premature. I've pulled the new list off the server unt... [19:58:36] and not all tools use oauth... [19:58:43] into which task? [20:00:37] into the test tool for that outreachy project [20:05:40] 10Cloud-Services, 10cloud-services-team (Kanban), 10Wikimedia-Mailing-lists: Create cloud-admin and archive labs-admin mailing list - https://phabricator.wikimedia.org/T167155#3619190 (10bd808) >>! In T167155#3619182, @RobH wrote: > While I put the new list in place, it seems it may have been premature. I'v... [20:07:29] zhuyifei1999_: there is actually nothing in the outreachy microtask about getting a Toolforge account or using it to setup a tool. Its not a horrible thing for someone to try out, but it was not in what I specified as the task requirements. [20:11:57] 10Cloud-Services, 10Wikimedia-Mailing-lists: Create cloud mailman list and archive labs-l list - https://phabricator.wikimedia.org/T175190#3619201 (10bd808) [20:13:11] 10Cloud-Services, 10cloud-services-team (Kanban), 10Wikimedia-Mailing-lists: Create cloud-admin and archive labs-admin mailing list - https://phabricator.wikimedia.org/T167155#3619202 (10RobH) The old list alias can continue to deliver to the new list alias by our inclusion of it in operations/puppet:/module... [20:13:15] Mridu: if you would like to make a task to track changes that you want to make/have made to the docs that is ok with me. You could also just make your edits on the wiki without a specific task. It's up to you. [20:13:18] bd808: updated https://phabricator.wikimedia.org/T167155 and i promise im not trying to block you guys [20:13:21] =] [20:13:28] i forgot about it entirely [20:13:43] and then quiddity emailed me and it added the redirect comment so i wanted to double check [20:13:54] his email also reminded me it existed. [20:14:08] ive been working dc ops stuff, while lists is technically any ops ;D [20:15:04] robh: I saw you get pulled into other things a couple of weeks ago and hadn't come back to poke you yet myself. I know *exactly* how all of this goes. :) [20:15:28] (I wasn't sure if redirect-aliases were implicit, or not. I think we were all assuming they were! I'm occasionally useful at noticing unspecified assumptions. ;- [20:15:28] your latest comment there seems right to me, but I'll +1 on task for posterity [20:15:30] ) [20:16:38] 10Cloud-Services, 10cloud-services-team (Kanban), 10Wikimedia-Mailing-lists: Create cloud-admin and archive labs-admin mailing list - https://phabricator.wikimedia.org/T167155#3619228 (10bd808) >>! In T167155#3619202, @RobH wrote: > * Do NOT rename, as it requires downtime of the list server. It could also... [20:17:12] quiddity: assume nothing is implied (as you did via your email) is always best, hehe [20:17:48] if the mbox didnt love to reindex and break links [20:17:59] and if it didnt require mailman downtime to rename, id prefer rename over all this as well ;] [20:18:03] but meh [20:18:58] none of the 3 labs lists is highly active currently. It would be a much bigger deal if we were talking about renaming wikitech-l :) [20:20:15] * quiddity lunches. [20:40:26] bd808 robh if it is to avoid downtime then okay :) [20:40:32] new things I keep learning [20:40:45] mailman is a fickle beast :) [20:40:52] truth. [20:41:08] can we just switch to mailman 3.0? [20:41:14] there was a task for that [20:41:23] I'm not sure what is blocking it right now [20:41:23] there is an entire workboard column for it =] [20:41:28] throughput [20:41:30] lol true [20:41:30] tabbycat: sure, just find us the engineering time to work on it... [20:42:01] bd808: what do you need? how much? I have my checkbook next to me ;) [20:42:30] earmarked grants are problematic, but ... I can put in touch with people ;) [20:43:58] 10Cloud-VPS (Project-requests), 10User-Zppix: Request creation of Zppix-Wiki-AI VPS project - https://phabricator.wikimedia.org/T175846#3605167 (10bd808) Broadly speaking, having a Cloud VPS project that is 'owned' a Foundation team (or a single person) is something that has happened in the past but that we tr... [20:45:10] bd808: heh, I know. Well, I'm sure the day will come... I am currently trying to understand if I can install PHP to work on Windows so I can run phpcbf before uploading my patches and find jerkins-bot complains because of a blank space :P [20:45:15] bd808: k [20:46:26] tabbycat: in theory MediaWiki-Vagrant would give you a place to run those kinds of tests if you can't get a bare metal install to work for you on Windows [20:47:00] bd808: last time I looked at MW-Vagrant documentation I had to take an aspirine [20:47:42] not blaming anyone... I'm total noob in that area so it's my fault [20:48:20] the main place that I have seen Windows user tripped up is getting git installed and working. [20:48:44] but if you can already submit gerrit patches that sounds like a solved problem. [20:49:28] git clone --recursive https://gerrit.wikimedia.org/r/mediawiki/vagrant; cd vagrant; setup.bat; vagrant up [20:49:30] 10Data-Services: Puppetize and setup initial lvms and directory structures for labstore1006|7 - https://phabricator.wikimedia.org/T171539#3619347 (10madhuvishy) Steps to set up lvms ``` # Unmount existing data LV sudo umount /dev/labstore1006-vg/data # Delete the data LV sudo lvremove labstore1006-vg/data # Rem... [20:49:36] at least in a perfect world :) [20:55:05] bd808: thank you, I'll try that for when I feel like giving it again a try [20:57:13] (03CR) 10Lokal Profil: "I think this scrupt is actually supposed to work for sparql. BUT we should be respecting the skip and skip_wd parameters. to ensure nl_wd " [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378982 (owner: 10Jean-Frédéric) [20:57:28] (03CR) 10Lokal Profil: "... and processed" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378982 (owner: 10Jean-Frédéric) [22:04:47] (03PS1) 10Lokal Profil: Harvest the source page of unknown fields [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379126 (https://phabricator.wikimedia.org/T117330) [22:08:33] (03CR) 10Lokal Profil: "This is of course an alternative implementation of https://gerrit.wikimedia.org/r/#/c/378981/ which combined some of the code and ideas I'" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379126 (https://phabricator.wikimedia.org/T117330) (owner: 10Lokal Profil) [22:13:19] 10cloud-services-team (Kanban), 10Patch-For-Review: Replace kernel and reboot labvirt1015, 1016, 1017, 1018 - https://phabricator.wikimedia.org/T176044#3619725 (10Andrew) I've moved labvirt1018 to 4.4.0-83 but can't reproduce this issue. [22:17:05] (03CR) 10Lokal Profil: "I added an alternate implementation in https://gerrit.wikimedia.org/r/#/c/379126" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378981 (https://phabricator.wikimedia.org/T117330) (owner: 10Jean-Frédéric) [22:34:07] PROBLEM - Puppet errors on tools-exec-1439 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [22:39:15] madhuvishy: nfs on tools-exec-1439 is pretty out to lunch. Its working but multiple seconds for an ls command. Pointers on what I should look for? [22:52:27] (03PS2) 10Lokal Profil: Add mechanism for storing wikipage locally instead of writing to wiki [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378800 (https://phabricator.wikimedia.org/T174614) [22:52:43] (03PS1) 10Lokal Profil: Make all erfgoedbot scripts respect the skipping mechanisms. [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379140 [22:53:42] (03CR) 10Lokal Profil: "Alternative solution at https://gerrit.wikimedia.org/r/#/c/379140/" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378982 (owner: 10Jean-Frédéric) [23:03:11] (03CR) 10jerkins-bot: [V: 04-1] Make all erfgoedbot scripts respect the skipping mechanisms. [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379140 (owner: 10Lokal Profil) [23:04:07] RECOVERY - Puppet errors on tools-exec-1439 is OK: OK: Less than 1.00% above the threshold [0.0] [23:29:09] PROBLEM - Puppet errors on tools-exec-1407 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [23:33:07] (03PS1) 10Lokal Profil: [WIP]Group unused images per source page [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379141 (https://phabricator.wikimedia.org/T117327) [23:35:38] (03CR) 10jerkins-bot: [V: 04-1] [WIP]Group unused images per source page [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379141 (https://phabricator.wikimedia.org/T117327) (owner: 10Lokal Profil) [23:36:00] (03PS114) 10Ricordisamoa: Initial commit [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 [23:45:02] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1413 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [23:47:52] (03CR) 10Ricordisamoa: [C: 04-2] "PS114 enables the ESLint valid-jsdoc rule and fixes errors" [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 (owner: 10Ricordisamoa)