[00:00:29] 10cloud-services-team (FY2017-18), 10Goal: Program 4 Outcome 1: improve documentation - https://phabricator.wikimedia.org/T166401#3455001 (10bd808) [00:00:31] 10cloud-services-team (FY2017-18), 10Research, 10Epic, 10User-bd808: [FY17-18] Program 4: Technical community building - https://phabricator.wikimedia.org/T171120#3454674 (10bd808) [00:01:59] 10cloud-services-team (FY2017-18), 10Research, 10Goal: [FY17-18] Program 4: Technical community building - https://phabricator.wikimedia.org/T171120#3455005 (10bd808) p:05High>03Normal a:05bd808>03None [00:05:10] Heh. /That/ is also new. :-P [00:05:36] I remember reading about the rename some time ago. And chuckling. "Cloud". :-) [00:05:50] it's all fancy :) [00:06:15] new paint and carpet, but the same old place -- https://phabricator.wikimedia.org/phame/post/view/59/labs_and_tool_labs_being_renamed/ [00:07:08] Coren: so what instances are you trying to hit? We had a big ldap meltdown and you may be getting messed up by that still [00:09:06] I was simply trying to hit a shell on the tools' bastion [00:09:41] ok. are you going in directly or via another bastion? [00:10:03] Directly. [00:10:31] Lemme try a general bastion, see if that works. [00:10:56] Nope. Key not liked. [00:11:59] yeah I see "Failed publickey for marc" and the fingerprint SHA256:9PEFM86VzgkjcsxMskl0/tdkNi1hgmAMPy6Kg1PdBm8 [00:12:34] so if you go to https://toolsadmin.wikimedia.org/profile/settings/ssh-keys it will show the fingerprints of the keys you have in ldap [00:12:37] * Coren waves at andrewbogott. [00:12:49] ... except I can't log in /there/ either. [00:12:54] doh! [00:13:19] want a password reset? We can do a hangout to verify id real quick? [00:13:33] But that may be a case of forgotten password. I didn't use a password auth anywhere near the wmf in ages. :-) Yeah, hangout works. [00:13:42] Lemme go grab my tablet. [00:14:43] PM'd you a hangout link [00:22:03] * andrewbogott is here but plumbing and, hence, angry [00:41:58] andrewbogott: do you have a minute to help with an ldap password reset? [00:42:00] andrewbogott: Ping when you have a minute [00:42:04] my skillz are failing [00:49:10] 10Tool-Labs-tools-XTools: Page titles in URLs are being HTML-encoded and not URL-encoded - https://phabricator.wikimedia.org/T171133#3455069 (10MusikAnimal) [00:49:24] 10Tool-Labs-tools-XTools, 10Community-Tech-Sprint: Page titles in URLs are being HTML-encoded and not URL-encoded - https://phabricator.wikimedia.org/T171133#3455081 (10MusikAnimal) [00:49:37] 10Tool-Labs-tools-XTools, 10Community-Tech-Sprint: Page titles in URLs are being HTML-encoded and not URL-encoded - https://phabricator.wikimedia.org/T171133#3455069 (10MusikAnimal) PR https://github.com/x-tools/xtools/pull/55 [00:49:49] 10Tool-Labs-tools-XTools, 10Community-Tech-Sprint: Page titles in URLs are being HTML-encoded and not URL-encoded - https://phabricator.wikimedia.org/T171133#3455098 (10MusikAnimal) [00:50:25] 10Tool-Labs-tools-XTools, 10Community-Tech-Sprint: Page titles in URLs are being HTML-encoded and not URL-encoded - https://phabricator.wikimedia.org/T171133#3455069 (10MusikAnimal) [00:50:35] 10Tool-Labs-tools-XTools, 10Community-Tech-Sprint: Page titles in URLs are being HTML-encoded and not URL-encoded - https://phabricator.wikimedia.org/T171133#3455069 (10MusikAnimal) [00:56:15] PROBLEM - Puppet errors on tools-exec-1408 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [00:56:22] 10Tool-Labs-tools-XTools, 10Community-Tech-Sprint: AutomatedEditsHelper isn't using a project URL when referencing the automated tools list - https://phabricator.wikimedia.org/T171135#3455112 (10MusikAnimal) [00:56:43] 10Tool-Labs-tools-XTools, 10Community-Tech-Sprint: AutomatedEditsHelper isn't using a project URL when referencing the automated tools list - https://phabricator.wikimedia.org/T171135#3455112 (10MusikAnimal) [01:00:43] 10Tool-Labs-tools-XTools, 10Community-Tech, 10User-Matthewrbowker: Rewrite XTools: Edit Summaries - https://phabricator.wikimedia.org/T170905#3455134 (10Matthewrbowker) [01:00:46] 10Tool-Labs-tools-Matthewrbowker's-tools, 10User-Matthewrbowker: Create a bash script or a script that can be run from the cron daily to download new i18n messages - https://phabricator.wikimedia.org/T170859#3455136 (10Matthewrbowker) [01:00:48] 10Tool-Labs-tools-XTools, 10Documentation, 10User-Matthewrbowker: Document algorithm for AdminScore - https://phabricator.wikimedia.org/T170892#3455135 (10Matthewrbowker) [01:00:50] 10Tool-Labs-tools-XTools, 10User-Matthewrbowker: Allow use of the profiler on xtools-dev.wmflabs.org - https://phabricator.wikimedia.org/T170615#3455138 (10Matthewrbowker) [01:00:53] 10Tool-Labs-tools-XTools, 10Community-Tech, 10Epic, 10User-Matthewrbowker: Epic: Rewrite Xtools: RfX Vote Calculator - https://phabricator.wikimedia.org/T165710#3455137 (10Matthewrbowker) [01:00:57] 10Tool-Labs-tools-XTools, 10Community-Tech, 10Epic, 10User-Matthewrbowker: Epic: Rewrite Xtools: RfX Analysis - https://phabricator.wikimedia.org/T165709#3455139 (10Matthewrbowker) [01:01:01] 10Tool-Labs-tools-XTools, 10Community-Tech-Sprint, 10User-Matthewrbowker: Convert xtools intuition to its own repository - https://phabricator.wikimedia.org/T165708#3455140 (10Matthewrbowker) [01:02:03] Sorry [01:16:31] bd808, Coren, as far as I know an ldap password reset == a wikitech password reset [01:17:27] andrewbogott: Tested and not working. [01:17:39] andrewbogott: The hashed password did not change. [01:19:30] I was able to log onto wikitech with a temp password (bypasses auth); from there a password reset did not actually work - logging off and back on didn't pick up on the new password and the LDAP hash didn't change. [01:24:16] Coren: I fscked up the novaadmin password which would explain the lack of wikitech's reset working :/ [01:24:42] Ah, so you _did_ change it. :-) I hope you remember what you set it to? [01:26:31] In unrelated news, I now have a Occulus Touch. Much fun ensued. :-) [01:29:15] Coren, bd808, I think that novaadmin is back to normal... [01:29:24] So /now/ I would expect a wikitech password change to work [01:29:29] * Coren attempts it. [01:29:38] …maybe [01:30:35] striker is happy again too [01:30:51] I'll write this up after I eat :/ [01:31:01] We use that account for a lot! Although, I would've thought, not for openstack-browser... [01:31:11] Oh, I guess it wasn't us talking to openstack it was openstack talking to ldap that was busted [01:31:15] yeah [01:31:21] It worked on Wikitech, at least. [01:31:35] Coren: that's a good sign... [01:31:42] Indeed. [01:31:49] Trying on the newfangled thing. [01:32:15] I missed the beginning of this, is this about you trying to get on striker? [01:32:29] 10Tool-Labs-tools-XTools, 10Community-Tech-Sprint: XTools prod environment: unknown "profiler_dump" function - https://phabricator.wikimedia.org/T170233#3455174 (10Samwilson) a:03Samwilson PR https://github.com/x-tools/xtools/pull/56 [01:32:30] Got into the console. [01:33:01] It's actually quite pretty. [01:33:21] I get a 500 when I try to go see my ssh keys. [01:33:30] Which explains why /that/ doesn't work, I guess. [01:33:52] Request ID a3e7ac343d304386843c150d8f985aed [01:36:16] RECOVERY - Puppet errors on tools-exec-1408 is OK: OK: Less than 1.00% above the threshold [0.0] [01:36:56] Coren: I hate to send you in circles, but the key interface on https://wikitech.wikimedia.org/wiki/Special:Preferences#mw-prefsection-openstack should also still work... [01:37:00] does that error out as well? [01:37:50] No; but it shows the keys I expected. Which means I should be able to ssh into bastions. Which I cannot. [01:38:02] So there is something smelly regardless. [01:38:24] I shall attempt to delete and readd them. [01:38:31] this is dumb but can you try... [01:38:35] that's what I was going to say :) [01:39:20] Well, some sort of desync seemed most likely. :-) [01:39:38] Still don't show up on toolsadmin though. [01:39:56] Nor can I log in using it. [01:40:12] and the key still isn't in ldap [01:40:16] (toolsadmin ssh thing still 500s) [01:41:02] How is wikitech seeing it then? [01:41:13] I mean, it pulls from LDAP doesn't it? [01:41:24] Or is it /also/ in the DB? [01:43:02] I've more-or-less never touched the key-handling bits… looking at the wikitech source now [01:43:30] Where's Ryan when we need him? :-P [01:45:24] sure looks to me like it's getting it from ldap [01:46:02] ok, it's in ldap... [01:46:15] maybe I missed it before or maybe there was a delay after you turned it off and on again... [01:46:22] ldap says... [01:46:29] https://www.irccloud.com/pastebin/uL3K5C8u/ [01:46:37] does that look familiar? [01:46:59] Still cannot log in or get a non-500 on labsconsole. [01:47:05] Yep, that's the right public key [01:47:27] does ssh restricted.bastion.wmflabs.org work by chance? [01:47:49] Permission denied (publickey). [01:48:03] and you've already been trying ssh bastion.wmflabs.org right? [01:48:54] Yes, and login.tools.wmflabs.org [01:50:52] FWIW, I think my last login is ~ 1m ago so it worked not all that far back. [01:51:01] 1month* [01:52:21] Is there any chance that dss keys have been disabled since then? I know they're being phased out in upcoming ssh releases [01:52:38] I'm looking for logs or emails about same... [01:53:00] dss doesn't work on stretch, but should still work on jessie/trusty I think [01:53:07] They still work fine in Xenial. Labs it mostly Trusty? [01:53:22] I think the boxes you're hitting are jessie [01:53:29] I do have a task to deprecate them in Cloud VPS and Toolforge [01:54:03] Lemme push my RSA key too then, if only to eliminate the possibility. [01:54:31] bd808: are you able to see why he's getting a 500 on striker? [01:55:17] ah-HA! [01:55:19] ERROR: Failed to parse "b'ssh-dss' key data can not be longer than 1024 bits (was 2047)" [01:55:42] rsa key worked. [01:55:55] so maybe we disabled dss and didn't know it :( [01:56:00] But also, what? key length over 1024 is perfectly legit! [01:56:08] * Coren removes that key. [01:56:26] And, there we go. [01:56:34] probably something in the python lib I'm using to validate keys [01:56:37] No more 500 on toolsadmin. [01:57:19] And I can log in. [01:57:29] So yeah. Something updated behind your back and made DSS keys broke. [01:57:41] You may want to make that known, just in case. :-) [01:57:51] Thanks guys. [01:57:53] o/ [01:58:43] 10Cloud-VPS, 10cloud-services-team: dss keys disabled prematurely - https://phabricator.wikimedia.org/T171136#3455186 (10Andrew) [01:58:46] * andrewbogott creates https://phabricator.wikimedia.org/T171136 for pondering purposes [01:59:08] Coren: yeah, seems like something we should understand and announce :) [02:00:12] Also, surprise semantic changes are always... iffy. :-) [02:01:51] 10Cloud-VPS, 10cloud-services-team: dss keys disabled prematurely - https://phabricator.wikimedia.org/T171136#3455201 (10bd808) [02:01:53] 10Cloud-VPS, 10Toolforge, 10cloud-services-team (Kanban): Deprecate DSA (ssh-dss) SSH keys for Labs users - https://phabricator.wikimedia.org/T168433#3455200 (10bd808) [02:02:14] PROBLEM - Puppet errors on tools-exec-1408 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [02:02:20] andrewbogott: did we patch up to openssh 7.x maybe? [02:02:33] not as far as I could tell [02:05:29] * Coren lols. [02:05:49] I read up on *why* DSA keys are being deprecated and the reason is just downright daft. [02:06:33] I see 6.7 on one bastion and 6.9 on another [02:07:31] The short of it, forcing DSA keys to be only 1024 bits is a misreading of FIPS 186; and 1024 bit keys are hypothetically a little weak. Openssh refused to generate longer keys, and because its keys are limited they deemed them "weak". One commenter summarizes: [02:07:39] "Another way of seeing the very same sequence of decisions is that OpenSSH developers blundered badly at some point because of some poor reading of FIPS 186, and then sought to cover it in the equivalent of dumping at sea the corpse of the inconvenient husband." [02:10:20] hah! [02:13:27] PROBLEM - Puppet errors on tools-bastion-03 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [02:14:21] PROBLEM - Puppet errors on tools-webgrid-generic-1401 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0] [02:16:26] PROBLEM - Puppet errors on tools-exec-1419 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [02:17:08] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1417 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:17:54] PROBLEM - Puppet errors on tools-exec-1429 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [02:18:24] PROBLEM - Puppet errors on tools-worker-1022 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:19:04] PROBLEM - Puppet errors on tools-elastic-01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [02:19:08] PROBLEM - Puppet errors on tools-worker-1005 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [02:19:30] PROBLEM - Puppet errors on tools-k8s-master-01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [02:20:29] PROBLEM - Puppet errors on tools-proxy-01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [02:21:00] PROBLEM - Puppet errors on tools-exec-1430 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [02:21:04] PROBLEM - Puppet errors on tools-worker-1001 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [02:21:16] PROBLEM - Puppet errors on tools-exec-1402 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [02:21:20] grrr puppet fail storm [02:21:22] PROBLEM - Puppet errors on tools-worker-1002 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [02:21:36] PROBLEM - Puppet errors on tools-exec-1413 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [02:21:44] PROBLEM - Puppet errors on tools-flannel-etcd-02 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [02:21:50] PROBLEM - Puppet errors on tools-exec-1425 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [02:22:08] PROBLEM - Puppet errors on tools-worker-1004 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [02:22:59] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1404 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [02:23:03] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1410 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [02:23:28] Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Failed when searching for node tools-elastic-01.tools.eqiad.wmflabs: Failed to find tools-elastic-01.tools.eqiad.wmflabs via exec: Execution of '/usr/local/bin/puppet-enc tools-elastic-01.tools.eqiad.wmflabs' returned 1: [02:23:38] but an manual run right after is fine? [02:23:39] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1424 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [02:24:15] PROBLEM - Puppet errors on tools-worker-1018 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [02:24:35] PROBLEM - Puppet errors on tools-prometheus-01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [02:25:19] PROBLEM - Puppet errors on tools-exec-1407 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [02:25:52] PROBLEM - Puppet errors on tools-worker-1023 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [02:25:54] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1407 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [02:26:22] PROBLEM - Puppet errors on tools-worker-1021 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [02:26:34] It looks like maybe the puppetmaster had a hiccup? [02:26:44] manual runs are working fine [02:26:48] PROBLEM - Puppet errors on tools-exec-1422 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [02:26:56] PROBLEM - Puppet errors on tools-exec-1420 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [02:27:53] but the servers I've checked all seemed to have that "failed to find via exec" message on their last cron-based run [02:28:06] PROBLEM - Puppet errors on tools-exec-1409 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [02:28:10] PROBLEM - Puppet errors on tools-exec-1431 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [02:28:45] PROBLEM - Puppet errors on tools-proxy-02 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [02:29:28] PROBLEM - Puppet errors on tools-exec-1434 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [02:29:54] PROBLEM - Puppet errors on tools-exec-1403 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [02:30:24] PROBLEM - Puppet errors on tools-worker-1017 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:30:30] PROBLEM - Puppet errors on tools-webgrid-generic-1404 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [02:30:56] PROBLEM - Puppet errors on tools-worker-1009 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [02:31:08] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1418 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [02:32:12] PROBLEM - Puppet errors on tools-static-11 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [02:33:03] PROBLEM - Puppet errors on tools-checker-02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:33:11] PROBLEM - Puppet errors on tools-k8s-etcd-02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:34:05] RECOVERY - Puppet errors on tools-elastic-01 is OK: OK: Less than 1.00% above the threshold [0.0] [02:34:59] PROBLEM - Puppet errors on tools-prometheus-02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [02:35:23] PROBLEM - Puppet errors on tools-exec-1414 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [02:35:33] PROBLEM - Puppet errors on tools-worker-1020 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [02:35:59] PROBLEM - Puppet errors on tools-flannel-etcd-03 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [02:36:17] PROBLEM - Puppet errors on tools-exec-1438 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [02:36:23] PROBLEM - Puppet errors on tools-exec-1401 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:36:25] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1419 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [02:36:29] PROBLEM - Puppet errors on tools-exec-1442 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [02:36:33] PROBLEM - Puppet errors on tools-checker-01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [02:36:51] PROBLEM - Puppet errors on tools-worker-1025 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:37:03] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1427 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:37:08] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1411 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [02:37:56] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1404 is OK: OK: Less than 1.00% above the threshold [0.0] [02:39:12] PROBLEM - Puppet errors on tools-worker-1011 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [02:39:28] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1416 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [02:39:42] PROBLEM - Puppet errors on tools-exec-1405 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [02:40:55] PROBLEM - Puppet errors on tools-exec-1424 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [02:41:09] PROBLEM - Puppet errors on tools-exec-1435 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [02:44:41] PROBLEM - Puppet errors on tools-docker-builder-05 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [02:45:07] PROBLEM - Puppet errors on tools-static-10 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [02:46:46] PROBLEM - Puppet errors on tools-worker-1003 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [02:47:08] PROBLEM - Puppet errors on tools-redis-1002 is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [02:53:25] RECOVERY - Puppet errors on tools-bastion-03 is OK: OK: Less than 1.00% above the threshold [0.0] [02:54:09] RECOVERY - Puppet errors on tools-worker-1005 is OK: OK: Less than 1.00% above the threshold [0.0] [02:54:21] RECOVERY - Puppet errors on tools-webgrid-generic-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [02:56:01] RECOVERY - Puppet errors on tools-exec-1430 is OK: OK: Less than 1.00% above the threshold [0.0] [02:56:03] RECOVERY - Puppet errors on tools-worker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [02:56:17] RECOVERY - Puppet errors on tools-exec-1402 is OK: OK: Less than 1.00% above the threshold [0.0] [02:56:22] RECOVERY - Puppet errors on tools-worker-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [02:56:24] RECOVERY - Puppet errors on tools-exec-1419 is OK: OK: Less than 1.00% above the threshold [0.0] [02:56:44] RECOVERY - Puppet errors on tools-flannel-etcd-02 is OK: OK: Less than 1.00% above the threshold [0.0] [02:56:52] RECOVERY - Puppet errors on tools-exec-1425 is OK: OK: Less than 1.00% above the threshold [0.0] [02:57:08] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1417 is OK: OK: Less than 1.00% above the threshold [0.0] [02:57:08] RECOVERY - Puppet errors on tools-worker-1004 is OK: OK: Less than 1.00% above the threshold [0.0] [02:57:54] RECOVERY - Puppet errors on tools-exec-1429 is OK: OK: Less than 1.00% above the threshold [0.0] [02:58:26] RECOVERY - Puppet errors on tools-worker-1022 is OK: OK: Less than 1.00% above the threshold [0.0] [02:59:30] RECOVERY - Puppet errors on tools-k8s-master-01 is OK: OK: Less than 1.00% above the threshold [0.0] [02:59:37] RECOVERY - Puppet errors on tools-prometheus-01 is OK: OK: Less than 1.00% above the threshold [0.0] [03:00:20] RECOVERY - Puppet errors on tools-exec-1407 is OK: OK: Less than 1.00% above the threshold [0.0] [03:00:28] RECOVERY - Puppet errors on tools-proxy-01 is OK: OK: Less than 1.00% above the threshold [0.0] [03:00:50] RECOVERY - Puppet errors on tools-worker-1023 is OK: OK: Less than 1.00% above the threshold [0.0] [03:01:20] RECOVERY - Puppet errors on tools-worker-1021 is OK: OK: Less than 1.00% above the threshold [0.0] [03:01:38] RECOVERY - Puppet errors on tools-exec-1413 is OK: OK: Less than 1.00% above the threshold [0.0] [03:03:03] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1410 is OK: OK: Less than 1.00% above the threshold [0.0] [03:03:41] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1424 is OK: OK: Less than 1.00% above the threshold [0.0] [03:03:45] RECOVERY - Puppet errors on tools-proxy-02 is OK: OK: Less than 1.00% above the threshold [0.0] [03:04:15] RECOVERY - Puppet errors on tools-worker-1018 is OK: OK: Less than 1.00% above the threshold [0.0] [03:05:53] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1407 is OK: OK: Less than 1.00% above the threshold [0.0] [03:06:06] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1418 is OK: OK: Less than 1.00% above the threshold [0.0] [03:06:48] RECOVERY - Puppet errors on tools-exec-1422 is OK: OK: Less than 1.00% above the threshold [0.0] [03:06:54] RECOVERY - Puppet errors on tools-exec-1420 is OK: OK: Less than 1.00% above the threshold [0.0] [03:07:08] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1420 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [03:07:10] RECOVERY - Puppet errors on tools-static-11 is OK: OK: Less than 1.00% above the threshold [0.0] [03:07:12] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1426 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [03:07:14] RECOVERY - Puppet errors on tools-exec-1408 is OK: OK: Less than 1.00% above the threshold [0.0] [03:08:08] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1415 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [03:08:08] RECOVERY - Puppet errors on tools-exec-1409 is OK: OK: Less than 1.00% above the threshold [0.0] [03:08:08] RECOVERY - Puppet errors on tools-exec-1431 is OK: OK: Less than 1.00% above the threshold [0.0] [03:08:10] PROBLEM - Puppet errors on tools-bastion-02 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [03:08:49] PROBLEM - Puppet errors on tools-puppetmaster-01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [03:08:51] PROBLEM - Puppet errors on tools-worker-1012 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [03:08:55] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1421 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [03:09:22] PROBLEM - Puppet errors on tools-paws-worker-1001 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [03:09:24] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1413 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [03:09:28] RECOVERY - Puppet errors on tools-exec-1434 is OK: OK: Less than 1.00% above the threshold [0.0] [03:09:56] RECOVERY - Puppet errors on tools-exec-1403 is OK: OK: Less than 1.00% above the threshold [0.0] [03:10:10] PROBLEM - Puppet errors on tools-exec-1427 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [03:10:22] RECOVERY - Puppet errors on tools-worker-1017 is OK: OK: Less than 1.00% above the threshold [0.0] [03:10:23] PROBLEM - Puppet errors on tools-paws-master-01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [03:10:29] RECOVERY - Puppet errors on tools-webgrid-generic-1404 is OK: OK: Less than 1.00% above the threshold [0.0] [03:10:37] PROBLEM - Puppet errors on tools-logs-02 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [03:10:52] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1428 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [03:10:54] RECOVERY - Puppet errors on tools-worker-1009 is OK: OK: Less than 1.00% above the threshold [0.0] [03:10:58] RECOVERY - Puppet errors on tools-flannel-etcd-03 is OK: OK: Less than 1.00% above the threshold [0.0] [03:11:02] PROBLEM - Puppet errors on tools-exec-1423 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [03:11:10] PROBLEM - Puppet errors on tools-exec-1410 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [03:11:24] PROBLEM - Puppet errors on tools-docker-registry-01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [03:11:34] RECOVERY - Puppet errors on tools-checker-01 is OK: OK: Less than 1.00% above the threshold [0.0] [03:11:52] PROBLEM - Puppet errors on tools-worker-1007 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [03:12:01] PROBLEM - Puppet errors on tools-services-01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [03:12:03] PROBLEM - Puppet errors on tools-exec-1418 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [03:12:03] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1414 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [03:12:09] PROBLEM - Puppet errors on tools-grid-master is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [03:12:17] PROBLEM - Puppet errors on tools-grid-shadow is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [03:12:51] PROBLEM - Puppet errors on tools-cron-01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [03:13:05] RECOVERY - Puppet errors on tools-checker-02 is OK: OK: Less than 1.00% above the threshold [0.0] [03:13:07] PROBLEM - Puppet errors on tools-exec-1415 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [03:13:11] PROBLEM - Puppet errors on tools-puppetmaster-02 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [03:13:13] PROBLEM - Puppet errors on tools-exec-gift-trusty-01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [03:13:14] PROBLEM - Puppet errors on tools-exec-1432 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [03:13:18] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1405 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [03:13:30] PROBLEM - Puppet errors on tools-exec-1426 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [03:13:38] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1401 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [03:13:52] PROBLEM - Puppet errors on tools-exec-1421 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [03:14:26] PROBLEM - Puppet errors on tools-bastion-03 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [03:14:44] PROBLEM - Puppet errors on tools-exec-1412 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [03:14:44] PROBLEM - Puppet errors on tools-exec-1416 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [03:14:44] RECOVERY - Puppet errors on tools-exec-1405 is OK: OK: Less than 1.00% above the threshold [0.0] [03:15:23] PROBLEM - Puppet errors on tools-webgrid-generic-1401 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [03:16:23] RECOVERY - Puppet errors on tools-exec-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [03:17:02] PROBLEM - Puppet errors on tools-exec-1430 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [03:17:17] PROBLEM - Puppet errors on tools-exec-1402 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [03:17:21] PROBLEM - Puppet errors on tools-worker-1002 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [03:17:25] PROBLEM - Puppet errors on tools-exec-1419 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [03:17:46] PROBLEM - Puppet errors on tools-flannel-etcd-02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [03:17:52] PROBLEM - Puppet errors on tools-exec-1425 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [03:18:08] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1417 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [03:18:56] PROBLEM - Puppet errors on tools-exec-1429 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [03:19:00] PROBLEM - Puppet errors on tools-exec-1437 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [03:19:25] PROBLEM - Puppet errors on tools-worker-1022 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [03:19:33] PROBLEM - Puppet errors on tools-docker-registry-02 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [03:19:43] PROBLEM - Puppet errors on tools-worker-1008 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [03:20:05] PROBLEM - Puppet errors on tools-elastic-01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [03:20:09] PROBLEM - Puppet errors on tools-worker-1005 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [03:20:11] PROBLEM - Puppet errors on tools-worker-1027 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [03:20:32] PROBLEM - Puppet errors on tools-k8s-master-01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [03:20:36] PROBLEM - Puppet errors on tools-elastic-03 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [03:21:29] PROBLEM - Puppet errors on tools-proxy-01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [03:22:03] PROBLEM - Puppet errors on tools-worker-1001 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [03:22:19] PROBLEM - Puppet errors on tools-worker-1021 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [03:22:37] PROBLEM - Puppet errors on tools-exec-1413 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [03:23:10] PROBLEM - Puppet errors on tools-worker-1004 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [03:23:26] PROBLEM - Puppet errors on tools-exec-1433 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [03:23:56] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1404 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [03:24:04] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1410 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [03:24:40] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1424 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [03:24:48] PROBLEM - Puppet errors on tools-worker-1006 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [03:25:16] PROBLEM - Puppet errors on tools-webgrid-generic-1402 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [03:25:17] PROBLEM - Puppet errors on tools-worker-1018 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [03:25:22] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1425 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [03:25:25] PROBLEM - Puppet errors on tools-flannel-etcd-01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [03:25:39] PROBLEM - Puppet errors on tools-prometheus-01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [03:25:39] PROBLEM - Puppet errors on tools-exec-1439 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [03:25:47] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1403 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [03:25:57] PROBLEM - Puppet errors on tools-bastion-05 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [03:26:19] PROBLEM - Puppet errors on tools-exec-1407 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [03:26:25] PROBLEM - Puppet errors on tools-exec-1411 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [03:26:50] PROBLEM - Puppet errors on tools-worker-1023 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [03:26:52] PROBLEM - Puppet errors on tools-mail is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [03:26:52] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1407 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [03:27:51] PROBLEM - Puppet errors on tools-exec-1422 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [03:27:55] PROBLEM - Puppet errors on tools-exec-1420 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [03:28:11] PROBLEM - Puppet errors on tools-package-builder-01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [03:28:13] PROBLEM - Puppet errors on tools-exec-1408 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [03:28:45] PROBLEM - Puppet errors on tools-elastic-02 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [03:28:55] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1408 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [03:28:55] PROBLEM - Puppet errors on tools-worker-1013 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [03:29:07] PROBLEM - Puppet errors on tools-exec-1409 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [03:29:45] PROBLEM - Puppet errors on tools-proxy-02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [03:30:28] PROBLEM - Puppet errors on tools-exec-1434 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [03:30:34] PROBLEM - Puppet errors on tools-worker-1010 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [03:30:42] PROBLEM - Puppet errors on tools-redis-1001 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [03:30:52] PROBLEM - Puppet errors on tools-exec-1403 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [03:31:02] PROBLEM - Puppet errors on tools-webgrid-generic-1403 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [03:31:28] PROBLEM - Puppet errors on tools-webgrid-generic-1404 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [03:31:33] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1412 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [03:31:55] PROBLEM - Puppet errors on tools-worker-1009 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [03:32:07] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1418 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [03:33:12] PROBLEM - Puppet errors on tools-static-11 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [03:33:34] PROBLEM - Puppet errors on tools-exec-1404 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [03:33:42] PROBLEM - Puppet errors on tools-exec-1441 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [03:34:33] PROBLEM - Puppet errors on tools-exec-1417 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [03:34:33] PROBLEM - Puppet errors on tools-worker-1015 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [03:34:53] PROBLEM - Puppet errors on tools-k8s-etcd-03 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [03:37:54] !log tools Restarted apache on tools-puppetmaster-01 [03:37:57] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [03:40:34] RECOVERY - Puppet errors on tools-worker-1020 is OK: OK: Less than 1.00% above the threshold [0.0] [03:43:11] RECOVERY - Puppet errors on tools-k8s-etcd-02 is OK: OK: Less than 1.00% above the threshold [0.0] [03:44:21] RECOVERY - Puppet errors on tools-paws-worker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [03:44:29] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1416 is OK: OK: Less than 1.00% above the threshold [0.0] [03:44:47] !log tools tools-puppetmaster-01:~# service nslcd restart && service nscd restart [03:44:50] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [03:44:59] RECOVERY - Puppet errors on tools-prometheus-02 is OK: OK: Less than 1.00% above the threshold [0.0] [03:45:01] !log tools tools-puppetmaster-01:~# service nslcd restart && service nscd restart [03:45:04] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [03:45:10] RECOVERY - Puppet errors on tools-exec-1427 is OK: OK: Less than 1.00% above the threshold [0.0] [03:45:20] RECOVERY - Puppet errors on tools-exec-1414 is OK: OK: Less than 1.00% above the threshold [0.0] [03:46:02] RECOVERY - Puppet errors on tools-exec-1423 is OK: OK: Less than 1.00% above the threshold [0.0] [03:46:14] RECOVERY - Puppet errors on tools-exec-1438 is OK: OK: Less than 1.00% above the threshold [0.0] [03:46:24] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1419 is OK: OK: Less than 1.00% above the threshold [0.0] [03:46:28] RECOVERY - Puppet errors on tools-docker-registry-01 is OK: OK: Less than 1.00% above the threshold [0.0] [03:46:30] RECOVERY - Puppet errors on tools-exec-1442 is OK: OK: Less than 1.00% above the threshold [0.0] [03:46:52] RECOVERY - Puppet errors on tools-worker-1025 is OK: OK: Less than 1.00% above the threshold [0.0] [03:47:06] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1427 is OK: OK: Less than 1.00% above the threshold [0.0] [03:47:06] RECOVERY - Puppet errors on tools-grid-master is OK: OK: Less than 1.00% above the threshold [0.0] [03:47:08] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1420 is OK: OK: Less than 1.00% above the threshold [0.0] [03:47:08] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1411 is OK: OK: Less than 1.00% above the threshold [0.0] [03:47:09] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1426 is OK: OK: Less than 1.00% above the threshold [0.0] [03:47:17] RECOVERY - Puppet errors on tools-grid-shadow is OK: OK: Less than 1.00% above the threshold [0.0] [03:48:07] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1415 is OK: OK: Less than 1.00% above the threshold [0.0] [03:48:11] RECOVERY - Puppet errors on tools-bastion-02 is OK: OK: Less than 1.00% above the threshold [0.0] [03:48:11] RECOVERY - Puppet errors on tools-puppetmaster-02 is OK: OK: Less than 1.00% above the threshold [0.0] [03:48:13] RECOVERY - Puppet errors on tools-exec-1432 is OK: OK: Less than 1.00% above the threshold [0.0] [03:48:38] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [03:48:51] RECOVERY - Puppet errors on tools-exec-1421 is OK: OK: Less than 1.00% above the threshold [0.0] [03:48:52] RECOVERY - Puppet errors on tools-worker-1012 is OK: OK: Less than 1.00% above the threshold [0.0] [03:48:58] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1421 is OK: OK: Less than 1.00% above the threshold [0.0] [03:49:12] RECOVERY - Puppet errors on tools-worker-1011 is OK: OK: Less than 1.00% above the threshold [0.0] [03:49:22] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1413 is OK: OK: Less than 1.00% above the threshold [0.0] [03:49:44] RECOVERY - Puppet errors on tools-exec-1412 is OK: OK: Less than 1.00% above the threshold [0.0] [03:50:08] RECOVERY - Puppet errors on tools-static-10 is OK: OK: Less than 1.00% above the threshold [0.0] [03:50:20] RECOVERY - Puppet errors on tools-paws-master-01 is OK: OK: Less than 1.00% above the threshold [0.0] [03:50:34] RECOVERY - Puppet errors on tools-logs-02 is OK: OK: Less than 1.00% above the threshold [0.0] [03:50:50] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1428 is OK: OK: Less than 1.00% above the threshold [0.0] [03:50:55] RECOVERY - Puppet errors on tools-exec-1424 is OK: OK: Less than 1.00% above the threshold [0.0] [03:51:11] RECOVERY - Puppet errors on tools-exec-1410 is OK: OK: Less than 1.00% above the threshold [0.0] [03:51:11] RECOVERY - Puppet errors on tools-exec-1435 is OK: OK: Less than 1.00% above the threshold [0.0] [03:51:47] RECOVERY - Puppet errors on tools-worker-1003 is OK: OK: Less than 1.00% above the threshold [0.0] [03:51:53] RECOVERY - Puppet errors on tools-worker-1007 is OK: OK: Less than 1.00% above the threshold [0.0] [03:51:59] RECOVERY - Puppet errors on tools-services-01 is OK: OK: Less than 1.00% above the threshold [0.0] [03:52:02] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1414 is OK: OK: Less than 1.00% above the threshold [0.0] [03:52:05] RECOVERY - Puppet errors on tools-redis-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [03:52:27] RECOVERY - Puppet errors on tools-exec-1419 is OK: OK: Less than 1.00% above the threshold [0.0] [03:52:51] RECOVERY - Puppet errors on tools-cron-01 is OK: OK: Less than 1.00% above the threshold [0.0] [03:53:08] RECOVERY - Puppet errors on tools-exec-1415 is OK: OK: Less than 1.00% above the threshold [0.0] [03:53:14] RECOVERY - Puppet errors on tools-exec-gift-trusty-01 is OK: OK: Less than 1.00% above the threshold [0.0] [03:53:16] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1405 is OK: OK: Less than 1.00% above the threshold [0.0] [03:53:28] RECOVERY - Puppet errors on tools-exec-1426 is OK: OK: Less than 1.00% above the threshold [0.0] [03:54:26] RECOVERY - Puppet errors on tools-bastion-03 is OK: OK: Less than 1.00% above the threshold [0.0] [03:54:40] RECOVERY - Puppet errors on tools-docker-builder-05 is OK: OK: Less than 1.00% above the threshold [0.0] [03:54:44] RECOVERY - Puppet errors on tools-exec-1416 is OK: OK: Less than 1.00% above the threshold [0.0] [03:55:06] RECOVERY - Puppet errors on tools-elastic-01 is OK: OK: Less than 1.00% above the threshold [0.0] [03:55:22] RECOVERY - Puppet errors on tools-webgrid-generic-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [03:55:32] RECOVERY - Puppet errors on tools-k8s-master-01 is OK: OK: Less than 1.00% above the threshold [0.0] [03:56:28] RECOVERY - Puppet errors on tools-proxy-01 is OK: OK: Less than 1.00% above the threshold [0.0] [03:57:01] RECOVERY - Puppet errors on tools-exec-1430 is OK: OK: Less than 1.00% above the threshold [0.0] [03:57:05] RECOVERY - Puppet errors on tools-worker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [03:57:17] RECOVERY - Puppet errors on tools-exec-1402 is OK: OK: Less than 1.00% above the threshold [0.0] [03:57:21] RECOVERY - Puppet errors on tools-worker-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [03:57:41] !log tools Redtarted cron, nscd, nslcd on tools-cron-01 [03:57:44] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [03:57:45] RECOVERY - Puppet errors on tools-flannel-etcd-02 is OK: OK: Less than 1.00% above the threshold [0.0] [03:57:51] RECOVERY - Puppet errors on tools-exec-1425 is OK: OK: Less than 1.00% above the threshold [0.0] [03:57:58] !log tools tools-exec-1428:~# service nslcd restart && service nscd restart [03:58:00] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [03:58:09] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1417 is OK: OK: Less than 1.00% above the threshold [0.0] [03:58:11] RECOVERY - Puppet errors on tools-worker-1004 is OK: OK: Less than 1.00% above the threshold [0.0] [03:58:25] RECOVERY - Puppet errors on tools-exec-1433 is OK: OK: Less than 1.00% above the threshold [0.0] [03:58:55] RECOVERY - Puppet errors on tools-exec-1429 is OK: OK: Less than 1.00% above the threshold [0.0] [03:58:57] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1404 is OK: OK: Less than 1.00% above the threshold [0.0] [03:59:01] RECOVERY - Puppet errors on tools-exec-1437 is OK: OK: Less than 1.00% above the threshold [0.0] [03:59:25] RECOVERY - Puppet errors on tools-worker-1022 is OK: OK: Less than 1.00% above the threshold [0.0] [03:59:33] RECOVERY - Puppet errors on tools-docker-registry-02 is OK: OK: Less than 1.00% above the threshold [0.0] [03:59:45] RECOVERY - Puppet errors on tools-worker-1008 is OK: OK: Less than 1.00% above the threshold [0.0] [04:00:32] RECOVERY - Puppet errors on tools-elastic-03 is OK: OK: Less than 1.00% above the threshold [0.0] [04:00:36] RECOVERY - Puppet errors on tools-prometheus-01 is OK: OK: Less than 1.00% above the threshold [0.0] [04:00:46] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1403 is OK: OK: Less than 1.00% above the threshold [0.0] [04:00:49] !log tools tools-webgrid-lighttpd-1402:~# service nslcd restart && service nscd restart [04:00:53] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [04:01:19] RECOVERY - Puppet errors on tools-exec-1407 is OK: OK: Less than 1.00% above the threshold [0.0] [04:01:51] RECOVERY - Puppet errors on tools-worker-1023 is OK: OK: Less than 1.00% above the threshold [0.0] [04:02:22] RECOVERY - Puppet errors on tools-worker-1021 is OK: OK: Less than 1.00% above the threshold [0.0] [04:02:38] RECOVERY - Puppet errors on tools-exec-1413 is OK: OK: Less than 1.00% above the threshold [0.0] [04:03:16] RECOVERY - Puppet errors on tools-exec-1408 is OK: OK: Less than 1.00% above the threshold [0.0] [04:03:54] RECOVERY - Puppet errors on tools-worker-1013 is OK: OK: Less than 1.00% above the threshold [0.0] [04:04:04] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1410 is OK: OK: Less than 1.00% above the threshold [0.0] [04:04:40] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1424 is OK: OK: Less than 1.00% above the threshold [0.0] [04:04:46] RECOVERY - Puppet errors on tools-proxy-02 is OK: OK: Less than 1.00% above the threshold [0.0] [04:04:50] RECOVERY - Puppet errors on tools-worker-1006 is OK: OK: Less than 1.00% above the threshold [0.0] [04:05:14] RECOVERY - Puppet errors on tools-worker-1018 is OK: OK: Less than 1.00% above the threshold [0.0] [04:05:17] RECOVERY - Puppet errors on tools-webgrid-generic-1402 is OK: OK: Less than 1.00% above the threshold [0.0] [04:05:23] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1425 is OK: OK: Less than 1.00% above the threshold [0.0] [04:05:27] RECOVERY - Puppet errors on tools-flannel-etcd-01 is OK: OK: Less than 1.00% above the threshold [0.0] [04:05:29] RECOVERY - Puppet errors on tools-exec-1434 is OK: OK: Less than 1.00% above the threshold [0.0] [04:05:37] RECOVERY - Puppet errors on tools-exec-1439 is OK: OK: Less than 1.00% above the threshold [0.0] [04:05:53] RECOVERY - Puppet errors on tools-exec-1403 is OK: OK: Less than 1.00% above the threshold [0.0] [04:05:57] RECOVERY - Puppet errors on tools-bastion-05 is OK: OK: Less than 1.00% above the threshold [0.0] [04:06:03] RECOVERY - Puppet errors on tools-webgrid-generic-1403 is OK: OK: Less than 1.00% above the threshold [0.0] [04:06:25] RECOVERY - Puppet errors on tools-exec-1411 is OK: OK: Less than 1.00% above the threshold [0.0] [04:06:33] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1412 is OK: OK: Less than 1.00% above the threshold [0.0] [04:06:53] RECOVERY - Puppet errors on tools-mail is OK: OK: Less than 1.00% above the threshold [0.0] [04:06:54] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1407 is OK: OK: Less than 1.00% above the threshold [0.0] [04:07:04] RECOVERY - Puppet errors on tools-exec-1428 is OK: OK: Less than 1.00% above the threshold [0.0] [04:07:08] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1418 is OK: OK: Less than 1.00% above the threshold [0.0] [04:07:48] RECOVERY - Puppet errors on tools-exec-1422 is OK: OK: Less than 1.00% above the threshold [0.0] [04:07:56] RECOVERY - Puppet errors on tools-exec-1420 is OK: OK: Less than 1.00% above the threshold [0.0] [04:08:10] RECOVERY - Puppet errors on tools-package-builder-01 is OK: OK: Less than 1.00% above the threshold [0.0] [04:08:12] RECOVERY - Puppet errors on tools-static-11 is OK: OK: Less than 1.00% above the threshold [0.0] [04:08:36] RECOVERY - Puppet errors on tools-exec-1404 is OK: OK: Less than 1.00% above the threshold [0.0] [04:08:44] RECOVERY - Puppet errors on tools-exec-1441 is OK: OK: Less than 1.00% above the threshold [0.0] [04:08:46] RECOVERY - Puppet errors on tools-elastic-02 is OK: OK: Less than 1.00% above the threshold [0.0] [04:08:56] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1408 is OK: OK: Less than 1.00% above the threshold [0.0] [04:09:08] RECOVERY - Puppet errors on tools-exec-1409 is OK: OK: Less than 1.00% above the threshold [0.0] [04:09:32] RECOVERY - Puppet errors on tools-worker-1015 is OK: OK: Less than 1.00% above the threshold [0.0] [04:09:33] RECOVERY - Puppet errors on tools-exec-1417 is OK: OK: Less than 1.00% above the threshold [0.0] [04:09:55] RECOVERY - Puppet errors on tools-k8s-etcd-03 is OK: OK: Less than 1.00% above the threshold [0.0] [04:10:35] RECOVERY - Puppet errors on tools-worker-1010 is OK: OK: Less than 1.00% above the threshold [0.0] [04:10:43] RECOVERY - Puppet errors on tools-redis-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [04:11:31] RECOVERY - Puppet errors on tools-webgrid-generic-1404 is OK: OK: Less than 1.00% above the threshold [0.0] [04:11:55] RECOVERY - Puppet errors on tools-worker-1009 is OK: OK: Less than 1.00% above the threshold [0.0] [04:18:48] RECOVERY - Puppet errors on tools-puppetmaster-01 is OK: OK: Less than 1.00% above the threshold [0.0] [04:22:00] RECOVERY - Puppet errors on tools-exec-1418 is OK: OK: Less than 1.00% above the threshold [0.0] [04:25:10] RECOVERY - Puppet errors on tools-worker-1005 is OK: OK: Less than 1.00% above the threshold [0.0] [04:25:12] RECOVERY - Puppet errors on tools-worker-1027 is OK: OK: Less than 1.00% above the threshold [0.0] [04:37:14] RECOVERY - Puppet errors on tools-worker-1014 is OK: OK: Less than 1.00% above the threshold [0.0] [04:37:24] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1406 is OK: OK: Less than 1.00% above the threshold [0.0] [04:37:46] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1402 is OK: OK: Less than 1.00% above the threshold [0.0] [04:38:32] RECOVERY - Puppet errors on tools-worker-1016 is OK: OK: Less than 1.00% above the threshold [0.0] [04:38:38] RECOVERY - Puppet errors on tools-exec-1406 is OK: OK: Less than 1.00% above the threshold [0.0] [04:38:47] 10Tool-Labs-tools-XTools, 10Community-Tech-Sprint: Page titles in URLs are being HTML-encoded and not URL-encoded - https://phabricator.wikimedia.org/T171133#3455322 (10Samwilson) 05Open>03Resolved a:03Samwilson Merged. [04:38:54] RECOVERY - Puppet errors on tools-exec-1436 is OK: OK: Less than 1.00% above the threshold [0.0] [06:01:54] PROBLEM - Puppet errors on tools-worker-1019 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [06:33:09] (03PS1) 10Giuseppe Lavagetto: Add my new pubkey [labs/private] - 10https://gerrit.wikimedia.org/r/366516 [06:33:35] (03CR) 10Giuseppe Lavagetto: [V: 032 C: 032] Add my new pubkey [labs/private] - 10https://gerrit.wikimedia.org/r/366516 (owner: 10Giuseppe Lavagetto) [06:41:55] RECOVERY - Puppet errors on tools-worker-1019 is OK: OK: Less than 1.00% above the threshold [0.0] [07:25:42] 10Cloud-Services, 10wikitech.wikimedia.org: Cannot login/change password to MABot@wikitech - https://phabricator.wikimedia.org/T171069#3455449 (10MarcoAurelio) @Andrew If that will let me use the shell instance name `mabot` then sure. However if that (the shell name) depends on LDAP then, to avoid breaking thi... [07:35:51] PROBLEM - Puppet errors on tools-k8s-etcd-03 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [07:45:13] 10Cloud-Services, 10DBA, 10Patch-For-Review: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743#3455456 (10Marostegui) s2 has been imported to labsdb1009 and labsdb1010. I will start with labsdb1011 in a bit. The reason I don't do all the h... [08:10:52] RECOVERY - Puppet errors on tools-k8s-etcd-03 is OK: OK: Less than 1.00% above the threshold [0.0] [08:19:03] 10wikitech.wikimedia.org, 10Wikidata: Allow accessing Wikidata data in Wikitech - https://phabricator.wikimedia.org/T171143#3455522 (10Peachey88) [08:23:05] 10Tool-Labs-tools-XTools: "Top Edited Pages" missing from new XTools - https://phabricator.wikimedia.org/T171150#3455528 (10Matthewrbowker) [08:44:10] 10Cloud-VPS, 10Release-Engineering-Team (Kanban): Labs Jessie images come with puppet 3.7.2, should be 3.8.5 - https://phabricator.wikimedia.org/T168511#3455552 (10hashar) 05Open>03Resolved a:03hashar I have booted a Jessie instance with the latest labs image and it comes with puppet 3.8.5: ``` apt-cache... [08:46:15] 10Cloud-Services, 10Puppet: Make changing puppetmasters for Labs instances more easy - https://phabricator.wikimedia.org/T152941#3455560 (10hashar) [08:51:48] 10wikitech.wikimedia.org, 10Wikidata: Allow accessing Wikidata data in Wikitech - https://phabricator.wikimedia.org/T171143#3455342 (10zhuyifei1999) > An outage on the main cluster should not bring down the documentation. https://wikitech-static.wikimedia.org/wiki/Main_Page / https://wikitech.wikimedia.org/wik... [09:23:33] 10Cloud-Services, 10Puppet: Make changing puppetmasters for Labs instances more easy - https://phabricator.wikimedia.org/T152941#3455698 (10hashar) [09:24:02] 10Cloud-Services, 10Puppet: Make changing puppetmasters for Labs instances more easy - https://phabricator.wikimedia.org/T152941#2864105 (10hashar) I have updated the workaround using the one I originally wrote on T148929. The proposed one did not work for me on CI instances with a self puppet master. [09:27:12] jynus: ping [09:28:47] Is it possible to run a python code and using only browser to active it on the Toolforge? [09:31:19] !help [09:31:19] TabbyCat: If you don't get a response in 15-30 minutes, please create a phabricator task -- https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?projects=wmcs-team [09:31:31] TabbyCat: ? [09:31:53] jynus: is it possible to query for https://phabricator.wikimedia.org/T144779 ? [09:32:00] I cannot find the table that does this [09:33:01] I don't think I am the right person for that- you need a mediawiki export [09:33:26] I can add missing tables, but know nothing of what each one do [09:34:28] jynus: and after that info we could create the replicas on labs? [09:34:33] so we could query them? [09:34:46] the problem, is I do not full understand the question [09:34:56] if there is something on mediawiki [09:34:58] that is public [09:35:13] and it is requested to be on labs- it can be added [09:35:27] (in most cases) [09:35:44] r96340: you mean a webservice? yes [09:36:00] the question is that in the past siebrand offered us that data, but I don't understand where he got that data from [09:36:02] if you tell me which table or fields you ask for, I can answer if it is there [09:36:18] it's the TranslationNotifications extension [09:36:29] Special:TranslatorSignup page [09:36:35] I can try to see what that is [09:36:41] I'm sorry that I do not know any further [09:36:59] yeah, the problem is I don't either- specially if they are extensions [09:37:15] I may not be too familiar with them [09:37:24] zhuyifei1999_: How can I it? [09:37:37] *How can I do it? [09:37:56] jynus: I've asked NIkerrabit where that data is stored [09:38:10] r96340: https://wikitech.wikimedia.org/wiki/Help:Toolforge/Web [09:38:17] meanwhile I'll take my media mañana coffee [09:39:07] Okay, thanks for answering [09:40:53] TabbyCat: if it is on x1 shard- it may be complicated- there is a bunch of private data there, so we don't replicate to labsdbs from there [09:41:13] I don't know jynus [09:44:41] jynus: they say it's in the user_preferences table [09:46:26] ok [09:46:30] then that seems good [09:46:42] some of those are considered private, though [09:47:53] who do you say offered you that dataset? [09:48:00] jynus: could you see if there's a ug_property called "translationnotifications" ? [09:48:07] siebrand [09:48:18] cf. task [09:48:40] [11:46] Nikerabbit something like select * from user_properties where up_property like 'translationnotifications-%'; <- maybe? [09:48:50] * TabbyCat ssh [09:49:43] I need to know if those are filtered [09:50:03] for future reference, there a place where you can search that for yourself [09:51:03] where (`enwiki`.`user_properties`.`up_property` in ('disablemail','fancysig','gender','nickname')) [09:51:08] only those are exposed [09:51:31] so, what I would suggest is for you to create a request to cloud [09:51:55] to expose that field [09:52:07] however, if it is private, it will not be shown [09:52:32] anything that is private on the website is not shown on labs [09:54:01] TabbyCat: you can see what is private and what is public at https://phabricator.wikimedia.org/source/operations-puppet/browse/production/modules/role/templates/labs/db/views/maintain-views.yaml [09:54:24] I've queried myself and I saw that only fancysig gender and signature is exposed [10:01:40] jynus which is the command I should use to see in user_properties the options avalaible on "up_properties" ? [10:01:45] show columns doesn't work [10:02:48] TabbyCat: in a meeting [10:02:54] will answer you later [10:02:58] ok :) [11:01:05] 10Cloud-VPS, 10Continuous-Integration-Infrastructure: contintcloud instance refuses to launch due to "Maximum number of fixed ips exceeded - https://phabricator.wikimedia.org/T171158#3455895 (10hashar) [11:01:35] 10Cloud-VPS, 10Continuous-Integration-Infrastructure: contintcloud instance refuses to launch due to "Maximum number of fixed ips exceeded - https://phabricator.wikimedia.org/T171158#3455907 (10hashar) [11:15:51] 10Cloud-VPS, 10Continuous-Integration-Infrastructure: contintcloud instance refuses to launch due to "Maximum number of fixed ips exceeded - https://phabricator.wikimedia.org/T171158#3455925 (10hashar) labnet1001.eqiad.wmnet has a lot of such errors in /var/log/nova/nova-network.log* The first suspicious one:... [11:16:19] 10Cloud-Services, 10Operations, 10Release-Engineering-Team, 10Patch-For-Review: contintcloud project thinks it is using 206 fixed-ip quota errantly - https://phabricator.wikimedia.org/T158350#3034394 (10hashar) That is happening again after something got restarted yesterday. Filled as T171158 [11:25:02] 10Cloud-VPS, 10Continuous-Integration-Infrastructure: contintcloud instance refuses to launch due to "Maximum number of fixed ips exceeded - https://phabricator.wikimedia.org/T171158#3455941 (10hashar) The Nodepool launch errors https://grafana.wikimedia.org/dashboard/db/nodepool?panelId=12&fullscreen&orgId=1&... [11:37:15] 10Cloud-VPS, 10Continuous-Integration-Infrastructure: contintcloud instance refuses to launch due to "Maximum number of fixed ips exceeded - https://phabricator.wikimedia.org/T171158#3455950 (10hashar) Seems the nova database is on `m5-master.eqiad.wmnet` db name `nova`. [11:58:05] hi, who is mgrabovsky? I found that user had been chowned onto /home/paladox on gerrit-test3 [12:01:37] 10Cloud-VPS, 10Continuous-Integration-Infrastructure: contintcloud instance refuses to launch due to "Maximum number of fixed ips exceeded - https://phabricator.wikimedia.org/T171158#3455996 (10Luke081515) p:05Triage>03High [12:34:36] PROBLEM - Puppet errors on tools-exec-1406 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [13:09:37] RECOVERY - Puppet errors on tools-exec-1406 is OK: OK: Less than 1.00% above the threshold [0.0] [13:19:10] 10Cloud-Services, 10Cloud-VPS, 10monitoring, 10Wikimedia-Incident: toolschecker fell to pieces when labs-ns0 went down - https://phabricator.wikimedia.org/T152369#2846085 (10faidon) What's needed to be done here, from whom and with what priority? (Asking because it shows up in our #monitoring workboard) [13:21:44] 10Cloud-Services, 10Shinken, 10Upstream: shinken.wmflabs.org redirects on https-login to http - https://phabricator.wikimedia.org/T85326#3456155 (10faidon) [13:36:56] andrewbogott: good morning :-) When you had your coffee, I could use a quota fix for contintcloud. It refuses to spawn more instances because the quota of 200 fixed-ips quota is reached [13:37:04] there is some entries that leaked in the nova database apparently [13:37:31] the summary is at https://phabricator.wikimedia.org/T171158#3455925 :} [13:38:39] hashar: sometimes those quotas don't refresh properly even if there aren't leaks… I'm trying to force a recalculation before digging too deep [13:38:49] ahh [13:39:19] we had an issue last summer with the # of instances quota. But that is supposedly definitely fixed via a max_age = 30 # seconds [13:39:22] or something like that [13:39:30] but maybe fixed-ips uses a different system [13:40:00] I was unable to find the fixed-ips quota usage for the project. It does not show up in nova absolute-limits :( [13:42:06] The quota engine is famously inaccurate… I know because in n they introduced a 'quota reset' facility rather than actually fixing it... [13:42:09] e.g. https://ask.openstack.org/en/question/494/how-to-reset-incorrect-quota-count/ [13:42:35] I set everything to zero in contintcloud which forces a recalc... [13:42:37] I think it's better now [13:43:07] (This is probably a side-effect of last night's mess where 1,000,000 vms were scheduled but never actually came up) [13:46:05] hashar: better now? [13:54:33] !log tools upgrading apache2 on tools-puppetmaster-01 [13:54:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [13:55:14] andrewbogott: will check [13:55:49] andrewbogott: yeah looks all fine [13:56:09] seems there was some issue at midnight utc [13:56:40] and something somehow recovered at 6am but then the fixedip quota issue prevented the pool to be replenished [13:56:46] so in short, since 6am nodepool has been trying to spawn instances in a loop [13:56:49] 10Cloud-VPS, 10Continuous-Integration-Infrastructure: contintcloud instance refuses to launch due to "Maximum number of fixed ips exceeded - https://phabricator.wikimedia.org/T171158#3456299 (10Andrew) 05Open>03Resolved a:03Andrew I resolved this by running the query in https://ask.openstack.org/en/quest... [13:57:10] andrewbogott: thank you very much :} [13:57:21] hashar: there was quite a while when no VMs would start at all. I was up late fixing that [13:57:39] 10Cloud-VPS, 10Continuous-Integration-Infrastructure: contintcloud instance refuses to launch due to "Maximum number of fixed ips exceeded - https://phabricator.wikimedia.org/T171158#3456302 (10hashar) I can confirm that resolved the issue completely. Thank you! [13:57:39] it was some kind of fallout from the cert expiration, I still don't totally understand what happened. [13:57:55] ah yeah the certs [13:57:55] :( [13:58:11] I have been fixing labs instances all day long :\ [13:58:45] huh, with restarting nslcd? [13:58:48] PROBLEM - Puppet errors on tools-exec-1422 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [13:58:51] I tried to do that everywhere but maybe I should have another go [13:58:54] PROBLEM - Puppet errors on tools-exec-1420 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [13:59:14] PROBLEM - Puppet errors on tools-exec-1408 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [13:59:35] on CI that is sorted out at least [13:59:48] on beta the puppetmaster is broken somehow. I am trying to fix it [13:59:52] and yeah nslcd fixed a few :} [14:02:55] so I take it y'all know about how creating/deleting VPSs is messy, or at least was several hours ago [14:03:05] I'm going to restart nslcd and nscd cloud-wide again [14:03:24] harej: if 'several hours ago' is 8 hours ago then yes [14:03:29] if less than that then no [14:03:48] Damn did I sleep 8 hours? [14:04:31] andrewbogott: thank you for your work on it [14:05:44] this was all fallout from https://wikitech.wikimedia.org/wiki/Incident_documentation/20170719-ldap [14:05:50] which… probably we haven't seen the last of it [14:09:13] hashar: I'm restarting nslcd/nscd on every VM that I can find. Hopefully that will improve things, at least slightly [14:09:25] Is the deployment puppetmaster still giving you trouble? [14:13:47] RECOVERY - Puppet errors on tools-exec-1422 is OK: OK: Less than 1.00% above the threshold [0.0] [14:15:30] 10cloud-services-team, 10Patch-For-Review: Build updated labvirt-star cert - https://phabricator.wikimedia.org/T171116#3456367 (10Andrew) 05Open>03Resolved [14:21:23] andrewbogott: yeah the configuration is f*** up :( [14:21:28] Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Failed to submit 'replace facts' command for deployment-puppetmaster02.deployment-prep.eqiad.wmflabs to PuppetDB at puppetdb:8081: getaddrinfo: Name or service not known [14:21:29] ;D [14:22:13] there is an issue while retrieving facts apparently [14:22:21] it shouldn't be trying to use puppetdb at all... [14:22:27] yeah that as well [14:23:03] andrewbogott or write some puppet code that applys every where to restart the service (only needs to be temp) :) [14:27:31] hashar: do you want me to look? [14:28:36] /etc/puppet/routes.yaml [14:28:45] looks like there used to be a puppetdb installed :D [14:28:51] it got the fats! [14:28:53] facts [14:30:02] andrewbogott: puppet pass :-] I was just ranting out loud [14:30:14] great! [14:31:32] hey folks, I'm having trouble logging into a labs instance sistersearch.search.eqiad.wmflabs heard there was an ldap issue yesterday? [14:31:52] I am off to write an incident report [14:33:55] jan_drewniak: it looks like puppet has been broken on that instance for many months… which is never a good idea [14:33:56] RECOVERY - Puppet errors on tools-exec-1420 is OK: OK: Less than 1.00% above the threshold [0.0] [14:34:03] but, can you confirm that it still doesn't work, right this minute? [14:34:15] RECOVERY - Puppet errors on tools-exec-1408 is OK: OK: Less than 1.00% above the threshold [0.0] [14:34:35] yup, still broken "Permission denied" [14:34:47] 10Cloud-Services, 10Toolforge, 10DBA: p50380g50816__pop_stats (popularpages) using 53G on labsdb1001 (enwiki) - https://phabricator.wikimedia.org/T133326#2228235 (10Marostegui) What should we do with this ticket? [14:35:09] it probaly did not get the new cert if puppet was broken [14:37:06] jan_drewniak: you ought to have been getting daily alert emails about broken puppet, are they not showing up for you? [14:37:50] andrewbogott: should I hold off on creating and deleting new instances for now? [14:37:55] andrewbogott: nope, haven't gotten any puppet emails [14:38:12] harej: it should be fine [14:45:26] jan_drewniak: I had to remove a class from your puppet config that was breaking puppet runs. It should be fine now [14:45:53] 10Cloud-Services, 10Toolforge, 10DBA: p50380g50816__pop_stats (popularpages) using 53G on labsdb1001 (enwiki) - https://phabricator.wikimedia.org/T133326#3456453 (10bd808) @kaldari {T118508} was declined (not for great reasons IMO, but whatever), but I thought that the rate limits were raised so that https:/... [14:45:57] andrewbogott: yippee! thanks, it works now :) [14:46:00] In theory all projectadmins should be getting nag emails about broken puppet [14:46:34] Is that a preference I should enable somewhere? [14:48:07] no, it should be unavoidable. [14:48:14] Maybe your emails are getting smaptrapped [14:48:26] I'm not sure, I don't have time to investigate right now [14:53:08] 10Cloud-Services, 10Beta-Cluster-Infrastructure, 10Operations, 10Services, 10Release-Engineering-Team (Kanban): a lot of beta cluster instances are not reachable over SSH - https://phabricator.wikimedia.org/T171174#3456488 (10hashar) [14:55:01] 10Cloud-Services, 10Beta-Cluster-Infrastructure, 10Operations, 10Services, 10Release-Engineering-Team (Kanban): a lot of beta cluster instances are not reachable over SSH - https://phabricator.wikimedia.org/T171174#3456488 (10Paladox) Now that puppet is fixed, you can either wait a few hours for puppet t... [14:55:54] 10Cloud-Services, 10Beta-Cluster-Infrastructure, 10Operations, 10Services, 10Release-Engineering-Team (Kanban): a lot of beta cluster instances are not reachable over SSH - https://phabricator.wikimedia.org/T171174#3456519 (10hashar) [15:07:17] hey yalllll [15:07:23] i just created a new instance in deployment-prep [15:07:26] deployment-eventlog01 [15:07:30] i can't log in [15:07:34] in instance log [15:07:36] just lots of [15:07:41] [1;31mError: Could not request certificate: Connection refused - connect(2)[0m [15:07:48] puppet not working there? [15:09:57] ottomata: talk to hashar and the folks in #wikimedia-releng [15:10:17] I know hashar was trying to fix up some puppet badness earlier [15:11:19] PROBLEM - High iowait on tools-webgrid-lighttpd-1416 is CRITICAL: CRITICAL: tools.tools-webgrid-lighttpd-1416.cpu.total.iowait (>11.11%) [15:12:15] ottomata: bunch of beta instances are broken ( https://phabricator.wikimedia.org/T171174 ) [15:12:47] ottomata: and I havent verified what happens when a new instance is created [15:13:04] maybe the stock cloud image would not grant access [15:13:40] 10VPS-Projects, 10Beta-Cluster-Infrastructure, 10Operations, 10Services, 10Release-Engineering-Team (Kanban): a lot of beta cluster instances are not reachable over SSH - https://phabricator.wikimedia.org/T171174#3456611 (10bd808) [15:13:44] ottomata: fill a task about it ? [15:13:49] haha ok [15:14:04] we had a major outage overnight [15:14:41] outage? You mean the VM creation issues? [15:14:47] or something else? [15:15:13] 10VPS-Projects, 10Beta-Cluster-Infrastructure, 10Operations, 10Services, 10Release-Engineering-Team (Kanban): New instance in deployment prep can't run puppet for the first time - https://phabricator.wikimedia.org/T171177#3456618 (10Ottomata) [15:15:14] hashar: https://phabricator.wikimedia.org/T171177 [15:15:29] VM could not spawn, CI was bring to an halt due to an instance that exploded thanks to puppte/conf change, most instances were not sshable and I had 2 or 3 instances deadlocked (no ssh / no salt) [15:15:40] it is mostly sorted out luckily :] [15:16:02] *nod* [15:18:31] 10VPS-Projects, 10Beta-Cluster-Infrastructure, 10Operations, 10Services, 10Release-Engineering-Team (Kanban): New instance in deployment prep can't run puppet for the first time - https://phabricator.wikimedia.org/T171177#3456656 (10hashar) [15:20:07] 10VPS-Projects, 10Beta-Cluster-Infrastructure, 10Operations, 10Services, 10Release-Engineering-Team (Kanban): New instance in deployment prep can't run puppet for the first time - https://phabricator.wikimedia.org/T171177#3456618 (10hashar) Seems the initial puppet run refuses to process for whatever rea... [15:21:15] RECOVERY - High iowait on tools-webgrid-lighttpd-1416 is OK: OK: All targets OK [15:24:20] I tried deleting a failed-to-launch VM and it said I wasn't allowed to delete it. I take it I should just wait? [15:25:15] andrewbogott: ^ harej has a stuck vm that needs cleanup [15:27:42] harej: what project? [15:28:06] wpx and the instance is wpx-data-01 [15:29:49] harej: looks like that was created during the everything-is-broken period so it's in an impossible state that nova doesn't know how to deal with it [15:29:53] I'll just kill it by hand [15:31:15] harej: ok, I think that did it [15:31:22] andrewbogott: that's what I figured. Thank you. (It never succeeded in building so no data loss) [15:39:20] 10cloud-services-team (FY2017-18), 10Datasets-General-or-Unknown, 10Goal: Begin migrating customer-facing Dumps endpoints to Cloud Services - https://phabricator.wikimedia.org/T168486#3456759 (10ArielGlenn) [16:01:19] 10Cloud-Services, 10Cloud-VPS, 10Operations, 10ops-eqiad: rack/setup/install labstore100[67].wikimedia.org - https://phabricator.wikimedia.org/T167984#3456817 (10madhuvishy) @Cmjohnson Do we have an estimate on when these will be racked? These servers being setup are part of our quarterly goal for Q1 - T16... [16:15:40] 10Cloud-Services, 10Operations: Move the main WMCS puppetmaster into the Labs realm - https://phabricator.wikimedia.org/T171188#3456889 (10faidon) [16:30:04] 10Cloud-VPS, 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Wikimedia-Incident: contintcloud instance refuses to launch due to "Maximum number of fixed ips exceeded - https://phabricator.wikimedia.org/T171158#3456946 (10hashar) https://wikitech.wikimedia.org/wiki/Incident_d... [16:30:19] 10VPS-Projects, 10Beta-Cluster-Infrastructure, 10Operations, 10Services, and 2 others: a lot of beta cluster instances are not reachable over SSH - https://phabricator.wikimedia.org/T171174#3456948 (10hashar) https://wikitech.wikimedia.org/wiki/Incident_documentation/20170719-ldap#CI.2Fbeta [16:35:37] 10VPS-Projects, 10Beta-Cluster-Infrastructure, 10Operations, 10Services, and 2 others: a lot of beta cluster instances are not reachable over SSH - https://phabricator.wikimedia.org/T171174#3456966 (10hashar) So the state as I understand it right now: The puppet master was broken, I had it fixed by removi... [16:35:43] 10VPS-Projects, 10Beta-Cluster-Infrastructure, 10Operations, 10Services, and 2 others: a lot of beta cluster instances are not reachable over SSH - https://phabricator.wikimedia.org/T171174#3456969 (10hashar) p:05Triage>03High [16:41:31] 10VPS-Projects, 10Beta-Cluster-Infrastructure, 10Operations, 10Services, and 2 others: a lot of beta cluster instances are not reachable over SSH - https://phabricator.wikimedia.org/T171174#3456998 (10hashar) Announced on the QA list pointing back to this task [16:45:26] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1416 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [16:56:22] andrewbogott: nodepool/openstack is all fine :-} I am heading back home. Thank you for the quick fix earlier! [16:58:46] 10Cloud-Services, 10Toolforge, 10DBA: p50380g50816__pop_stats (popularpages) using 53G on labsdb1001 (enwiki) - https://phabricator.wikimedia.org/T133326#3457049 (10kaldari) I just deleted everything older than 2014, which was about half the tables. [17:30:30] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1416 is OK: OK: Less than 1.00% above the threshold [0.0] [17:36:52] 10Tool-Labs-tools-XTools, 10Community-Tech-Sprint: XTools prod environment: unknown "profiler_dump" function - https://phabricator.wikimedia.org/T170233#3457178 (10kaldari) 05Open>03Resolved [17:46:30] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1416 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:16:30] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1416 is OK: OK: Less than 1.00% above the threshold [0.0] [18:28:59] 10Cloud-Services, 10wikitech.wikimedia.org, 10BetaFeatures, 10Edit-Review-Improvements, and 3 others: ERI requesting opt-in on wikitech but not available - https://phabricator.wikimedia.org/T165822#3457377 (10Jdforrester-WMF) p:05Triage>03Low [18:29:56] PROBLEM - Puppet errors on tools-exec-1436 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [18:34:20] 10VPS-Projects, 10Beta-Cluster-Infrastructure, 10Operations, 10Release-Engineering-Team (Kanban), 10Services (watching): New instance in deployment prep can't run puppet for the first time - https://phabricator.wikimedia.org/T171177#3457390 (10mobrovac) [18:35:02] 10VPS-Projects, 10Beta-Cluster-Infrastructure, 10Operations, 10Release-Engineering-Team (Kanban), and 2 others: a lot of beta cluster instances are not reachable over SSH - https://phabricator.wikimedia.org/T171174#3457392 (10mobrovac) [18:42:17] 10wikitech.wikimedia.org, 10Operations: Update mediawiki on wikitech-static - https://phabricator.wikimedia.org/T170854#3457410 (10Andrew) 05Open>03Resolved a:03Andrew Now running 1.29.0 (52abe24) [18:43:16] 10wikitech.wikimedia.org: Deploy TemplateStyles for wikitech-static - https://phabricator.wikimedia.org/T171005#3450553 (10Andrew) The extension is now installed and loading on wikitech-static. It looks terrible, for now -- waiting to see if a re-sync fixes things. [18:59:45] hey there, since yesterday, all my cronjobs on tool labs not working anymore. [18:59:54] Is this a known problem? [19:09:54] RECOVERY - Puppet errors on tools-exec-1436 is OK: OK: Less than 1.00% above the threshold [0.0] [19:12:59] FNDE_: we had some issues with jobs being rejected because of an LDAP server problem, but we thought we had that fixed yesterday evening (US west coast time) [19:13:25] do you have jobs stuck in an error state that are keeping new ones from launching? [19:18:18] 10cloud-services-team, 10wikitech.wikimedia.org: Cannot login/change password to MABot@wikitech - https://phabricator.wikimedia.org/T171069#3457546 (10MarcoAurelio) [19:19:50] 10cloud-services-team, 10wikitech.wikimedia.org: Requesting 'bot' access for MABot - https://phabricator.wikimedia.org/T171066#3457550 (10MarcoAurelio) [19:23:14] 10cloud-services-team, 10wikitech.wikimedia.org: Cannot login/change password to MABot@wikitech - https://phabricator.wikimedia.org/T171069#3457554 (10Andrew) Ok, I renamed MABot to 'MABot former'. I think when you retry you should log out entirely and create the account as though you are a new user -- that's... [19:25:44] 10cloud-services-team, 10wikitech.wikimedia.org: contentadmin has suddenly less permissions - https://phabricator.wikimedia.org/T171208#3457562 (10MarcoAurelio) [19:26:37] 10cloud-services-team (Kanban), 10wikitech.wikimedia.org, 10User-bd808: Remove Wikitech rights for WikiSysop - https://phabricator.wikimedia.org/T171090#3457577 (10bd808) 05Open>03Resolved a:03bd808 Rights removed -- https://wikitech.wikimedia.org/w/index.php?title=Special%3ALog&type=rights&user=&page=... [19:33:03] bd808: I don't know, how I can access hanging jobs? [19:33:22] run qstat as your tool [19:34:11] if you see state "Eqw" then run `qmod -cj ` [19:34:28] 10cloud-services-team, 10wikitech.wikimedia.org: Cannot login/change password to MABot@wikitech - https://phabricator.wikimedia.org/T171069#3457593 (10MarcoAurelio) @Andrew Much appreciated. I've created the account now and everything seems to work fine. I gave you some love on your talk page to test editting... [19:34:28] that should let the stuck job die and get out of the way [19:36:44] 10cloud-services-team, 10wikitech.wikimedia.org: Requesting 'bot' access for MABot - https://phabricator.wikimedia.org/T171066#3457646 (10MarcoAurelio) Issues seem to have been fixed. [19:40:37] 10cloud-services-team (Kanban), 10wikitech.wikimedia.org, 10User-bd808: Requesting 'bot' access for MABot - https://phabricator.wikimedia.org/T171066#3457650 (10bd808) 05Open>03Resolved a:03bd808 https://wikitech.wikimedia.org/w/index.php?title=Special%3ALog&type=&user=&page=User%3AMABot&year=&month=-1... [19:42:37] bd808: thank you very much, it works :) [19:42:53] FNDE_: awesome! sorry you got tripped up by that [19:43:51] no worries :) thx! [19:48:33] !log tools Clearing all Eqw state jobs in all queues with: qstat -u '*' | grep Eqw | awk '{print $1;}' | xargs -L1 qmod -cj [19:48:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [19:49:06] ouch. we should have done this yesterday... [19:49:22] bd808 let me guess related to ldap issue last night? [19:50:25] yeah. jobs that tried to start while ldap was messed up got stuck in error state [19:51:01] but there are handy wikitech docs on fixing it! -- https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin#Clearing_error_state [20:02:33] 10cloud-services-team (Kanban), 10Project-Admins, 10User-bd808: Rename and update Cloud Services Phabricator projects - https://phabricator.wikimedia.org/T167244#3457743 (10Aklapper) Trying to provide a list of tools in Toolforge (née Tool Labs) using Phab for task tracking, which are supposed to become subp... [20:05:49] 10cloud-services-team (Kanban), 10Project-Admins, 10User-bd808: Rename and update Cloud Services Phabricator projects - https://phabricator.wikimedia.org/T167244#3457745 (10bd808) >>! In T167244#3457743, @Aklapper wrote: > Trying to provide a list of tools in Toolforge (née Tool Labs) using Phab for task tra... [20:18:01] 10Toolforge, 10Wikisource, 10Bengali-Sites: OCR for Bengali Wikisource needs to be updated at tools labs by updating the "tesseract-ben" package - https://phabricator.wikimedia.org/T167566#3457787 (10Aklapper) @Bodhisattwa: Sorry this task has not received a reply yet. So this seems to be a followup to T6735... [20:18:24] 10Toolforge, 10Wikisource, 10Bengali-Sites: Update the "tesseract-ben" package on Toolforge for OCR on Bengali Wikisource - https://phabricator.wikimedia.org/T167566#3457791 (10Aklapper) [20:19:19] 10VPS-Projects, 10Beta-Cluster-Infrastructure, 10Operations, 10Release-Engineering-Team (Kanban), and 2 others: a lot of beta cluster instances are not reachable over SSH - https://phabricator.wikimedia.org/T171174#3457796 (10hashar) [20:19:22] 10VPS-Projects, 10Beta-Cluster-Infrastructure, 10Operations, 10Release-Engineering-Team (Kanban), 10Services (watching): New instance in deployment prep can't run puppet for the first time - https://phabricator.wikimedia.org/T171177#3457793 (10hashar) 05Open>03Resolved a:03Ottomata Andrew has delet... [20:29:29] 10cloud-services-team, 10wikitech.wikimedia.org: Cannot login/change password to MABot@wikitech - https://phabricator.wikimedia.org/T171069#3457815 (10Andrew) 05Open>03Resolved a:03Andrew Yep, all looks good to me. [20:41:29] andrewbogott: bot account seems to work this time :) [20:41:34] thanks for fixing that [20:41:37] great [20:41:41] bd808: thanks for flagging the bot [20:42:05] TabbyCat: yw. glad to help people fix our docs up :) [20:42:20] can I get 'confirmed' temporary for a week? wiki won't let me create an OAUth consumer w/o being confirmed or autoconfirmed [20:42:34] you can set it temporary via the userrights form now :) [20:42:45] (for MABot I mean) [20:43:49] TabbyCat: {{done}} [20:43:53] :D [20:44:08] thanky [20:47:34] I'll investigate later how could I use tools.mabot to run this [20:47:43] as user-config will differ [20:48:49] welcome d3r1ck [21:03:39] hmmm [21:03:42] \pywikibot\comms\http.py:345: UserWarning: Invalid authentication tokens for wikitech.wikimedia.org set in `config.authenticate` [21:03:43] 'set in `config.authenticate`' % path) [21:17:23] 10cloud-services-team (FY2017-18), 10Research, 10Goal: [FY17-18] Program 4: Technical community building - https://phabricator.wikimedia.org/T171120#3458019 (10DarTar) [21:17:50] 10cloud-services-team (FY2017-18), 10Research, 10Goal: [FY17-18] Program 4: Technical community building - https://phabricator.wikimedia.org/T171120#3454674 (10DarTar) [21:19:51] (fixed) [21:19:54] 10cloud-services-team (FY2017-18), 10Goal: Program 4 Outcome 1: improve documentation - https://phabricator.wikimedia.org/T166401#3295217 (10DarTar) @bd808 I don't know how you all feel about this, but the rule of thumb I am following in #research-programs (currently being populated) is to create just the prog... [21:21:33] bd808: Thanks :) [21:31:10] 10VPS-project-XTools: "Top Edited Pages" missing from new XTools - https://phabricator.wikimedia.org/T171150#3455528 (10MusikAnimal) This goes back to the idea of having the Edit Counter be a one-stop shop for all data related to a single editor. My recommendation is we load this in asynchronously just like we a... [21:41:17] 10VPS-project-XTools, 10Community-Tech-Sprint: EditCounter's "pages created" does not match Pages tool - https://phabricator.wikimedia.org/T169955#3458145 (10MusikAnimal) All things considered, I'm thinking we should just show the raw pages created count as we are now, and leave a footnote explaining that all... [22:10:52] 10cloud-services-team (Kanban), 10Project-Admins, 10User-bd808: Rename and update Cloud Services Phabricator projects - https://phabricator.wikimedia.org/T167244#3458314 (10bd808) @mmodell: I've done a big batch of renaming and have these projects ready to be re-parented: ``` /home/twentyafterfour/move_benea... [22:31:11] 10Tool-Labs-standards-committee, 10Toolforge, 10Tools, 10Developer-Relations: Make sure abandoned useful tools are properly advertised so potentially interested new maintainers could find them - https://phabricator.wikimedia.org/T159595#3458470 (10Aklapper) Trying to rephrase the last comment and hoping I... [23:00:45] (03Draft1) 10Paladox: Disable ldap usage [labs/icinga2] - 10https://gerrit.wikimedia.org/r/366751 [23:00:48] (03PS2) 10Paladox: Disable ldap usage [labs/icinga2] - 10https://gerrit.wikimedia.org/r/366751 [23:06:37] (03PS3) 10Paladox: Disable ldap usage [labs/icinga2] - 10https://gerrit.wikimedia.org/r/366751 [23:13:11] 10Tool-Labs-standards-committee, 10Toolforge, 10Tools, 10Developer-Relations: Make sure abandoned useful tools are properly advertised so potentially interested new maintainers could find them - https://phabricator.wikimedia.org/T159595#3458735 (10bd808) >>! In T159595#3458470, @Aklapper wrote: > Yeah, sou... [23:13:40] (03PS4) 10Paladox: Disable ldap usage [labs/icinga2] - 10https://gerrit.wikimedia.org/r/366751 [23:13:44] (03CR) 10Paladox: [V: 032 C: 032] Disable ldap usage [labs/icinga2] - 10https://gerrit.wikimedia.org/r/366751 (owner: 10Paladox) [23:21:03] !log git shut down gerrit temp on gerrit-test3 for preperations for installing local ldap [23:21:06] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Git/SAL [23:21:08] bd808 ^^ [23:21:27] RainbowSprinkles we will need to make the ldap connect configuable [23:21:29] awesome. thanks paladox [23:21:33] Your welcome [23:22:48] 10VPS-project-XTools, 10Community-Tech-Sprint: Internal Server Error from new articleinfo interface in XTools - https://phabricator.wikimedia.org/T169767#3458772 (10kaldari) @Samwilson: Is the XTools DB set up now? Still getting any of these errors? [23:31:32] !log git applying role::openldap::labtest to gerrit-test3 [23:31:34] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Git/SAL