[00:20:23] 10Labs, 10MediaWiki-Cache, 10wikitech.wikimedia.org, 10MW-1.28-release-notes, 10Wikimedia-log-errors: "Memcached::touch(): touch is only supported with binary protocol" from wikitech as it's not running HHVM - https://phabricator.wikimedia.org/T143464#3346719 (10demon) As I said on IRC: why do we even at... [00:34:58] 10Labs, 10MediaWiki-Cache, 10wikitech.wikimedia.org, 10MW-1.28-release-notes, 10Wikimedia-log-errors: "Memcached::touch(): touch is only supported with binary protocol" from wikitech as it's not running HHVM - https://phabricator.wikimedia.org/T143464#3346730 (10bd808) >>! In T143464#3346719, @demon wrot... [00:47:08] PROBLEM - Puppet errors on tools-exec-1410 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [01:03:03] 10Labs: Need support on hosting an RStudio Shiny Server on a Labs instance behind a proxy - https://phabricator.wikimedia.org/T167702#3346795 (10zhuyifei1999) >>! In T167702#3346298, @GoranSMilovanovic wrote: > @zhuyifei1999 Are you a Shiny Server user too? Nope. I simply like doing challenges ;) Ok so the ste... [01:08:39] 10Labs: Need support on hosting an RStudio Shiny Server on a Labs instance behind a proxy - https://phabricator.wikimedia.org/T167702#3346798 (10zhuyifei1999) So setup a web proxy at http://commonsarchive-test.wmflabs.org/, got `Welcome to Shiny Server!`. Cannot reproduce 504, nor I have any idea about which ste... [01:22:08] RECOVERY - Puppet errors on tools-exec-1410 is OK: OK: Less than 1.00% above the threshold [0.0] [01:43:09] PROBLEM - Puppet errors on tools-exec-1410 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [02:53:08] RECOVERY - Puppet errors on tools-exec-1410 is OK: OK: Less than 1.00% above the threshold [0.0] [04:30:22] 10Labs: Need support on hosting an RStudio Shiny Server on a Labs instance behind a proxy - https://phabricator.wikimedia.org/T167702#3346953 (10zhuyifei1999) @GoranSMilovanovic Can you `curl localhost:3838` from within the instance to verify if the port is open from within? In my case it's something like: ``` z... [05:12:22] 10Labs, 10MediaWiki-Cache, 10wikitech.wikimedia.org, 10MW-1.28-release-notes, 10Wikimedia-log-errors: "Memcached::touch(): touch is only supported with binary protocol" from wikitech as it's not running HHVM - https://phabricator.wikimedia.org/T143464#3346970 (10demon) I mean the performance gain over a... [05:57:32] 10Tool-Labs-tools-Xtools, 10Community-Tech-Sprint: Investigation: XTools routing - https://phabricator.wikimedia.org/T163283#3346985 (10Samwilson) [06:39:13] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1419 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [07:14:12] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1419 is OK: OK: Less than 1.00% above the threshold [0.0] [07:37:22] 10Labs: Need support on hosting an RStudio Shiny Server on a Labs instance behind a proxy - https://phabricator.wikimedia.org/T167702#3347044 (10GoranSMilovanovic) @zhuyifei1999 curl localhost:3838 from my Labs instance returns the content of the Welcome to Shiny page. [08:23:48] (03PS1) 10Gehel: maps - add dummy redis password [labs/private] - 10https://gerrit.wikimedia.org/r/358906 [08:24:11] (03CR) 10Gehel: [V: 032 C: 032] maps - add dummy redis password [labs/private] - 10https://gerrit.wikimedia.org/r/358906 (owner: 10Gehel) [10:04:05] !log wikispeech Deploy latest from Git master: e641fcb (T165540), dfee00f (T148623) [10:04:09] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Wikispeech/SAL [10:04:10] T148623: Highlight recited word - https://phabricator.wikimedia.org/T148623 [10:04:10] T165540: Add CODE_OF_CONDUCT.md to Wikimedia repositories - https://phabricator.wikimedia.org/T165540 [11:55:35] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1417 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [12:04:25] 10Labs, 10Operations, 10hardware-requests: Codfw: (1) hardware access request for labtestnet2003 [region 2] - https://phabricator.wikimedia.org/T161764#3347571 (10faidon) [12:04:29] 10Labs, 10Operations, 10hardware-requests: Codfw: (1) hardware access request for labtestneutron refresh - https://phabricator.wikimedia.org/T154706#3347572 (10faidon) [12:06:53] 10Labs, 10Operations, 10hardware-requests: Eqiad: (2) hardware access request for labcontrol1003/1004 - https://phabricator.wikimedia.org/T158207#3347577 (10faidon) [12:15:29] Do individual tools bother to have an associated phabricator project for tickets? If so is there a way to automatically make one or do I need to file a ticket? Is there a typical naming scheme people use for these projects. I had a look but couldn't see any examples. [12:25:37] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1417 is OK: OK: Less than 1.00% above the threshold [0.0] [12:35:42] there is a convention, but I do not know if it is created semi-automaticaly or manually [12:36:09] https://phabricator.wikimedia.org/tag/tool-labs-tools-quentinv57's-tools/ [12:36:17] https://phabricator.wikimedia.org/tag/tool-labs-tools-erwin's-tools/ [12:36:42] awesome; I'll file a manual ticket [12:53:37] (03PS1) 10Gehel: maps - add dummy redis password for tilerator / tileratorui [labs/private] - 10https://gerrit.wikimedia.org/r/358950 (https://phabricator.wikimedia.org/T167871) [14:44:41] 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-eqiad: rack/setup/install labweb100[12].wikimedia.org - https://phabricator.wikimedia.org/T167820#3348249 (10Cmjohnson) [14:46:18] 10Labs, 10Operations, 10hardware-requests: Codfw: (1) hardware access request for labtestneutron refresh - https://phabricator.wikimedia.org/T154706#3348255 (10RobH) 05Open>03Resolved we've purchased this system, and setup is via T167160 [14:46:39] 10Labs, 10Operations, 10hardware-requests: Codfw: (1) hardware access request for labtestnet2003 [region 2] - https://phabricator.wikimedia.org/T161764#3348261 (10RobH) 05Open>03Resolved We've purchased this system, setup via T167160. [15:08:29] 10Labs, 10Labs-Infrastructure, 10LDAP-Access-Requests, 10Operations: Make all ldap users have a sane shell (/bin/bash) - https://phabricator.wikimedia.org/T86668#3348420 (10hashar) The summary is roughly: Only service groups still have `sillyshell` as a login. OpenStackManager is no more adding it and de... [17:53:16] 10Labs, 10Labs-Infrastructure, 10Tool-Labs, 10Wikimedia-Incident, 10cloud-services-team (Kanban): Write a simple script that handles failovering proxies - https://phabricator.wikimedia.org/T143639#3348997 (10madhuvishy) a:05madhuvishy>03None [17:53:58] 10Labs, 10Tool-Labs, 10cloud-services-team (Kanban): Setup monitoring for cdnjs git pull - https://phabricator.wikimedia.org/T144215#3348998 (10madhuvishy) a:05madhuvishy>03None [17:54:32] bd808: I'm trying to finish T165624 but I'm getting stuck. It won't let me update the cn. [17:54:32] T165624: Request to rename LegoFan4000 to MacFan4000 on WikiTech - https://phabricator.wikimedia.org/T165624 [17:54:40] Step 1.3 [17:54:51] Search failed: No such object [17:56:05] RainbowSprinkles: that suggests it's misinterpreting the shell user name [17:56:55] Yeah. Someone keeps breaking the ldap tools [17:57:04] And I can't figure out ldapvi to save my life [18:15:24] 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-eqiad: rack/setup/install labpuppetmaster100[12].wikimedia.org - https://phabricator.wikimedia.org/T167905#3349071 (10RobH) [18:29:29] hello all [18:29:42] need a help on getting an account at tools lab [18:30:04] I registered for a tool account. [18:30:05] https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Info-farmer [18:30:20] logged into toolsadmin.wikimedia.org and created LDAP account with shell account name. [18:30:20] Username [18:30:20] Tha uzhvan [18:30:20] Shell account name [18:30:20] thauzhavan [18:30:32] Created ssh RSA keys and pasted in settings. [18:30:32] Then, trying to login in shell using ssh, but getting connection closed error. [18:30:32] ssh -i .ssh/id_rsa thauzhavan@login.tools.wmflabs.org [18:30:32] Connection closed by 208.80.155.163 [18:30:45] I am following this link [18:30:45] https://wikitech.wikimedia.org/wiki/Help:Tool_Labs#Quick_start [18:30:45] Please hep to solve this issue. [18:33:58] shrini: there are no errors shown? i.e. 'connection closed' is immediately on the first line of output? [18:35:39] shrini: try ssh -vvv -i thauzhavan@login.tools.wmflabs.org [18:36:49] basically, I would expect something along the lines of 'Permission denied (publickey,hostbased).'. Just 'Connection closed' is odd, and suggests a network issue (e.g. an outgoing firewall that blocks the connection) [18:50:09] 10Labs: Need support on hosting an RStudio Shiny Server on a Labs instance behind a proxy - https://phabricator.wikimedia.org/T167702#3349224 (10zhuyifei1999) Something interesting about the security groups is that it only affects communications between different projects (i.e. communication between [[https://wi... [18:57:49] valhallasw`cloud: No errors found [18:57:54] just that message only [18:58:20] shrini: Hm. Odd. If you try telnet login.tools.wmflabs.org 22 [18:58:35] does that show 'SSH-2.0-OpenSSH_6.9p1 Ubuntu-2~trusty1' or just a disconnect? [19:01:09] it works for me with other account [19:01:16] ssh tshrinivasan@login.tools.wmflabs.org [19:01:20] this works [19:01:28] ssh -i id_rsa thauzhavan@login.tools.wmflabs.org [19:01:32] this is not working [19:01:46] Did I miss something on this account creation? [19:01:54] valhallasw`cloud: ^^ [19:02:47] shrini: from the same computer? [19:03:07] no [19:03:13] from different computers [19:03:19] they both have different keys [19:04:39] Right. Ok. [19:05:25] Need help on fixing this [19:05:38] did I miss any process? [19:06:03] So. I don't understand the error you're seeing, but in the server error logs, I find some connections for user 'thauzhavan', trying three different keypairs (SHA256:LpQDq6nAiOoGOCl7VuuXB+AC7HRs4Aoni8JWheFyRMM, SHA256:q5yT1yI8hveK/wWa+T8SharbpaMMU0i0ugP/s6CpTa0, SHA256:lJamxcWxT6b/iInLNZJcqE6oH7KerdYO6gnvttMUJ0c) [19:06:06] neither of them is correct [19:07:08] SHA256:lJamxcWxT6b/iInLNZJcqE6oH7KerdYO6gnvttMUJ0c [19:07:22] is the one we set in https://toolsadmin.wikimedia.org/profile/settings/ssh-keys [19:07:34] for the user account Tha uzhavan [19:08:01] shall I try creating new key and add there? [19:08:19] shrini: try removing the newlines [19:08:33] at least, when I look in ldap, the key is split over 6 lines [19:08:54] 10Labs: Need support on hosting an RStudio Shiny Server on a Labs instance behind a proxy - https://phabricator.wikimedia.org/T167702#3349351 (10zhuyifei1999) Also, you previously said: >>! In T167702#3341941, @GoranSMilovanovic wrote: > I already have a security group for Shiny Server, port 3838 opened. Can you... [19:10:10] valhallasw`cloud: http://storage4.static.itmages.com/i/17/0614/h_1497467378_9129754_0447158943.jpeg [19:10:20] see here for the screenshot of the settings [19:10:27] I dont see any newline there [19:10:43] hm, indeed, it's the ldap display that seems to cause this [19:10:49] hrmmm [19:12:29] oh [19:14:06] I'm at a loss here. The log shows the following: https://tools.wmflabs.org/paste/view/70fb246b [19:14:24] the only thing I can think of is to rearrange the pubkey order, trying the correct one before the other two [19:14:27] when I query ldap I see [19:14:28] https://phabricator.wikimedia.org/P5581 [19:14:29] but that really shouldn't be a problem [19:15:08] will creating new keypair help? [19:15:14] yes, and ssh-keygen -lf on that gives the SHA256:lJamxcWxT6b/iInLNZJcqE6oH7KerdYO6gnvttMUJ0c we were discussing. Weird. [19:15:22] shrini: maybe [19:15:34] valhallasw`cloud: just confirming :) [19:15:57] idk what, seems like it has to be a weird char in a key from paste or not client not trying the right key [19:16:08] or a key storage issue, I haven't used toolsadmin to add a keypair yet [19:16:21] shrini: try it with "ssh -i " [19:16:30] (vs. using agent or so) [19:16:46] mutante: tried that too. same issue [19:16:57] -i points to the private key fyi [19:17:09] yes [19:17:29] if "ssh -i /path/to/private_key tools-bastion-03.eqiad.wmflabs" does not log you in and it shows 'Failed publickey' in the server side logs [19:17:39] it has to be a generation or a storing issue [19:18:01] chasemp: it's even weirder [19:18:07] shrini: can you try ot change to a new key via wikitech.wikimedia.org? [19:18:16] the logs show: Jun 14 19:00:17 tools-bastion-03 sshd[18089]: Failed publickey for thauzhavan from [IP] port 38796 ssh2: RSA SHA256:lJamxcWxT6b/iInLNZJcqE6oH7KerdYO6gnvttMUJ0c [19:18:30] in other words, the key we're discussing is being presented by the client [19:19:17] so it has to be something weird with the key -- just storing it wrong would cause these hashes to mismatch [19:19:43] can't argue with your reasoning [19:20:01] if it's me I'm starting over with a newly generated key and trying to add via wikitech [19:20:10] if that still fails now it's really interesting [19:20:17] chasemp: where to add new key ion wikitech.wikimedia.org? [19:20:30] is the public key for that really on the bastion? [19:20:38] mutante: no it's from ldap [19:20:43] oh, right [19:20:46] looked up at the time they try to login [19:21:06] shrini: https://wikitech.wikimedia.org/wiki/Special:Preferences#mw-prefsection-openstack [19:23:30] exploring this [19:24:20] shrini: it should store things in teh same place [19:24:27] just fyi [19:28:02] I dont have login details for wikitech.wikimedia.org now [19:28:11] will explore that tomorrow morning [19:28:32] shrini: should be your LDAP username/password [19:28:45] so user 'Tha uzhavan' [19:29:12] shrini: yes it's all connected to ldap. Your username and login already setup should work, and then the key added on this page will be tied to your shell username. [19:29:42] shrini: if you don't get it to work please log a ticket via phabricator.wikimedia.org and/or come back here and use !help and we'll figure it out [19:30:01] got it to the site using LDAP account [19:30:09] nice. thanks for the info [19:30:38] will add new key here [19:34:38] http://storage4.static.itmages.com/i/17/0614/h_1497468871_3079660_6c8850c899.jpeg [19:34:48] added new pair [19:34:52] shrini: delete the old? [19:35:00] ok [19:35:26] deleted [19:35:30] still can not login [19:36:01] ssh -i id_rsa_thagaval thauzhavan@login.tools.wmflabs.org [19:36:08] Connection to login.tools.wmflabs.org closed by remote host. [19:36:08] Connection to login.tools.wmflabs.org closed [19:36:17] is the user name I use correct? [19:36:29] Is there any spelling mistake on the username? [19:36:48] chasemp: ^^ [19:37:09] username is right [19:37:13] thauzhavan [19:37:33] shrini: try again [19:38:05] same error [19:38:25] do I have to paste the key on toolsadmin.wikimedia.org too? [19:40:06] shrini: are you specifying a key via -i? [19:40:23] yes chasemp [19:40:32] ssh -i id_rsa_thagaval thauzhavan@login.tools.wmflabs.org [19:40:46] does ssh client still try multiple keys when that happens? that's weird [19:41:00] when I tried to add the key in toolsadmin.wikimedia.org [19:41:04] getting some error [19:41:21] 6d15867b7e944dd59aa750067840db1a is the error id [19:41:26] internal server error [19:41:58] http://storage2.static.itmages.com/i/17/0614/h_1497469309_8941539_e4add0a2ad.jpeg [19:42:45] bd808: when you are about shrini is having some issues with adding an ssh key, tried first via toolsadmin and seems to be getting errors [19:45:24] shrini: I'm at a loss at the moment and I have to step away for a minute, please make a task on https://phabricator.wikimedia.org/ (just register via that same ldap) [19:45:28] chasemp, shrini: hi! [19:45:42] I just got off of a conference call [19:45:48] * bd808 reads backscroll [19:45:56] ok chasemp thanks [19:46:02] bd808: hai [19:46:54] bd808: synopsis: shrini tries to add a key via toolsadmin following getting started guide, seemingly no luck getting into toolsadmin, valhallasw`cloud and I scratch our heads that it seems like he's using the right key to attempt, I asked to try anew via wikitech with a new key, same result. Either some client side process is wrong or our key management afaict [19:46:59] * chasemp brb [19:47:27] s/seemingly no luck getting into toolsadmin/getting it to work after using toolsadmin [19:47:30] really brb [19:47:57] 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-eqiad: rack/setup/install labpuppetmaster100[12].wikimedia.org - https://phabricator.wikimedia.org/T167905#3349561 (10Cmjohnson) [19:48:02] good times. Let me look for that crash report first and see what tha says [19:48:52] Thanks bd808 [19:49:03] ask if you have any queries [19:49:08] or actions to do [19:51:06] hmmm... the error at toolsadmin looks like some kind of bug with writing the key to the LDAP backend [19:51:40] seems like that only [19:51:52] can not add new keys via toolsadmin site [19:52:52] yeah. I'll make a note of that and look later. Let's get back to your ssh auth problem [19:53:10] sure [19:53:39] bd808: https://tools.wmflabs.org/paste/view/70fb246b -- auth.log plus key lookup (old key) [19:53:39] shrini: what is your shell account name? I'd like to look at the data that is already in LDAP [19:53:57] ah. thanks valhallasw`cloud [19:54:17] bd808: thauzhavan [19:57:37] shrini: ok. so I see one SSH public key in LDAP for your user. That key has a fingerprint of "SHA256:T7OzxBar163u6Eob7DIy9D9sfC8zImg4dd6W2Pzeq+A". [19:58:28] The paste that valhallasw`cloud shared with us does not show that key now [19:58:43] bd808: no, that's an older snippet from auth.log [19:58:58] after that we changed keys, but that also did not resolve the issue [19:59:02] The fingerprint is for old key [19:59:19] I created new key and pasted here - wikitech.wikimedia.org [19:59:39] auth.log shows: Jun 14 19:33:42 tools-bastion-03 sshd[14104]: Failed publickey for thauzhavan from [IP] port 39188 ssh2: RSA SHA256:T7OzxBar163u6Eob7DIy9D9sfC8zImg4dd6W2Pzeq+A [19:59:49] ok [19:59:52] when tried to add the new key at toolsamin, got error. cant add new key [20:00:02] so the problem is somehting outside of the key [20:00:28] shrini: I would suggest trying to figure out what the other two keys are that are being presented [20:00:35] shrini: -vvv could help for that [20:00:39] and then prevent those from being loaded [20:00:53] 10Labs, 10Cassandra, 10Services (blocked), 10User-bd808, 10cloud-services-team (Kanban): Request increased quota for services-testbed labs project - https://phabricator.wikimedia.org/T163375#3349582 (10Eevans) >>! In T163375#3230116, @bd808 wrote: > [ ... ] > Raised to 16G. Let me know when you are done... [20:00:56] shrini: is you account newly created, say in the last month or so? [20:01:08] bd808: yes. newly created [20:01:21] on may 3 2017 [20:01:33] We had at least one person who didn't get properly added to the bastion group I think. [20:01:37] * bd808 looks for that bug [20:01:45] 10Labs-project-other, 10Math: Instances in math project show high system CPU usage - https://phabricator.wikimedia.org/T160824#3349584 (10hashar) Can one of you possibly check the `drmf` instance at least? It had super high system CPU for months: https://grafana-labs.wikimedia.org/dashboard/db/labs-project-bo... [20:03:58] valhallasw`cloud: https://pastebin.com/LEDrwp0g [20:04:07] see here for ssh command -vvv option [20:04:26] debug1: Authentication succeeded (publickey). [20:04:30] ?! [20:05:08] magic!? [20:05:20] in any case. It's offering two other keys first, from the ssh agent [20:08:35] shrini: try SSH_AUTH_SOCK="" ssh -i id_rsa_thagaval thauzhavan@login.tools.wmflabs.org -vvv ? [20:08:42] that prevents the ssh-agent from being used [20:09:44] ssh -i id_rsa_thagaval thauzhavan@login.tools.wmflabs.org [20:09:45] The authenticity of host 'login.tools.wmflabs.org (208.80.155.163)' can't be established. [20:09:45] ECDSA key fingerprint is SHA256:TybNtIoEmUacZxKi83BRYP3Q+TMeK5llxuMI6duBKEQ. [20:09:45] Are you sure you want to continue connecting (yes/no)? yes [20:09:45] Warning: Permanently added 'login.tools.wmflabs.org,208.80.155.163' (ECDSA) to the list of known hosts. [20:09:45] Load key "id_rsa_thagaval": Permission denied [20:09:45] Permission denied (publickey,hostbased). [20:10:05] copied the keys to new user in my linux [20:10:12] from there tried to login [20:10:17] got the above error [20:10:27] shrini: it can't read id_rsa_thagaval. Check the permissions. [20:12:20] ok [20:12:26] gave read permissions [20:12:55] and here are the error message for [20:12:56] SSH_AUTH_SOCK="" ssh -i id_rsa_thagaval thauzhavan@login.tools.wmflabs.org -vvv [20:13:01] https://pastebin.com/szrmKBx3 [20:14:29] ok, so the pubkey auth is not the problem [20:14:46] Jun 14 20:11:01 tools-bastion-03 sshd[29297]: fatal: Access denied for user thauzhavan by PAM account configuration [preauth] [20:14:56] bd808: ^ that might actually be that bastion issue then [20:15:19] but then again I don't really understand ssh when it works, so when it breaks... [20:16:53] chasemp: so im trying to make racking tasks for https://phabricator.wikimedia.org/T154664 [20:17:03] and not sure what to call them [20:17:19] valhallasw`cloud: yeah. I'm trying to track down the bug that we had for that before [20:17:20] the order itself is 4 systems, covering both https://phabricator.wikimedia.org/T154664 and https://phabricator.wikimedia.org/T161766 [20:17:26] the https://phabricator.wikimedia.org/T161766 is easier, since its listed already ;] [20:17:38] they also have not arrived yet [20:18:39] robh: right, ok [20:19:08] sec [20:19:21] if you arent sure i dont have to make the task now, you can think about it =] [20:19:28] since they havent arrived yet, its not exactly urgent [20:19:33] they arent due to ship until the 22nd [20:20:00] 10Tool-Labs-tools-Xtools, 10Community-Tech: Create an XTools logo - https://phabricator.wikimedia.org/T167345#3349653 (10kaldari) Apparently we can get 3 logo designs for $69 at https://worthylogollc.com/. [20:20:14] robh: so we have https://phabricator.wikimedia.org/T161766 which denotes labtestservices2003 and labtestcontrol2003 [20:20:21] if I change that and update both tasks is taht cool? [20:20:30] yeah you wanna call them something else? [20:20:42] well, looking [20:20:43] i havent made anything from them yet other than the order [20:20:50] so much concurrent ordering! [20:20:54] so is fine, just update both hw-request tasks with what to call them [20:21:04] and ill take from them when i make the setup tasks [20:21:20] shrini: I think I figure out the problem. It does not look like you are member of the tools project yet. [20:21:51] bd808: ah ha, geez on me for not checking that, for some reason I thought toolsadmin ensured it or something [20:21:56] id thauzhavan = uid=16971(thauzhavan) gid=500(wikidev) groups=500(wikidev) [20:22:01] oh [20:22:02] https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Info-farmer [20:22:17] this seems request completed [20:22:30] shrini: that's a completely differnent username [20:22:49] ooops [20:23:00] the shell user thauzhavan is https://wikitech.wikimedia.org/wiki/User:Tha_uzhvan [20:23:04] robh: does the edit make sense? I just specified for https://phabricator.wikimedia.org/T154664 and https://phabricator.wikimedia.org/T161766 should be ok [20:23:33] oh [20:23:39] seems odd each one has a labtestservices [20:23:39] messed up with the usernames [20:23:47] shrini: looks like you have multiple accounts which is ok [20:23:58] but the end result is the same since the hardware is all identical [20:24:05] where to apply now? [20:24:06] robh: it's for two different region setups, i.e two distinct control plans with different labtestservices hosts [20:24:09] shrini: the Info-farmer wiki user has the shell account name "info-farmer" [20:24:11] ah [20:24:21] shrini: and no ssh keys uploaded [20:24:23] chasemp: ok, no worries since the end result is the same [20:24:27] robh: a region is like a means of HA or multitenancy in openstack so we are testing a second region setup here [20:24:35] and each needs a services host [20:24:39] sure, just clarifying :) [20:24:57] shrini: there should be a link to apply for the Tool Labs membership at https://toolsadmin.wikimedia.org/tools/ [20:25:38] https://toolsadmin.wikimedia.org/tools/membership/status/32 [20:25:44] bd808: I wonder if toolsadmin could throw a giant banner if someone is not a member of Tools [20:25:46] applied there. it is pending stattus [20:25:51] for their sake as well as ours [20:25:54] chasemp: or ssh :P [20:25:58] eheh [20:26:01] but toolsadmin would probably be easier [20:26:23] valhallasw`cloud: we learned our lesson! check the project and don't assume :D [20:26:46] I blindly assumed we didn't check new accounts anymore after moving to toolsadmin. Which is also silly :D [20:27:34] ooh, I have 39 alerts [20:27:48] shrini: approved! I'm not seeing the group membership change yet though... [20:28:04] bd808: approving in toolsadmin should also set the ldap groups correctly? [20:28:15] well.... [20:28:20] in theory [20:28:21] in theory, at least? ;-) [20:28:24] :-) [20:28:30] but it does it indirectly [20:28:40] awesome [20:28:48] got into the server [20:28:54] there's a hook in keystone somewhere that does it [20:28:58] shrini: w00t! [20:29:01] thanks all [20:29:53] valhallasw`cloud: if you go to https://toolsadmin.wikimedia.org/tools/membership/?o=-status you can see the requests that are still open [20:30:12] *nod* [20:30:56] the 2 that are there I have skipped mostly because the requests were a bit sketchy (brand new accounts that did not link to SUL identities) [20:31:10] very nice :-) Much better than the 'please click these three links in order and please don't try to do two requests at the same time' wikitech experience [20:31:18] yes :) [20:31:32] one click does it all [20:31:35] could we add a 'stalled' status maybe? [20:31:41] to write down exactly those notes [20:31:57] and clarifying in the status that it's waiting for the user to clarify [20:32:20] yeah it could be patched to have a "feedback needed" state [20:32:27] bd808: huh. https://toolsadmin.wikimedia.org/tools/membership/status/22 vs https://wikitech.wikimedia.org/wiki/User:BarrelRoller [20:32:34] ah, seems appropriate, and a text field coudl link to a task or something if we needed [20:32:48] I'd like to add emails for notices too [20:32:54] bd808: the user is in ldap but does not have a wikitech user attached....? [20:33:17] valhallasw`cloud: yeah, that state is expected when the account was created via striker and has never authed to wikitech [20:33:31] Aha. Yes, that makes sense. [20:33:34] but there should be no way to do that without having a SUL account linked [20:33:41] which makes it extra fishy [20:33:46] I hadn't realized one could create an LDAP account in striker [20:33:57] bd808: do they have a way to associate post with a wikitech account? [20:34:02] is that a dumb question... [20:34:17] the wikitech side is magic. account created on first auth [20:34:26] bd808: maybe they did login with oauth, but it somehow did not store that info correctly? [20:34:27] there is a way to associate the SUL account too [20:34:31] awesome bd808 valhallasw`cloud chasemp [20:34:37] thanks for the great help [20:34:48] valhallasw`cloud: that's possible. [20:34:52] shrini: you are welcome [20:35:04] can login and create a novaservicegroup [20:35:15] can become a tool now [20:38:48] valhallasw`cloud: you should try out striker's account creation workflow. I think it's pretty nice [20:39:05] bd808: it tells me my account is already connected :-o [20:39:09] (what a surprise) [20:39:12] you can use https://striker.wmflabs.org/ if you don't want to make junk in the prod LDAP [20:39:18] oh, that's a good idea [20:39:33] which ... is busted? ;/ [20:39:57] :( [20:40:09] * bd808 looks to see why [20:40:49] 10Labs, 10Tracking: Existing Labs project quota increase requests (Tracking) - https://phabricator.wikimedia.org/T140904#3349740 (10Luke081515) [20:40:52] 10Labs, 10Cassandra, 10Services (blocked), 10User-bd808, 10cloud-services-team (Kanban): Request increased quota for services-testbed labs project - https://phabricator.wikimedia.org/T163375#3349739 (10Luke081515) 05stalled>03Open [20:45:07] bd808: I assume it's not trivial to run locally? [20:45:29] valhallasw`cloud: there is a MediaWiki-Vagrant role! [20:45:40] should setup everything needed [20:45:56] vagrant roles enable striker; vagrant provision [20:45:56] oooh, shiny [20:46:15] and that gives me a git checkout somewhere? [20:46:29] yeah, in srv/striker [20:47:18] the role sets up a couple of wikis, a phab instance, an ldap server, a keystone instance, and striker [20:47:56] Yeah, that's probably worth a try. [20:48:07] but not now -- time for bed :-) [20:48:22] there are a few manual steps that are documented at http://dev.wiki.local.wmftest.net:8080/wiki/VagrantRoleStriker once you do the initial provision [20:48:58] I was too lazy to figure out how to automate all the phab setup that is needed [20:50:23] *grin* [20:50:43] bd808: wmftest.net is not something that's accessible for me [20:50:48] * bd808 is not sure why ldap connections are failing on the test project [20:51:02] or is that the local wiki in the vagrant role? [20:51:14] *.local.wmftest.net == 127.0.0.1 [20:51:17] yeah [20:56:53] (03PS5) 10BryanDavis: Add toolinfo.json style data [labs/striker] - 10https://gerrit.wikimedia.org/r/353909 (https://phabricator.wikimedia.org/T149458) [20:56:56] (03PS2) 10BryanDavis: Add support for "tagging" toolinfo records [labs/striker] - 10https://gerrit.wikimedia.org/r/358505 (https://phabricator.wikimedia.org/T149458) [20:56:59] (03PS2) 10BryanDavis: Expose all toolinfo data for indexing [labs/striker] - 10https://gerrit.wikimedia.org/r/358506 (https://phabricator.wikimedia.org/T149458) [20:57:02] (03PS1) 10BryanDavis: Change #wikimedia-labs to #wikimedia-cloud [labs/striker] - 10https://gerrit.wikimedia.org/r/359042 (https://phabricator.wikimedia.org/T166420) [20:58:28] A query on PAWS [20:58:38] How to use python2.7 in PAWS? [20:58:45] it has only python 3 [21:00:31] valhallasw`cloud: bd808 ^^ [21:01:04] shrini: I don't know. If there's no 'Python 2' in the kernel list, you probably can't [21:02:12] * valhallasw`cloud is off to bed [21:02:51] fine valhallasw`cloud [21:02:53] thanks [21:02:55] good night [21:03:41] shrini: in a shell from PAWS or in a notebook page? [21:03:56] in a shell [21:03:59] is enough [21:04:03] for now [21:06:02] shrini: looks like python2 is not installed for PAWS containers [21:06:26] ok [21:06:29] do you have a package that is only 2.x compatible? [21:06:38] can anyone install it/ [21:06:53] yes. Many of my tools depends on only python 2.7 [21:07:06] many libraries are not ported to python3 yet [21:07:14] example - wikitools python library [21:07:52] 10Tool-Labs-tools-Xtools, 10Community-Tech: Create an XTools logo - https://phabricator.wikimedia.org/T167345#3349826 (10MusikAnimal) >>! In T167345#3349653, @kaldari wrote: > Apparently we can get 3 logo designs for $69 at https://worthylogollc.com/. Dead link? [21:08:32] shrini: we should fix the library :) [21:08:45] hahaha [21:08:47] python3 just had it's 10th birthday [21:09:21] if there is apossibility add python2.7 to paws, please do [21:09:28] will solve many issues [21:09:56] when we train tools creation to newbies, paws helps a lot [21:10:09] but when our tools are not working there, we stuck [21:10:24] shrini: please file a phabricator task if there isn't one already [21:10:32] Not every newbie can apply for a server in wikitech [21:10:43] will raise a task there [21:11:01] pretty much anyone can use tool labs, but I agree that paws is much easier [21:11:17] shrini: :) -- https://github.com/alexz-enwp/wikitools/issues/47 [21:11:57] :-) [21:12:53] shrini: porting all your scripts may not be easy, but https://github.com/mwclient/mwclient is py3 compatible [21:13:07] that's the library I use with the toolsadmin django app [21:13:16] yes. [21:13:39] but it really sad to see still most of the libraries on python2.7 [21:14:16] when we miss a library on python3, and it works on python 2.7, we are sad :-( [21:14:26] even python3.3 is nearly 5 years old [21:14:54] some would argue that was the first stable 3.x release [21:17:05] 10PAWS: Install Python 2.7 in PAWS - https://phabricator.wikimedia.org/T167926#3349849 (10Tshrinivasan) [21:17:27] wow [21:17:40] nice to see the ping here on new bugs reported [21:25:47] another query [21:25:49] 10 * * * * /usr/bin/jsub -N cron-tools.shrinitools-2 -once -quiet /bin/bash /data/project/shrinitools/dev/get_tawikisource_report.sh [21:26:06] this cron entry is not working on my tools server [21:26:13] how to fix this? [21:26:32] are there any error messages that get logged? [21:27:41] 10Tool-Labs-tools-Xtools, 10Community-Tech: Create an XTools logo - https://phabricator.wikimedia.org/T167345#3330053 (10Luke081515) As we posted the link, the link worked. Now it does not work for me too. [21:28:17] where to check the erro logs of cron? [21:29:21] shrini: look for cron-tools.shrinitools-2.err and cron-tools.shrinitools-2.out files [21:30:05] super [21:30:09] got the issue [21:30:10] thanks [21:30:18] yw [21:30:55] good night [21:31:02] thanks for the great support [21:31:17] o/ you showed up on a good day [21:31:32] you can help by being helpful to others :) [21:32:19] true [21:32:22] :-) [21:38:35] Storing 100 GB of Audio audio fingerprints on Labs isn't a problem, right? [21:40:49] Dispenser: "it depends" [21:41:00] 100G is quite a bit [21:41:11] we'd need to find a place for that [21:41:21] but its not outrageous [21:41:46] We've wasted 1 TB on pirated material from WP0 [21:42:07] constructive discussion is helpful [21:45:45] no single project/tool uses 1TB of storage [21:52:20] 100 GB is a lot of space but in this era of cloud computing it's totally doable. [21:54:56] Still, I don't know if it's trivial enough that we could just offer it to whoever asked for it [21:56:55] 100Gb isn't crazy if the project using it has strong value for the movement. I'd be sad to see it wasted though certinaly. [21:59:42] madhuvishy: 1TB is about the bad wikipedia zero uploads to openstack swift [22:03:02] Dispenser: if you are going for an instance-local storage, m1.xlarge will have 160G of storage (20G will be mounted at /, and the rest can be configured to whatever you like) [22:03:19] PROBLEM - Puppet errors on tools-static-11 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [22:04:01] PROBLEM - Puppet errors on tools-redis-1001 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:04:25] PROBLEM - Puppet errors on tools-worker-1015 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:04:40] PROBLEM - Puppet errors on tools-exec-1404 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:04:48] * bd808 looks at these puppet errors [22:05:29] catalog failures from the master? [22:05:31] PROBLEM - Puppet errors on tools-webgrid-generic-1404 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:07:28] PROBLEM - Puppet errors on tools-exec-1417 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:07:44] So doable, I'd have to see how echoprint uses the fingerprints. Nobody really offering free fingerprinting as it seems to use a non-trivial resources. [22:08:30] PROBLEM - Puppet errors on tools-k8s-etcd-03 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [22:08:34] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1422 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:08:46] PROBLEM - Puppet errors on tools-checker-01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [22:08:47] PROBLEM - Puppet errors on tools-worker-1017 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:08:47] PROBLEM - Puppet errors on tools-exec-1414 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:08:59] PROBLEM - Puppet errors on tools-worker-1009 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [22:09:03] PROBLEM - Puppet errors on tools-checker-02 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [22:09:13] PROBLEM - Puppet errors on tools-flannel-etcd-03 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [22:09:22] !log tools Restarted apache2 proc on tools-puppetmaster-02 [22:09:27] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [22:09:47] PROBLEM - Puppet errors on tools-worker-1020 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [22:09:51] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1409 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [22:10:15] PROBLEM - Puppet errors on tools-k8s-etcd-02 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [22:10:19] PROBLEM - Puppet errors on tools-prometheus-02 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:11:15] madhuvishy: hmmm.. puppet seems very unhappy. you have time to help me look for the cause? [22:11:37] bd808: yeah sure [22:11:41] Looking [22:11:46] I restarted apache on the puppermaster. now manual runs seem to just be hanging [22:11:59] trying on tools-redis-1001 now [22:12:12] PROBLEM - Puppet errors on tools-docker-registry-01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:12:18] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1416 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:12:50] PROBLEM - Puppet errors on tools-exec-1442 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [22:12:52] andrewbogott: could this be from the labspuppetbackend change you merged? [22:12:56] PROBLEM - Puppet errors on tools-exec-1435 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:12:58] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1411 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [22:13:14] PROBLEM - Puppet errors on tools-exec-1401 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:13:30] PROBLEM - Puppet errors on tools-exec-1405 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:13:31] yeah puppet run hangs for me [22:13:32] probably, I'll check [22:13:40] PROBLEM - Puppet errors on tools-worker-1011 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [22:13:42] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1413 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:13:59] PROBLEM - Puppet errors on tools-exec-1438 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:13:59] PROBLEM - Puppet errors on tools-grid-shadow is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [22:14:09] PROBLEM - Puppet errors on tools-exec-1410 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [22:14:17] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1426 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:14:19] PROBLEM - Puppet errors on tools-exec-1427 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:14:23] PROBLEM - Puppet errors on tools-exec-1418 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [22:14:24] Hi, puppet is failing for me [22:14:27] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1427 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:14:33] paladox: known, investigating [22:14:33] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1420 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [22:14:34] when i got to run puppet agent -tv it stalls [22:14:37] ok thanks [22:14:43] yes, anything with a local puppetmaster is temporarily broken, I will fix it shortly [22:14:53] PROBLEM - Puppet errors on tools-worker-1025 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [22:14:55] PROBLEM - Puppet errors on tools-worker-1012 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [22:14:59] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1421 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [22:15:13] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1419 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [22:15:17] PROBLEM - Puppet errors on tools-bastion-02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [22:15:26] PROBLEM - Puppet errors on tools-worker-1026 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:15:28] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1405 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:15:32] PROBLEM - Puppet errors on tools-exec-1423 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:15:38] PROBLEM - Puppet errors on tools-exec-1421 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:15:53] !log shinked Stopped service ircecho [22:15:54] madhuvishy: Unknown project "shinked" [22:15:58] !log shinken Stopped service ircecho [22:16:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Shinken/SAL [22:16:22] 10Labs, 10DBA: Prepare and check storage layer for atjwiki - https://phabricator.wikimedia.org/T167715#3350090 (10Reedy) Wiki has been created! [22:16:33] thanks madhuvishy [22:26:16] 10Labs-project-Wikistats: Add atjwiki - https://phabricator.wikimedia.org/T167929#3350102 (10Reedy) [22:28:40] 10Labs-project-Wikistats: Add atjwiki - https://phabricator.wikimedia.org/T167929#3350118 (10Dzahn) a:03Dzahn [22:30:32] puppet recoveries should be coming in now. Sorry about the noise! [22:31:57] PROBLEM - Puppet errors on tools-exec-1439 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [22:32:09] PROBLEM - Puppet errors on tools-worker-1016 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:32:09] PROBLEM - Puppet errors on tools-exec-1409 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:32:09] PROBLEM - Puppet errors on tools-package-builder-01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:32:09] PROBLEM - Puppet errors on tools-worker-1013 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:32:13] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1425 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [22:32:22] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1412 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:32:27] PROBLEM - Puppet errors on tools-exec-1441 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [22:32:38] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1408 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [22:32:42] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1407 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:32:46] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1424 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:32:54] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1418 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [22:32:58] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1403 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:33:03] I love how everything isn't just critical...it's CRITICAL: CRITICAL [22:33:06] PROBLEM - Puppet errors on tools-exec-1408 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:33:16] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1402 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:33:19] PROBLEM - Puppet errors on tools-flannel-etcd-01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:33:29] PROBLEM - Puppet errors on tools-bastion-05 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [22:33:31] PROBLEM - Puppet errors on tools-exec-1403 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:33:32] PROBLEM - Puppet errors on tools-exec-1436 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [22:33:35] 10Striker: Fatal error when adding a duplicate SSH key - https://phabricator.wikimedia.org/T167931#3350140 (10bd808) [22:33:44] we get it shinken-wm theres a nuclear meltdown it was supposely fixed [22:33:48] PROBLEM - Puppet errors on tools-exec-1406 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:33:48] PROBLEM - Puppet errors on tools-worker-1010 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:33:54] PROBLEM - Puppet errors on tools-worker-1019 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:33:55] Now a 10 minute period in which shinken is wrong about everything... [22:34:00] PROBLEM - Puppet errors on tools-worker-1018 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:34:00] PROBLEM - Puppet errors on tools-mail is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [22:34:24] PROBLEM - Puppet errors on tools-exec-1411 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [22:34:30] PROBLEM - Puppet errors on tools-proxy-02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [22:34:54] PROBLEM - Puppet errors on tools-webgrid-generic-1403 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [22:34:54] PROBLEM - Puppet errors on tools-exec-1407 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:34:58] PROBLEM - Puppet errors on tools-exec-1428 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:35:10] PROBLEM - Puppet errors on tools-elastic-02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [22:35:14] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1406 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [22:35:18] PROBLEM - Puppet errors on tools-worker-1006 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [22:35:25] PROBLEM - Puppet errors on tools-worker-1014 is CRITICAL: CRITICAL: 70.00% of data above the critical threshold [0.0] [22:36:12] meeple27: ori has given some great rants before about the horrible irc error reporting we have :) [22:36:34] 10Labs-project-Wikistats: Add atjwiki - https://phabricator.wikimedia.org/T167929#3350152 (10Dzahn) 05Open>03Resolved ``` MariaDB [wikistats]> insert into wikipedias (prefix,lang,loclang,method) values ("atj","Atikamekw","Atikamekw Nehiromowin",'8'); ``` ``` @wikistats-cowgirl:~# /usr/bin/php /usr/lib/wik... [22:36:56] shinken needs a 'pretty much everywhere' reporting mode [22:40:15] RECOVERY - Puppet errors on tools-k8s-etcd-02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:40:29] RECOVERY - Puppet errors on tools-k8s-master-01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:41:03] "It's going to recover so much, you'll be sick of recovering" [22:42:38] at my last gig we worked out how to do x% of puppet is failing across the env alerts instead of per bd808 [22:42:45] iirc it was a check_mk magic [22:43:14] RECOVERY - Puppet errors on tools-exec-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [22:43:34] RECOVERY - Puppet errors on tools-exec-1403 is OK: OK: Less than 1.00% above the threshold [0.0] [22:43:44] RECOVERY - Puppet errors on tools-checker-01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:43:46] RECOVERY - Puppet errors on tools-exec-1414 is OK: OK: Less than 1.00% above the threshold [0.0] [22:44:00] RECOVERY - Puppet errors on tools-exec-1438 is OK: OK: Less than 1.00% above the threshold [0.0] [22:44:01] RECOVERY - Puppet errors on tools-checker-02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:44:17] RECOVERY - Puppet errors on tools-exec-1427 is OK: OK: Less than 1.00% above the threshold [0.0] [22:44:45] RECOVERY - Puppet errors on tools-worker-1020 is OK: OK: Less than 1.00% above the threshold [0.0] [22:44:53] RECOVERY - Puppet errors on tools-worker-1025 is OK: OK: Less than 1.00% above the threshold [0.0] [22:44:57] RECOVERY - Puppet errors on tools-worker-1012 is OK: OK: Less than 1.00% above the threshold [0.0] [22:45:13] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1419 is OK: OK: Less than 1.00% above the threshold [0.0] [22:45:18] RECOVERY - Puppet errors on tools-bastion-02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:45:20] RECOVERY - Puppet errors on tools-prometheus-02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:45:32] RECOVERY - Puppet errors on tools-exec-1423 is OK: OK: Less than 1.00% above the threshold [0.0] [22:46:00] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1415 is OK: OK: Less than 1.00% above the threshold [0.0] [22:46:57] 10Labs, 10DBA: Prepare and check storage layer for atjwiki - https://phabricator.wikimedia.org/T167715#3350169 (10Benoit_Rochon) Youppiii. Lol. Thanks a lot Reedy. Is the content from incubator will follow? [22:47:10] RECOVERY - Puppet errors on tools-docker-registry-01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:47:16] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1416 is OK: OK: Less than 1.00% above the threshold [0.0] [22:47:20] RECOVERY - Puppet errors on tools-grid-master is OK: OK: Less than 1.00% above the threshold [0.0] [22:47:30] RECOVERY - Puppet errors on tools-worker-1029 is OK: OK: Less than 1.00% above the threshold [0.0] [22:47:36] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1408 is OK: OK: Less than 1.00% above the threshold [0.0] [22:47:50] RECOVERY - Puppet errors on tools-exec-1442 is OK: OK: Less than 1.00% above the threshold [0.0] [22:47:57] RECOVERY - Puppet errors on tools-exec-1435 is OK: OK: Less than 1.00% above the threshold [0.0] [22:47:59] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1411 is OK: OK: Less than 1.00% above the threshold [0.0] [22:48:33] RECOVERY - Puppet errors on tools-exec-1405 is OK: OK: Less than 1.00% above the threshold [0.0] [22:48:43] RECOVERY - Puppet errors on tools-worker-1011 is OK: OK: Less than 1.00% above the threshold [0.0] [22:48:43] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1413 is OK: OK: Less than 1.00% above the threshold [0.0] [22:48:57] RECOVERY - Puppet errors on tools-grid-shadow is OK: OK: Less than 1.00% above the threshold [0.0] [22:49:01] RECOVERY - Puppet errors on tools-logs-02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:49:07] RECOVERY - Puppet errors on tools-exec-1410 is OK: OK: Less than 1.00% above the threshold [0.0] [22:49:16] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1426 is OK: OK: Less than 1.00% above the threshold [0.0] [22:49:16] RECOVERY - Puppet errors on tools-exec-1432 is OK: OK: Less than 1.00% above the threshold [0.0] [22:49:18] RECOVERY - Puppet errors on tools-puppetmaster-02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:49:23] RECOVERY - Puppet errors on tools-exec-1418 is OK: OK: Less than 1.00% above the threshold [0.0] [22:49:27] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1427 is OK: OK: Less than 1.00% above the threshold [0.0] [22:49:31] RECOVERY - Puppet errors on tools-exec-1424 is OK: OK: Less than 1.00% above the threshold [0.0] [22:49:33] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1420 is OK: OK: Less than 1.00% above the threshold [0.0] [22:49:46] 10Labs, 10DBA: Prepare and check storage layer for atjwiki - https://phabricator.wikimedia.org/T167715#3350177 (10Reedy) >>! In T167715#3350169, @Benoit_Rochon wrote: > Is the content from incubator will follow? "we" don't usually do that. Usually the incubator guys do, so MF-Warburg or SPQRobin will do it as... [22:49:59] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1421 is OK: OK: Less than 1.00% above the threshold [0.0] [22:50:13] RECOVERY - Puppet errors on tools-exec-1426 is OK: OK: Less than 1.00% above the threshold [0.0] [22:50:27] RECOVERY - Puppet errors on tools-worker-1026 is OK: OK: Less than 1.00% above the threshold [0.0] [22:50:29] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1405 is OK: OK: Less than 1.00% above the threshold [0.0] [22:50:39] RECOVERY - Puppet errors on tools-exec-1421 is OK: OK: Less than 1.00% above the threshold [0.0] [22:50:39] RECOVERY - Puppet errors on tools-worker-1007 is OK: OK: Less than 1.00% above the threshold [0.0] [22:50:41] RECOVERY - Puppet errors on tools-exec-gift-trusty-01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:51:03] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [22:52:06] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1428 is OK: OK: Less than 1.00% above the threshold [0.0] [22:52:06] RECOVERY - Puppet errors on tools-static-10 is OK: OK: Less than 1.00% above the threshold [0.0] [22:52:16] RECOVERY - Puppet errors on tools-exec-1412 is OK: OK: Less than 1.00% above the threshold [0.0] [22:52:34] RECOVERY - Puppet errors on tools-exec-1415 is OK: OK: Less than 1.00% above the threshold [0.0] [22:52:44] RECOVERY - Puppet errors on tools-k8s-etcd-01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:52:52] RECOVERY - Puppet errors on tools-services-01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:53:00] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1414 is OK: OK: Less than 1.00% above the threshold [0.0] [22:53:10] RECOVERY - Puppet errors on tools-cron-01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:53:54] RECOVERY - Puppet errors on tools-worker-1027 is OK: OK: Less than 1.00% above the threshold [0.0] [22:54:00] RECOVERY - Puppet errors on tools-worker-1028 is OK: OK: Less than 1.00% above the threshold [0.0] [22:54:14] RECOVERY - Puppet errors on tools-worker-1005 is OK: OK: Less than 1.00% above the threshold [0.0] [22:54:18] RECOVERY - Puppet errors on tools-docker-builder-05 is OK: OK: Less than 1.00% above the threshold [0.0] [22:54:46] RECOVERY - Puppet errors on tools-elastic-01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:54:50] RECOVERY - Puppet errors on tools-redis-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [22:54:54] RECOVERY - Puppet errors on tools-webgrid-generic-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [22:55:20] RECOVERY - Puppet errors on tools-worker-1003 is OK: OK: Less than 1.00% above the threshold [0.0] [22:55:50] RECOVERY - Puppet errors on tools-exec-1419 is OK: OK: Less than 1.00% above the threshold [0.0] [22:56:06] RECOVERY - Puppet errors on tools-exec-1416 is OK: OK: Less than 1.00% above the threshold [0.0] [22:56:37] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1417 is OK: OK: Less than 1.00% above the threshold [0.0] [22:57:03] RECOVERY - Puppet errors on tools-worker-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [22:57:23] RECOVERY - Puppet errors on tools-exec-1402 is OK: OK: Less than 1.00% above the threshold [0.0] [22:57:27] RECOVERY - Puppet errors on tools-exec-1433 is OK: OK: Less than 1.00% above the threshold [0.0] [22:57:31] RECOVERY - Puppet errors on tools-bastion-03 is OK: OK: Less than 1.00% above the threshold [0.0] [22:58:03] RECOVERY - Puppet errors on tools-worker-1022 is OK: OK: Less than 1.00% above the threshold [0.0] [22:58:37] RECOVERY - Puppet errors on tools-worker-1004 is OK: OK: Less than 1.00% above the threshold [0.0] [22:58:50] RECOVERY - Puppet errors on tools-proxy-01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:59:16] RECOVERY - Puppet errors on tools-worker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [22:59:22] RECOVERY - Puppet errors on tools-exec-1430 is OK: OK: Less than 1.00% above the threshold [0.0] [22:59:26] RECOVERY - Puppet errors on tools-elastic-03 is OK: OK: Less than 1.00% above the threshold [0.0] [22:59:38] RECOVERY - Puppet errors on tools-docker-registry-02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:59:40] RECOVERY - Puppet errors on tools-exec-1413 is OK: OK: Less than 1.00% above the threshold [0.0] [22:59:44] RECOVERY - Puppet errors on tools-worker-1008 is OK: OK: Less than 1.00% above the threshold [0.0] [23:00:17] RECOVERY - Puppet errors on tools-worker-1023 is OK: OK: Less than 1.00% above the threshold [0.0] [23:00:19] RECOVERY - Puppet errors on tools-prometheus-01 is OK: OK: Less than 1.00% above the threshold [0.0] [23:00:21] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1404 is OK: OK: Less than 1.00% above the threshold [0.0] [23:00:21] RECOVERY - Puppet errors on tools-flannel-etcd-02 is OK: OK: Less than 1.00% above the threshold [0.0] [23:00:41] RECOVERY - Puppet errors on tools-exec-1437 is OK: OK: Less than 1.00% above the threshold [0.0] [23:00:45] RECOVERY - Puppet errors on tools-exec-1425 is OK: OK: Less than 1.00% above the threshold [0.0] [23:00:47] RECOVERY - Puppet errors on tools-exec-1429 is OK: OK: Less than 1.00% above the threshold [0.0] [23:00:49] RECOVERY - Puppet errors on tools-worker-1021 is OK: OK: Less than 1.00% above the threshold [0.0] [23:02:12] RECOVERY - Puppet errors on tools-worker-1013 is OK: OK: Less than 1.00% above the threshold [0.0] [23:02:12] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1425 is OK: OK: Less than 1.00% above the threshold [0.0] [23:02:48] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1424 is OK: OK: Less than 1.00% above the threshold [0.0] [23:02:58] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1403 is OK: OK: Less than 1.00% above the threshold [0.0] [23:03:04] RECOVERY - Puppet errors on tools-exec-1408 is OK: OK: Less than 1.00% above the threshold [0.0] [23:03:18] RECOVERY - Puppet errors on tools-flannel-etcd-01 is OK: OK: Less than 1.00% above the threshold [0.0] [23:04:04] RECOVERY - Puppet errors on tools-worker-1018 is OK: OK: Less than 1.00% above the threshold [0.0] [23:04:18] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1410 is OK: OK: Less than 1.00% above the threshold [0.0] [23:04:25] RECOVERY - Puppet errors on tools-exec-1411 is OK: OK: Less than 1.00% above the threshold [0.0] [23:04:29] RECOVERY - Puppet errors on tools-proxy-02 is OK: OK: Less than 1.00% above the threshold [0.0] [23:04:53] RECOVERY - Puppet errors on tools-exec-1407 is OK: OK: Less than 1.00% above the threshold [0.0] [23:05:19] RECOVERY - Puppet errors on tools-worker-1006 is OK: OK: Less than 1.00% above the threshold [0.0] [23:05:21] RECOVERY - Puppet errors on tools-services-02 is OK: OK: Less than 1.00% above the threshold [0.0] [23:05:35] RECOVERY - Puppet errors on tools-webgrid-generic-1402 is OK: OK: Less than 1.00% above the threshold [0.0] [23:05:49] RECOVERY - Puppet errors on tools-exec-1431 is OK: OK: Less than 1.00% above the threshold [0.0] [23:05:57] RECOVERY - Puppet errors on tools-exec-1420 is OK: OK: Less than 1.00% above the threshold [0.0] [23:06:15] RECOVERY - Puppet errors on tools-exec-1422 is OK: OK: Less than 1.00% above the threshold [0.0] [23:06:59] RECOVERY - Puppet errors on tools-exec-1439 is OK: OK: Less than 1.00% above the threshold [0.0] [23:07:07] RECOVERY - Puppet errors on tools-worker-1016 is OK: OK: Less than 1.00% above the threshold [0.0] [23:07:09] RECOVERY - Puppet errors on tools-exec-1409 is OK: OK: Less than 1.00% above the threshold [0.0] [23:07:11] RECOVERY - Puppet errors on tools-package-builder-01 is OK: OK: Less than 1.00% above the threshold [0.0] [23:07:19] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1412 is OK: OK: Less than 1.00% above the threshold [0.0] [23:07:41] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1407 is OK: OK: Less than 1.00% above the threshold [0.0] [23:07:57] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1418 is OK: OK: Less than 1.00% above the threshold [0.0] [23:08:15] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1402 is OK: OK: Less than 1.00% above the threshold [0.0] [23:08:19] RECOVERY - Puppet errors on tools-static-11 is OK: OK: Less than 1.00% above the threshold [0.0] [23:08:29] RECOVERY - Puppet errors on tools-bastion-05 is OK: OK: Less than 1.00% above the threshold [0.0] [23:08:32] RECOVERY - Puppet errors on tools-exec-1436 is OK: OK: Less than 1.00% above the threshold [0.0] [23:08:44] RECOVERY - Puppet errors on tools-exec-1406 is OK: OK: Less than 1.00% above the threshold [0.0] [23:08:44] RECOVERY - Puppet errors on tools-worker-1017 is OK: OK: Less than 1.00% above the threshold [0.0] [23:08:48] RECOVERY - Puppet errors on tools-worker-1010 is OK: OK: Less than 1.00% above the threshold [0.0] [23:08:54] RECOVERY - Puppet errors on tools-worker-1019 is OK: OK: Less than 1.00% above the threshold [0.0] [23:09:00] RECOVERY - Puppet errors on tools-mail is OK: OK: Less than 1.00% above the threshold [0.0] [23:09:02] RECOVERY - Puppet errors on tools-redis-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [23:09:14] RECOVERY - Puppet errors on tools-flannel-etcd-03 is OK: OK: Less than 1.00% above the threshold [0.0] [23:09:26] RECOVERY - Puppet errors on tools-worker-1015 is OK: OK: Less than 1.00% above the threshold [0.0] [23:09:52] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1409 is OK: OK: Less than 1.00% above the threshold [0.0] [23:09:54] RECOVERY - Puppet errors on tools-webgrid-generic-1403 is OK: OK: Less than 1.00% above the threshold [0.0] [23:09:58] RECOVERY - Puppet errors on tools-exec-1428 is OK: OK: Less than 1.00% above the threshold [0.0] [23:10:05] RECOVERY - Puppet errors on tools-exec-1440 is OK: OK: Less than 1.00% above the threshold [0.0] [23:10:11] RECOVERY - Puppet errors on tools-elastic-02 is OK: OK: Less than 1.00% above the threshold [0.0] [23:10:17] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1406 is OK: OK: Less than 1.00% above the threshold [0.0] [23:10:27] RECOVERY - Puppet errors on tools-worker-1014 is OK: OK: Less than 1.00% above the threshold [0.0] [23:10:31] RECOVERY - Puppet errors on tools-webgrid-generic-1404 is OK: OK: Less than 1.00% above the threshold [0.0] [23:12:25] RECOVERY - Puppet errors on tools-exec-1441 is OK: OK: Less than 1.00% above the threshold [0.0] [23:12:31] RECOVERY - Puppet errors on tools-exec-1417 is OK: OK: Less than 1.00% above the threshold [0.0] [23:12:57] RECOVERY - Puppet errors on tools-exec-1434 is OK: OK: Less than 1.00% above the threshold [0.0] [23:13:32] RECOVERY - Puppet errors on tools-k8s-etcd-03 is OK: OK: Less than 1.00% above the threshold [0.0] [23:13:34] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1422 is OK: OK: Less than 1.00% above the threshold [0.0] [23:13:58] RECOVERY - Puppet errors on tools-worker-1009 is OK: OK: Less than 1.00% above the threshold [0.0] [23:14:34] such recovery, so recovered [23:14:40] RECOVERY - Puppet errors on tools-exec-1404 is OK: OK: Less than 1.00% above the threshold [0.0] [23:23:27] PROBLEM - Puppet errors on tools-exec-1433 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [23:58:27] RECOVERY - Puppet errors on tools-exec-1433 is OK: OK: Less than 1.00% above the threshold [0.0]