[00:37:50] 10Labs-Kubernetes, 10Diffusion: Switch to sourcing kubernetes builds from phabricator instead of Gerrit - https://phabricator.wikimedia.org/T142448#2535464 (10yuvipanda) [00:41:49] PROBLEM - Puppet run on tools-services-02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [01:21:49] RECOVERY - Puppet run on tools-services-02 is OK: OK: Less than 1.00% above the threshold [0.0] [01:29:11] 06Labs, 10Labs-Infrastructure, 07LDAP: LDAP contains two extra incorrect host entries with aRecord=10.68.17.118, one with aRecord=10.68.22.5, and one with aRecord=10.68.16.120 - https://phabricator.wikimedia.org/T134025#2535524 (10AlexMonk-WMF) [01:30:47] 06Labs, 10Labs-Infrastructure, 07LDAP: LDAP contains two extra incorrect host entries with aRecord=10.68.17.118, one with aRecord=10.68.22.5, and one with aRecord=10.68.16.120 - https://phabricator.wikimedia.org/T134025#2252921 (10AlexMonk-WMF) ```root@shinken-01:/etc/shinken# ldapsearch -x aRecord=10.68.16.... [01:38:04] yuvipanda: do python wheels that are pure python work on any OS as long as its the same python version? [01:38:28] 06Labs, 07LDAP: Many (approx 265) labs instances in LDAP with associatedDomain: i-00000[0-9a-f]{3}\.(\b.*\b)?eqiad\.wmflabs - https://phabricator.wikimedia.org/T142449#2535532 (10AlexMonk-WMF) [01:38:29] legoktm I think so [02:11:47] 06Labs, 10Tool-Labs: Move all of tool labs to project puppetmaster - https://phabricator.wikimedia.org/T142452#2535590 (10yuvipanda) [02:12:13] 06Labs, 10Tool-Labs: Move all of tool labs to project puppetmaster - https://phabricator.wikimedia.org/T142452#2535603 (10yuvipanda) @andrew @chasemp @valhallasw @scfc any objections to this? [02:15:43] project puppetmasters getting the DB thing yuvipanda? [02:16:10] krenair yeah, I can help figure that out after. [02:16:19] nice [02:16:27] krenair ideally by running it with the 'puppetmaster' module rather than the self hosted puppet stuff [02:25:30] yuvipanda, so we'd then stop using the self hosted puppet stuff in labs? [02:26:16] krenair so in my mind, things that are per-project puppetmasters (deployment-prep, tools, integration) should not be using the self hosted puppet [02:26:23] self hosted puppet should be for single node things for testing [02:26:40] okay [02:26:41] how difficult is it to migrate? [02:27:28] depends, I guess. [02:27:37] I assume with either puppetmaster or puppet::self you need to put in some work to get collected resources. Is it much less work with puppetmaster? [02:27:46] I mean, prod already has it [02:27:55] so I suppose using it would give it to us 'for free' in some sense [02:28:17] well [02:28:55] but I'm mostly conjecturing now :) [02:29:15] like role::labs::puppetmaster ? [02:32:01] I hate the 'is_labs_master' thing there [02:32:01] grrr [02:37:48] so that's to have different different /etc/puppet/auth.conf files, certcleaner, and a different private repository [02:38:22] krenair yeah, those individual things should've been configurable [02:38:27] use_auth [02:38:29] cert cleaner should be in the role [02:38:41] I'm now reviewing the auth.conf diff [02:39:01] ... It's not very big [02:39:45] prod has "Temporary allow rhodium to compile all the catalogs while testing", labs has lines to allow Horizon to ask about available roles [02:40:08] Shouldn't this be one single template, maybe with a couple of if statements? [02:40:49] I assume labs' require_package('ruby-httpclient') is for the mwyaml hiera backend? [03:10:07] !log puppet3-diffs added Krenair to project [03:10:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Puppet3-diffs/SAL, Master [06:40:01] PROBLEM - Puppet run on tools-mail-01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [06:58:18] PROBLEM - Puppet run on tools-webgrid-lighttpd-1411 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [07:20:02] RECOVERY - Puppet run on tools-mail-01 is OK: OK: Less than 1.00% above the threshold [0.0] [07:33:17] RECOVERY - Puppet run on tools-webgrid-lighttpd-1411 is OK: OK: Less than 1.00% above the threshold [0.0] [07:39:55] !log tools.heritage Deployed latest from Git, 768b3ac, 30e33ca, 8d7de41 (T141505) [07:39:57] T141505: ErfogedBot categorisation of Kosovo pictures is wrong - https://phabricator.wikimedia.org/T141505 [07:39:59] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.heritage/SAL, Master [08:22:06] PROBLEM - SSH on tools-webgrid-lighttpd-1202 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:25:21] (03PS1) 10Elukey: Add fake aqs Cassandra user's password [labs/private] - 10https://gerrit.wikimedia.org/r/303772 (https://phabricator.wikimedia.org/T142073) [09:25:59] (03CR) 10Elukey: [C: 032 V: 032] Add fake aqs Cassandra user's password [labs/private] - 10https://gerrit.wikimedia.org/r/303772 (https://phabricator.wikimedia.org/T142073) (owner: 10Elukey) [10:40:10] PROBLEM - Puppet staleness on tools-proxy-02 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [43200.0] [11:56:58] RECOVERY - SSH on tools-webgrid-lighttpd-1202 is OK: SSH OK - OpenSSH_6.6.1p1 Ubuntu-2ubuntu2~wmfprecise2 (protocol 2.0) [12:04:19] 06Labs, 10Labs-Infrastructure: Create a new labs flavor available to all project: largedisk - https://phabricator.wikimedia.org/T142166#2525722 (10valhallasw) >>! In T142166#2535220, @yuvipanda wrote: > Proposed naming convention: > > cX.mY.sZ, > > where X is number of CPU cores, Y is GB of RAM, Z is GB of S... [13:21:56] 06Labs, 10Tool-Labs: s51059 is doing inappropiate queries - https://phabricator.wikimedia.org/T142475#2536412 (10jcrespo) [13:32:22] 06Labs, 10Tool-Labs: s51059 is doing unnecessarily slow queries - https://phabricator.wikimedia.org/T142475#2536470 (10jcrespo) [13:34:55] 06Labs, 10Tool-Labs: s51704 had multiple long-running (~1 hour) concurrent queries before labsdb crashed - https://phabricator.wikimedia.org/T142358#2536475 (10jcrespo) [13:35:00] 06Labs, 10Tool-Labs, 06Discovery, 06Maps: p50380g50921 has 20+ open persistent connections to labsdb1001 & labsdb1003 - https://phabricator.wikimedia.org/T142356#2536476 (10jcrespo) [13:35:03] 06Labs, 10Tool-Labs, 10DBA, 07Tracking: Certain tools users create multiple long running queries that take all memory from labsdb hosts, slowing it down and potentially crashing (tracking) - https://phabricator.wikimedia.org/T119601#2536473 (10jcrespo) [14:15:41] 06Labs, 10Tool-Labs: s51059 is doing unnecessarily slow queries - https://phabricator.wikimedia.org/T142475#2536527 (10valhallasw) @Cyberpower678: s51059 is tools.cyberbot, of which you are the maintainer. [14:28:07] 06Labs, 10Tool-Labs: s51059 is doing unnecessarily slow queries - https://phabricator.wikimedia.org/T142475#2536571 (10Cyberpower678) Oh. It's better to not use the DB username when mentioning problems otherwise, I'll think I was subscribed by mistake. That aside I haven't done anything recently with the ext... [14:38:06] Hey Tool Labs, how do I stop a cron job (from tools) from sending me an email every hour? [14:38:25] does the cron job have an output? [14:39:33] jan_drewniak: well option 1 is stop the cron job, option 2 is redirect output from it to null [14:40:19] Hmm, I was just added to the Tools project. I think the job looks is `/usr/bin/jsub -N cron-tools.pagecounts-1 -once -l release=trusty -mem 500m update.sh` that doesn't look like it has an output [14:42:08] 06Labs, 10Tool-Labs: s51059 is doing unnecessarily slow queries - https://phabricator.wikimedia.org/T142475#2536579 (10jcrespo) So can I block that user account? [14:42:12] jan_drewniak: what is in the email it sends? [14:42:35] chasemp: "Your job 9632258 ("cron-tools.pagecounts-1") has been submitted" [14:43:34] 06Labs, 10Tool-Labs: s51059 is doing unnecessarily slow queries - https://phabricator.wikimedia.org/T142475#2536584 (10Cyberpower678) I'm not sure. I can never remember my DB username, since it's just a bunch of numbers. [14:44:40] 06Labs, 10Tool-Labs: s51059 is doing unnecessarily slow queries - https://phabricator.wikimedia.org/T142475#2536585 (10Cyberpower678) So I am indeed s51059, and IABot is actively using that account, but not the external links table. [14:45:44] 06Labs, 10Tool-Labs: s51059 is doing unnecessarily slow queries - https://phabricator.wikimedia.org/T142475#2536589 (10jcrespo) My fear is, if it is not using that table, but queries are being sent with it, maybe it is compromised? [14:47:16] 06Labs, 10Tool-Labs: s51059 is doing unnecessarily slow queries - https://phabricator.wikimedia.org/T142475#2536590 (10jcrespo) Queries are being received from 10.68.17.155:46093 [14:48:19] 06Labs, 10Tool-Labs: s51059 is doing unnecessarily slow queries - https://phabricator.wikimedia.org/T142475#2536412 (10Platonides) More likely, it uses some library function that performs such queries internally. [14:48:24] ok jan_drewniak I see you are the only maintainer here and it's a simple cron job (which runs on our cron runners) I'll change this to redirect stdout to null [14:48:28] no worries [14:48:29] 06Labs, 10Tool-Labs: s51059 is doing unnecessarily slow queries - https://phabricator.wikimedia.org/T142475#2536595 (10Cyberpower678) Oh wait. I do have a bot script that I commissioned over 2 years ago that does use the external links table. But that bot has been in continuous operation since I commissioned... [14:49:05] jan_drewniak: let me know if that doesn't solve it [14:50:12] 06Labs, 10Tool-Labs: s51059 is doing unnecessarily slow queries - https://phabricator.wikimedia.org/T142475#2536611 (10jcrespo) Well, right now, I have to go over all tools doing things incorrectly. I suggest either stop it or rewrite it to page using indexes as mentioned it above. [14:50:51] chasemp: I'm sure it will. Thanks for editing that for me! Out of curiosity (I've just joined tools last week) how would I edit that cron job myself? [14:51:14] 06Labs, 10Tool-Labs: s51059 is doing unnecessarily slow queries - https://phabricator.wikimedia.org/T142475#2536626 (10Cyberpower678) Hmm... I seem to have another problem. My Wikitech password is being rejected. It's a saved password in my vault. [14:51:40] jan_drewniak: ssh in to a bastion as you, become pagecounts, crontab -e [14:51:49] we have trickery that edits it in teh right place regardless of which bastion you are on [14:52:34] 06Labs, 10Tool-Labs: s51059 is doing unnecessarily slow queries - https://phabricator.wikimedia.org/T142475#2536629 (10valhallasw) That's cyberbot-exec-01.cyberbot.eqiad.wmflabs, which suggests that these are legitimate queries. A compromise does not sound likely (why would one run externallinks queries?). [14:52:35] ok great, thanks! [14:53:30] didn't know you had to " become" the project [14:53:32] 06Labs, 10Tool-Labs: s51059 is doing unnecessarily slow queries - https://phabricator.wikimedia.org/T142475#2536630 (10Cyberpower678) Indeed it is, and quickly fixed my password problem, hooray for backups. :p [14:54:42] 06Labs, 10Tool-Labs: s51059 is doing unnecessarily slow queries - https://phabricator.wikimedia.org/T142475#2536631 (10Cyberpower678) To query as you suggested would require a complete rewrite of the bot script, so the only thing I can do right now is disable it. [14:57:05] chasemp hi, mutante created https://phabricator.wikimedia.org/T142440 [14:57:57] 06Labs: Request increased quota (floating-IP) for git labs project - https://phabricator.wikimedia.org/T142440#2536633 (10chasemp) Is this a long lived thing or a testing phase that we could remove in 90 days or something? [14:58:00] commented [15:00:26] chasemp thanks [15:00:47] 06Labs, 10Tool-Labs: s51059 is doing unnecessarily slow queries - https://phabricator.wikimedia.org/T142475#2536636 (10Cyberpower678) I've disabled the affected scripts. [15:01:25] 06Labs: Request increased quota (floating-IP) for git labs project - https://phabricator.wikimedia.org/T142440#2536639 (10Paladox) @chasemp hi, I think this is a long thing until we move all code reviewing to differential. Since we will be using this to test changes, before they are deployed to production gerrit... [15:02:25] 06Labs, 10Tool-Labs: s51059 is doing unnecessarily slow queries - https://phabricator.wikimedia.org/T142475#2536640 (10jcrespo) I do not think a full rewrite would be needed- a limit seems to be kept on each query. Change that to a PK limit (it should not be more than a few lines of code changed). It is ok to... [15:06:53] 06Labs, 10Tool-Labs, 10DBA, 07Tracking: Certain tools users create multiple long running queries that take all memory from labsdb hosts, slowing it down and potentially crashing (tracking) - https://phabricator.wikimedia.org/T119601#2536650 (10jcrespo) [15:06:55] 06Labs, 10Tool-Labs: s51059 is doing unnecessarily slow queries - https://phabricator.wikimedia.org/T142475#2536647 (10jcrespo) 05Open>03Resolved a:03jcrespo @Cyberpower678 Thank you for your quick response! Remember that there was no problem with using the database in the way you did it, I am just conta... [15:11:32] valhallasw`vecto <----is that a special name for some reason? [15:19:36] chasemp i have commented too :) [15:22:08] 06Labs, 10Tool-Labs: u3532 is exeuting highly-intensive, innefficient long-running queries on at least labsdb1003, potentially hurting the stability of the system - https://phabricator.wikimedia.org/T142482#2536689 (10jcrespo) [15:32:09] 06Labs, 10Labs-Infrastructure: Set up some sort of web pages at wmflabs.org or www.wmflabs.org - https://phabricator.wikimedia.org/T38885#418242 (10chasemp) >>! In T38885#2530802, @AlexMonk-WMF wrote: > @Andrew, @YuviPanda: How about I CNAME these domains to proxy-eqiad.wmflabs.org (novaproxy-01 in project-pro... [15:33:52] 06Labs, 10Tool-Labs: u3532 is exeuting highly-intensive, innefficient long-running queries on at least labsdb1003, potentially hurting the stability of the system - https://phabricator.wikimedia.org/T142482#2536735 (10jcrespo) After limiting concurrency to 3, resource usage has gone down a 50-75% (on a 24 core... [15:34:59] 06Labs, 10Tool-Labs: u3532 is executing several concurrent, highly-intensive, innefficient long-running queries on at least labsdb1003, potentially hurting the stability of the system - https://phabricator.wikimedia.org/T142482#2536736 (10jcrespo) [15:44:52] 06Labs, 10Tool-Labs: u3532 is executing several concurrent, highly-intensive, innefficient long-running queries on at least labsdb1003, potentially hurting the stability of the system - https://phabricator.wikimedia.org/T142482#2536809 (10jcrespo) [15:44:54] 06Labs, 10Tool-Labs, 10DBA, 07Tracking: Certain tools users create multiple long running queries that take all memory from labsdb hosts, slowing it down and potentially crashing (tracking) - https://phabricator.wikimedia.org/T119601#2536810 (10jcrespo) [15:53:26] jynus: this account is connected via vector, the other one (`cloud) via irccloud. I could also just call them valhallasw and valhallasw`, but that's boring ;-) [15:54:49] I do not even know what is that :-? [15:54:54] vector? [15:55:33] valhallasw`vecto, I was pinging you because you probably are aware of the latest issues with labsdb [15:56:03] I was notifying some users with inneficient queries to try to convince them to do them differently [16:02:19] 06Labs, 10Tool-Labs: u3532 is executing several concurrent, highly-intensive, innefficient long-running queries on at least labsdb1003, potentially hurting the stability of the system - https://phabricator.wikimedia.org/T142482#2536899 (10marcmiquel) I just fixed the problem and now I'm going to code a workrou... [16:03:53] 06Labs, 10Tool-Labs, 10DBA, 07Tracking: Certain tools users create multiple long running queries that take all memory from labsdb hosts, slowing it down and potentially crashing (tracking) - https://phabricator.wikimedia.org/T119601#2536903 (10jcrespo) [16:03:56] 06Labs, 10Tool-Labs: u3532 is executing several concurrent, highly-intensive, innefficient long-running queries on at least labsdb1003, potentially hurting the stability of the system - https://phabricator.wikimedia.org/T142482#2536900 (10jcrespo) 05Open>03Resolved a:03jcrespo I've talked to the user on... [16:04:47] Vector.im [16:05:19] Jynus: yes, thank you for that [16:06:05] do you know any trusted user that would be willing to test the new labsdb servers (5x faster)? [16:06:19] 06Labs, 13Patch-For-Review: promethium.wikitextexp.eqiad.wmflabs (10.68.16.2, labs baremetal host) has strange DNS A record result, and missing PTR - https://phabricator.wikimedia.org/T139438#2536905 (10AlexMonk-WMF) Much better, though not perfect: ```; <<>> DiG 9.9.5-8-Debian <<>> promethium.wikitextexp.eqia... [16:29:06] Jynus: good question - I'm not sure. [16:40:14] * tom29739 would be willing, but is not sure whether he's "trusted" [16:45:39] 06Labs: Request increased quota (floating-IP) for git labs project - https://phabricator.wikimedia.org/T142440#2537020 (10Dzahn) Yea, would be nice to keep it as long as we use Gerrit. By the way, 5 IPs have just been released recently from the semi-related staging project, we'd just want one of these. [16:46:55] jynus: Magnus might be a good person to talk to about some testing. He's certainly got plenty of different tools with various access patterns to try out. [16:46:57] Jynus: it also depends on what you want tested. Quarry might be an option as it kills queries that take too long [16:47:08] And only runs a few in parallel [16:47:42] it has to be a toold that only reads, doesn't write [16:48:16] but maybe something that can be changed back to the old servers if something is wrong [16:58:28] PROBLEM - Puppet run on tools-exec-1211 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [17:02:24] PROBLEM - Puppet run on tools-exec-1202 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [17:02:41] PROBLEM - Puppet run on tools-exec-1402 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [17:02:41] PROBLEM - Puppet run on tools-webgrid-lighttpd-1413 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [17:03:05] PROBLEM - Puppet run on tools-webgrid-generic-1404 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [17:03:23] PROBLEM - Puppet run on tools-exec-1403 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [17:03:37] PROBLEM - Puppet run on tools-exec-1213 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [17:04:19] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/VIGNERON was created, changed by VIGNERON link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/VIGNERON edit summary: Created page with "{{Tools Access Request |Justification=Formatting and visualisation around Wikidata |Completed=false |User Name=VIGNERON }}" [17:05:07] PROBLEM - Puppet run on tools-worker-1025 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [17:05:13] PROBLEM - Puppet run on tools-exec-1221 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [17:05:27] PROBLEM - Puppet run on tools-flannel-etcd-03 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [17:05:45] PROBLEM - Puppet run on tools-merlbot-proxy is CRITICAL: CRITICAL: 70.00% of data above the critical threshold [0.0] [17:05:53] PROBLEM - Puppet run on tools-elastic-03 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [17:06:09] PROBLEM - Puppet run on tools-exec-1206 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [17:06:47] PROBLEM - Puppet run on tools-webgrid-lighttpd-1405 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [17:07:13] PROBLEM - Puppet run on tools-exec-1407 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [17:07:15] PROBLEM - Puppet run on tools-flannel-etcd-01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [17:07:16] PROBLEM - Puppet run on tools-precise-dev is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [17:07:22] PROBLEM - Puppet run on tools-webgrid-lighttpd-1206 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [17:07:24] PROBLEM - Puppet run on tools-exec-1210 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [17:07:26] PROBLEM - Puppet run on tools-worker-1007 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [17:07:30] PROBLEM - Puppet run on tools-exec-1401 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [17:07:50] PROBLEM - Puppet run on tools-services-02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [17:07:53] PROBLEM - Puppet run on tools-services-01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [17:08:33] PROBLEM - Puppet run on tools-worker-1023 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [17:08:33] PROBLEM - Puppet run on tools-webgrid-lighttpd-1403 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [17:08:39] PROBLEM - Puppet run on tools-checker-02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [17:10:03] PROBLEM - Puppet run on tools-redis-1002 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [17:10:29] PROBLEM - Puppet run on tools-worker-1002 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [17:11:03] PROBLEM - Puppet run on tools-mail-01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [17:12:01] PROBLEM - Puppet run on tools-exec-1201 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [17:12:09] PROBLEM - Puppet run on tools-webgrid-generic-1405 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [17:12:15] PROBLEM - Puppet run on tools-exec-1410 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [17:12:25] PROBLEM - Puppet run on tools-exec-1214 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [17:12:30] PROBLEM - Puppet run on tools-prometheus-01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [17:12:36] PROBLEM - Puppet run on tools-exec-1217 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [17:12:36] PROBLEM - Puppet run on tools-worker-1013 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [17:12:38] PROBLEM - Puppet run on tools-exec-1406 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [17:12:44] PROBLEM - Puppet run on tools-worker-1001 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [17:13:00] PROBLEM - Puppet run on tools-exec-1409 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [17:13:28] PROBLEM - Puppet run on tools-worker-1017 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [17:13:34] PROBLEM - Puppet run on tools-webgrid-lighttpd-1412 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [17:15:08] PROBLEM - Puppet run on tools-exec-1216 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [17:15:40] PROBLEM - Puppet run on tools-web-static-01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [17:15:42] PROBLEM - Puppet run on tools-k8s-master-01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [17:15:42] PROBLEM - Puppet run on tools-bastion-02 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [17:15:44] PROBLEM - Puppet run on tools-worker-1018 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [17:15:50] PROBLEM - Puppet run on tools-mail is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [17:15:51] PROBLEM - Puppet run on tools-elastic-01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [17:16:39] PROBLEM - Puppet run on tools-worker-1009 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [17:16:47] PROBLEM - Puppet run on tools-worker-1012 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [17:16:50] PROBLEM - Puppet run on tools-exec-1208 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [17:16:53] PROBLEM - Puppet run on tools-webgrid-lighttpd-1205 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [17:17:04] PROBLEM - Puppet run on tools-webgrid-lighttpd-1409 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [17:17:05] PROBLEM - Puppet run on tools-webgrid-lighttpd-1408 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [17:17:12] PROBLEM - Puppet run on tools-worker-1016 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [17:17:14] PROBLEM - Puppet run on tools-exec-gift is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [17:17:14] PROBLEM - Puppet run on tools-exec-1212 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [17:17:16] PROBLEM - Puppet run on tools-k8s-etcd-02 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [17:17:20] PROBLEM - Puppet run on tools-webgrid-lighttpd-1410 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [17:17:22] PROBLEM - Puppet run on tools-webgrid-lighttpd-1210 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [17:17:24] PROBLEM - Puppet run on tools-webgrid-lighttpd-1209 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [17:17:40] PROBLEM - Puppet run on tools-exec-cyberbot is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [17:18:02] PROBLEM - Puppet run on tools-worker-1011 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [17:18:14] PROBLEM - Puppet run on tools-logs-01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [17:18:20] PROBLEM - Puppet run on tools-flannel-etcd-02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [17:18:28] PROBLEM - Puppet run on tools-bastion-03 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [17:18:32] PROBLEM - Puppet run on tools-k8s-etcd-01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [17:18:58] PROBLEM - Puppet run on tools-webgrid-generic-1402 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [17:19:04] PROBLEM - Puppet run on tools-webgrid-lighttpd-1415 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [17:19:22] PROBLEM - Puppet run on tools-grid-shadow is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [17:19:30] PROBLEM - Puppet run on tools-exec-1220 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [17:21:01] PROBLEM - Puppet run on tools-worker-1020 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [17:21:24] PROBLEM - Puppet run on tools-cron-01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [17:21:43] PROBLEM - Puppet run on tools-worker-1021 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [17:21:53] PROBLEM - Puppet run on tools-worker-1006 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [17:22:21] PROBLEM - Puppet run on tools-worker-1008 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [17:22:39] PROBLEM - Puppet run on tools-web-static-02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [17:22:43] PROBLEM - Puppet run on tools-exec-1209 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [17:22:46] PROBLEM - Puppet run on tools-worker-1010 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [17:23:02] PROBLEM - Puppet run on tools-elastic-02 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [17:25:31] There seems to be a puppet failure [17:25:45] yuvipanda chasemp andrewbogott ^^ [17:25:59] paladox: yes, it's fixed [17:26:03] Oh [17:26:06] thanks [17:26:33] andrewbogott could you also comment on https://phabricator.wikimedia.org/T142440 please? [17:26:55] paladox: looks like chase has responded already [17:27:05] Oh yep [17:37:39] RECOVERY - Puppet run on tools-webgrid-lighttpd-1413 is OK: OK: Less than 1.00% above the threshold [0.0] [17:38:03] RECOVERY - Puppet run on tools-webgrid-generic-1404 is OK: OK: Less than 1.00% above the threshold [0.0] [17:38:27] RECOVERY - Puppet run on tools-exec-1211 is OK: OK: Less than 1.00% above the threshold [0.0] [17:39:31] RECOVERY - Puppet run on tools-exec-1220 is OK: OK: Less than 1.00% above the threshold [0.0] [17:39:39] any estimate when labs will be able to create instances/security groups again? [17:40:27] RECOVERY - Puppet run on tools-flannel-etcd-03 is OK: OK: Less than 1.00% above the threshold [0.0] [17:40:46] RECOVERY - Puppet run on tools-merlbot-proxy is OK: OK: Less than 1.00% above the threshold [0.0] [17:41:24] andrewbogott ^^ [17:41:43] the comment above the puppet recovery messages [17:41:47] SMalyshev: later today, ideally [17:41:54] :) [17:41:58] andrewbogott: cool, thanks :) [17:42:22] RECOVERY - Puppet run on tools-exec-1202 is OK: OK: Less than 1.00% above the threshold [0.0] [17:42:30] RECOVERY - Puppet run on tools-exec-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [17:42:40] RECOVERY - Puppet run on tools-exec-1402 is OK: OK: Less than 1.00% above the threshold [0.0] [17:43:24] RECOVERY - Puppet run on tools-exec-1403 is OK: OK: Less than 1.00% above the threshold [0.0] [17:43:36] RECOVERY - Puppet run on tools-exec-1213 is OK: OK: Less than 1.00% above the threshold [0.0] [17:45:08] RECOVERY - Puppet run on tools-worker-1025 is OK: OK: Less than 1.00% above the threshold [0.0] [17:45:10] RECOVERY - Puppet run on tools-exec-1221 is OK: OK: Less than 1.00% above the threshold [0.0] [17:45:54] RECOVERY - Puppet run on tools-elastic-03 is OK: OK: Less than 1.00% above the threshold [0.0] [17:46:02] RECOVERY - Puppet run on tools-mail-01 is OK: OK: Less than 1.00% above the threshold [0.0] [17:46:12] RECOVERY - Puppet run on tools-exec-1206 is OK: OK: Less than 1.00% above the threshold [0.0] [17:46:48] RECOVERY - Puppet run on tools-webgrid-lighttpd-1405 is OK: OK: Less than 1.00% above the threshold [0.0] [17:47:14] RECOVERY - Puppet run on tools-exec-1407 is OK: OK: Less than 1.00% above the threshold [0.0] [17:47:15] RECOVERY - Puppet run on tools-flannel-etcd-01 is OK: OK: Less than 1.00% above the threshold [0.0] [17:47:17] RECOVERY - Puppet run on tools-precise-dev is OK: OK: Less than 1.00% above the threshold [0.0] [17:47:23] RECOVERY - Puppet run on tools-webgrid-lighttpd-1206 is OK: OK: Less than 1.00% above the threshold [0.0] [17:47:25] RECOVERY - Puppet run on tools-exec-1210 is OK: OK: Less than 1.00% above the threshold [0.0] [17:47:29] RECOVERY - Puppet run on tools-worker-1007 is OK: OK: Less than 1.00% above the threshold [0.0] [17:47:31] RECOVERY - Puppet run on tools-prometheus-01 is OK: OK: Less than 1.00% above the threshold [0.0] [17:47:37] RECOVERY - Puppet run on tools-worker-1013 is OK: OK: Less than 1.00% above the threshold [0.0] [17:47:37] RECOVERY - Puppet run on tools-exec-1406 is OK: OK: Less than 1.00% above the threshold [0.0] [17:47:49] RECOVERY - Puppet run on tools-worker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [17:48:33] RECOVERY - Puppet run on tools-worker-1023 is OK: OK: Less than 1.00% above the threshold [0.0] [17:48:33] RECOVERY - Puppet run on tools-webgrid-lighttpd-1403 is OK: OK: Less than 1.00% above the threshold [0.0] [17:48:39] RECOVERY - Puppet run on tools-checker-02 is OK: OK: Less than 1.00% above the threshold [0.0] [17:50:03] RECOVERY - Puppet run on tools-redis-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [17:50:29] RECOVERY - Puppet run on tools-worker-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [17:51:51] RECOVERY - Puppet run on tools-webgrid-lighttpd-1205 is OK: OK: Less than 1.00% above the threshold [0.0] [17:51:53] RECOVERY - Puppet run on tools-exec-1208 is OK: OK: Less than 1.00% above the threshold [0.0] [17:52:02] RECOVERY - Puppet run on tools-exec-1201 is OK: OK: Less than 1.00% above the threshold [0.0] [17:52:04] RECOVERY - Puppet run on tools-webgrid-lighttpd-1409 is OK: OK: Less than 1.00% above the threshold [0.0] [17:52:11] RECOVERY - Puppet run on tools-webgrid-generic-1405 is OK: OK: Less than 1.00% above the threshold [0.0] [17:52:12] RECOVERY - Puppet run on tools-worker-1016 is OK: OK: Less than 1.00% above the threshold [0.0] [17:52:14] RECOVERY - Puppet run on tools-exec-gift is OK: OK: Less than 1.00% above the threshold [0.0] [17:52:14] RECOVERY - Puppet run on tools-exec-1212 is OK: OK: Less than 1.00% above the threshold [0.0] [17:52:15] at least the bot get's not klined this time... [17:52:16] RECOVERY - Puppet run on tools-exec-1410 is OK: OK: Less than 1.00% above the threshold [0.0] [17:52:22] RECOVERY - Puppet run on tools-webgrid-lighttpd-1210 is OK: OK: Less than 1.00% above the threshold [0.0] [17:52:24] RECOVERY - Puppet run on tools-webgrid-lighttpd-1209 is OK: OK: Less than 1.00% above the threshold [0.0] [17:52:28] RECOVERY - Puppet run on tools-exec-1214 is OK: OK: Less than 1.00% above the threshold [0.0] [17:52:36] RECOVERY - Puppet run on tools-exec-1217 is OK: OK: Less than 1.00% above the threshold [0.0] [17:52:52] RECOVERY - Puppet run on tools-services-02 is OK: OK: Less than 1.00% above the threshold [0.0] [17:53:00] RECOVERY - Puppet run on tools-exec-1409 is OK: OK: Less than 1.00% above the threshold [0.0] [17:53:14] RECOVERY - Puppet run on tools-logs-01 is OK: OK: Less than 1.00% above the threshold [0.0] [17:53:20] RECOVERY - Puppet run on tools-flannel-etcd-02 is OK: OK: Less than 1.00% above the threshold [0.0] [17:53:30] RECOVERY - Puppet run on tools-worker-1017 is OK: OK: Less than 1.00% above the threshold [0.0] [17:53:34] RECOVERY - Puppet run on tools-webgrid-lighttpd-1412 is OK: OK: Less than 1.00% above the threshold [0.0] [17:53:34] RECOVERY - Puppet run on tools-k8s-etcd-01 is OK: OK: Less than 1.00% above the threshold [0.0] [17:55:08] RECOVERY - Puppet run on tools-exec-1216 is OK: OK: Less than 1.00% above the threshold [0.0] [17:55:40] RECOVERY - Puppet run on tools-web-static-01 is OK: OK: Less than 1.00% above the threshold [0.0] [17:55:42] RECOVERY - Puppet run on tools-bastion-02 is OK: OK: Less than 1.00% above the threshold [0.0] [17:55:43] RECOVERY - Puppet run on tools-k8s-master-01 is OK: OK: Less than 1.00% above the threshold [0.0] [17:55:43] RECOVERY - Puppet run on tools-worker-1018 is OK: OK: Less than 1.00% above the threshold [0.0] [17:55:51] RECOVERY - Puppet run on tools-elastic-01 is OK: OK: Less than 1.00% above the threshold [0.0] [17:55:51] RECOVERY - Puppet run on tools-mail is OK: OK: Less than 1.00% above the threshold [0.0] [17:56:01] RECOVERY - Puppet run on tools-worker-1020 is OK: OK: Less than 1.00% above the threshold [0.0] [17:56:25] RECOVERY - Puppet run on tools-cron-01 is OK: OK: Less than 1.00% above the threshold [0.0] [17:56:37] RECOVERY - Puppet run on tools-worker-1009 is OK: OK: Less than 1.00% above the threshold [0.0] [17:56:47] RECOVERY - Puppet run on tools-worker-1012 is OK: OK: Less than 1.00% above the threshold [0.0] [17:56:55] RECOVERY - Puppet run on tools-worker-1006 is OK: OK: Less than 1.00% above the threshold [0.0] [17:57:07] RECOVERY - Puppet run on tools-webgrid-lighttpd-1408 is OK: OK: Less than 1.00% above the threshold [0.0] [17:57:18] RECOVERY - Puppet run on tools-k8s-etcd-02 is OK: OK: Less than 1.00% above the threshold [0.0] [17:57:22] RECOVERY - Puppet run on tools-webgrid-lighttpd-1410 is OK: OK: Less than 1.00% above the threshold [0.0] [17:57:38] RECOVERY - Puppet run on tools-web-static-02 is OK: OK: Less than 1.00% above the threshold [0.0] [17:57:41] RECOVERY - Puppet run on tools-exec-cyberbot is OK: OK: Less than 1.00% above the threshold [0.0] [17:57:59] RECOVERY - Puppet run on tools-worker-1011 is OK: OK: Less than 1.00% above the threshold [0.0] [17:58:03] RECOVERY - Puppet run on tools-elastic-02 is OK: OK: Less than 1.00% above the threshold [0.0] [17:58:27] RECOVERY - Puppet run on tools-bastion-03 is OK: OK: Less than 1.00% above the threshold [0.0] [17:59:00] RECOVERY - Puppet run on tools-webgrid-generic-1402 is OK: OK: Less than 1.00% above the threshold [0.0] [17:59:06] RECOVERY - Puppet run on tools-webgrid-lighttpd-1415 is OK: OK: Less than 1.00% above the threshold [0.0] [17:59:23] RECOVERY - Puppet run on tools-grid-shadow is OK: OK: Less than 1.00% above the threshold [0.0] [18:01:40] RECOVERY - Puppet run on tools-worker-1021 is OK: OK: Less than 1.00% above the threshold [0.0] [18:02:20] RECOVERY - Puppet run on tools-worker-1008 is OK: OK: Less than 1.00% above the threshold [0.0] [18:02:42] RECOVERY - Puppet run on tools-exec-1209 is OK: OK: Less than 1.00% above the threshold [0.0] [18:02:48] RECOVERY - Puppet run on tools-worker-1010 is OK: OK: Less than 1.00% above the threshold [0.0] [18:18:18] 06Labs, 10Labs-Infrastructure: Set up some sort of web pages at wmflabs.org or www.wmflabs.org - https://phabricator.wikimedia.org/T38885#2537453 (10Andrew) works for me! [18:22:52] RECOVERY - Puppet run on tools-services-01 is OK: OK: Less than 1.00% above the threshold [0.0] [18:26:48] PROBLEM - Puppet run on tools-exec-1408 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [18:26:48] PROBLEM - Puppet run on tools-exec-1405 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [18:27:12] PROBLEM - Puppet run on tools-webgrid-lighttpd-1414 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [18:27:23] PROBLEM - Puppet run on tools-logs-02 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [18:28:03] PROBLEM - Puppet run on tools-webgrid-generic-1401 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [18:28:09] o.O [18:28:17] PROBLEM - Puppet run on tools-webgrid-lighttpd-1406 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [18:29:27] PROBLEM - Puppet run on tools-exec-1404 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [18:30:08] andrewbogott: you again I'm guessing? ^ :) maybe we should silence this for today? [18:30:31] yeah, probably worth silencing [18:30:31] I don't know how, off the top of my head [18:30:42] let me do it [18:30:49] I just kill the daemon on shinken-01 [18:31:44] PROBLEM - Puppet run on tools-merlbot-proxy is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0] [18:32:38] PROBLEM - Puppet run on tools-webgrid-generic-1403 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [18:33:40] PROBLEM - Puppet run on tools-exec-1402 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [18:33:40] PROBLEM - Puppet run on tools-webgrid-lighttpd-1413 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [18:34:04] PROBLEM - Puppet run on tools-webgrid-generic-1404 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [18:57:02] 10Tool-Labs-tools-Xtools: Xtools API hits error and returns 'maintenance' - https://phabricator.wikimedia.org/T136482#2336643 (10Alfa80) I [[ https://meta.wikimedia.org/wiki/User:Alfa80/XTools/XTools.js | could see that ]] removing the '**&db=**' parameter and replacing it with the project name in **XTools.js**... [19:16:08] 06Labs: Request increased quota (floating-IP) for git labs project - https://phabricator.wikimedia.org/T142440#2537645 (10Paladox) @chasemp would you be able to do this please? [19:27:05] paladox: read the parent task for ^ [19:27:33] ok [19:28:09] Specifically: Requests are processed by the Labs team during the Labs team meeting every Monday (8:30 AM PST) that the meeting is held. This schedule may be effected by holidays, conferences, or other unavailability. Requests can be granted when approved by a quorum of at least two labs team members. [19:28:37] yep and oh. [19:28:39] thanks [19:31:18] 10Tool-Labs-tools-Other: Croptool does not work on php 5.4 - https://phabricator.wikimedia.org/T103059#2537735 (10Danmichaelo) I removed the dependency on mcrypt, so it now works. [19:31:35] 10Tool-Labs-tools-Other: Croptool does not work on php 5.4 - https://phabricator.wikimedia.org/T103059#2537736 (10Danmichaelo) 05Open>03Resolved [19:35:13] 06Labs, 10Labs-Team-Backlog, 10Tool-Labs: Tool Labs: Enable php5-mcrypt on Trusty - https://phabricator.wikimedia.org/T97857#2537753 (10Danmichaelo) I've learned that mcrypt is abandonware and not really recommended to use, so I've removed the dependency on it in CropTool. If no other tools need it, I guess... [19:35:57] 10Tool-Labs-tools-Other: Croptool does not work on php 5.4 - https://phabricator.wikimedia.org/T103059#2537762 (10Danmichaelo) [19:35:59] 06Labs, 10Tool-Labs, 07Tracking: Packages to be added to toollabs puppet - https://phabricator.wikimedia.org/T55704#2537763 (10Danmichaelo) [19:36:01] 06Labs, 10Labs-Team-Backlog, 10Tool-Labs: Tool Labs: Enable php5-mcrypt on Trusty - https://phabricator.wikimedia.org/T97857#2537761 (10Danmichaelo) 05Open>03declined [19:36:25] danmichaelo: thanks for the update! [19:55:48] (03PS1) 10Awight: Fix extension project path [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/303858 [20:03:37] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/VIGNERON was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=816817 edit summary: [20:24:31] 06Labs: Request increased quota (floating-IP) for git labs project - https://phabricator.wikimedia.org/T142440#2538047 (10Andrew) One IP is definitely fine. [20:29:51] 06Labs, 07Tracking: Existing Labs project quota increase requests (Tracking) - https://phabricator.wikimedia.org/T140904#2538072 (10chasemp) [20:29:53] 06Labs: Request increased quota (floating-IP) for git labs project - https://phabricator.wikimedia.org/T142440#2538069 (10chasemp) 05Open>03Resolved a:03chasemp done [20:37:39] 06Labs, 10Tool-Labs: Move all of tool labs to project puppetmaster - https://phabricator.wikimedia.org/T142452#2538091 (10chasemp) I completely think we should consolidate Tools on one puppet master, and I'm good with the in-project master. [20:41:18] 06Labs, 10Labs-Infrastructure: Default source group allowances do not work post Liberty upgrade - https://phabricator.wikimedia.org/T142165#2538114 (10Andrew) The 'no firewalling' issue is now resolved, thanks to kernel downgrades. The actual issue in question is probably this: $ git describe --contains 3466... [20:42:31] 06Labs, 10Labs-Infrastructure: Default source group allowances do not work post Liberty upgrade - https://phabricator.wikimedia.org/T142165#2538118 (10Andrew) Today we tried to replace the normal source group rules in the 'default' service group for tools. Weirdly, the iptables rules were not applied on the l... [20:42:33] 06Labs: Request increased quota (floating-IP) for git labs project - https://phabricator.wikimedia.org/T142440#2538119 (10Paladox) @chasemp and @Andrew thanks. [20:58:31] !log git re enabling puppet on gerrit-test3 [20:58:34] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Git/SAL, Master [21:17:42] 06Labs: Request increased quota (floating-IP) for git labs project - https://phabricator.wikimedia.org/T142440#2538263 (10Dzahn) [21:20:15] 06Labs, 06Operations: Enable root passwords on Labs VMs - https://phabricator.wikimedia.org/T142216#2538272 (10Andrew) 05Open>03Resolved [21:27:06] 06Labs, 10Gerrit: please associate gerrit-01.wmflabs.org with 208.80.155.149 - https://phabricator.wikimedia.org/T142528#2538312 (10Dzahn) [21:27:30] 06Labs, 10Gerrit: please associate gerrit-01.wmflabs.org with 208.80.155.149 - https://phabricator.wikimedia.org/T142528#2538312 (10Dzahn) [21:28:08] 06Labs, 10Labs-Infrastructure, 10Gerrit: please associate gerrit-01.wmflabs.org with 208.80.155.149 - https://phabricator.wikimedia.org/T142528#2538327 (10Dzahn) [21:30:15] 06Labs, 10Labs-Infrastructure, 10Gerrit: please associate gerrit-01.wmflabs.org with 208.80.155.149 - https://phabricator.wikimedia.org/T142528#2538329 (10Paladox) @chasemp would you be able to do this please? [21:35:47] 06Labs, 10Labs-Infrastructure, 10Gerrit: please associate gerrit-01.wmflabs.org with 208.80.155.149 - https://phabricator.wikimedia.org/T142528#2538352 (10chasemp) I'm not sure what to do and you guys may be the first to request, but I think you are looking at `gerrit-01.git.wmflabs.org` these days fyi. @... [21:37:59] 06Labs, 10Labs-Infrastructure, 10Gerrit: please associate gerrit-01.wmflabs.org with 208.80.155.149 - https://phabricator.wikimedia.org/T142528#2538360 (10Paladox) @chasemp I belive gerrit-01.wmflabs.org will work since phabricator project have phabricator.wmflabs.org. But I'm guessing I may be wrong. [21:40:39] 06Labs, 10Labs-Infrastructure, 10Gerrit: please associate gerrit-01.wmflabs.org with 208.80.155.149 - https://phabricator.wikimedia.org/T142528#2538365 (10Paladox) Actually gerrit.git.wmflabs.org will do please :). [21:43:38] 06Labs, 10Labs-Infrastructure, 10Gerrit: please associate gerrit-01.wmflabs.org with 208.80.155.149 - https://phabricator.wikimedia.org/T142528#2538366 (10Dzahn) Thanks @chasemp,. Yea, .git.wmflabs.org is totally fine of course. Actually we'd go with "gerrit.git.wmflabs.org" then. No number needed. We'll ju... [22:04:37] 06Labs, 06Operations: Enable root passwords on Labs VMs - https://phabricator.wikimedia.org/T142216#2538411 (10Andrew) 05Resolved>03Open [22:04:53] 06Labs, 06Operations: Enable root passwords on Labs VMs - https://phabricator.wikimedia.org/T142216#2527493 (10Andrew) [22:05:57] 06Labs, 10Labs-Infrastructure, 10Gerrit: please associate gerrit-01.wmflabs.org with 208.80.155.149 - https://phabricator.wikimedia.org/T142528#2538416 (10AlexMonk-WMF) As this is in the `git` project you should be able to get ownership in designate of a `git.wmflabs.org` domain here by asking, and then crea... [22:06:01] 06Labs, 10Labs-Infrastructure, 10Gerrit: please associate gerrit-01.wmflabs.org with 208.80.155.149 - https://phabricator.wikimedia.org/T142528#2538419 (10Dzahn) I just learned more about this after talking to Krenair. So.. it seems what we really want is just "add subdomain git.wmflabs.org to project git"... [22:08:00] !log ores deployed ores-wmflabs-config:1b1c56d [22:08:04] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Ores/SAL, Master [22:08:30] !log ores restarted precached service on ores-web-03 [22:08:33] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Ores/SAL, Master [22:09:05] 06Labs, 10Labs-Infrastructure, 10Gerrit: please associate gerrit-01.wmflabs.org with 208.80.155.149 - https://phabricator.wikimedia.org/T142528#2538425 (10Paladox) @Krenair or @AlexMonk-WMF I have deleted it now. git.wmflabs.org should now be available. [22:09:26] 06Labs, 10Labs-Infrastructure, 10Gerrit: please associate gerrit-01.wmflabs.org with 208.80.155.149 - https://phabricator.wikimedia.org/T142528#2538428 (10Andrew) I created git.wmflabs.org domain in project 'git' [22:11:07] 06Labs, 10Labs-Infrastructure, 10Gerrit: please associate gerrit-01.wmflabs.org with 208.80.155.149 - https://phabricator.wikimedia.org/T142528#2538432 (10Paladox) @Andrew thanks :) [22:18:55] 06Labs, 10Labs-Infrastructure, 10Gerrit: please associate gerrit-01.wmflabs.org with 208.80.155.149 - https://phabricator.wikimedia.org/T142528#2538479 (10AlexMonk-WMF) 05Open>03Resolved a:03Andrew @Dzahn says it shows up for him, calling this resolved. [22:19:20] 06Labs, 10Labs-Infrastructure, 10Gerrit: please associate gerrit-01.wmflabs.org with 208.80.155.149 - https://phabricator.wikimedia.org/T142528#2538485 (10Dzahn) I see it in Horizon -> DNS now and can control the records. thanks! [22:21:41] RECOVERY - Puppet run on tools-k8s-master-01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:22:35] RECOVERY - Puppet run on tools-worker-1009 is OK: OK: Less than 1.00% above the threshold [0.0] [22:23:11] RECOVERY - Puppet run on tools-worker-1016 is OK: OK: Less than 1.00% above the threshold [0.0] [22:23:15] RECOVERY - Puppet run on tools-k8s-etcd-02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:23:59] RECOVERY - Puppet run on tools-worker-1011 is OK: OK: Less than 1.00% above the threshold [0.0] [22:24:17] RECOVERY - Puppet run on tools-flannel-etcd-02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:24:25] RECOVERY - Puppet run on tools-bastion-03 is OK: OK: Less than 1.00% above the threshold [0.0] [22:24:30] RECOVERY - Puppet run on tools-worker-1017 is OK: OK: Less than 1.00% above the threshold [0.0] [22:24:32] RECOVERY - Puppet run on tools-k8s-etcd-01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:26:42] RECOVERY - Puppet run on tools-bastion-02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:26:44] RECOVERY - Puppet run on tools-worker-1018 is OK: OK: Less than 1.00% above the threshold [0.0] [22:26:54] RECOVERY - Puppet run on tools-elastic-01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:27:06] RECOVERY - Puppet run on tools-worker-1020 is OK: OK: Less than 1.00% above the threshold [0.0] [22:27:40] RECOVERY - Puppet run on tools-worker-1021 is OK: OK: Less than 1.00% above the threshold [0.0] [22:27:48] RECOVERY - Puppet run on tools-worker-1012 is OK: OK: Less than 1.00% above the threshold [0.0] [22:27:54] RECOVERY - Puppet run on tools-worker-1006 is OK: OK: Less than 1.00% above the threshold [0.0] [22:28:20] RECOVERY - Puppet run on tools-worker-1008 is OK: OK: Less than 1.00% above the threshold [0.0] [22:29:02] RECOVERY - Puppet run on tools-elastic-02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:33:47] RECOVERY - Puppet run on tools-worker-1010 is OK: OK: Less than 1.00% above the threshold [0.0] [22:34:46] 10Tool-Labs-tools-Other: Server admin log should be timezone-aware - https://phabricator.wikimedia.org/T142536#2538569 (10bd808) Source is at https://github.com/bd808/sal. The dates in the Elasticsearch index are in UTC so the client would need to convert from the local timezone to UTC when searching. There is... [22:35:39] RECOVERY - Puppet run on tools-docker-registry-01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:35:55] RECOVERY - Puppet run on tools-worker-1005 is OK: OK: Less than 1.00% above the threshold [0.0] [22:35:59] 10Tool-Labs-tools-Other, 10Adminbot, 10Deployment-Systems: Server admin log should be timezone-aware - https://phabricator.wikimedia.org/T142536#2538573 (10greg) [22:36:32] 06Labs: Don't set instance root passwords if using a local puppetmaster - https://phabricator.wikimedia.org/T142531#2538575 (10Danny_B) [22:37:15] RECOVERY - Puppet run on tools-worker-1004 is OK: OK: Less than 1.00% above the threshold [0.0] [22:37:15] RECOVERY - Puppet run on tools-worker-1015 is OK: OK: Less than 1.00% above the threshold [0.0] [22:37:23] RECOVERY - Puppet run on tools-worker-1022 is OK: OK: Less than 1.00% above the threshold [0.0] [22:38:15] RECOVERY - Puppet run on tools-worker-1019 is OK: OK: Less than 1.00% above the threshold [0.0] [22:38:25] 10Tool-Labs-tools-Other: Server admin log should be timezone-aware - https://phabricator.wikimedia.org/T142536#2538593 (10greg) [22:39:13] 10Tool-Labs-tools-Other: Server admin log should be timezone-aware - https://phabricator.wikimedia.org/T142536#2538598 (10bd808) My general theory on date/time data related to servers is that it should always be stored and thought of as UTC values. Things get kind of ugly otherwise with servers and users scatter... [22:40:35] RECOVERY - Puppet run on tools-worker-1003 is OK: OK: Less than 1.00% above the threshold [0.0] [22:40:36] RECOVERY - Puppet run on tools-bastion-05 is OK: OK: Less than 1.00% above the threshold [0.0] [22:41:08] RECOVERY - Puppet run on tools-worker-1025 is OK: OK: Less than 1.00% above the threshold [0.0] [22:41:28] RECOVERY - Puppet run on tools-flannel-etcd-03 is OK: OK: Less than 1.00% above the threshold [0.0] [22:41:52] RECOVERY - Puppet run on tools-elastic-03 is OK: OK: Less than 1.00% above the threshold [0.0] [22:42:26] RECOVERY - Puppet run on tools-worker-1014 is OK: OK: Less than 1.00% above the threshold [0.0] [22:43:14] RECOVERY - Puppet run on tools-flannel-etcd-01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:43:26] RECOVERY - Puppet run on tools-worker-1007 is OK: OK: Less than 1.00% above the threshold [0.0] [22:43:45] (03PS1) 10EdouardHue: Importing code [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/303933 [22:48:28] RECOVERY - Puppet run on tools-prometheus-01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:48:34] RECOVERY - Puppet run on tools-worker-1013 is OK: OK: Less than 1.00% above the threshold [0.0] [22:48:44] RECOVERY - Puppet run on tools-worker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [22:49:35] RECOVERY - Puppet run on tools-worker-1023 is OK: OK: Less than 1.00% above the threshold [0.0] [22:51:29] RECOVERY - Puppet run on tools-worker-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [23:06:16] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 10Security-Reviews, 10Striker: Security review of Tool Labs console application - https://phabricator.wikimedia.org/T135784#2538657 (10bd808) >>! In T135784#2487952, @bd808 wrote: > @dpatrick Can we call this closed now, or are there other issues that you... [23:13:51] PROBLEM - Puppet run on tools-services-02 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [23:30:14] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 10Diffusion, and 2 others: Create application to manage Diffusion repositories for a Tool Labs project - https://phabricator.wikimedia.org/T133252#2538728 (10dpatrick) [23:30:17] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 10Security-Reviews, 10Striker: Security review of Tool Labs console application - https://phabricator.wikimedia.org/T135784#2538724 (10dpatrick) 05Open>03Resolved Yep, you can consider it resolved. Sorry for the delay! [23:37:15] 06Labs, 10Labs-Infrastructure: Set up some sort of web pages at wmflabs.org or www.wmflabs.org - https://phabricator.wikimedia.org/T38885#2538738 (10AlexMonk-WMF) I've set up the www DNS record: ```www.wmflabs.org. 3600 IN CNAME proxy-eqiad.wmflabs.org. proxy-eqiad.wmflabs.org. 86400 IN A 208.80.155.156``` Wil... [23:53:51] RECOVERY - Puppet run on tools-services-02 is OK: OK: Less than 1.00% above the threshold [0.0]