[00:00:53] <zhuyifei1999_>	 Platonides: sorry about that 'see above'. I meant I was demonstrating a difference in the errors to isolate the issue, rather than trying seek help about my command
[00:02:29] <Platonides>	 yep, sorry
[00:02:51] <Platonides>	 I gave it a quick look, and spotted the obvious
[00:53:22] <shinken-wm>	 PROBLEM - Puppet errors on tools-exec-1430 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[01:03:23] <wikibugs>	 10Labs, 10Tool-Labs, 10Tool-Labs-tools-Database-Queries: Tool Labs logging vs indexed version returning dramatically different results - https://phabricator.wikimedia.org/T168349#3361809 (10MusikAnimal)
[01:03:37] <wikibugs>	 10Labs, 10Tool-Labs, 10Tool-Labs-tools-Database-Queries: Tool Labs logging vs indexed version returning dramatically different results - https://phabricator.wikimedia.org/T168349#3361822 (10MusikAnimal)
[01:06:08] <wikibugs>	 10Labs, 10Tool-Labs, 10Tool-Labs-tools-Database-Queries: Tool Labs logging vs indexed version returning dramatically different results - https://phabricator.wikimedia.org/T168349#3361823 (10MusikAnimal)
[01:09:36] <wikibugs>	 10Labs, 10Tool-Labs, 10Tool-Labs-tools-Database-Queries: Tool Labs logging vs indexed version returning dramatically different results - https://phabricator.wikimedia.org/T168349#3361809 (10MZMcBride) @jcrespo can answer much better than I can, but in my experience, these types of data integrity issues on To...
[01:20:13] <wikibugs>	 10Labs, 10Tool-Labs, 10Tool-Labs-tools-Database-Queries: Tool Labs logging vs indexed version returning dramatically different results - https://phabricator.wikimedia.org/T168349#3361828 (10MusikAnimal) I want to also point out this query took 9 seconds to finish on production.
[02:03:22] <shinken-wm>	 RECOVERY - Puppet errors on tools-exec-1430 is OK: OK: Less than 1.00% above the threshold [0.0]
[02:24:24] <wikibugs>	 10Labs, 10Tool-Labs, 10DBA: Tool Labs logging vs indexed version returning dramatically different results - https://phabricator.wikimedia.org/T168349#3361851 (10zhuyifei1999)
[02:30:21] <wikibugs>	 10Labs, 10Tool-Labs, 10DBA: Tool Labs logging vs indexed version returning dramatically different results - https://phabricator.wikimedia.org/T168349#3361809 (10zhuyifei1999) FWIW, using https://tools.wmflabs.org/tools-info/optimizer.py, the EXPLAIN-s for both queries query is basically the same as (differ a...
[02:38:00] <wikibugs>	 10Labs, 10Tool-Labs, 10DBA: Tool Labs logging vs indexed version returning dramatically different results - https://phabricator.wikimedia.org/T168349#3361856 (10zhuyifei1999) Regarding logging_logindex:  | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | | 1 | SIMPLE | l...
[03:23:28] <wikibugs>	 10Labs, 10Tool-Labs, 10DBA: Tool Labs logging vs indexed version returning dramatically different results - https://phabricator.wikimedia.org/T168349#3361908 (10bd808)
[03:23:30] <wikibugs>	 10Labs, 10DBA, 10Epic: Labs database replica drift - https://phabricator.wikimedia.org/T138967#3361909 (10bd808)
[03:24:50] <wikibugs>	 10Labs, 10DBA: enwiki_p logging vs logging_userindex returning dramatically different results - https://phabricator.wikimedia.org/T168349#3361809 (10bd808)
[03:26:54] <wikibugs>	 10Labs, 10DBA, 10Epic: Labs database replica drift - https://phabricator.wikimedia.org/T138967#3361916 (10bd808) Linked {T168349} as a child. The report there is pretty long for pasting into this task.
[03:30:14] <wikibugs>	 10Labs, 10DBA: enwiki_p logging vs logging_userindex returning dramatically different results - https://phabricator.wikimedia.org/T168349#3361928 (10zhuyifei1999) Maybe DBAs have better ideas, but this is an optimised-to-4-min query:  {P5596}  The relevant EXPLAIN is: | id | select_type | table | type | possib...
[03:31:18] <zhuyifei1999_>	 bd808: I don't really think it's a drift. it's just mysql grouping mechanism being weird
[03:34:08] <zhuyifei1999_>	 and mysql optimizer is bad (and you can't really write a good optimizer for such declarative language as SQL anyways)
[03:37:34] <bd808>	 zhuyifei1999_: could be. my default guess for "results different than prod" is to link to that tracking bug
[03:37:50] <zhuyifei1999_>	 k
[03:39:03] <bd808>	 there was a time when I knew lots of sql things, but I've forgotten most of them ;)
[03:39:45] <zhuyifei1999_>	 yeah, sql is madness
[03:39:57] * zhuyifei1999_ goes and write c
[03:40:04] <Esther>	 I bumped https://phabricator.wikimedia.org/T109179
[03:40:14] <Esther>	 Since literally every time drift comes up, this gets mentioned.
[03:40:18] <Esther>	 And it seems to have stalled.
[03:40:24] <bd808>	 we are actually getting close
[03:40:30] <Esther>	 Nice.
[03:40:46] <Esther>	 I saw some notes about codfw testing, but they're from late 2015 and early 2016.
[03:40:54] <bd808>	 I was taking to Jamie and Manuel about it in an email thread last week
[03:41:17] <Esther>	 There's a separate task somewhere about automating data integrity checks.
[03:41:21] <Esther>	 I think.
[03:41:36] <Esther>	 But at the moment such a system would just tell us what we already know.
[03:42:14] <bd808>	 T140788 is the master task for the new replicas that are using row-based replication
[03:42:14] <stashbot>	 T140788: Labs databases rearchitecture (tracking) - https://phabricator.wikimedia.org/T140788
[03:42:15] <Esther>	 Hmmm, maybe the lack of primary keys was the blocker to row-based replication.
[03:42:22] <bd808>	 yeah
[03:44:37] <bd808>	 the part that we are mostly waiting on is filling up the new replicas with sanitized data -- T153743
[03:44:37] <stashbot>	 T153743: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743
[04:02:15] <wikibugs>	 10Labs, 10DBA: enwiki_p logging vs logging_userindex returning dramatically different results - https://phabricator.wikimedia.org/T168349#3361943 (10MusikAnimal) >>! In T168349#3361928, @zhuyifei1999 wrote: > Maybe DBAs have better ideas, but this is an optimised-to-4-min query: >  > P5596  This is amazing. Th...
[05:08:21] <wikibugs>	 10Labs, 10DBA: enwiki_p logging vs logging_userindex returning dramatically different results - https://phabricator.wikimedia.org/T168349#3361969 (10zhuyifei1999) >>! In T168349#3361943, @MusikAnimal wrote: > I think something really funky is going on.   The grouping mechanism don't seem to work correctly from...
[05:15:53] <shinken-wm>	 PROBLEM - Puppet errors on tools-webgrid-generic-1401 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0]
[05:55:52] <shinken-wm>	 RECOVERY - Puppet errors on tools-webgrid-generic-1401 is OK: OK: Less than 1.00% above the threshold [0.0]
[06:35:31] <shinken-wm>	 PROBLEM - Puppet errors on tools-exec-1403 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]
[06:49:22] <shinken-wm>	 PROBLEM - Puppet errors on tools-exec-1430 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[06:49:23] <wikibugs>	 10Labs, 10Quarry, 10Community-Wikimetrics, 10DBA, and 2 others: Evaluate future of wmf puppet module "mysql" - https://phabricator.wikimedia.org/T165625#3362051 (10jcrespo)
[07:05:39] <shinken-wm>	 PROBLEM - Puppet errors on tools-exec-1404 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]
[07:10:30] <shinken-wm>	 RECOVERY - Puppet errors on tools-exec-1403 is OK: OK: Less than 1.00% above the threshold [0.0]
[07:29:20] <shinken-wm>	 RECOVERY - Puppet errors on tools-exec-1430 is OK: OK: Less than 1.00% above the threshold [0.0]
[07:34:43] <wikibugs>	 10Tool-Labs-tools-Xtools, 10Community-Tech-Sprint: Restrict access to users' edit stats unless opted-in - https://phabricator.wikimedia.org/T165401#3362175 (10Samwilson) It is live, yes.  It seems to be timing out for users with large numbers of edits. Even loading your example with just the 'general stats' se...
[07:37:24] <weirdo>	 !help
[07:37:24] <wm-bot2>	 weirdo: If you don't get a response in 15-30 minutes, please create a phabricator task -- https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?projects=wmcs-team
[07:40:40] <shinken-wm>	 RECOVERY - Puppet errors on tools-exec-1404 is OK: OK: Less than 1.00% above the threshold [0.0]
[07:41:49] <wikibugs>	 10Labs, 10Labs-Infrastructure, 10Operations: Puppet CA: virt1000.wikimedia.org' will expire on 2017-08-15 - https://phabricator.wikimedia.org/T168110#3356455 (10akosiaris) It's not related to the host. It's the Puppet CA itself as @Andrew says. On a random VM created on Mar 26  ``` sudo openssl x509 -noout -...
[07:44:46] <zhuyifei1999_>	 weirdo: yes?
[07:49:01] <weirdo>	 I got an email saying my article was reviewed
[07:49:08] <weirdo>	 I don't know what that means
[07:49:38] <weirdo>	 I googled it to death
[08:10:19] <zhuyifei1999_>	 weirdo: you mean wikipedia?
[08:11:16] <zhuyifei1999_>	 if that's the case, you might want to ask in #wikipedia-en-help
[08:11:41] <weirdo>	 thanks
[08:13:10] <zhuyifei1999_>	 np
[08:59:51] <wikibugs>	 10Tool-Labs-tools-fatameh: URL regexes are too loose - https://phabricator.wikimedia.org/T168363#3362362 (10Tarrow)
[09:01:46] <wikibugs>	 10Tool-Labs-tools-fatameh: Enable Auth token for non browser session use - https://phabricator.wikimedia.org/T168364#3362375 (10Tarrow)
[09:49:08] <wikibugs>	 10Labs, 10DBA: enwiki_p logging vs logging_userindex returning dramatically different results - https://phabricator.wikimedia.org/T168349#3361809 (10Marostegui) In which hosts did you do the tests?
[10:02:46] <wikibugs>	 10Tool-Labs-tools-Other: Heavy 19-hour quries on labsdb1005 (tools-db) by s51203 at s51203__baglama2_p - https://phabricator.wikimedia.org/T168375#3362604 (10jcrespo)
[10:05:06] <wikibugs>	 10Tool-Labs-tools-Other: Heavy 19-hour quries on labsdb1005 (tools-db) by s51203 at s51203__baglama2_p - https://phabricator.wikimedia.org/T168375#3362618 (10jcrespo)
[10:05:08] <wikibugs>	 10Labs, 10Tool-Labs, 10DBA, 10Tracking: Certain tools users create multiple long running queries that take all memory from labsdb hosts, slowing it down and potentially crashing (tracking) - https://phabricator.wikimedia.org/T119601#3362619 (10jcrespo)
[10:06:02] <wikibugs>	 10Labs, 10Tool-Labs, 10DBA, 10Tracking: Certain tools users create multiple long running queries that take all memory from labsdb hosts, slowing it down and potentially crashing (tracking) - https://phabricator.wikimedia.org/T119601#1830725 (10jcrespo)
[10:06:04] <wikibugs>	 10Labs, 10Tool-Labs, 10DBA: s51053 (tools.jackbot) is abusing resources on labsdbs, throttle his grants - https://phabricator.wikimedia.org/T114559#3362620 (10jcrespo)
[10:06:15] <wikibugs>	 10Labs, 10Tool-Labs, 10DBA: s51053 (tools.jackbot) is abusing resources on labsdbs, throttle his grants - https://phabricator.wikimedia.org/T114559#1699378 (10jcrespo)
[10:06:17] <wikibugs>	 10Labs, 10Tool-Labs, 10DBA, 10Tracking: Certain tools users create multiple long running queries that take all memory from labsdb hosts, slowing it down and potentially crashing (tracking) - https://phabricator.wikimedia.org/T119601#1830725 (10jcrespo)
[10:08:00] <wikibugs>	 10Labs, 10Tool-Labs, 10DBA: s51053 (tools.jackbot) is abusing resources on labsdbs, throttle his grants - https://phabricator.wikimedia.org/T114559#3362637 (10JackPotte) This should be resolved now, do you know a monitoring on which I could check it please?
[10:09:29] <wikibugs>	 10Labs, 10Tool-Labs, 10DBA: s51053 (tools.jackbot) is abusing resources on labsdbs, throttle his grants - https://phabricator.wikimedia.org/T114559#3362642 (10jcrespo) This is resolved, I only edited it because of admin purposes (correct tracking). Sorry for the spam, email is automatic.
[14:04:58] <wikibugs>	 10Labs, 10Labs-Infrastructure, 10Operations, 10cloud-services-team (Kanban): Puppet CA: virt1000.wikimedia.org' will expire on 2017-08-15 - https://phabricator.wikimedia.org/T168110#3363428 (10Andrew)
[14:53:40] <wikibugs>	 10Tool-Labs-tools-fatameh: Enable Auth token for non browser session use - https://phabricator.wikimedia.org/T168364#3362375 (10Tarrow) 05Open>03Resolved a:03Tarrow
[14:54:36] <wikibugs>	 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-eqiad: rack/setup/install labnodepool1002.eqiad.wmnet - https://phabricator.wikimedia.org/T168407#3363615 (10RobH)
[15:14:09] <wikibugs>	 10Labs, 10MediaWiki-extensions-OpenStackManager, 10User-Addshore: Add "GoranSMilovanovic" to labs "bastion" project - https://phabricator.wikimedia.org/T165294#3363708 (10Addshore) 05Open>03Resolved a:03Addshore
[15:22:33] <wikibugs>	 10Labs, 10DBA, 10User-bd808, 10cloud-services-team (Kanban): setup dewiki and wikidatawiki on the labsdb1009, 1010 and 1011 - https://phabricator.wikimedia.org/T168021#3353312 (10JAllemandou) Hi @Marostegui I can't connect to `dewiki_p` nor `wikidatawiki_p` on `labsdb-analytics`. Should this task be reopened?
[15:23:20] <wikibugs>	 10Labs, 10DBA, 10User-bd808, 10cloud-services-team (Kanban): setup dewiki and wikidatawiki on the labsdb1009, 1010 and 1011 - https://phabricator.wikimedia.org/T168021#3363746 (10Marostegui) What errors are you getting?
[15:25:04] <wikibugs>	 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-eqiad, and 2 others: rack/setup/install labvirt101[5-8] - https://phabricator.wikimedia.org/T165531#3363765 (10Andrew)
[15:25:51] <wikibugs>	 10Tool-Labs-tools-Other: Heavy 19-hour quries on labsdb1005 (tools-db) by s51203 at s51203__baglama2_p - https://phabricator.wikimedia.org/T168375#3363766 (10Magnus) I have rewritten the query, should work better now I hope
[15:39:40] <wikibugs>	 10Labs, 10DBA, 10User-bd808, 10cloud-services-team (Kanban): setup dewiki and wikidatawiki on the labsdb1009, 1010 and 1011 - https://phabricator.wikimedia.org/T168021#3363809 (10Marostegui) I have recreated the views, can you try again? if you can show the error you are getting, that would be helpful. Als...
[16:05:33] <wikibugs>	 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-eqiad, and 2 others: rack/setup/install labvirt101[5-8] - https://phabricator.wikimedia.org/T165531#3363907 (10RobH) a:05Cmjohnson>03RobH
[16:25:49] <wikibugs>	 10Labs, 10DBA, 10User-bd808, 10cloud-services-team (Kanban): setup dewiki and wikidatawiki on the labsdb1009, 1010 and 1011 - https://phabricator.wikimedia.org/T168021#3364008 (10JAllemandou) I use a script checking available views  from `information_schema`. For the moment it still tells me `dewiki` and `...
[16:27:52] <robh>	 chasemp or andrewbogott: im taking over the setup task for the new labvirt hosts in eqiad
[16:28:17] <robh>	 but i have questions for the networking (they are currently in 1gbe racks, but have both 1gbe and 10gbe capabitlity)
[16:28:25] <robh>	 and on partitioning, they are 10 * 1.6TB ssds
[16:28:43] <robh>	 andrewbogott: asked for a small raid for the os and a larger for the data
[16:28:44] <robh>	 With hardware raid, you raid the ENTIRE disk, so if you want your OS data on a different raid partition than the data, it has to be split into, at minimum, 2 of the 10 1.6TB SSDs.  That would lose a substantial amount of data to just silo the OS to its own hardware raid.
[16:30:13] <wikibugs>	 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-eqiad, and 2 others: rack/setup/install labvirt101[5-8] - https://phabricator.wikimedia.org/T165531#3364014 (10RobH) @andrew: These hosts were reviewed and approved for order with 10 * 1.6TB Intel S3510 SSDs.  With hardware raid, you raid the ENTIRE disk,...
[16:34:19] <wikibugs>	 10Labs, 10DBA, 10User-bd808, 10cloud-services-team (Kanban): setup dewiki and wikidatawiki on the labsdb1009, 1010 and 1011 - https://phabricator.wikimedia.org/T168021#3364022 (10Marostegui) I don't know what that script does but: ``` mysql:root@localhost [information_schema]> select @@hostname; +---------...
[16:34:48] <andrewbogott>	 robh: hm, ok… so I guess we want just one big hardware raid and then we can partition that for os and VMs.
[16:34:53] * andrewbogott braces for a day of partman
[16:34:58] <andrewbogott>	 robh: I'll update accordingly
[16:35:17] <andrewbogott>	 as far as I know the row those are in doesn't have ports for 10Gb so everything is just set up with 1G, these can be the same.
[16:35:45] <robh>	 andrewbogott: ill write the recipe
[16:35:52] <andrewbogott>	 great!
[16:35:57] <robh>	 i just bashed one out for dumpsdata with only 2 live hacks in testing
[16:36:03] <robh>	 ie: got it installed in less than 3 reboots!
[16:36:09] <wikibugs>	 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-eqiad, and 2 others: rack/setup/install labvirt101[5-8] - https://phabricator.wikimedia.org/T165531#3364029 (10Andrew) > Would that be acceptable?  Yep, sounds great.  Thank you.
[16:36:09] <robh>	 so im due for more partman hell
[16:36:10] <robh>	 ;]
[16:36:33] <andrewbogott>	 3 reboots is a lot less than my last attempt
[16:37:01] <robh>	 well, im also just totally going to steal what i made for dumpsdata and use it here with a little bit of refactoring on mount points
[16:37:10] <robh>	 as i just this second realized they are nearly identical otherwise
[16:37:18] <robh>	 \o/
[16:37:33] <robh>	 also these only have one of the two interfaces hooked up
[16:37:38] <robh>	 where it seems other labvirts have 2 ports
[16:37:44] <robh>	 do these need to have 2 ports bonded for speed?
[16:38:38] <andrewbogott>	 I'm not sure.
[16:38:48] <andrewbogott>	 We definitely want these to be the same as the other labvirst
[16:39:01] <andrewbogott>	 but I would've guessed that they had one port for ssh and one port for VM neteworking
[16:39:11] <andrewbogott>	 chasemp: do you remember?
[16:39:29] <robh>	 yeah in site.pp it doesnt say
[16:39:36] <robh>	 but it does say labvirts that already exist have     openstack::nova::partition{ '/dev/sdb': } 
[16:39:44] <robh>	 which wont work for these if they have to define that as well
[16:39:47] <robh>	 (since these are hw raid)
[16:39:58] <robh>	 we can also make the new hosts sw raid, but seems wasteful on cpu overhead ;]
[16:40:27] <robh>	 theopenstack::nova::partition wont block my installation of the OS, but it seems like it may block you later ;]
[16:40:50] <robh>	 (i hope it can also be defined by a directory structure rather than a partition)
[16:40:59] <robh>	 or we may have to peel off an SSD for it?
[16:42:24] <andrewbogott>	 I think that specifying the partition device is in hiera and differs from labvirt to labvirt
[16:42:31] <andrewbogott>	 so this won't be any different
[16:42:41] <robh>	 cool
[16:42:51] <robh>	 well
[16:42:56] <robh>	 partition device is different
[16:42:56] <andrewbogott>	 e.g. role::labs::openstack::nova::compute::instance_dev: "/dev/mapper/tank-data"
[16:43:01] <robh>	 ohhhh
[16:43:03] <robh>	 ok
[16:43:16] <andrewbogott>	 in puppet/hieradata/hosts/labvirt1013.yaml 
[16:43:17] <robh>	 just odd that site.pp has it for node /^labvirt100[0-9].eqiad.wmnet/ {  but the others are in heira
[16:43:20] <andrewbogott>	 (for example)
[16:43:21] <chasemp>	 andrewbogott: robh no bonding on the labvirts
[16:43:26] <chasemp>	 sorry I missed hte ping
[16:43:30] <robh>	 chasemp: do they need two interfaces?
[16:43:50] <robh>	 right now only 1 is wired for the labs-hosts1-b-eqiad
[16:43:51] <andrewbogott>	 chasemp: don't we have both ports hooked up, though?  One on the lab VM network and one on the labsupport (or maybe normal prod) network?
[16:43:51] <chasemp>	 yes, one in labs-hosts and one is a trunk 
[16:43:59] <robh>	 ok, so eth0 is labs-hosts1-b-eqiad
[16:44:03] <robh>	 and eth1 is trunk?
[16:44:10] <chasemp>	 robh: same scheme as labvirt2003 was in codfew
[16:44:12] <chasemp>	 yeah
[16:44:20] <andrewbogott>	 I would've said the other way around...
[16:44:22] <andrewbogott>	 eth0 trunk
[16:44:29] <andrewbogott>	 but, can we look at an existing one and see?
[16:44:46] <chasemp>	 eth0 is labs-hosts and eth1 is trunk on existing
[16:45:08] <robh>	 yeah
[16:45:09] <chasemp>	 eth1.1102 is the subinterface
[16:45:11] <robh>	 it seems other way areound
[16:45:12] <chasemp>	 same for labnet
[16:45:17] <robh>	 ge-5/0/0        up    up   labvirt1014 eth0
[16:45:17] <robh>	 ge-5/0/3        up    up   labvirt1014 eth1
[16:45:29] <robh>	 and eth1 is in labs-instances1-b-eqiad
[16:45:43] <robh>	 unless they are labeled wrong on switch
[16:46:08] <robh>	 but if so then its wrong for labvirt1013 as well
[16:46:14] <robh>	 it has eth1 in labs-instances1-b-eqiad
[16:46:22] <robh>	 which seems odd, since then its primary interface is in 'trunk'
[16:46:29] <robh>	 which i dont see on the switch as a vlan, so its something else
[16:46:59] <robh>	 oh wait, found labs-hosts1-b-eqiad
[16:47:44] <chasemp>	 I'm confident eth0 is in labs-hosts1-b-eqiad, less so eth1 switch side configuraiton other than to say all logic assumes trunk 
[16:47:58] <chasemp>	 and it should be consistent across labvirts
[16:48:49] <chasemp>	 robh: it seems what I'm saying and what you are saying is the same, I'm not sure what you see as 'other way around'
[16:52:10] * andrewbogott withdraws his opinion
[16:53:20] <robh>	 sorry, bouncer died
[16:53:52] <robh>	 chasemp: sorry about that, but vlans seem sensible now that i stops transposing them about.  eth0 in labs-hosts1-b-eqiad and eth1 in labs-instances1-b-eqiad
[16:54:15] <chasemp>	 robh: eth1 is a trunk technically and not in any particular vlan but labs-instances1-b-eqiad in on the allowed list
[16:54:28] <chasemp>	 I'm not sure in junos what the effect is if a port in a VLAN range and yet functioning as a trunk
[16:54:37] <chasemp>	 if that's what seems to have happened in some case
[16:54:40] <robh>	 also its not required for the os isntall afaik
[16:54:53] <chasemp>	 right eth1 is instances only
[16:54:57] <robh>	 so it seems i can just install these so they are calling in and just not running instances
[16:55:03] <robh>	 then you guys are only blocked on netops
[16:55:06] <robh>	 seems right to me
[16:55:19] <chasemp>	 robh: sure, you can assign to me post install and I'll take care of it 
[16:56:29] <Cyberpower678>	 legoktm: I need a sysadmin, are you around?
[16:57:16] <Cyberpower678>	 legoktm: I would like to rename a user with 75k edits.
[17:01:57] <wikibugs>	 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-eqiad, and 2 others: rack/setup/install labvirt101[5-8] - https://phabricator.wikimedia.org/T165531#3364154 (10RobH) Ok, further updates.  I'll write the partman recipe and get the OS isntallation done on these.  However, all of these hosts will need their...
[17:02:50] <wikibugs>	 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-eqiad, and 2 others: rack/setup/install labvirt101[5-8] - https://phabricator.wikimedia.org/T165531#3364156 (10RobH)
[17:04:40] <wikibugs>	 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-eqiad, and 2 others: rack/setup/install labvirt101[5-8] - https://phabricator.wikimedia.org/T165531#3364159 (10chasemp) @Cmjohnson @RobH thanks guys, post install assign to me and I'll take care of it.
[17:11:15] <shinken-wm>	 PROBLEM - Puppet errors on tools-bastion-02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[17:11:49] <robh>	 andrewbogott: actually, it seems the other labvirts1014 use a recipe called labvirt-ssd
[17:12:00] <robh>	 but the sizes are arbitarry and uses swap, i'd like to modify the existing recipe
[17:12:17] <andrewbogott>	 I think that's fine, it wont affect existing boxes anyway.
[17:12:19] <robh>	 but then it has to be understood when those labvirt1010-1014 reinstall they'll repairtitiong slightly
[17:12:21] <robh>	 yeah
[17:12:31] <robh>	 also the ssd recipe uses a mount point of
[17:12:37] <robh>	    /var/lib/nova/instances
[17:12:50] <robh>	 is that what you want as the instance mount point, not srv or something liek that?
[17:13:10] <chasemp>	 that's it and that's hardcoded elsewhere to match so it's not easily changed
[17:13:16] <andrewbogott>	 yeah, that's what I want.  At least, that's how everything else is anyway.  We could symlink instead if it worries you :)
[17:13:22] <robh>	 ok, i'll use that, nah!
[17:13:27] <robh>	 i'd just like to remove swap from it
[17:13:36] <robh>	 and move the #  * 92G /  to just 120GB like most things
[17:13:58] <andrewbogott>	 sure
[17:14:07] <robh>	 there was debate on swap, im not sure where it lands on labvirt usage
[17:14:44] <andrewbogott>	 I don't have much opinion, although there's something to be said for having consistency among servers doing the same job.
[17:14:48] <andrewbogott>	 How much swap is in that recipe?
[17:14:53] <robh>	 https://phabricator.wikimedia.org/T156955
[17:14:58] <robh>	 not much
[17:14:59] <robh>	 8gb
[17:15:06] <robh>	 so i can just leave if you prefer
[17:15:26] <robh>	 honestly we can leave the 97gb for / if you like i dont really ahve a strong preference, i just like to suggest standardization ;]
[17:15:42] <andrewbogott>	 I'd rather you leave it, just so we don't have one more variable.  But standardizing the OS partition is definitely fine
[17:15:51] <robh>	 your raid10 on these hosts is 7.24TB 
[17:15:59] <robh>	 so you have some space
[17:16:05] <andrewbogott>	 sweet
[17:20:17] <robh>	 oh, one more thing
[17:20:23] <robh>	 it seems other labvirts are trusty
[17:20:32] <robh>	 do these need to be as well?  (or can they be jessie?)
[17:20:36] <andrewbogott>	 yeah, these need to be trusty too for now :(
[17:20:38] <robh>	 ok
[17:20:40] <robh>	 wilco
[17:22:48] <wikibugs>	 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-codfw: rack/setup/install labtestpuppetmaster2001 - https://phabricator.wikimedia.org/T167157#3364338 (10Papaul) In the process of troubleshooting the pxe boot issue on this system, I setup a test dhcp/dns/tftp server on my laptop and boot the server to it...
[17:28:11] <wikibugs>	 10Labs: Deprecate DSA (ssh-dss) SSH keys for Labs users - https://phabricator.wikimedia.org/T168433#3364362 (10bd808)
[17:29:20] <wikibugs>	 10Labs, 10DBA, 10User-bd808, 10cloud-services-team (Kanban): setup dewiki and wikidatawiki on the labsdb1009, 1010 and 1011 - https://phabricator.wikimedia.org/T168021#3364388 (10JAllemandou) I can access the views - Sorry for the false positive. However my script still don't find the DB - I'll need to loo...
[17:30:24] <wikibugs>	 10Labs, 10Labs-Infrastructure: ssh-dss (DSA) keys fail for Labs instances with "debian-9.0-stretch (experimental)" image - https://phabricator.wikimedia.org/T167267#3364398 (10bd808) 05Open>03declined Closing in favor of {T168433} after a short discussion with @faidon on irc. Affected users should generate...
[17:31:26] <wikibugs>	 10Labs, 10cloud-services-team (Kanban): Deprecate DSA (ssh-dss) SSH keys for Labs users - https://phabricator.wikimedia.org/T168433#3364362 (10bd808)
[17:36:29] <wikibugs>	 10Labs: `maintain-meta_p --all-databases` timeout on labsdb1009 contacting uk.wikimedia.org - https://phabricator.wikimedia.org/T168436#3364444 (10bd808)
[17:36:44] <wikibugs>	 10Labs, 10cloud-services-team (Kanban): `maintain-meta_p --all-databases` timeout on labsdb1009 contacting uk.wikimedia.org - https://phabricator.wikimedia.org/T168436#3364457 (10bd808)
[17:37:32] <wikibugs>	 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-codfw, 10Patch-For-Review: rack/setup/install labtestpuppetmaster2001 - https://phabricator.wikimedia.org/T167157#3319424 (10Dzahn) >>! In T167157#3364338, @Papaul wrote: > Jun 20 17:21:43 install2002 dhcpd[11106]: DHCPDISCOVER from 30:e1:71:63:5e:5c via...
[17:37:45] <wikibugs>	 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-codfw, 10Patch-For-Review: rack/setup/install labtestpuppetmaster2001 - https://phabricator.wikimedia.org/T167157#3364465 (10Papaul) Daniel find out that for 208.80.153.108  reverse lookup = 2001  and forward lookup = 1002   He fixed it and will try inst...
[17:41:17] <wikibugs>	 10Labs, 10Tool-Labs, 10Tools-Kubernetes: Fix or delete tools-worker-1028 and 29 - https://phabricator.wikimedia.org/T167324#3364484 (10yuvipanda) 05Open>03Resolved a:03yuvipanda I just deleted these :)
[17:43:48] <shinken-wm>	 PROBLEM - Host tools-worker-1028 is DOWN: CRITICAL - Host Unreachable (10.68.22.23)
[17:45:08] <wikibugs>	 10Labs, 10PAWS, 10Tool-Labs, 10Tools-Kubernetes: Consider moving PAWS to its own k8s cluster, rather than using Tools' k8s cluster - https://phabricator.wikimedia.org/T167086#3364539 (10yuvipanda) Going to keep it inside tools!
[17:46:16] <shinken-wm>	 RECOVERY - Puppet errors on tools-bastion-02 is OK: OK: Less than 1.00% above the threshold [0.0]
[17:46:26] <shinken-wm>	 PROBLEM - Host tools-worker-1029 is DOWN: CRITICAL - Host Unreachable (10.68.22.5)
[17:53:08] <wikibugs>	 10Labs, 10Horizon, 10User-bd808, 10cloud-services-team (Kanban): Horizon bug: hidden web proxy after deleting instance - https://phabricator.wikimedia.org/T167985#3364576 (10mpopov) 05Open>03Resolved @Andrew it works now, thank you!
[18:15:25] <wikibugs>	 10cloud-services-team, 10Operations: Reboots of cloud servers - https://phabricator.wikimedia.org/T168445#3364701 (10MoritzMuehlenhoff)
[18:20:48] <marktraceur>	 o/
[18:21:47] <marktraceur>	 I'd like to run some performance analysis on the beta cluster while I do some uploads and thumbnailing, does anyone know where I should look to find information on CPU/memory usage?
[18:25:17] <wikibugs>	 10Labs, 10DBA: enwiki_p logging vs logging_userindex returning dramatically different results - https://phabricator.wikimedia.org/T168349#3364718 (10MusikAnimal) >>! In T168349#3362544, @Marostegui wrote: > In which hosts did you do the tests?  Sorry I didn't record this information. I ran `sql enwiki` on `too...
[18:30:01] <wikibugs>	 10Tool-Labs-tools-Xtools, 10Community-Tech-Sprint: Restrict access to users' edit stats unless opted-in - https://phabricator.wikimedia.org/T165401#3364723 (10MusikAnimal) >>! In T165401#3362175, @Samwilson wrote: > It seems to be timing out for users with large numbers of edits.   I'm not sure what's going on...
[18:36:52] <Guest57706>	 marktraceur: https://tools.wmflabs.org/nagf/?project=deployment-prep or https://grafana-labs.wikimedia.org/dashboard/db/labs-project-board?var-project=deployment-prep&var-server=All possibly
[18:37:04] <Guest57706>	 chase here apparently irc hates me
[18:37:51] <chasemp>	 test
[18:37:57] <paladox>	 andrewbogott hi, is it normal for horizion create instance page to look like
[18:37:57] <paladox>	 https://phabricator.wikimedia.org/F8491115
[18:38:09] <paladox>	 it is showing variable names instead of images.
[18:38:21] <marktraceur>	 Thanks chasemp 
[18:39:43] <andrewbogott>	 paladox: it's a race condition that pops up sometimes, if you reload the dialog it should switch back to normal
[18:39:52] <paladox>	 Ok
[18:39:53] <paladox>	 thanks
[18:48:50] <paladox>	 andrewbogott i've created a stretch instance called wikistats-tuesdaytaco but trying to ssh in fails with
[18:48:51] <paladox>	 $ ssh wikistats-tuesdaytaco
[18:48:51] <paladox>	 Permission denied (publickey).
[18:48:51] <paladox>	 Killed by signal 1.
[18:49:02] <andrewbogott>	 what project?
[18:49:06] <paladox>	 wikistats
[18:51:08] <wikibugs>	 10cloud-services-team, 10Operations: Reboots of cloud servers - https://phabricator.wikimedia.org/T168445#3364905 (10MoritzMuehlenhoff) Updated kernels have been installed (plus the related base libraries/services).
[18:52:23] <wikibugs>	 10Labs-project-Extdist: Migrate extdist.wmflabs.org to Debian stretch - https://phabricator.wikimedia.org/T168456#3364906 (10Legoktm)
[18:53:16] <andrewbogott>	 paladox: Googling, I see a few other people with a similar error but there's no good explanation other than 'corruption'.  I haven't seen it with any test cases… want to see if it happens twice in a row?
[18:53:27] <paladox>	 Ok
[18:53:38] <paladox>	 i will re create it :)
[18:54:34] <paladox>	 thanks also for googleing :)
[18:54:54] <wikibugs>	 10Labs, 10cloud-services-team (Kanban): `maintain-meta_p --all-databases` timeout on labsdb1009 contacting uk.wikimedia.org - https://phabricator.wikimedia.org/T168436#3364940 (10bd808) Same result on labsdb1010 and labsdb1011.
[18:58:13] <paladox>	 andrewbogott still happends
[18:58:24] <andrewbogott>	 hm, ok
[18:59:28] <paladox>	 which is strange as doing a direct jessie -> to stretch (by that i mean doing apt-get dist-upgrade) works. Though this is a brand new instance with stretch. Could it some how be missing the config that tells the instances where our pub key is which we store in wikitech?
[19:02:28] <paladox>	 works now
[19:02:30] <andrewbogott>	 paladox: I think this one just wasn't ready yet.  Does it work for you now?
[19:02:31] <paladox>	 after doing a reboot
[19:02:35] <paladox>	 Yep
[19:02:39] <paladox>	 thanks
[19:03:02] <wikibugs>	 10Labs, 10Tool-Labs, 10DBA, 10Stewards-and-global-tools: Throttling linkwatcher tool user as it is consuming 100% CPU - https://phabricator.wikimedia.org/T121094#1868898 (10Luke081515) Any updates on this old ticket?
[19:12:07] <wikibugs>	 10Labs, 10cloud-services-team (Kanban): labmon1001 disk filling up - https://phabricator.wikimedia.org/T168344#3361694 (10Luke081515) I would think a year is a first good step.
[19:19:20] <hasharAway>	 andrewbogott: I am back around sorry.
[19:19:33] <andrewbogott>	 hashar: is now a good time?
[19:19:47] <hashar>	 for the nodepool rate ( https://gerrit.wikimedia.org/r/#/c/358601/  ),  maybe the OpenStack API has a rate limit as well ?
[19:19:54] <hashar>	 and yeah that can be done any time for nodepool/ci side
[19:20:08] <hashar>	 nodepool reread the yaml file automagically
[19:20:24] <hashar>	 that will just make it send requests every 5 seconds instead of 6 secs.   
[19:20:40] <hashar>	 the trouble is really figuring out what might happen on the openstack side :(
[19:22:34] <paladox>	 andrewbogott hi, Krenair in -devtools says he carn't write a comment on https://phabricator.wikimedia.org/phame/post/view/56/watroles_returns_in_a_different_place_and_with_a_different_name_and_totally_different_code./ but seems other users can
[19:22:43] <paladox>	 I carn't write a comment on there either.
[19:23:01] <andrewbogott>	 hashar: merging, we will see :)
[19:24:04] <andrewbogott>	 paladox: I feel like we've seen this before but I don't remember what it was.  chasemp, any idea about phab blog comments?
[19:24:45] <paladox>	 andrewbogott "<twentyafterfour>	I believe comments are controlled by the 'edit' policy on the blog"
[19:24:49] <hashar>	 :]
[19:25:29] <andrewbogott>	 Ah, so it's not per-post
[19:25:53] <twentyafterfour>	 I think it's a bit broken, if it's what I think it is then you can't comment on a blog post unless you can edit that blog (editing the blog doesn't affect posting, posting is controlled by the 'blog post' form, it's all rather confusing and stupid)
[19:26:30] <twentyafterfour>	 but since posting is controlled by the form then we can allow editing on the blog and that won't allow just anyone to post to the blog
[19:27:51] <paladox>	 twentyafterfour would you be able to fix the form please?
[19:29:44] <hashar>	 andrewbogott: nodepool caught up and does run a query every 5 seconds :)
[19:29:50] <Krenair>	 it could just be set up the wiki way
[19:42:18] <hashar>	 andrewbogott: looks good to me so far. I will check the impact on CI after a couple days of data
[19:45:55] <wikibugs>	 10Labs, 10cloud-services-team (Kanban): labmon1001 disk filling up - https://phabricator.wikimedia.org/T168344#3361694 (10chasemp) >>! In T168344#3365032, @Luke081515 wrote: > I would think a year is a first good step.  +1
[19:51:32] <wikibugs>	 10Labs, 10cloud-services-team (Kanban): `maintain-meta_p --all-databases` timeout on labsdb1009 contacting uk.wikimedia.org - https://phabricator.wikimedia.org/T168436#3365188 (10bd808) p:05Triage>03Low ukwikimedia is in both wikimedia.dblist and closed.dblist, but not in the deleted.dblist which would kee...
[19:52:35] <andrewbogott>	 hashar: sounds good, thanks for sticking around.
[19:54:31] <wikibugs>	 10Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure, 10Nodepool, and 2 others: Lower rate of Nodepool requests to OpenStack API - https://phabricator.wikimedia.org/T167803#3365191 (10hashar) Nodepool caught up with the new rate. That would reflect on the graphs:  * //Tasks per minute// h...
[19:55:34] <hashar>	 andrewbogott: if all goes fine on the openstack side, I guess I will ask to lower it slightly again
[19:56:38] <wikibugs>	 10Labs-project-Wikistats, 10Patch-For-Review: Wikistats 2.2 [beta] gives  internal server error 500 for all csv, ssv and xml formats - https://phabricator.wikimedia.org/T165879#3365196 (10Dzahn) Nice, i see the subtask is resolved too. cool! ( i should still do the rewrites when i get to it)    Also, Xqt i lik...
[20:07:15] <wikibugs>	 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-codfw, 10Patch-For-Review: rack/setup/install labtestpuppetmaster2001 - https://phabricator.wikimedia.org/T167157#3365221 (10Papaul)
[20:57:46] <wikibugs>	 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-codfw, 10Patch-For-Review: rack/setup/install labtestpuppetmaster2001 - https://phabricator.wikimedia.org/T167157#3365385 (10Papaul)
[21:01:32] <wikibugs>	 10Labs, 10Labs-Infrastructure: Setup wikitech, horizon, and striker on new labweb hardware - https://phabricator.wikimedia.org/T168470#3365393 (10bd808)
[21:22:04] <wikibugs>	 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-codfw, 10Patch-For-Review: rack/setup/install labtestpuppetmaster2001 - https://phabricator.wikimedia.org/T167157#3365462 (10Papaul)
[21:26:20] <wikibugs>	 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-codfw, 10Patch-For-Review: rack/setup/install labtestpuppetmaster2001 - https://phabricator.wikimedia.org/T167157#3365467 (10Papaul) @Andrew  this is complete you can take over from here.   Thanks.
[21:30:28] <wikibugs>	 10Labs, 10Patch-For-Review, 10User-bd808, 10cloud-services-team (Kanban): `maintain-meta_p --all-databases` timeout on labsdb1009 contacting uk.wikimedia.org - https://phabricator.wikimedia.org/T168436#3365470 (10bd808) a:03bd808
[21:41:25] <wikibugs>	 10Labs-project-Wikistats: numbers in rank.php wrong? - https://phabricator.wikimedia.org/T168474#3365509 (10Dzahn)
[21:41:32] <wikibugs>	 10Labs-project-Wikistats: numbers in rank.php wrong? - https://phabricator.wikimedia.org/T168474#3365521 (10Dzahn) a:03Dzahn
[22:10:50] <wikibugs>	 10Labs, 10cloud-services-team, 10DBA, 10wikitech.wikimedia.org: move wikitech and labstestwiki to s3 (needs discussion) - https://phabricator.wikimedia.org/T167973#3365576 (10bd808)
[22:10:50] <wikibugs>	 10Labs, 10Labs-Infrastructure: Setup wikitech, horizon, and striker on new labweb hardware - https://phabricator.wikimedia.org/T168470#3365575 (10bd808)
[22:12:50] <wikibugs>	 10Labs, 10Labs-Infrastructure: Setup wikitech, horizon, and striker on new labweb hardware - https://phabricator.wikimedia.org/T168470#3365393 (10bd808)
[22:13:52] <wikibugs>	 10Labs, 10MediaWiki-extensions-OpenStackManager, 10wikitech.wikimedia.org, 10MW-1.30-release-notes (WMF-deploy-2017-06-06_(1.30.0-wmf.4)), 10Patch-For-Review: Remove OpenStackManager from Wikitech - https://phabricator.wikimedia.org/T161553#3365585 (10bd808)
[22:13:52] <wikibugs>	 10Labs, 10Operations, 10wikitech.wikimedia.org, 10HHVM: Move wikitech (silver) to HHVM - https://phabricator.wikimedia.org/T98813#1278203 (10bd808)
[22:14:49] <wikibugs>	 10Labs, 10Operations, 10wikitech.wikimedia.org, 10HHVM: Move wikitech (silver) to HHVM - https://phabricator.wikimedia.org/T98813#1278203 (10bd808) >>! In T98813#3135116, @greg wrote: > Added T161553 as a subtask per above comments.  I removed OSM deprecation as a blocker. I think we can figure out how to...
[22:19:54] <wikibugs>	 10Labs, 10wikitech.wikimedia.org, 10Epic: Make Wikitech an SUL wiki - https://phabricator.wikimedia.org/T161859#3365606 (10bd808)
[22:21:10] <wikibugs>	 10Labs, 10Striker, 10Operations, 10LDAP: Store Wikimedia unified account name (SUL) in LDAP directory - https://phabricator.wikimedia.org/T148048#3365620 (10bd808)
[22:21:12] <wikibugs>	 10Labs, 10wikitech.wikimedia.org, 10Epic: Make Wikitech an SUL wiki - https://phabricator.wikimedia.org/T161859#3145305 (10bd808)
[22:33:51] <robh>	 chasemp: heyas, ive installed labvirt101[56] and its odd.  i can ping them from install1002, but i cannot ping or login to them from puppetmaster1001
[22:34:02] <robh>	 and i have to do that to access with new install key, sign puppet keys, etc...
[22:34:14] <robh>	 oddly enough, labvirt1014 (existing labvirt) also doesnt ping from puppetmaster1001
[22:34:26] <robh>	 but does from install1002, so it makes me think its networking related and not related to my new isntalls
[22:34:29] <robh>	 installs even
[22:35:13] <chasemp>	 robh: my guess is that's fw related if it's install box specific
[22:35:23] <robh>	 i mean the install box works
[22:35:25] <robh>	 but puppetmaster doesnt
[22:35:40] <robh>	 ie: os install is done on half of them but i cannot sign the puppet keys or login to them at all
[22:35:57] <robh>	 just wondering if you may be aware of any recent changes to network rules that may explain it
[22:36:48] <chasemp>	 I can't ping ping labvirt1001.eqiad.wmnet from puppetmaster1001 either
[22:37:00] <robh>	 yeah, but those all call into it for puppet updates
[22:37:07] <robh>	 so yeah, maybe firewall on puppetmaster
[22:37:17] <chasemp>	 no I mean firewall in teh core routers
[22:37:36] <robh>	 it would have had to be new since the last install of a labvirt
[22:38:01] <chasemp>	 could hte move of this to puppetmaster1001 be new since the last labvirt?
[22:38:55] <robh>	 labvirt1014 was installed last year
[22:38:56] <robh>	 heh
[22:39:23] <chasemp>	 I don't know when puppetmaster1001 came to be
[22:40:46] <robh>	 Ubuntu 14.04.5 LTS auto-installed on Sat Aug 6 15:29:51 UTC 2016. 
[22:40:49] <robh>	 labvirt1014
[22:41:02] <robh>	 puppetmaster1001: Debian GNU/Linux 8 auto-installed on Wed Aug 24 22:45:07 UTC 2016. 
[22:41:09] <robh>	 but that doesnt mean thats when it took over as puppetmaser =P
[22:41:17] <robh>	 either way, post labvirt1014
[22:41:20] <chasemp>	 it seems like labvirt1015-18 have puppet keys pending
[22:41:29] <robh>	 yeah, but dont sign
[22:41:39] <robh>	 cuz we cannot login to them to enable puppet to run after singing
[22:41:43] <robh>	 signing
[22:43:20] <robh>	 and the existing labvirts all call into puppet normal
[22:43:24] <robh>	 so its something blocking ssh and ping
[22:46:41] <robh>	 chasemp: so worst case is now i'll just have to escatae this to netops for them to check out the settings on the routers
[22:46:53] <robh>	 but it seems like the installs ran fine, and the partitions during install seemed ok
[22:47:15] <chasemp>	 so yu would do install-console <foo> next on puppmaster1001
[22:47:19] <chasemp>	 is that right?
[22:47:45] <robh>	 basically install_console is just a wrapper for a new_install ssh key
[22:47:53] <robh>	 that we use for the initial login and to enable puppet and trigger a run
[22:47:57] <chasemp>	 I have the vague memory of something here and andrew asking folks about new installs on labs things
[22:47:58] <chasemp>	 right ok
[22:48:00] <robh>	 puppet is disabled on install
[22:48:13] <robh>	 yeah i also vaguely recall that
[22:48:20] <robh>	 and i thought whatever we did seemed broken
[22:48:29] <chasemp>	 andrewbogott: is there some known issue with new installs on labvirts?  I recall you looking at the problem when iron went away or something
[22:48:30] <robh>	 like i recall he moved the key somewhere lese, but i think it was a server that i dont recall 
[22:49:00] <robh>	 oh
[22:49:01] <robh>	 iron has it
[22:49:10] <robh>	 i recall now i didnt like this solution
[22:49:12] <robh>	 but it works.
[22:49:31] <robh>	 I think its better to firewall iron off from labs, it has no reason to touch it, and allow puppetmaster1001 ssh
[22:49:33] <chasemp>	 maybe I only thouht iron went away but I'm remembering the email thread and general discussion
[22:49:37] <robh>	 but iron is 'ops bastion' so meh
[22:49:54] <robh>	 it no long houses the private repo or anthing
[22:50:00] <robh>	 but seems it still exists, not sure why
[22:50:11] <robh>	 iron is a Bastion host using two factor authentication (bastionhost::twofa)
[22:50:11] <robh>	 iron is a Experimental Yubico two factor authentication bastion (misc)
[22:50:22] <robh>	 oh well, it works around this issue
[22:50:42] <chasemp>	 robh: so you can do the first login from iron is that the deal?
[22:50:46] <robh>	 yep
[22:50:49] <chasemp>	 ok
[22:51:20] <chasemp>	 fuzzy memory for teh win
[22:52:20] <mutante>	 hehe! aha.. so iron solves it
[22:52:25] <mutante>	 gtk
[22:52:45] <robh>	 yeah, i just think its a bad solution
[22:52:53] <robh>	 seems no good reason for iron to talk to this vlan other than this
[22:53:01] <robh>	 while puppetmaster1001 already has to talk to it for its normal duties
[22:53:10] <robh>	 so allowing it additional ping and ssh access seems trivial from there versus iron.
[22:53:18] <robh>	 (does that make sense?)
[22:53:38] <chasemp>	 so I think that the conflation of prod acl and instance acl here is what causes this
[22:53:46] <chasemp>	 as in instances go through the labs-hosts vlan for transit
[22:53:55] <mutante>	 it seems like an ACL that allowed puppetmaster and install servers was changed
[22:53:56] <chasemp>	 and an acl there is says no 22 ever to be careful
[22:54:09] <robh>	 oh, iron is public vlan
[22:54:13] <chasemp>	 right
[22:54:18] <mutante>	 aha
[22:54:44] <chasemp>	 I think this can be fixed down teh road by separating out instance traffic from host traffic for labvirts totally
[22:54:57] <chasemp>	 or at least that should relieve teh paranoia 
[22:55:23] <chasemp>	 that's somewhere down teh list of wishes
[22:55:47] <mutante>	 or we could suggest to put install-console on install servers
[22:56:14] <mutante>	 unless that is considered insecure
[22:56:21] <mutante>	 i mean officially, with puppet
[22:56:43] <chasemp>	 I would think anything private space prod is going to suffer same issues
[22:56:52] <chasemp>	 gotta go eat :)
[22:56:58] <chasemp>	 good luck robh and thanks
[22:57:06] <robh>	 welcome
[23:08:08] <wikibugs>	 10Tool-Labs-tools-Xtools, 10Community-Tech-Sprint: Convert xtools intuition to its own repository - https://phabricator.wikimedia.org/T165708#3274486 (10kaldari) @Matthewrbowker: Can you explain this task? I believe the Intuition migration guide is about moving message keys out of the Intuition repo (as used t...
[23:08:40] <wikibugs>	 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-eqiad, and 2 others: rack/setup/install labvirt101[5-8] - https://phabricator.wikimedia.org/T165531#3365759 (10RobH) p:05High>03Normal
[23:22:23] <wikibugs>	 10Tool-Labs-tools-Xtools, 10Community-Tech-Sprint: Create an XTools logo - https://phabricator.wikimedia.org/T167345#3365815 (10kaldari)
[23:23:16] <wikibugs>	 10Tool-Labs-tools-Xtools, 10Community-Tech-Sprint: Fix "Notice: Undefined index: allusers" in Adminstats when the wiki is unreachable - https://phabricator.wikimedia.org/T165707#3365819 (10kaldari)
[23:27:39] <andrewbogott>	 robh: sorry for the delay in responding… those servers need to be set up from iron and not from the puppetmaster
[23:27:46] <andrewbogott>	 due to how their vlan is set up, I believe.
[23:27:55] <andrewbogott>	 oh, you're there already, great
[23:28:02] <andrewbogott>	 Sorry I wasn't around to save you the trouble earlier :(
[23:30:19] <wikibugs>	 10Tool-Labs-tools-Xtools, 10Community-Tech-Sprint: Planning for Xtools beta - https://phabricator.wikimedia.org/T167217#3365834 (10kaldari)
[23:30:37] <wikibugs>	 10Tool-Labs-tools-Xtools, 10Community-Tech-Sprint: Planning for Xtools beta - https://phabricator.wikimedia.org/T167217#3321093 (10kaldari)
[23:35:15] <wikibugs>	 10Labs, 10cloud-services-team (Kanban): Rename labs-admin mailing list to cloud-admin - https://phabricator.wikimedia.org/T167155#3365859 (10bd808)
[23:37:22] <wikibugs>	 10Labs, 10User-bd808, 10cloud-services-team (Kanban): Consult with technical community on Cloud Services rebranding plan - https://phabricator.wikimedia.org/T165094#3365868 (10bd808) 05Open>03Resolved The on-wiki plan documentation has been updated based on feedback received from the consultation. See {T...
[23:38:15] <wikibugs>	 10Labs, 10Horizon, 10MediaWiki-Vagrant, 10Patch-For-Review, and 2 others: Create MediaWiki Vagrant role for local devlopment of Horizon customizations - https://phabricator.wikimedia.org/T166006#3365875 (10bd808) 05Open>03Resolved Always more work to do, but the initial Horizon role is functional.
[23:38:29] <wikibugs>	 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-eqiad, and 2 others: rack/setup/install labvirt101[5-8] - https://phabricator.wikimedia.org/T165531#3365877 (10RobH) a:05RobH>03Cmjohnson Chris:  Please wire up eth1 on these systems and label their ports on the switch.  Then you or I can take a look a...
[23:49:18] <wikibugs>	 10Labs, 10Horizon, 10Patch-For-Review, 10cloud-services-team (Kanban): Fix watroles to work with new Puppet storage backend for Labs - https://phabricator.wikimedia.org/T151522#3365920 (10bd808) 05Open>03Resolved Live at https://tools.wmflabs.org/openstack-browser/puppetclass/. I also setup a redirect...
[23:49:53] <wikibugs>	 10Labs, 10Labs-Infrastructure, 10Patch-For-Review, 10cloud-services-team (Kanban): Horizon puppet roles not cleared when instance is deleted - https://phabricator.wikimedia.org/T147878#3365922 (10bd808) 05Open>03Resolved