[00:27:09] PROBLEM - Puppet errors on tools-exec-1407 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [00:33:42] PROBLEM - Puppet errors on tools-exec-1434 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [00:50:34] can somebody help me with mysql + grafana? [00:53:16] Sagan: maybe. what are you trying to accomplish? [00:55:20] bd808: I'm not sure, if grafana does that on his own. I want to query a simple value COUTN(value) and display the value in a time graph [00:56:05] or do I need to create kind of logging table then at query that one first? [00:59:55] you want to use grafana to show data based on a mysql query? [01:00:16] yeah, that's possible since 4.5 [01:00:19] they offer that [01:00:26] it works for single values nicely [01:01:22] to do a time series you would need to somehow store the point-in-time values [01:01:48] ok, so it's not possible to just tell grafana a value and it would log the values at the time itself [01:02:00] not that I'm aware of, no [01:02:16] ok, thanks :) [01:02:40] usually we would setup something that takes a measurement and then saves that to graphite or prometheus [01:03:03] then use graphana to show the time series data from that source [01:37:09] RECOVERY - Puppet errors on tools-exec-1407 is OK: OK: Less than 1.00% above the threshold [0.0] [02:03:58] PROBLEM - Puppet errors on tools-worker-1020 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [02:08:42] RECOVERY - Puppet errors on tools-exec-1434 is OK: OK: Less than 1.00% above the threshold [0.0] [02:15:40] PROBLEM - Puppet errors on tools-exec-1415 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [02:28:09] PROBLEM - Puppet errors on tools-exec-1407 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [02:34:40] PROBLEM - Puppet errors on tools-exec-1434 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [02:43:59] RECOVERY - Puppet errors on tools-worker-1020 is OK: OK: Less than 1.00% above the threshold [0.0] [02:55:41] RECOVERY - Puppet errors on tools-exec-1415 is OK: OK: Less than 1.00% above the threshold [0.0] [03:03:08] RECOVERY - Puppet errors on tools-exec-1407 is OK: OK: Less than 1.00% above the threshold [0.0] [03:24:10] PROBLEM - Puppet errors on tools-exec-1407 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [03:39:42] RECOVERY - Puppet errors on tools-exec-1434 is OK: OK: Less than 1.00% above the threshold [0.0] [03:49:11] PROBLEM - Puppet errors on tools-exec-1405 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [04:04:09] RECOVERY - Puppet errors on tools-exec-1407 is OK: OK: Less than 1.00% above the threshold [0.0] [04:14:12] RECOVERY - Puppet errors on tools-exec-1405 is OK: OK: Less than 1.00% above the threshold [0.0] [05:00:12] PROBLEM - Puppet errors on tools-exec-1407 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [05:15:05] (03PS10) 10BryanDavis: Add rewritten crontab in Python [labs/toollabs] - 10https://gerrit.wikimedia.org/r/336998 (https://phabricator.wikimedia.org/T156174) (owner: 10Zhuyifei1999) [05:15:07] (03PS1) 10BryanDavis: Convert list-user-databases to python3 [labs/toollabs] - 10https://gerrit.wikimedia.org/r/381385 [05:15:09] (03PS1) 10BryanDavis: Convert jsub to python3 [labs/toollabs] - 10https://gerrit.wikimedia.org/r/381386 [05:23:06] (03CR) 10BryanDavis: [C: 04-1] Add rewritten crontab in Python (032 comments) [labs/toollabs] - 10https://gerrit.wikimedia.org/r/336998 (https://phabricator.wikimedia.org/T156174) (owner: 10Zhuyifei1999) [05:56:22] 10Toolforge, 10Outreachy (Round-15): Outreachy - webservice microtask for Mridu_Bhatnagar - https://phabricator.wikimedia.org/T176018#3645368 (10Mridu_Bhatnagar) Hi, @bd808, @madhuvishy, @Andrew, @srishakatux, I have completed the microtask assigned as a part of outreachy round 15 project . Improvement in th... [06:06:25] (03CR) 10jerkins-bot: [V: 04-1] Convert jsub to python3 [labs/toollabs] - 10https://gerrit.wikimedia.org/r/381386 (owner: 10BryanDavis) [06:30:26] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1418 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [07:05:09] RECOVERY - Puppet errors on tools-exec-1407 is OK: OK: Less than 1.00% above the threshold [0.0] [07:10:28] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1418 is OK: OK: Less than 1.00% above the threshold [0.0] [07:26:09] PROBLEM - Puppet errors on tools-exec-1407 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [08:06:10] RECOVERY - Puppet errors on tools-exec-1407 is OK: OK: Less than 1.00% above the threshold [0.0] [08:06:11] (03PS1) 10Lokal Profil: Correct monument_article matching in th_th [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/381405 (https://phabricator.wikimedia.org/T176712) [08:12:31] PROBLEM - Puppet errors on tools-exec-1423 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [08:21:50] (03PS2) 10Lokal Profil: Correct monument_article matching in th_th and add url base [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/381405 (https://phabricator.wikimedia.org/T176712) [08:27:10] PROBLEM - Puppet errors on tools-exec-1407 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [08:31:11] 10cloud-services-team, 10Community-Tech, 10DBA, 10Security: Create core ip_changes view for replicas - https://phabricator.wikimedia.org/T173891#3645512 (10Marostegui) I can see the table on the replicas, so what is pending is #cloud-services-team to create the view [08:47:33] RECOVERY - Puppet errors on tools-exec-1423 is OK: OK: Less than 1.00% above the threshold [0.0] [09:00:02] (03CR) 10Jean-Frédéric: [C: 032] Correct monument_article matching in th_th and add url base [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/381405 (https://phabricator.wikimedia.org/T176712) (owner: 10Lokal Profil) [09:02:26] (03Merged) 10jenkins-bot: Correct monument_article matching in th_th and add url base [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/381405 (https://phabricator.wikimedia.org/T176712) (owner: 10Lokal Profil) [09:05:19] (03CR) 10jenkins-bot: Correct monument_article matching in th_th and add url base [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/381405 (https://phabricator.wikimedia.org/T176712) (owner: 10Lokal Profil) [09:18:02] hello! I need help in deploying my tool over toolforge, I have access to my tool's service and wish to use python webservice. According to documentations, i have to place my project in $HOME/www/--- but no such www folder exists in my home directory! [09:19:47] Is the path to folder 'www' somewhere else? and is it so that the paths $HOME and $home/my-account-name different? [09:48:47] 10Cloud-Services, 10Continuous-Integration-Infrastructure (shipyard), 10Graphite: Grafana reports ALL docker mounts in a spammy way - https://phabricator.wikimedia.org/T177052#3645705 (10Addshore) [11:02:51] Hello! I am using kubernetes-webservice with python3. my tool-webservice displays 502 Bad Gateway [11:03:03] please help me! [11:09:53] i made some changes but it still shows 502 bad gateway! [11:47:46] PROBLEM - Puppet errors on tools-exec-1416 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [12:07:07] RECOVERY - Puppet errors on tools-exec-1407 is OK: OK: Less than 1.00% above the threshold [0.0] [12:16:08] 10Cloud-Services, 10Continuous-Integration-Infrastructure (shipyard), 10Graphite: Grafana reports ALL docker mounts in a spammy way - https://phabricator.wikimedia.org/T177052#3646024 (10hashar) For {T1075} @fgiunchedi did a tweak to disable reporting diskI/O from partitions: ``` lang=ruby,name=modules/diam... [12:17:04] 10Cloud-Services, 10Continuous-Integration-Infrastructure (shipyard), 10Graphite: Grafana reports ALL docker mounts in a spammy way - https://phabricator.wikimedia.org/T177052#3646030 (10Addshore) That sounds like a good solution. [12:17:08] 10Cloud-Services, 10Continuous-Integration-Infrastructure (shipyard), 10Graphite, 10User-Addshore: Grafana reports ALL docker mounts in a spammy way - https://phabricator.wikimedia.org/T177052#3646031 (10Addshore) [12:22:45] RECOVERY - Puppet errors on tools-exec-1416 is OK: OK: Less than 1.00% above the threshold [0.0] [12:54:55] (03PS1) 10Hashar: Add /.tox to .gitignore [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/381435 [13:12:39] 10Cloud-Services, 10Wikidata, 10Patch-For-Review, 10User-Ladsgroup, 10Wikidata-Sprint: Open view for term_full_entity_id in wb_terms table in labs - https://phabricator.wikimedia.org/T167114#3646244 (10Ladsgroup) Okay, the population is done and we can pick this up now. One thing to note that, in the fil... [13:16:32] bd808: sorry I think I forgot about that crontab rewrite... [13:32:46] (03PS1) 10Lokal Profil: Drop non-url registrant_url [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/381441 [13:36:30] (03CR) 10Jean-Frédéric: [C: 032] Drop non-url registrant_url [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/381441 (owner: 10Lokal Profil) [13:38:16] (03Merged) 10jenkins-bot: Drop non-url registrant_url [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/381441 (owner: 10Lokal Profil) [13:39:46] (03CR) 10jenkins-bot: Drop non-url registrant_url [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/381441 (owner: 10Lokal Profil) [13:55:32] 10cloud-services-team (Kanban): lots of cloud-local puppetmasters broken - https://phabricator.wikimedia.org/T176645#3646385 (10Andrew) 05Open>03Resolved There are only a few left that are broken, and I've emailed all the owners. [14:40:10] 10Cloud-VPS (Project-requests), 10User-Zppix: Request creation of Zppix-Wiki-AI VPS project - https://phabricator.wikimedia.org/T175846#3646498 (10Zppix) 05Open>03Resolved Closing as I have found another workaround with using ubuntu vm [14:47:12] 10Cloud-VPS, 10cloud-services-team (Kanban), 10Continuous-Integration-Infrastructure, 10Nodepool, and 2 others: rabbitmq: Consume and log messages sent to notifications.error - https://phabricator.wikimedia.org/T175029#3646514 (10Andrew) p:05Triage>03Normal [14:49:28] 10Cloud-VPS, 10cloud-services-team (Kanban), 10Patch-For-Review: Set good availability-zone defaults for nova users - https://phabricator.wikimedia.org/T170447#3646516 (10Andrew) 05Open>03Resolved An equivalent to 375941 was merged as part of a larger refactor, and I'm pretty sure this is adequate for th... [15:09:35] zhuyifei1999_: no worries. I had sort of forgotten about it too until I started touching other scripts in that package. :) [15:09:46] 10Cloud-Services, 10Cloud-VPS, 10Operations, 10Patch-For-Review, 10Wikimedia-Incident: Some labs instances IP have multiple PTR entries in DNS - https://phabricator.wikimedia.org/T115194#3646634 (10Andrew) 05Open>03Resolved This is as fixed as it's going to be. Any time there's a designate outage I... [15:14:38] bd808: is there a list of commands to issue with sql? I mean, I saw some use sql --write; a list of -- is there anywhere? [15:15:00] sql --help ? [15:15:12] I don't know of a --write flag in it [15:15:24] ie sql --write metawiki logs in as wikiadmin@db[number] [15:15:37] oh... that's the production `sql` script which is completely different [15:15:49] ah, okay :D [15:16:10] wikiadmin != root [15:16:12] right? [15:16:16] on a production host the `--write` option picks a master server instead of a replica [15:16:33] wikiadmin is the production database "owner" [15:16:59] so wikiadmin can do everything [15:17:00] that account can do schema changes and similar privileged operations [15:17:50] oki [15:19:08] If I had a time machine... /usr/bin/sql would be named /usr/bin/toolsql or something [15:19:49] but I guess if I had a time machine I'd be on an island in a hammock instead of here doing this stuff [15:20:03] I was about to say that [15:22:17] bd808: cleaning wikitech again from that harasser [15:22:20] :| [15:22:27] :/ [15:25:33] 10Cloud-Services, 10cloud-services-team (Kanban), 10wikitech.wikimedia.org, 10Operations: Set up external DNS record for wikitech-static - https://phabricator.wikimedia.org/T164290#3646694 (10Andrew) a:05Andrew>03None [15:25:51] 10cloud-services-team (FY2017-18), 10Goal: Program 1 Outcome 4: VPS hosting - https://phabricator.wikimedia.org/T166396#3646696 (10Andrew) [15:25:53] 10cloud-services-team (FY2017-18), 10Goal, 10Patch-For-Review: Define a metric to track OpenStack system availability - https://phabricator.wikimedia.org/T167556#3646695 (10Andrew) 05Open>03Resolved [15:29:31] 10Toolforge, 10Outreachy (Round-15): Outreachy - webservice microtask for Mridu_Bhatnagar - https://phabricator.wikimedia.org/T176018#3646698 (10Mridu_Bhatnagar) [15:29:56] 10Cloud-Services, 10cloud-services-team (Kanban), 10DBA: Prepare and check storage layer for hi.wikivoyage - https://phabricator.wikimedia.org/T173027#3516428 (10bd808) I performed the other steps from https://wikitech.wikimedia.org/wiki/Add_a_wiki#Cloud_Services : * `/usr/local/sbin/maintain-meta_p --datab... [15:32:11] https://wikitech.wikimedia.org/wiki/Special:UserRights/WiktCAPT <-- remove 'shell' here? [15:32:26] or that'd remove him from all instances/tools? [15:33:38] tabbycat: it would be a no-op. I removed them from the tools project. [15:33:55] and from bastion [15:34:04] shell is kind of outdated I think [15:34:16] None of the ssh code cares [15:34:43] it used to add you to the bastion project but that has changed on the backend [15:35:25] bd808: do you know if vagrant's debian package will work on ubuntu? [15:36:03] sure it will. Its a virtual machine [15:36:26] it works on OSX, Windows, Ubuntu, Debian, RedHat, ... [15:36:45] what would i download from vagrant if im running a ubuntu vm? bd808 [15:36:46] oh... you mean the .deb? [15:38:32] im on the step to download from vagrant from vagrantup what do i download for it to work on an ubuntu system bd808 [15:44:38] Zppix: you need some vagrant package. either downloaded from vagrantup or from apt is one is already available there [15:45:02] I think mw-vagrant works with 1.8+ [16:04:43] 10cloud-services-team: New replica hosts are dramatically slower than labsdb - https://phabricator.wikimedia.org/T177096#3646922 (10MusikAnimal) [16:07:13] 10Data-Services, 10DBA: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3646959 (10bd808) [16:37:34] 10Data-Services, 10DBA: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3647040 (10Marostegui) p:05Triage>03Normal So these are the two query plans: ``` mysql:root@localhost [enwiki]> select @@hostname; +------------+ | @@h... [16:40:00] 10Data-Services, 10DBA: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3647048 (10Marostegui) I just realised that index is NOT on tables.sql, so it must have been added on the old labs server for some reason, so that is why i... [16:48:28] 10Data-Services, 10DBA: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3646922 (10bd808) Explain via https://quarry.wmflabs.org/query/21882 |id|select_type|table|type|possible_keys|key|key_len|ref|rows|Extra| | -- | -- | -- |... [16:52:03] 10Data-Services, 10DBA: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3647075 (10bd808) >>! In T177096#3647048, @Marostegui wrote: > I just realised that index is NOT on tables.sql, so it must have been added on the old labs... [16:53:53] 10Data-Services, 10DBA: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3647077 (10MusikAnimal) >>! In T177096#3647075, @bd808 wrote: > I think we should have all prod indices, but that it is also reasonable to have additional... [16:56:33] 10cloud-services-team (FY2017-18), 10Developer-Relations, 10Goal: Program 4 Outcome 1: improve documentation - https://phabricator.wikimedia.org/T166401#3647095 (10bd808) [16:56:35] 10cloud-services-team (FY2017-18), 10Goal: Plan contract documentation work - https://phabricator.wikimedia.org/T168484#3647093 (10bd808) 05Open>03declined We are going to try a slightly different direction for this. Rather than an external contractor we are going to use existing Foundation staff in the in... [16:56:58] 10Data-Services, 10DBA: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3646922 (10jcrespo) > but that it is also reasonable to have additional indices that make non-MediaWiki queries more performant. I don't disagree, we can... [17:03:14] 10Data-Services, 10DBA: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3647122 (10Marostegui) >>! In T177096#3647096, @jcrespo wrote: >> but that it is also reasonable to have additional indices that make non-MediaWiki queries... [17:44:59] 10Data-Services, 10DBA: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3647248 (10Marostegui) So, thinking about it. Should we: 1) Create the index to unblock @MusikAnimal 2) Think a way of documenting, puppetizing or someth... [17:52:41] 10Cloud-VPS: DNS resolution chosing IPv6 addrs on hosts with only link-local IPv6 addresses - https://phabricator.wikimedia.org/T176891#3640404 (10chasemp) Let's start with "Disable IPv6 entirely on the VM using /etc/sysctl.conf" and see how it goes? Let's revert https://gerrit.wikimedia.org/r/#/c/380318/ too t... [18:11:43] 10Toolforge: Catchpoint tests failing under Toolforge availability product - https://phabricator.wikimedia.org/T177103#3647298 (10chasemp) [18:11:49] 10Toolforge: Catchpoint tests failing under Toolforge availability product - https://phabricator.wikimedia.org/T177103#3647310 (10chasemp) p:05Triage>03Normal [18:12:35] 10Toolforge: Catchpoint tests failing under Toolforge availability product - https://phabricator.wikimedia.org/T177103#3647298 (10chasemp) @madhuvishy is it possible the labsdb* failing tests relates back to the rewrite for account handling? I'm wondering if these checks use creds that got clobbered. [18:13:28] 10Toolforge: Catchpoint tests failing under Toolforge availability product - https://phabricator.wikimedia.org/T177103#3647313 (10chasemp) I deactivated the failing tests for now until we debug them. [18:14:39] 10Cloud-VPS: DNS resolution chosing IPv6 addrs on hosts with only link-local IPv6 addresses - https://phabricator.wikimedia.org/T176891#3647314 (10bd808) Should we limit the disable to just the 2 tools-static hosts? [18:15:52] 10Cloud-VPS: DNS resolution chosing IPv6 addrs on hosts with only link-local IPv6 addresses - https://phabricator.wikimedia.org/T176891#3647329 (10chasemp) I'm kind of thinking at least all of Toolforge would make sense but let's see if it has the effect we think in this case before we worry about the logistics... [18:16:07] 10Cloud-VPS: DNS resolution chosing IPv6 addrs on hosts with only link-local IPv6 addresses - https://phabricator.wikimedia.org/T176891#3647330 (10chasemp) p:05Triage>03Normal [18:19:32] 10Toolforge: Catchpoint tests failing under Toolforge availability product - https://phabricator.wikimedia.org/T177103#3647332 (10chasemp) a:03madhuvishy [18:19:45] 10Data-Services, 10DBA: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3647333 (10bd808) I'd be ok with stalling the index fix for a few days if we can get something properly designed and built relatively quickly. I would be g... [18:21:04] PROBLEM - Puppet errors on tools-exec-1413 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [18:22:20] !log rcm Neon: Running update [18:22:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Rcm/SAL [19:01:05] RECOVERY - Puppet errors on tools-exec-1413 is OK: OK: Less than 1.00% above the threshold [0.0] [19:38:30] 10Cloud-VPS (Project-requests), 10User-Zppix: Request creation of Zppix-Wiki-AI VPS project - https://phabricator.wikimedia.org/T175846#3647506 (10Aklapper) 05Resolved>03declined Nothing [[ https://www.mediawiki.org/wiki/Bug_management/Bug_report_life_cycle| resolved here ]] [19:42:32] (03PS1) 10Hashar: Support python 3.5 test, bump beautifulsoup4 [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/381492 [19:42:59] (03CR) 10jerkins-bot: [V: 04-1] Support python 3.5 test, bump beautifulsoup4 [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/381492 (owner: 10Hashar) [19:49:46] !log tools migration tools-clushmaster-01 to labvirt1015 [19:49:50] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [19:51:50] PROBLEM - Host tools-clushmaster-01 is DOWN: CRITICAL - Host Unreachable (10.68.18.81) [19:56:23] RECOVERY - Host tools-clushmaster-01 is UP: PING OK - Packet loss = 0%, RTA = 1.50 ms [19:58:34] what's the process to ask for a quota increase on a given cloud project? [19:59:14] (if any) bd808 ^ [19:59:35] (03PS2) 10Hashar: Support python 3.5 test, bump beautifulsoup4 [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/381492 [20:09:13] gilles: https://phabricator.wikimedia.org/project/view/2880/ [20:09:25] Zppix: thank you [20:09:29] np gilles [20:11:17] !log phabricator puppet-phabricator & phabricator (phab-01.wmflabs.org) will be going down soon for a reboot. [20:11:19] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Phabricator/SAL [20:13:50] !log git Performing a rolling restart of all hosts. [20:13:52] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Git/SAL [20:16:00] !log git All hosts back up, and operational. No reported errors. [20:16:02] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Git/SAL [20:21:26] paladox: mind if I shut down gerrit-test3.git for a few minutes? I need to migrate it to a different host [20:21:33] yep [20:21:56] paladox, 'yep' you mind, or 'yep' go ahead? [20:22:04] yep go ahead :) [20:22:21] thanks [20:22:39] your welcome :) [20:22:41] andrewbogott: are you doing all .git hosts? [20:22:52] !log git gerrit-test3 going down for maint. [20:22:54] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Git/SAL [20:22:56] Zppix: nope, just that one for now [20:23:06] (evacuating a virt host) [20:23:08] andrewbogott: ok :) [20:25:10] SMalyshev: I need to clear out a virt host and need to migrate wdqs-deploy to a new host. can I do that now, or you need to notify/warn people? [20:25:40] andrewbogott: if you do need to do gerrit-mysql please hold off cause i will have to notify people just fyi [20:26:46] for the moment this is all I care about: https://phabricator.wikimedia.org/P6060 [20:30:55] ebernhar|lunch: mind if I cause some downtime on search-jessie.search.eqiad.wmflabs? [20:38:04] 10Cloud-VPS, 10cloud-services-team (Kanban), 10Continuous-Integration-Infrastructure, 10Nodepool, and 2 others: figure out if nodepool is overwhelming rabbitmq and/or nova - https://phabricator.wikimedia.org/T170492#3647602 (10Andrew) [20:38:07] 10Cloud-VPS, 10cloud-services-team (Kanban), 10Continuous-Integration-Infrastructure, 10Nodepool, and 2 others: rabbitmq: Consume and log messages sent to notifications.error - https://phabricator.wikimedia.org/T175029#3647601 (10Andrew) 05Open>03Resolved [20:38:09] andrewbogott: is that just downtime or any change? [20:38:28] SMalyshev: just downtime + a reboot [20:38:36] andrewbogott: no problem then, go ahead [20:38:55] just ping me when it's done so I ensure it came up cleanly (which should happen but you know...) [20:39:33] great, thanks [20:40:14] !log wikidata-query migrating wdqs-deploy to a labvirt1015, rebooting [20:40:16] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Wikidata-query/SAL [20:41:20] SMalyshev: looks like there's quite a bit on that box, might take a few hours to finish migrating [20:41:34] Currently predicing 2:30 [20:41:51] I might not be around when it's done but it should start up again all on its own, so you can have a look then [20:42:03] ok [20:42:17] yeah that one has a lot of data there [20:42:27] andrewbogott: why the migration? [20:43:14] Zppix: We had a couple of servers with a particular kernel version (4.4.0-83) fail to come up after a reboot. [20:43:37] labvirt1016 is running that kernel. I don't know for sure that it's cursed but just to be on the safe side going to rebuild with a kernel we've had more experience with [20:45:47] 10cloud-services-team (FY2017-18), 10Goal, 10Patch-For-Review, 10User-bd808: Perform core Cloud Services rebranding - https://phabricator.wikimedia.org/T168480#3647642 (10Quiddity) [20:45:49] 10Cloud-Services, 10Design: Update Cloud Services logo+wordmark - https://phabricator.wikimedia.org/T174094#3647639 (10Quiddity) 05Open>03Resolved a:03Quiddity I've updated all the relevant images. I used Semi-bold Montserrat for the wording, to make it legible at small sizes. I kept the logo at the or... [20:46:04] 10cloud-services-team (FY2017-18), 10Goal, 10Patch-For-Review, 10User-bd808: Perform core Cloud Services rebranding - https://phabricator.wikimedia.org/T168480#3365842 (10Quiddity) [20:46:33] 10cloud-services-team (FY2017-18), 10Goal, 10User-bd808: Perform core Cloud Services rebranding - https://phabricator.wikimedia.org/T168480#3365855 (10Quiddity) [20:50:37] 10cloud-services-team (Kanban), 10Patch-For-Review: Replace kernel and reboot labvirt1015, 1016, 1017, 1018 - https://phabricator.wikimedia.org/T176044#3647662 (10Andrew) I've rebuilt labvirt1015, 1017 and 1018 (and the labtestvirts) with 4.4.0-81. So now all of our virt nodes are running that kernel except f... [21:01:22] 10cloud-services-team (Kanban): CamelCase vs. VPS instance naming - https://phabricator.wikimedia.org/T176757#3647688 (10Andrew) p:05Triage>03Normal [21:04:45] andrewbogott: yes thats no problem [21:05:01] ebernhardson: great, thanks [21:23:17] 10Tools: Error retrieving token: mwoauthdatastore-request-token-not-found on commonshelper - https://phabricator.wikimedia.org/T177116#3647720 (10Zoranzoki21) [21:24:16] 10Tools: Error retrieving token: mwoauthdatastore-request-token-not-found for magnus tools - https://phabricator.wikimedia.org/T177116#3647732 (10Zoranzoki21) [21:35:18] 10Tool-stewardbots, 10WorkType-NewFunctionality: Let stewardbot display temporary rightchanges too - https://phabricator.wikimedia.org/T164204#3647759 (10Melos) [22:12:45] 10cloud-services-team (FY2017-18), 10Goal, 10User-bd808: Perform core Cloud Services rebranding - https://phabricator.wikimedia.org/T168480#3647837 (10bd808) [22:21:10] 10Cloud-VPS, 10MediaWiki-Vagrant: Wikimedia SMTP server does not work with Labs-Vagrant - https://phabricator.wikimedia.org/T117391#3647866 (10bd808) [22:34:58] PROBLEM - Puppet errors on tools-worker-1020 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [22:35:41] 10cloud-services-team (FY2017-18), 10Goal: Program 1 Outcome 4: VPS hosting - https://phabricator.wikimedia.org/T166396#3647885 (10bd808) [22:35:43] 10Cloud-Services, 10cloud-services-team (FY2017-18), 10Goal: Setup wikitech, horizon, and striker on new labweb hardware - https://phabricator.wikimedia.org/T168470#3647884 (10bd808) [22:38:41] 10Cloud-Services, 10cloud-services-team (FY2017-18), 10Documentation, 10Goal: Form a WMCS Documentation Special Interest Group - https://phabricator.wikimedia.org/T177123#3647887 (10bd808) [22:38:57] 10Cloud-Services, 10cloud-services-team (FY2017-18), 10Documentation, 10Goal: Form a WMCS Documentation Special Interest Group - https://phabricator.wikimedia.org/T177123#3647899 (10bd808) [22:38:59] 10cloud-services-team (FY2017-18), 10Developer-Relations, 10Goal: Program 4 Outcome 1: improve documentation - https://phabricator.wikimedia.org/T166401#3647900 (10bd808) [22:43:13] 10cloud-services-team (FY2017-18), 10Goal: mprove "My first Flask OAuth tool" tutorial until it can be used as an example of a "good" tutorial - https://phabricator.wikimedia.org/T177124#3647906 (10bd808) [22:43:24] 10cloud-services-team (FY2017-18), 10Goal: Improve "My first Flask OAuth tool" tutorial until it can be used as an example of a "good" tutorial - https://phabricator.wikimedia.org/T177124#3647906 (10bd808) [22:45:23] 10Data-Services, 10cloud-services-team (FY2017-18), 10DBA, 10Goal: Decommission labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T142807#3647923 (10bd808) [22:45:25] 10Data-Services, 10cloud-services-team (FY2017-18), 10Goal: Program 7 Outcome 3: data services - https://phabricator.wikimedia.org/T166402#3647922 (10bd808) [22:47:59] 10Data-Services, 10cloud-services-team (FY2017-18), 10DBA, 10Goal: Migrate all users to new Wiki Replica cluster and decommission old hardware - https://phabricator.wikimedia.org/T142807#3647943 (10bd808) [22:58:46] (03PS3) 10Lokal Profil: Harvest coordinate template and drop lat, lon [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/380939 (https://phabricator.wikimedia.org/T176845) [23:00:05] 10Tools: SVGCheck tool unresponsive - https://phabricator.wikimedia.org/T177125#3647949 (10Offnfopt) [23:00:11] 10Toolforge, 10cloud-services-team (FY2017-18), 10Goal: 2017 Toolforge user survey - https://phabricator.wikimedia.org/T177126#3647961 (10bd808) [23:00:53] 10Toolforge, 10cloud-services-team (FY2017-18), 10Goal: 2017 Toolforge user survey - https://phabricator.wikimedia.org/T177126#3647976 (10bd808) [23:00:54] 10cloud-services-team (FY2017-18), 10Goal: Program 10 Outcome 1: PaaS is easy to use - https://phabricator.wikimedia.org/T166403#3647975 (10bd808) [23:01:48] 10Tools: SVGCheck tool unresponsive - https://phabricator.wikimedia.org/T177125#3647977 (10Offnfopt) [23:03:06] 10Cloud-Services, 10Toolforge: Unconfirm account email addresses for Wikitech accounts that bounced during 2016 survey mailings - https://phabricator.wikimedia.org/T149824#3647979 (10bd808) [23:03:08] 10Cloud-Services, 10Toolforge, 10MediaWiki-extensions-WikimediaMaintenance: Make maintance script for sending annual survey emails - https://phabricator.wikimedia.org/T148783#3647980 (10bd808) [23:03:10] 10Toolforge, 10cloud-services-team (FY2017-18), 10Goal: 2017 Toolforge user survey - https://phabricator.wikimedia.org/T177126#3647978 (10bd808) [23:03:15] SMalyshev: migration is done now, thanks [23:06:08] 10Cloud-Services, 10cloud-services-team (FY2017-18), 10Documentation, 10Goal: Form a WMCS Documentation Special Interest Group - https://phabricator.wikimedia.org/T177123#3647986 (10bd808) p:05Triage>03Normal a:03srodlund [23:06:31] 10cloud-services-team (FY2017-18), 10Goal: Improve "My first Flask OAuth tool" tutorial until it can be used as an example of a "good" tutorial - https://phabricator.wikimedia.org/T177124#3647988 (10bd808) a:03srodlund [23:07:38] 10Cloud-Services, 10cloud-services-team (FY2017-18), 10Goal: Setup wikitech, horizon, and striker on new labweb hardware - https://phabricator.wikimedia.org/T168470#3647989 (10bd808) p:05Triage>03Normal a:03Andrew [23:08:46] 10Tools: Error retrieving token: mwoauthdatastore-request-token-not-found for magnus tools - https://phabricator.wikimedia.org/T177116#3647992 (10Aklapper) Please see T174730#3571842 and also provide a link to "magnus tools for commons". [23:11:16] 10Toolforge, 10cloud-services-team (FY2017-18), 10Goal: Promote Toolforge Tools and their maintainers within Wikimedia communities - https://phabricator.wikimedia.org/T177127#3647994 (10bd808) [23:11:52] 10cloud-services-team (FY2017-18), 10Goal: Program 10 Outcome 3: Outreach - https://phabricator.wikimedia.org/T166406#3648008 (10bd808) [23:11:54] 10Toolforge, 10cloud-services-team (FY2017-18), 10Goal: Promote Toolforge Tools and their maintainers within Wikimedia communities - https://phabricator.wikimedia.org/T177127#3647994 (10bd808) [23:12:36] 10Tools, 10cloud-services-team, 10Community-Liaisons (Oct-Dec 2017): Find and promote tools and their authors - https://phabricator.wikimedia.org/T176677#3648012 (10bd808) [23:12:38] 10Toolforge, 10cloud-services-team (FY2017-18), 10Goal: Promote Toolforge Tools and their maintainers within Wikimedia communities - https://phabricator.wikimedia.org/T177127#3647994 (10bd808) [23:13:16] 10Toolforge, 10Tools, 10cloud-services-team (FY2017-18), 10Community-Liaisons (Oct-Dec 2017): Find and promote tools and their authors - https://phabricator.wikimedia.org/T176677#3633534 (10bd808) [23:14:57] RECOVERY - Puppet errors on tools-worker-1020 is OK: OK: Less than 1.00% above the threshold [0.0] [23:15:46] andrewbogott: thanks! [23:28:59] andrewbogott: hey hey! Is it a known that labsdb1001's switch port is negociating at 100M (and not 1G)? [23:40:32] 10Toolforge, 10cloud-services-team (FY2017-18), 10Goal: 2017 Toolforge user survey - https://phabricator.wikimedia.org/T177126#3648040 (10bd808) p:05Triage>03Normal