[00:22:40] 10Cloud-VPS, 10cloud-services-team (Kanban): Create banner image for Wikimedia Cloud VPS - https://phabricator.wikimedia.org/T177442#3692574 (10Quiddity) Folder/directory locations: Old: https://tools-static.wmflabs.org/static/logos/powered-by-tool-labs.png New: https://tools-static.wmflabs.org/toolforge/banne... [00:23:03] 10cloud-services-team (FY2017-18), 10Goal, 10Patch-For-Review, 10User-bd808: Perform core Cloud Services rebranding - https://phabricator.wikimedia.org/T168480#3692575 (10Quiddity) [00:46:32] (03PS3) 10Paladox: Gerrit: Replace certificates with tokens for its-phabricator [labs/private] - 10https://gerrit.wikimedia.org/r/384902 (https://phabricator.wikimedia.org/T178385) [00:48:29] 10Tool-fatameh: Fails to make items where journal is unclear because of duplicate ISSN - https://phabricator.wikimedia.org/T168362#3362348 (10Daniel_Mietchen) I would suggest to start the items without a "published in" statement, and to build a tool/ Wikidata game/ Mix'n match catalog for reconciling those misma... [02:02:06] 10Data-Services, 10DBA, 10Epic: Labs database replica drift - https://phabricator.wikimedia.org/T138967#3692624 (10bd808) [02:09:38] 10Data-Services, 10DBA, 10Tracking: Wikireplica service for tools and labs - issues and missing available views (tracking) - https://phabricator.wikimedia.org/T150767#3692628 (10bd808) [02:09:41] 10Cloud-Services, 10DBA, 10Operations, 10Tracking: Database replication problems - production and labs (tracking) - https://phabricator.wikimedia.org/T50930#3692629 (10bd808) [02:09:45] 10Data-Services, 10DBA, 10Epic: Labs database replica drift - https://phabricator.wikimedia.org/T138967#3692625 (10bd808) 05Open>03Resolved a:03jcrespo The main description here and https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database/Replica_drift have been updated to reflect our confidence tha... [02:17:18] 10Data-Services, 10DBA, 10Epic: Labs database replica drift - https://phabricator.wikimedia.org/T138967#3692637 (10bd808) [02:17:20] 10Data-Services: Data missing from labs replica of enwiki.imagelinks - https://phabricator.wikimedia.org/T172567#3692633 (10bd808) 05Open>03declined Use `enwiki.{analytics,web}.db.svc.eqiad.wmflabs` instead of `enwiki.labsdb`. See also {T142807} [02:22:12] 10Cloud-Services, 10DBA, 10Operations, 10Tracking: Database replication problems - production and labs (tracking) - https://phabricator.wikimedia.org/T50930#3692639 (10bd808) [02:25:46] 10Data-Services, 10DBA, 10Tracking: Wikireplica service for tools and labs - issues and missing available views (tracking) - https://phabricator.wikimedia.org/T150767#3692641 (10bd808) [02:29:20] 10Data-Services, 10DBA, 10Tracking: Wikireplica service for tools and labs - issues and missing available views (tracking) - https://phabricator.wikimedia.org/T150767#3692646 (10bd808) 05Open>03declined a:03jcrespo Tracking task deprecated. Please file tasks tagged as #data-services instead. [02:34:45] 10Data-Services, 10cloud-services-team (Kanban), 10Community-Tech, 10DBA, 10Security: Create core ip_changes view for replicas - https://phabricator.wikimedia.org/T173891#3692653 (10bd808) 05Open>03Resolved a:03Andrew @andrew merged my config patch and ran the script to create the views. ``` $ sql... [02:52:54] 10Data-Services, 10cloud-services-team (Kanban), 10DBA, 10User-bd808: Determine schema differences between labsdb1001 and labsdb1009 - https://phabricator.wikimedia.org/T177223#3692662 (10bd808) >>! In T177223#3651125, @jcrespo wrote: > There is already 5 related things that, even nothing to do with this,... [02:54:22] 10Data-Services, 10cloud-services-team (Kanban), 10DBA, 10User-bd808: Determine schema differences between labsdb1001 and labsdb1009 - https://phabricator.wikimedia.org/T177223#3692664 (10bd808) From {rOPUP2b2050646943e3f7dd03370692b90aba6fdda669} we also now have the `modules/role/files/labs/db/views/extr... [03:00:12] 10Data-Services, 10cloud-services-team (Kanban), 10DBA, 10User-bd808: Determine schema differences between labsdb1001 and labsdb1009 - https://phabricator.wikimedia.org/T177223#3692668 (10bd808) Deeper inspection by me is currently blocked by {T178128}, but my intent is to run a tool like [[https://dev.mys... [03:00:31] 10Data-Services, 10cloud-services-team (Kanban), 10DBA, 10User-bd808: Determine schema differences between labsdb1001 and labsdb1009 - https://phabricator.wikimedia.org/T177223#3692670 (10bd808) 05Open>03stalled [03:00:33] 10Data-Services, 10DBA, 10Patch-For-Review: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3692671 (10bd808) [06:21:39] 10Cloud-Services, 10Outreachy (Round-15): Proposal: Improvements for the Toolforge 'webservice' command - https://phabricator.wikimedia.org/T177603#3664370 (10srishakatux) @Sowjanyavemuri Hello, I have reached out to Outreachy organizers, and I'm trying to confirm your eligibility based on your responses. I ho... [07:45:14] 10Toolforge: Node.js on gridengine - https://phabricator.wikimedia.org/T166830#3692905 (10edsu) Ok, thanks for the details, this helps a lot. So it sounds like the way to move forward is to start using Kubernetes which I wanted to learn about anyway, and now I have a reason. [07:46:04] 10Toolforge: Node.js on gridengine - https://phabricator.wikimedia.org/T166830#3692906 (10edsu) 05Open>03Resolved [09:14:52] 10cloud-services-team (Kanban), 10Analytics: Remove logging from labs for schema https://meta.wikimedia.org/wiki/Schema:CommandInvocation - https://phabricator.wikimedia.org/T166712#3693022 (10elukey) @Nuria Both tables right? ``` +-------------------------------------+ | Tables_in_log (CommandInvocation%) |... [09:15:07] 10cloud-services-team (Kanban), 10Analytics, 10User-Elukey: Remove logging from labs for schema https://meta.wikimedia.org/wiki/Schema:CommandInvocation - https://phabricator.wikimedia.org/T166712#3693023 (10elukey) [09:19:26] 10Horizon, 10User-Addshore: Applied puppet classes not appearing in horizon for integration-slave-docker-c2-m4-d40-1005.integration.eqiad.wmflabs - https://phabricator.wikimedia.org/T178409#3693038 (10Addshore) [09:31:21] 10Tools: Long running queries from pltools unlikely to finish - https://phabricator.wikimedia.org/T178459#3693048 (10jcrespo) [10:19:05] 10Data-Services, 10cloud-services-team (Kanban), 10Community-Tech, 10DBA, 10Security: Create core ip_changes view for replicas - https://phabricator.wikimedia.org/T173891#3693202 (10IKhitron) [[https://www.mediawiki.org/wiki/Manual:Ip_changes_table|Documentation]]. [13:17:05] 10Toolforge: k8s nodes sometimes getting bad token value from hiera - https://phabricator.wikimedia.org/T177944#3693743 (10chasemp) tldr from IRC. When the every 30m cron is triggered to update the labs/private repo on the Toolforge master (or probably any project specific master) it seems rebase sets aside the... [13:32:15] 10Cloud-VPS (Project-requests): Request creation of reading-lists VPS project - https://phabricator.wikimedia.org/T178110#3693771 (10chasemp) 05Open>03Resolved Created with @tgr as project admin. [13:34:39] 10Cloud-VPS (Quota-requests): Request increased quota for mwstake Cloud VPS project - https://phabricator.wikimedia.org/T178012#3693774 (10chasemp) 05Open>03Resolved You know have 1 floating IP :) [13:40:27] 10Cloud-VPS (Quota-requests): Request increased quota for cyberbot Cloud VPS project - https://phabricator.wikimedia.org/T178332#3693816 (10chasemp) 05Open>03Resolved I added one large instance worth of quota to the existing resources: > | Name | RAM | Disk | VCPU > | m1.large | 8192 | 80 |... [14:02:24] 10Toolforge, 10cloud-services-team (Kanban): Elasticsearch credential request for strephit - https://phabricator.wikimedia.org/T178310#3693974 (10chasemp) 05Open>03Resolved a:03chasemp @Kiailandi the credentials should be in the Tools home at /data/project/strephit/.elasticsearch.ini [14:04:17] !log tools add strephit creds to elasticsearch per T178310 [14:04:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [14:04:22] T178310: Elasticsearch credential request for strephit - https://phabricator.wikimedia.org/T178310 [14:10:03] 10Cloud-VPS (Quota-requests): Request increased quota for cyberbot Cloud VPS project - https://phabricator.wikimedia.org/T178332#3693987 (10Cyberpower678) I have now spawned a large instance named "cyberbot-db-01". Thank you for the increase. [14:10:47] 10Cloud-VPS (Quota-requests): Request increased quota for mwstake Cloud VPS project - https://phabricator.wikimedia.org/T178012#3693988 (10CCicalese_WMF) Thank you! [14:29:30] Technical Advice IRC meeting starting in 30 minutes in channel #wikimedia-tech, hosts: @addshore & @Christoph_Jauera_(WMDE) - all questions welcome, more infos: https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting [14:55:35] Technical Advice IRC meeting starting now in channel #wikimedia-tech, hosts: @addshore & @Christoph_Jauera_(WMDE) - all questions welcome, more infos: https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting [15:03:05] 10Cloud-Services, 10VPS-project-Phabricator, 10Operations, 10Patch-For-Review: spam from phabricator in labs - https://phabricator.wikimedia.org/T166322#3694113 (10Dzahn) exim-ganglia stats can/should be removed from everything, which will also close this. also, i haven't received any of this in a long ti... [15:03:11] 10Cloud-Services, 10VPS-project-Phabricator, 10Operations, 10Patch-For-Review: spam from phabricator in labs - https://phabricator.wikimedia.org/T166322#3694114 (10Dzahn) p:05High>03Normal [15:03:41] 10Cloud-Services, 10VPS-project-Phabricator, 10Operations, 10Patch-For-Review: spam from phabricator in labs - https://phabricator.wikimedia.org/T166322#3292437 (10Dzahn) a:03Dzahn [15:06:34] 10Toolforge, 10Mail, 10monitoring: Monitor mail system in Graphite - https://phabricator.wikimedia.org/T71072#3694149 (10Dzahn) [15:08:15] 10Cloud-Services, 10VPS-project-Phabricator, 10Operations, 10Patch-For-Review: spam from phabricator in labs - https://phabricator.wikimedia.org/T166322#3694164 (10Paladox) @Dzahn yep, I’ve applied them manually on the puppet master and there somewhere in the git log. [15:08:25] 10Toolforge, 10Mail, 10monitoring: Monitor mail system in Graphite - https://phabricator.wikimedia.org/T71072#724852 (10Dzahn) >>! In T71072#724872, @scfc wrote: > (Gerrit change #143111 added a collector for exim, but I don't see it at http://graphite.wmflabs.org/ under tools-mail.) https://graphite-labs.w... [15:08:44] 10Cloud-Services, 10Mail, 10monitoring: Monitor mail system in Graphite - https://phabricator.wikimedia.org/T71072#3694168 (10Dzahn) [15:08:57] 10Toolforge, 10Mail, 10monitoring: Monitor mail system in Graphite - https://phabricator.wikimedia.org/T71072#724852 (10Dzahn) [15:14:24] 10cloud-services-team (Kanban), 10DC-Ops, 10Operations, 10ops-eqiad: labvirt1015 crashes - https://phabricator.wikimedia.org/T171473#3694183 (10Cmjohnson) @bd808 I swapped the CPU's to see if the error follows the CPU. The replacement that I put in there was refurbished so there is a possibility it was ba... [15:33:36] bd808: are you you awake? [15:33:49] possibly [15:34:10] I may be lucid dreaming the work I am doing now [15:34:17] HAHA [15:34:40] So I spawned my new DB instance. [15:34:44] 10cloud-services-team (FY2017-18), 10Operations, 10Puppet, 10User-Joe: Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3694242 (10herron) [15:34:57] I'm questioning if 80GB is enough though. [15:35:57] But the actual question is do I need to sudo apt-get install mysql? or what do I need to do to get this server going? [15:36:27] Your current usage on tools.labsdb is 49.6 GB. You should have some headroom, but not tons [15:37:00] I'd feel more comfortable with 120GB, but okay [15:37:49] If you get everything up and running and migrated over we can talk about how to get you more disk. Disk is our most over subscribed resource today so we try to be pretty conservative handing it out [15:38:49] bd808: well we could steal some from IABot's exec node. It only needs the CPU and RAM for the most part. [15:39:13] For the "where do I start" question... the answer is in large part that you need to decide what you are aiming for in the long term. [15:39:46] bd808: pm? [15:39:49] I would recommend that you investigate how to make everything setup via Puppet so that you can rebuild the service easily. [15:40:01] Okay [15:40:12] There are existing Puppet roles for setting up MySQL/MariaDB servers [15:40:24] I haven't really used them myself [15:40:25] * Cyberpower678 knows nothing about puppet [15:40:48] This is largely my concern about you migrating to your own VMs for this [15:40:54] 10Toolforge, 10cloud-services-team (Kanban): Elasticsearch credential request for strephit - https://phabricator.wikimedia.org/T178310#3694263 (10Kiailandi) >>! In T178310#3693974, @chasemp wrote: > @Kiailandi the credentials should be in the Tools home at /data/project/strephit/.elasticsearch.ini Thank you v... [15:41:05] bd808: I can learn. :p [15:41:12] We will be able to give a small amount of help, but really only a small amount [15:41:31] Projects are largely on their own to keep things up and running [15:42:16] we try very hard to make sure the infrastructure is stable, but can't take ownership or responsibility of the VMs that run on that infrastructure [15:42:47] 10cloud-services-team: Update VPS puppetmasters to 4.8 or newer - https://phabricator.wikimedia.org/T178508#3694268 (10Andrew) [15:43:13] bd808: I get that, so I'm willing to learn what's needed to do that. My other 2 instances are stable. [15:43:36] It's good experience/ [15:44:13] 10cloud-services-team: Update VPS puppetmasters to 4.8 or newer - https://phabricator.wikimedia.org/T178508#3694295 (10Andrew) The future parser has few complaints, so we're ready to move on to actual upgrade testing. http://puppet-compiler-tools.wmflabs.org/puppet-compiler-tools/666/index-future.html http://pu... [15:45:26] * bd808 looks to see if there is a basic database server role [15:45:53] 10cloud-services-team: Upgrade puppetmaster on toolsbeta and test - https://phabricator.wikimedia.org/T178510#3694297 (10Andrew) [15:46:21] 10cloud-services-team (Kanban): Update VPS puppetmasters to 4.8 or newer - https://phabricator.wikimedia.org/T178508#3694268 (10Andrew) [15:48:05] there is a "role::mariadb" and "role::simplelamp". The latter might be easier to start with. The mariadb roles look to be pretty production server specific. [15:49:35] I would very much recommend trying to find a co-maintainer to work with and maybe explictly look for someone with reasonable Linux system admin experience [15:49:53] Linux is fun [15:49:59] :p [15:53:48] 10cloud-services-team: Upgrade puppetmaster on toolsbeta and test - https://phabricator.wikimedia.org/T178510#3694318 (10Andrew) cc: everyone who has been active in the toolsbeta project Two things: 1) Is anyone doing anything particularly active/current in toolsbeta that will be disrupted if this upgrade goe... [15:59:09] bd808: it appears simplelamp is not compatible with the latest images according to mutante [15:59:35] somebody should fix that :) [16:00:10] or work on a basic mariadb role which might be more broadly useful [16:02:18] * Cyberpower678 's head is spinning going over the puppet manual. [16:03:04] * Cyberpower678 prefers hands on learning [16:04:16] 10Tools: Long running queries from pltools unlikely to finish - https://phabricator.wikimedia.org/T178459#3694344 (10jcrespo) The user did not respond but the queries kept retrying, probably unattended, I am going to kill them until I get his/her attention. [16:06:06] So I don't waste anymore of bd808's time, can someone who regularly uses puppet educate me? [16:06:24] Puppet is apparently something I should understand. [16:07:42] Cyberpower678: I can answer questions, but I cannot train you [16:08:53] jynus: for now I just need basic information. I just spawned a new instance so I can migrate my DB of the shared tools DB host [16:09:10] bd808: suggested using puppet. [16:09:14] ah [16:09:16] yes [16:09:27] we have some generic classes for that [16:09:39] the mysql one is outdated because no one uses it [16:09:47] but it is the canonical one [16:09:52] I tried to learn from the docs, but all of it seems to talk about installing puppet. [16:09:54] on production we use one called mariadb [16:10:06] He suggested "role::mariadb" and "role::simplelamp" [16:10:14] but installs wmf-specific packages [16:10:17] yeah [16:10:29] role::mariadb is the one we use in production [16:10:40] The former appears to be production specific and not for what I want, and the latter is apparently not supported anymore. [16:10:43] or on labsdb, for example (more or less) [16:10:57] tecnically it is not production specific [16:11:08] but it has some weirdness [16:11:40] to be fair, I do not think it is tested outside wmf production, but maybe you can try it :-) [16:12:10] quarry still uses its own database on quarry-main-01, without any redundancy or regularly scheduled backups [16:12:25] * zhuyifei1999_ fears that instance die any time [16:12:45] backups of this DB are important. [16:13:06] jynus: worst case I end up respawning the node. [16:13:10] Cyberpower678: https://github.com/wikimedia/puppet/blob/production/modules/quarry/manifests/database.pp [16:13:22] Cyberpower678: in reality it i general-purpose [16:13:22] you might want to apply something similar [16:13:29] it just has some odd defaults [16:13:36] (at least it works on cloud) [16:13:42] for example, it does not start mysql by default, etc. [16:13:59] but that is configurable [16:14:04] jynus: that is odd. Why would mysql not be started by default? [16:14:32] Cyberpower678: when there is other pieces of orchestration, other than puppet, handling that [16:14:48] Oh [16:14:50] so we just use puppet for install + static configuration [16:15:02] the rest is not done on puppet [16:15:06] so I guess I need I need to know how to apply the role? [16:15:08] e.g. if a production host is down [16:15:18] we already have monitoring [16:15:55] Cyberpower678: I can tell you about puppet and this role in particular [16:16:03] Okay [16:16:24] puppet setup in toolforge/virtual instances I do not know much [16:17:03] but basically, someting similar to role::mariadb would be what you want [16:17:11] Indeed [16:17:44] the good thing is that if you install our packages, I can help you [16:17:52] if you install random packages, I cannot [16:18:07] but obviously, no one else uses ours [16:18:17] so up to you [16:18:22] ok [16:18:23] there is the mysql module too [16:18:40] which is the generic one but it is outdated [16:18:49] So the big question is: How do I install packages? [16:18:52] I do not think it works on stretch [16:18:57] puppet will do that for you [16:19:12] So how do I tell puppet to install and configure it? [16:19:17] that is the main reason to use puppet- automatic installation and configuration :-) [16:19:50] :p [16:20:04] I see the point, but the vital information is how do I set it up? [16:20:10] Cyberpower678: https://wikitech.wikimedia.org/wiki/Help:Instances#Puppet_information [16:20:23] this is as much as I know, as I said, I am not a VPS user [16:20:30] I would ask around how other people do it [16:20:39] Cyberpower678: i just do this: require_package('mariadb-server') [16:20:47] when i want a local db [16:20:51] on cloud VPS [16:21:11] and then i have "if stretch, then use require_package('php7.0-mysql') [16:21:16] for PHP to talk to it [16:21:17] jynus: thank you [16:21:47] mutante: in the command line? [16:21:58] Cyberpower678: no, in my puppet class [16:22:10] which i apply on the instance in Horizon [16:22:23] Now we are getting somewhere. [16:22:28] but the whole "install mysql" part is that one line, basically [16:22:37] well, mariadb [16:23:28] Cyberpower678: example code: repo: operations/puppet puppet/modules/wikistats/manifestst/db.pp [16:23:31] mutante: you say role::simplelamp is not compatible with the newest Debian images? [16:23:41] i use that on a VPS [16:23:45] Your trying to install mysql? [16:23:48] it includes setting some grants [16:23:51] Correct [16:23:51] on the first run [16:23:55] i recommend mariadb [16:23:57] not mysql. [16:24:06] due to dist issues ie stretch [16:24:07] paladox: may help you, he is knowlegable on this issues [16:24:14] :) [16:24:31] Cyberpower678: i dont know if role::simplelamp still works, i use my own [16:24:43] Okay [16:24:48] but i wouldnt be surprised if it's broken in stretch, yea [16:24:51] use mediawiki vagrant [16:24:54] if you want ldap [16:24:56] no, dont [16:25:02] :) [16:25:04] it's much easier and bd808 showed me how to do it :) [16:25:27] So when setting up the classes I go to horizon, select the instance I want to setup, go to the "Puppet Configuration" tab, and click "Edit" for other classes? [16:25:48] you click edit if you want to add classes not listed in the all classes tab [16:26:21] you only need edit if you are using a custom puppet master and are applying a new module not merged in operations/puppet [16:28:06] Cyberpower678: sorry it is a bit convoluted, but it will save time in the long term [16:28:20] So what should I install? I'm getting so many different suggestions here. :-). I want to move my DB from tools-db onto my own instance so it can continue to run. Ideally I would like to somehow have backups in case of instance failures. [16:28:22] new installs or upgrade will be immediate [16:28:51] Cyberpower678 apt-get install mariadb-server-10.1 [16:28:54] if your on jessie [16:29:00] Cyberpower678: basically you have to write your own role.. but it's not as horrible as it sounds :) [16:29:09] not much code needed to get that package installed [16:29:14] I'm on stretch [16:29:41] ah [16:29:46] that should work too [16:29:58] My other 2 instances are still trusty [16:30:15] * Cyberpower678 needs to upgrade those at some point. [16:30:19] doing apt-get install mariadb-server-10.1 should work :) [16:30:35] Cyberpower678: let's upgrade them from Ubuntu to Debian and then i help you write a custom role [16:30:38] you can then copy your dbs over from tools to that one. [16:30:41] you just apply it.. and BOOM [16:30:52] (BOOM in a good way :) [16:31:20] * Cyberpower678 envisions a mushroom cloud [16:31:33] lol [16:31:41] please let puppet do it and dont just run apt-get install :) [16:31:48] mutante: thank you. I did the apt-get install mariadb-server-10.1 [16:31:50] you will like it later when you do the next one and the next one [16:31:53] * paladox uses a custom puppet role https://github.com/wikimedia/labs-icinga2 [16:33:15] mutante: apt-get install mariadb-server-10.1 has finished, now what? [16:35:51] mutante: when upgrading the instances, will I need to reinstall everything? [16:35:59] Cyberpower678: well.. my recommendation was to NOT manually install it [16:36:08] yes, because you installed it manually [16:36:22] use a puppet role to avoid having to reinstall things :) [16:36:27] Oh paladox suggested it. [16:36:29] which role should he be installing mutante? [16:36:35] I thought you suggested that. [16:36:36] the one we are writing now :) [16:36:42] oh i see [16:37:06] Cyberpower678: what do you need installed? is there a list of requirements or so? [16:37:35] mutante: for the two exec nodes needing upgrading? [16:37:45] Cyberpower678: i dont know at all what nodes you are working on [16:37:59] i just know from "i need a db" [16:38:13] mutante: the one needing the DB, is using stretch [16:38:14] so do these nodes have any existing roles applied to them? [16:38:31] I haven't ever used puppet until now. [16:38:33] So now [16:38:35] *no [16:38:57] so if you had a brandnew Debian install now , with nothing on it [16:39:09] what would you have to manually do to get a node like that [16:39:10] so that is the great thing about puppet- on reinstall, it would do all the work for you [16:39:33] then i'll help you "translate" manual to puppet [16:39:48] and like jynus said, if you do that ONCE you can then reuse it in the future and never have to repeat it [16:40:12] if you setup 2 nodes that are the same, also, you only have to work once [16:40:21] we just need something like "list of packages installed" "list of files created" , "content of the files" [16:40:35] mutante: I'm confused with your question. It's a clean install of stretch and I just did the apt-get install of mariadb on it. [16:40:42] if you ask for help, showing the puppet manifes is the only thing you need to do [16:40:54] Cyberpower678: the question is about your goal. when are you 'ready'? [16:41:08] Cyberpower678: what else would you need besides the DB? [16:41:25] mutante: well this instance only houses the DB. [16:41:33] It's a dedicated DB VM [16:41:50] mutante: for context we asked him to setup dedicated resources [16:41:54] The actual bot lives elsewhere. [16:42:09] because much was being eaten on the tools-db [16:42:18] aha, got it. i didnt have the context yet [16:42:28] so he just needs the db, to start it, a user and that's all [16:42:38] mutante: I'm glad we have that cleared up? [16:42:39] maybe some cron doing dumps [16:42:39] ok, i suggest we create something like "role simpledb" [16:43:02] we start by just letting it install that package [16:43:07] and later it can be extended when needed [16:43:20] with crons .. :) [16:43:23] mutante: ok [16:43:53] jynus: if you ever make a cron script that does the backups, could you also cc me? [16:43:57] (for quarry) [16:44:43] zhuyifei1999_: it seems like you and Cyberpower678 could help each other :-) [16:44:53] well maybe. [16:45:02] :D [16:45:04] quarry is puppetized by yuvi [16:45:10] I need to learn the ropes first. [16:45:22] zhuyifei1999_: it is not that I do not want to share my backup scripts [16:45:25] they are already shared [16:45:40] o.O where? [16:45:49] but they are a 6-month ongoing project heavily customized for wmf production [16:45:57] uh :( [16:46:05] so probably too much for what you need [16:46:15] yeah probably [16:46:16] Cyberpower678: hold on , i will upload something [16:46:28] creating a cron + mydumper script is realatively easy [16:46:38] mutante: okay [16:47:03] is there a good name for your project? [16:47:12] what labs project does this run it [16:47:16] in [16:47:34] The DB instance is currently named cyberbot-db-01 [16:47:37] it's for a bot, right [16:47:45] ok [16:48:01] jynus: I'll write a script to operations/puppet module quarry and add you as reviewer [16:48:12] changes mind that it should be a role for cyberbot specifically rather than a generic "simpledb" [16:48:27] starts that [16:48:31] mutante: yes. The bot instances are cyberbot-exec-01 and cyberbot-exec-iabot-01 [16:49:03] so you have cyberbot::exec and cyberbot::db so to speak [16:49:15] essentially [16:49:28] zhuyifei1999_: happy to review [16:49:32] what packages do you install on an "exec" node? [16:49:39] ok thanks [16:49:41] there are already some things done, for you to look up [16:49:48] let me search them for you [16:51:11] btw chasemp or bd808 can I just backup quarry data to /data/project on NFS? [16:51:22] mutante: on the exec node, I have PHP stuff installed on it. [16:51:27] the query results are already stored on NFS [16:51:40] Cyberpower678: i assume php-mysql then ? [16:51:41] With common extensions [16:51:42] zhuyifei1999_: for what I know, the official position is that right now now space is provided for backups [16:51:44] zhuyifei1999_: that would be probably not the best idea, esp for long term storage [16:51:53] and nfs is the least realiable option [16:51:57] PHP's mysqli extension [16:52:26] Cyberpower678: could you paste the output of "dpkg -l | grep php" on an exec node for me [16:53:09] zhuyifei1999_: can you make a task? A few questions come to mind: how large, how often, how long to keep, what are you trying to recover from [16:53:37] zhuyifei1999_: this is one https://phabricator.wikimedia.org/source/operations-puppet/browse/production/modules/backup/templates/mysql-predump.erb [16:53:39] 10cloud-services-team (Kanban), 10Analytics, 10User-Elukey: Remove logging from labs for schema https://meta.wikimedia.org/wiki/Schema:CommandInvocation - https://phabricator.wikimedia.org/T166712#3694457 (10Nuria) Yes, pinging @bd808 that those two will be deleted soon. [16:53:50] chasemp: what would be your alternative? [16:53:52] NFS storage is far and away disproportionately expensive for anything like static backup files in a few senses [16:54:39] zhuyifei1999_: we don't really have an off-prem user backup solution atm but if it were me a pure user in cloud land it depends on how large the backup is [16:54:46] zhuyifei1999_: this is another https://phabricator.wikimedia.org/source/operations-puppet/browse/production/modules/role/files/mariadb/dump_shards.sh [16:55:00] you want to be able to keep it off-cloud or at least potentially on multiple labvirts [16:55:35] and if it's not that large it's possible dumps could be used for generation as a large drop off location but keeping it there isn't very valuable for backup purposes depending on what you want to survive [16:55:42] what kind of events I mean [16:55:53] it’s something like 100MB per backup [16:55:54] mutante: https://pastebin.com/JQRwxsp9 [16:55:56] but it's all contextual [16:56:50] zhuyifei1999_: if that's the case keeping a copy on an instance intra-cloud for backups isn't crazy to me,that gets you n+1 for labvirt redundancy at least [16:57:20] a service for users to drop off files as a backup solution is on our brains but it's expensive and time consuming and hasn't happened yet [16:57:49] I would like to offer a simple rsync.net style service at some point here [16:58:26] hmm [17:00:17] As a general rule is if something isn't operationally involved in the running of a Tool then it shouldn't be on Toolforge NFS really, /home is another use case that we squish in there because we can and it hasn't burned us badly enough yet to revise [17:01:03] because all NFS storage is really 3x and has to be dealt with in realtime for replication and the bigger it gets teh slower and more complicated [17:01:28] Cyberpower678: https://gerrit.wikimedia.org/r/#/c/385006/ [17:01:35] and all NFS access is throttled and managed so no matter what you do to use it for something totally static like a backup file it's a bad idea for the overall system [17:01:42] paladox: this ^ :) [17:01:50] 10cloud-services-team (Kanban), 10Analytics, 10User-Elukey: Remove logging from labs for schema https://meta.wikimedia.org/wiki/Schema:CommandInvocation - https://phabricator.wikimedia.org/T166712#3694500 (10bd808) >>! In T166712#3694457, @Nuria wrote: > Yes, pinging @bd808 that those two will be deleted soo... [17:01:51] thanks :) [17:02:57] zhuyifei1999_: if you want to make a small instance I can verify it comes up on a distinct labvirt for labvirt redundancy of the backup, that's as good or better than every other idea I have at the moment. Having a place to stash it totally off-cloud is obviously best. [17:04:45] (sorry was away) [17:06:03] I’m wondering, what about the current query results? [17:06:22] they are stored on the NFS right now [17:06:33] as sqlite databases [17:06:41] that's all legacy I imagine [17:07:18] For a long time NFS was abused as a one size fits all solution and it crumbled badly and often [17:07:36] and there remnants of that still around but it's not a good path [17:07:40] I’ll look into how much data it has (dozens of gigabytes I imagine) [17:07:54] ok [17:11:07] zhuyifei1999_: I'm happy to help, this has been a thorn in our side for a long time and yuvi spent a lot of effort removing these kinds of use cases from NFS so it's somewhat endearing that quarry is in this mode still :) [17:12:20] k, will make a task once I get time for it, probably tonight [17:29:49] 10Data-Services: Make user_email_authenticated status visible on labs - https://phabricator.wikimedia.org/T70876#3694593 (10Bawolff) Could we maybe just have a new separate view that contains user_id, user_name and user_is_emailable where user_is_emailable is a boolean field that checks that disablemail is not o... [17:38:54] could we raise the quota for Cyberpower temporarily? it would be great if he can _first_ create new stretch instances to replace trusty instances and then delete them [17:39:10] i mean the number of VPSes in the cyberbot project [17:39:28] much nicer than having to upgrade them in-place [17:40:13] he just got a quota bump to make the new db server [17:40:39] that would be unrelated to replacing the exec nodes though [17:40:46] which are trusty afaik [17:40:56] that can't possibly be in use yet operationally, so if he needs to rolling replace the exec nodes that space could be used [17:41:29] but yes we would also grant a temporary bump for replatforming. just needs a phab task [17:41:31] i just adviced "first make the new ones, test the puppet role, once you are happy delete the old ones" [17:41:42] ok [17:41:49] and a timeline :) [17:45:22] bd808: since I'm heavily working on instances if done now, I can work it now. [17:46:03] 10Tools: Long running queries from pltools unlikely to finish - https://phabricator.wikimedia.org/T178459#3694628 (10Pasleim) I can not use EXPLAIN: ``` ERROR 1345 (HY000): EXPLAIN/SHOW can not be issued; lacking privileges for underlying table ``` The first query I could rewrite to SPARQL and execute now on que... [17:53:45] 10Quarry: Make backups of Quarry's main database on quarry-main-01 - https://phabricator.wikimedia.org/T178519#3694650 (10zhuyifei1999) [17:59:17] 10cloud-services-team (Kanban), 10DC-Ops, 10Operations, 10ops-eqiad: labvirt1015 crashes - https://phabricator.wikimedia.org/T171473#3694673 (10bd808) >>! In T171473#3694183, @Cmjohnson wrote: > Let's monitor and see if the error persists. We probably need to find a way to put load on this system rather t... [18:00:03] 10Quarry: Find somewhere else (not NFS) to store Quarry's resultsets - https://phabricator.wikimedia.org/T178520#3694676 (10zhuyifei1999) [18:00:12] chasemp: ^ two tasks [18:00:44] zhuyifei1999_: can you cc me? [18:00:57] yeah I did [18:01:09] ok thanks :) I don't see wikibugs so I wasn't sure [18:03:20] 10Quarry: Find somewhere else (not NFS) to store Quarry's resultsets - https://phabricator.wikimedia.org/T178520#3694690 (10zhuyifei1999) The query runners must somehow store the results to somewhere the web server can access. Celery does not support sending large tables as a result of a job; doing so would floo... [18:05:25] 10Tools: Long running queries from pltools unlikely to finish - https://phabricator.wikimedia.org/T178459#3693048 (10zhuyifei1999) >>! In T178459#3694628, @Pasleim wrote: > I can not use EXPLAIN: How to execute `EXPLAIN` is described in T50875#2845764. Alternatively [[http://quarry.wmflabs.org/|Quarry]] can pro... [18:37:01] 10Quarry: Make backups of Quarry's main database on quarry-main-01 - https://phabricator.wikimedia.org/T178519#3694769 (10zhuyifei1999) I'm wondering, would it make sense for a separate instance to connect to quarry-main-01 to do the backups to its instance local storage? Or the backup process should be done on... [18:41:30] 10cloud-services-team: Upgrade puppetmaster on toolsbeta and test - https://phabricator.wikimedia.org/T178510#3694774 (10valhallasw) The puppet compilers can be deleted now that you got everything working on another host. Given that there isn't much of a grid left (I vaguely remember the regular exec nodes did n... [19:15:22] 10cloud-services-team (Kanban), 10DC-Ops, 10Operations, 10ops-eqiad: labvirt1015 crashes - https://phabricator.wikimedia.org/T171473#3694836 (10chasemp) @Andrew scheduled 20 instances to this server and 4 think they came up and the rest failed. ```2017-10-18 18:13:49.714 2530 ERROR nova.compute.manager... [19:43:40] 10cloud-services-team (Kanban), 10DC-Ops, 10Operations, 10ops-eqiad: labvirt1015 crashes - https://phabricator.wikimedia.org/T171473#3694964 (10Andrew) I think the VM creation failure was a (mostly? completely?) unrelated issue. I've rescheduled some actually running VMs there, and will see how they do. [21:36:14] !log tools stop basebot -- it is going crazy and spamming email w/ failing to log to error.log. Need to figure out how to notify but it's clearly in a failure loop. [21:36:19] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [21:38:09] the only member of basebot is 'base:x:2785:500:Base:/home/base:/bin/bash' [21:41:12] left a talk page message at https://wikitech.wikimedia.org/wiki/User_talk:Base [21:54:37] (03PS1) 10Awight: Add more ORES repos [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/385103 [22:12:20] bd808: got some time? [23:12:46] chasemp, bd808: So I switched to using labsdb-analytics.eqiad.wmnet but some wiki report of mine requires joining with a sibling database table. Is that supported? [23:14:05] Oh, here: https://lists.wikimedia.org/pipermail/labs-l/2017-September/005074.html [23:14:33] > These new servers will not allow users to create their own databases/tables co-located with the replicated content. [23:16:25] sql has a --cluster argument now, okay. [23:21:23] Esther: unfortunately we have not been able to find a good solution for the use case of user maintained tables yet. [23:22:04] there has always been a warning that those tables might not be available if the backend servers are switched for maintenance, etc [23:22:32] with the new setup that we have we expect those kinds of moves to be more common for maintenance, load balancing, etc [23:23:22] so it seems like a better solution to remove the feature rather than have people build more things that require it and then get broken by normal system maintance [23:23:48] the tools.labsdb server will remain available for user created content [23:24:24] so in theory things can be rewritten to do joins in application space rather than in the db directly [23:24:49] this does mean code changes of course and those may be difficult for some use cases [23:25:00] Yeah... is a giant SQLite database okay? [23:25:17] Let's say 5GB or something. [23:25:43] the performance will be horrible and it will steal NFS bandwidth from all other tools on the same exec node [23:26:09] if you can stuff the data in sqlite could you not use tools.labsdb? [23:26:09] https://en.wikipedia.org/wiki/Wikipedia:List_of_Wikipedians_by_article_count/Configuration is the report. [23:26:23] Hmmm, maybe, yeah. [23:26:52] s51334__enwiki_first_page_revisions_p is the current MySQL database, I guess. [23:27:22] *nod* [23:28:05] Ah, right, the loader is already using two cursors/connections. [23:28:14] But that damn JOIN there to actually generate the report... [23:28:53] JOIN with a GROUP BY, yeah... [23:29:12] reading now, but I think the difference would be that you would open 2 database cursors: one for the revisions_p data and one for the replica data. Then instead of the join you would do SELECTs with batches of WHERE rev_id in (... list of things you fetched from revisions_p) [23:29:32] And store the data somewhere as it fetches it for every user, I guess. [23:29:40] And then order by and truncate. [23:29:50] and then you would I guess have to do the group by in ram with a hashtable or something [23:29:57] Yeah. [23:30:06] Or set a threshold and only store > X results? Idk. [23:30:28] yeah, I'd have to read more closely to figure out what's really going on [23:30:43] It's basically just a dumb JOIN. I tried to keep it simple. [23:31:13] At least it lasted more than two years. That's nice. [23:31:18] :) [23:32:12] I really don't like breaking things for people, but cutting this off seems better in the long run than people trying to deal with data appearing and disappearing [23:32:28] Yeah, I suppose. [23:34:50] bd808: If I make a database on tools.db.svc.eqiad.wmflabs, should I continue to use the s51334__ prefix? [23:35:00] (Is that automatic, I forget?) [23:35:25] 51334(tools.mzmcbride) is where it's coming from, I guess. [23:35:31] it's not automatic, but you will only be able to create dbs that start with your mysql user id and the __ [23:35:39] Cool. [23:36:11] Moving the database and updating it is easy enough. Replicating the JOIN behavior, eh. Will need to think on it a bit. [23:36:26] Could maybe use a temp table in MySQL as a store.