[06:40:55] 10serviceops, 10Analytics, 10EventBus: helmfile apply with values.yaml file change did not deploy new k8s pods - https://phabricator.wikimedia.org/T228700 (10fsero) the main issue is in notifying changes to the deployment object department, not in helmfile. helmfile is AFAICT working as intended. In the eve... [08:42:20] <_joe_> akosiaris: do we want the TPM in new servers that could become k8s nodes? [08:42:36] <_joe_> I don't think we will realistically have anything in place to use it [08:44:45] I don't think that TPM requirement is of any use currently [08:44:49] so we can remove it [08:45:16] the only CRE that has some use for it is rkt, but it's not fully OCI compliant yet [08:52:57] <_joe_> yeah and I don't like its chances given the 3 levels of acquisitions coreos went through [08:56:08] 10serviceops, 10Operations, 10Core Platform Team Workboards (Green): Keys from MediaWiki Redis Instances - https://phabricator.wikimedia.org/T228703 (10Joe) [08:59:22] 10serviceops, 10Operations, 10PHP 7.2 support, 10Performance-Team (Radar), and 2 others: PHP 7 corruption during deployment (was: PHP 7 fatals on mw1262) - https://phabricator.wikimedia.org/T224491 (10Joe) >>! In T224491#5354568, @Krinkle wrote: > Logstash query for the error in question: > > 10serviceops, 10Operations: Migrate pool counters to Stretch/Buster - https://phabricator.wikimedia.org/T224572 (10akosiaris) poolcounter1004 has just been added [09:40:13] jijiki: the vaiours mw hosts with puppet disabled since yesterday are still WIP? [09:41:53] I wondered the same but I'd say yes volans [09:42:32] not sure if only for precaution or in need for an update, but better to wait for Effie [10:06:51] 10serviceops, 10Wikibase-Termbox-Iteration-20, 10Wikidata-Termbox-Iteration-19, 10Patch-For-Review: Create termbox release for test.wikidata.org - https://phabricator.wikimedia.org/T226814 (10fsero) 05Open→03Resolved This has been deployed via the DNS artifact previously discused . ` fsero@deploy1001:... [10:06:56] 10serviceops, 10Operations, 10Wikidata, 10Wikidata-Termbox-Hike, and 4 others: New Service Request: Wikidata Termbox SSR - https://phabricator.wikimedia.org/T212189 (10fsero) [10:10:01] fsero: woo! Thanks :) [10:16:05] 10serviceops, 10Wikibase-Termbox-Iteration-20, 10Wikidata-Termbox-Iteration-19, 10Patch-For-Review: Create termbox release for test.wikidata.org - https://phabricator.wikimedia.org/T226814 (10WMDE-leszek) Thanks @fsero! [11:02:23] volans: elukey tx, it is a precaution, I will enable it in a bit [11:06:30] I was about to yesterday, and then decided against it [11:07:06] ack, no prob [11:08:05] 10serviceops, 10Scap, 10PHP 7.2 support, 10Patch-For-Review, and 3 others: Enhance MediaWiki deployments for support of php7.x - https://phabricator.wikimedia.org/T224857 (10Joe) >>! In T224857#5354823, @thcipriani wrote: >>>! In T224857#5352559, @Joe wrote: >> @thcipriani do you need more information from... [11:54:21] 10serviceops, 10Scap, 10PHP 7.2 support, 10Patch-For-Review, and 3 others: Enhance MediaWiki deployments for support of php7.x - https://phabricator.wikimedia.org/T224857 (10greg) a:03thcipriani [12:15:17] 10serviceops, 10Analytics, 10EventBus, 10Patch-For-Review: helmfile apply with values.yaml file change did not deploy new k8s pods - https://phabricator.wikimedia.org/T228700 (10akosiaris) I think the issue is on the stream-config.yaml file, not the config.yaml template. Using `.Files.Get` means the file i... [12:15:26] 10serviceops, 10Analytics, 10EventBus, 10Patch-For-Review: helmfile apply with values.yaml file change did not deploy new k8s pods - https://phabricator.wikimedia.org/T228700 (10akosiaris) p:05Triage→03Normal [12:25:15] 10serviceops, 10Operations, 10Core Platform Team Workboards (Green): Keys from MediaWiki Redis Instances - https://phabricator.wikimedia.org/T228703 (10WDoranWMF) p:05Triage→03High [12:49:04] 10serviceops, 10Analytics, 10EventBus, 10Patch-For-Review: helmfile apply with values.yaml file change did not deploy new k8s pods - https://phabricator.wikimedia.org/T228700 (10Ottomata) Thanks so much you two! I'll take some of what Fabian wrote and add it to my EventGate docs too. [12:55:34] 10serviceops, 10Operations, 10Core Platform Team Backlog (Watching / External), 10Core Platform Team Workboards (Clinic Duty Team), and 4 others: Use PHP7 to run all async jobs - https://phabricator.wikimedia.org/T219148 (10jijiki) [13:09:30] hello people, I am planning to deploy async replication for mcrouter with https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/525053/ today/tomorrow to the mw canaries if people are ok [13:09:48] (already tested on one app server and one api appserver by me and Aaron) [13:12:54] sounds good, ping me to monitor as well when you do [13:13:21] sure [13:53:12] 10serviceops, 10Operations, 10Patch-For-Review: Migrate pool counters to Stretch/Buster - https://phabricator.wikimedia.org/T224572 (10MoritzMuehlenhoff) a:03MoritzMuehlenhoff [14:33:19] 10serviceops, 10Operations, 10Wikimedia-General-or-Unknown, 10Patch-For-Review: Remove pear/mail packages from WMF MW app servers - https://phabricator.wikimedia.org/T195364 (10Joe) @Tgr do you see any reason not to uninstall those packages? I will for now just remove them from puppet, and uninstall them o... [14:38:37] 10serviceops, 10Operations, 10Wikimedia-General-or-Unknown, 10Patch-For-Review: Remove pear/mail packages from WMF MW app servers - https://phabricator.wikimedia.org/T195364 (10Tgr) >>! In T195364#5357382, @Joe wrote: > @Tgr do you see any reason not to uninstall those packages? I will for now just remove... [16:01:32] 10serviceops, 10Operations, 10Core Platform Team Workboards (Green): Keys from MediaWiki Redis Instances - https://phabricator.wikimedia.org/T228703 (10jijiki) 05Open→03Resolved @holger.knust I copied a gzipped dump to a server you have access to, please reopen when you need newer one:) [16:01:59] jijiki: i am told that you may know about redis and if we can use it for other stuff [16:03:55] who on earth is spreading rumors [16:04:13] a little bird told me [16:04:17] I am deeply deeply appalled [16:04:42] * jijiki puts on the sales suit [16:05:01] so we have rdb1005 and rdb1009 on eqiad [16:05:27] they are masters, each having a slave (1005-1006, 1009-1010) [16:06:01] they don't get replicated to codfw [16:06:23] the services we have now using them are not that demanding [16:06:44] so for now we can afford rebooting them without anything happening [16:07:07] each server has 5 redis separate redis instances [16:07:17] I think that is the tldr [16:07:41] iirc this https://wikitech.wikimedia.org/wiki/Redis [16:07:44] is up to date [16:08:54] * jijiki will be around on and off keyboard [16:08:59] aha [16:08:59] tell me what you need [16:09:01] okay thank you :) [16:09:14] we need just access to a redis cluster for netbox at some point [16:09:19] soonish [16:09:33] I will ping you after dinner for details [16:09:46] cool thank s :) [16:09:48] tx [17:52:54] chaomodus: can you tell me now? [17:53:41] jijiki: okay so netbox will require access to a redis in order to store caching and do task queuing [17:53:47] - in the next version [17:54:04] ok so that means that if we say reboot redis [17:54:24] how will netbox react ? [17:54:40] that's a good question! [17:54:48] I guess we'll find out :) [17:55:00] ahah well i guess the basic question is "can we use this for that" [17:55:19] ofc [17:55:25] okay cool [17:55:33] "how do we use this for that" presumably it's in cluster mode? [17:55:39] the stuff I am asking are for my own admin overhead:) [17:55:45] no cluster mode [17:55:55] Oh really? [17:56:06] yeap, they are to separate masters [17:57:39] interesting [17:57:40] now changeprop that's using those servers [17:57:59] it is doing it via nutcracker, and nutcracker takes care of sharding [17:58:10] but we are trying to get out of it [17:59:12] but hangon since there are ephemeral data there [17:59:19] that are replicated to a slave anyway [17:59:43] and netbox exists in one dc only (iirc) [17:59:53] for now (tm) [17:59:54] isnt that enough? [17:59:56] oh [18:00:19] actually very shortly it'll exist in two, but we're sort of deferring the active/active setup [18:00:28] so nbd if the data isn't shared acrross dc [18:01:54] alright [18:02:41] you can grep in the puppet repo and find a db that is not in use [18:02:45] urm [18:02:49] I meant an instance [18:04:27] i'm not quite sure i understand [18:05:45] there are 5 instances on each master [18:05:56] on different ports [18:06:03] ahh i see [18:06:27] :) [18:10:39] looks like only 6379 and 6382 are used [18:12:04] just those 2? [18:12:06] ok good [18:12:15] damn I am not selling enough [18:12:45] aha [18:13:10] Hmm [18:13:19] the redis page implies a lot more usage [18:13:47] it used to be busier I believe [18:14:40] so is the egenral pattern to pick one of the two servers, and pick and unallocated installation and use it however? [18:16:37] yeah, since this is not a very busy service [18:17:03] kay cool [18:17:09] this should be enough for now [18:19:22] there was a typo for the hostnames in the servers list :) [18:21:57] on the wiki ? [18:22:02] yah i fixed itn [18:22:05] tx tx [18:22:14] it had rdb1005 listed as the secondary for rdb1005 [18:23:17] inception [18:23:40] so how do i define authentication bits for these [18:24:29] the quickest way is to ssh to on and check the password [18:24:41] the slower way is to find it in the secrets repo [18:25:18] unless thats not what you are asking [18:25:41] heh I'm asking do we define new auth tokens for our use or use existing ones or what [18:27:30] as is [18:28:32] - so use the existing ones [18:28:48] yeap [18:28:55] kay cool that's straight forward [18:29:44] I think i have enough to proceed, thanks much for your help! [18:31:09] haha thanks for shopping [18:33:43] shop smart, shop ServiceOpsMart [18:33:55] smoart [18:35:10] hahaha [22:45:05] Hello, I've recently merged a new chart to https://gerrit.wikimedia.org/g/operations/deployment-charts, and I was wondering what I should do to get it packaged/available on https://releases.wikimedia.org/charts/? [22:50:08] longma: paraphrasing something akosiari.s wrote on IRC from a while back: helm package ; helm repo index . ; git add -*.tgz index.yaml ; git commit ; push for review [22:50:40] i just checked the git log on the releases server [22:50:48] and the latest change is from Tyler on June 17 [22:51:13] cant confirm a newer merge ? [22:52:02] the git history seems a bit messed up [22:52:45] authordate vs commitdate [22:52:49] git log --format=fuller [22:52:52] https://gerrit.wikimedia.org/r/q/project:operations%252Fdeployment-charts+status:merged [22:53:10] ah okay [22:53:19] I merged my commit from the 17th today [22:53:46] i see. and that is already on the servers because somebody (not puppet) pulled [22:54:49] that's a good question: I did not pull [22:54:52] maybe there's a cron? [22:55:27] that would be odd. then we could just use "ensure => latest" [22:55:33] hmm [22:55:49] i saw the puppet code only ensures present [22:56:07] # Puppet Name: git_pull_charts [22:56:21] */1 * * * * cd /srv/deployment-charts && /usr/bin/git pull >/dev/null 2>&1 [22:57:07] ok, but puppet would be much simpler. maybe it's just because we didn't want to wait for next run [22:58:11] mayhaps [23:02:32] 10serviceops, 10Scap, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO (201907): Deploy scap 3.11.1-1 - https://phabricator.wikimedia.org/T228482 (10Dzahn) 05Open→03Resolved 22:36 mutante: rolling out scap 3.11.1-1 on mw-eqiad servers 22:14 mutante: continuing rollout o... [23:02:34] 10serviceops, 10Scap, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO (201907): 'scap pull' stopped working on appservers ? - https://phabricator.wikimedia.org/T228328 (10Dzahn) [23:03:42] 10serviceops, 10Scap, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO (201907): 'scap pull' stopped working on appservers ? - https://phabricator.wikimedia.org/T228328 (10Dzahn) 05Open→03Resolved deployed globally in the subtask. should be resolved now.