[05:26:08] 10serviceops, 10Operations, 10Core Platform Team Backlog (Later), 10Services (next): Migrate node-based services in production to node10 - https://phabricator.wikimedia.org/T210704 (10KartikMistry) [07:13:50] 10serviceops, 10Operations, 10WMDE-QWERTY-Team, 10wikidiff2, and 3 others: Deploy Wikidiff2 version 1.8.2 with the timeout issue fixed - https://phabricator.wikimedia.org/T223391 (10jijiki) a:05Joe→03jijiki [07:15:02] 10serviceops, 10Operations, 10WMDE-QWERTY-Team, 10wikidiff2, and 3 others: Deploy Wikidiff2 version 1.8.2 with the timeout issue fixed - https://phabricator.wikimedia.org/T223391 (10jijiki) @awight @jkroll should we rollout all hosts? [08:21:01] 10serviceops, 10Operations, 10observability: Gather metrics on request status codes, latencies from the MediaWiki appservers - https://phabricator.wikimedia.org/T226815 (10fgiunchedi) Indeed I think mtail to extract metrics from apache logs is the best way we have, I'm not aware of apache exposing more of it... [08:34:56] 10serviceops, 10Operations, 10WMDE-QWERTY-Team, 10wikidiff2, and 3 others: Deploy Wikidiff2 version 1.8.2 with the timeout issue fixed - https://phabricator.wikimedia.org/T223391 (10Joe) >>! In T223391#5299032, @jijiki wrote: > @awight @jkroll should we rollout to all hosts? I think we should, I didn't ge... [08:36:28] 10serviceops, 10Operations, 10WMDE-QWERTY-Team, 10wikidiff2, and 3 others: Deploy Wikidiff2 version 1.8.2 with the timeout issue fixed - https://phabricator.wikimedia.org/T223391 (10WMDE-Fisch) >>! In T223391#5299264, @Joe wrote: >>>! In T223391#5299032, @jijiki wrote: >> @awight @jkroll should we rollout... [09:59:12] akosiaris: _joe_ do you mind if I move a couple of tasks I would like to look into this week [09:59:21] to the doing or up next column ? [09:59:39] <_joe_> jijiki: well in theory we should pick from "up next" [09:59:43] I want to have another go at the socket error saga [10:01:51] 10serviceops, 10MediaWiki-Logging, 10Operations, 10Wikimedia-Logstash, and 8 others: Port mediawiki/php/wmerrors to PHP7 and deploy - https://phabricator.wikimedia.org/T187147 (10Joe) Talking with @Krinkle I realized that the reason why I saw that ugly error message is because the endpoint doesn't initiali... [10:02:01] also, should I create the documention tracking task ? [10:02:16] <_joe_> jijiki: ok, but I'd prefer you to focus on jobs and crons :) [10:02:18] based on what we discussed last week [10:02:31] <_joe_> yeah I guess so [15:02:07] 10serviceops, 10Operations, 10WMDE-QWERTY-Team, 10wikidiff2, and 3 others: Deploy Wikidiff2 version 1.8.2 with the timeout issue fixed - https://phabricator.wikimedia.org/T223391 (10jijiki) 05Open→03Resolved All service restarts should be complete in a few hours, please reopen if there are any issues :) [18:00:30] o/ I am told that since i will be needing Redis when I upgrade Netbox, that I should ask here about if/how to use the redis cluster for that (if not nbd) [18:50:31] cdanis: so to be clear "kask_build_info" ? [18:50:39] urandom: +1 [18:50:46] kk [18:50:54] I pushed that as patch 2, but wanted to be sure [18:51:23] the standard for Prometheus nowadays is that everything should be suffixed with a 'unit'... and _info is as close as you can get to _seconds or _bytes in this context [18:51:43] (and sorry for not thinking of this to begin with :) [18:53:13] nope, that's cool [18:53:31] I like what this provides [18:59:50] cdanis: oh crap, I didn't push it! [18:59:55] smh [20:36:59] 10serviceops, 10Continuous-Integration-Infrastructure, 10Operations, 10Release-Engineering-Team-TODO (201907): contint1001 store docker images on separate partition or disk - https://phabricator.wikimedia.org/T207707 (10Dzahn) > RAID1 over the new disks /dev/sdc and /dev/sdd apt-get install parted parted... [21:22:00] 10serviceops, 10Continuous-Integration-Infrastructure, 10Operations, 10Release-Engineering-Team-TODO (201907): contint1001 store docker images on separate partition or disk - https://phabricator.wikimedia.org/T207707 (10Dzahn) >>! In T207707#5271473, @hashar wrote: > * a LVM volume group pvcreate /dev/md3... [21:22:55] 10serviceops, 10Continuous-Integration-Infrastructure, 10Operations, 10Release-Engineering-Team-TODO (201907): contint1001 store docker images on separate partition or disk - https://phabricator.wikimedia.org/T207707 (10Dzahn) ` root@contint1001:/mnt/docker# df -h Filesystem Size... [21:34:42] 10serviceops, 10Continuous-Integration-Infrastructure, 10Operations, 10Release-Engineering-Team-TODO (201907): contint1001 store docker images on separate partition or disk - https://phabricator.wikimedia.org/T207707 (10Dzahn) a:05Dzahn→03hashar [21:36:15] 10serviceops, 10Continuous-Integration-Infrastructure, 10Operations, 10Release-Engineering-Team-TODO (201907): contint1001 store docker images on separate partition or disk - https://phabricator.wikimedia.org/T207707 (10Dzahn) Hi Hashar, at this point i think it makes sense to assign back to to you to chec... [21:44:01] 10serviceops, 10Operations, 10Thumbor, 10ops-eqiad, 10User-jijiki: (OoW) thumbor1004 memory errors - https://phabricator.wikimedia.org/T215411 (10wiki_willy) [22:04:10] 10serviceops, 10Continuous-Integration-Infrastructure, 10Operations, 10Release-Engineering-Team-TODO (201907): contint1001 store docker images on separate partition or disk - https://phabricator.wikimedia.org/T207707 (10thcipriani) >>! In T207707#5302044, @Dzahn wrote: > Hi Hashar, at this point i think it...