[12:47:07] 10serviceops, 10Operations: High APCu fragmentation can impact server performance - https://phabricator.wikimedia.org/T240205 (10jijiki) [12:56:15] 10serviceops, 10Operations: High APCu fragmentation can impact server performance - https://phabricator.wikimedia.org/T240205 (10jijiki) a:03jijiki [13:56:37] 10serviceops, 10Operations: Reimage all mediawiki servers - https://phabricator.wikimedia.org/T239054 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by jiji on cumin1001.eqiad.wmnet for hosts: ` ['mw2270.codfw.wmnet', 'mw2269.codfw.wmnet', 'mw2268.codfw.wmnet'] ` The log can be found in `/var/log/... [14:19:07] 10serviceops, 10Operations, 10SRE-swift-storage, 10Patch-For-Review, and 2 others: Swift object servers become briefly unresponsive on a regular basis - https://phabricator.wikimedia.org/T226373 (10fgiunchedi) re: ats and client timeouts and retries, yes ats does retry on origin timeout as it seems. Otherw... [14:37:00] hello! [14:37:04] https://grafana-labs.wikimedia.org/d/000000317/memcache-slabs?orgId=1&var-datasource=Beta%20Prometheus&var-cluster=misc&var-instance=deployment-memc08&var-slab=All [14:37:44] the "1.5.x" tab shows new slab metrics for 1.5.x from the new exporter (need to upload it to apt but works in labs) [14:38:09] this in theory closes the last action item to prepare memcached for buster [14:38:27] modulo of course more testing with prod traffic about slab settings etc.. [14:40:47] 10serviceops, 10Operations, 10Patch-For-Review, 10Performance-Team (Radar), and 2 others: Upgrade memcached for Debian Stretch/Buster - https://phabricator.wikimedia.org/T213089 (10elukey) Finally after a long back and forth with upstream I was able to have my pull request merged. Built the new exporter an... [14:41:40] all reported --^ [14:42:06] this means that the new memcached nodes for the gutter pool will be buster-ready [14:43:10] one question about naming: we'll get three new nodes for codfw and 3 for eqida [14:43:13] *eqiad [14:43:35] those will be different from regular mcXXXX nodes (256G of ram, 10G network, etc..) [14:43:59] at the moment the idea is to name them mc103[7-9] and mc203[7-9] [14:44:10] so not using a separate naming convention [14:44:17] please tell me if this is ok or not :) [14:44:39] Cc: effie, _joe_ --^ [14:46:58] <_joe_> elukey: I think we should use a different naming just to troll dcops [14:49:33] _joe_ let's do it, something like mc-gutter-pool-from-facebook100[1-3] [14:50:43] too long :-P [14:52:57] <_joe_> gut1001 [15:32:35] sorry elukey, I was at lunch [15:33:13] as far as naming goes [15:33:21] I think we should use something to distingish them [15:33:42] I already find it very very hard that we have special mw servers [15:33:50] (scap proxies, memcached proxies) [15:34:13] +1 on that [15:34:17] so mc-gp1001 would be great [15:34:33] I am ok with any solution [15:34:58] let's just find an agreement soon so we can add the info to the tasks [15:35:50] the tasks which I should have written/updated last week [15:35:52] * effie sighs [16:00:35] <_joe_> serviceops people: do we want to have the meeting today? [16:00:52] <_joe_> Effie will be late, daniel is off, and reuven and I are working together [16:00:59] <_joe_> I say we ditch it [16:01:03] do get an SRE meeting update [16:01:34] <_joe_> I am adding the relevant things I did [16:01:54] 10serviceops, 10Operations: Reimage all mediawiki servers - https://phabricator.wikimedia.org/T239054 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2270.codfw.wmnet', 'mw2269.codfw.wmnet', 'mw2268.codfw.wmnet'] ` and were **ALL** successful. [16:14:30] I have added my stuff on the SRE doc as well [16:20:04] effie: no no the tasks about rack/setup/deploy etc.. :) [16:30:29] Ah I wondered why I am waiting in there all alooooone [16:30:43] I updated the pad earlier in any case [16:30:56] I am done [16:31:02] are we having the meeting? [16:31:11] oh [16:31:23] I finished my other meeting on time though :D [16:31:40] whatever alex and I are in it [16:31:43] όπως θες [16:31:54] lol [16:32:00] so it's not just me [16:32:12] όχι βέβαια [16:32:43] <_joe_> Sorry I'm switching location now [16:34:12] <_joe_> You can have the meeting without me [16:34:31] <_joe_> I did put my notes in [16:35:56] we are having the meeting in greek :p hahaha [16:36:47] <_joe_> Maybe ping rlazarus... [16:38:02] you are not together? [16:38:36] <_joe_> We were but I'm going home [16:42:27] oh [16:42:38] then he is going back to the hotel I reckon [17:13:43] ahh sorry, didn't see this until too late [17:14:38] ah well [17:14:44] just scribble your stuff into the pad I guess [17:14:58] 10serviceops, 10MediaWiki-General, 10Security-Team, 10Performance-Team (Radar), 10Security: Create a tmp directory just for MediaWiki - https://phabricator.wikimedia.org/T179901 (10chasemp) p:05Triage→03Normal [17:15:35] will do [17:34:43] rlazarus: welcome to the european timezones :p [17:35:08] every meeting is late here :D [17:38:04] haha [17:38:23] I'm lucky to be on the US east coast instead of west, I don't have to wake up early every day [17:39:21] east coast is a really good place to be for working here tbh [17:39:22] literaly, your timezone is the nest [17:39:28] best* [17:40:26] <_joe_> cdanis: I think coastal brazil is the best [17:40:32] <_joe_> for a series of reasons [17:40:46] ah sure [17:40:50] that's a nice timezone [17:40:57] and it's much better environs than Greenland [18:01:15] * rlazarus off [18:30:45] 10serviceops, 10Analytics-Kanban, 10Better Use Of Data, 10Event-Platform, and 8 others: Set up eventgate-logging-external in production - https://phabricator.wikimedia.org/T236386 (10Ottomata) @jlinehan thoughts? I'm considering moving forward with intake-{analytics,logging}. [20:15:31] 10serviceops, 10Analytics-Kanban, 10Better Use Of Data, 10Event-Platform, and 8 others: Set up eventgate-logging-external in production - https://phabricator.wikimedia.org/T236386 (10Ottomata) Hey also, before I go through with this; is there any issue with CORS here? If we go with a separate (non wikimed...