[00:01:21] PROBLEM - Logstash Elasticsearch indexing errors #o11y on alert1001 is CRITICAL: 8.083 ge 8 https://wikitech.wikimedia.org/wiki/Logstash%23Indexing_errors https://logstash.wikimedia.org/goto/3283cc1372b7df18f26128163125cf45 https://grafana.wikimedia.org/dashboard/db/logstash [00:09:26] (03PS1) 10Legoktm: Add enwiki20 "Option A" fixed logos [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657927 (https://phabricator.wikimedia.org/T272526) [00:09:36] Urbanecm: ^ can you take a look? [00:11:16] Looking [00:12:27] (03PS1) 10Legoktm: Switch enwiki to use enwiki20 "Option A" logo variant [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657929 (https://phabricator.wikimedia.org/T272526) [00:13:16] looks like this https://usercontent.irccloud-cdn.com/file/fTncvQTS/image.png [00:14:03] looks quite small in timeless, but being not a regular user legoktm might be able to tell better https://usercontent.irccloud-cdn.com/file/fjh2GPUQ/image.png [00:14:38] are you using a hdpi screen? [00:15:51] I don't think so legoktm, through...is there a way to check that? [00:16:32] which logo variant gets served to you, 1.5x, 2x or normal? [00:17:18] Normal [00:18:23] so no then :) [00:19:01] the 2x logo variant looks good to me in Vector/Timeless [00:19:09] great [00:19:25] and MB too [00:19:47] I assume the 1.5x will be good in the middle too [00:19:54] all set then? [00:19:58] i think so [00:20:10] 1.5x is not sized correctly. [00:20:22] oh, hi Isarra [00:20:27] * legoktm hugs Isarra [00:20:46] Hello I'm not awake. [00:20:49] Isarra: what's wrong with it? [00:21:20] Uuuuh actually the file seems fine. [00:21:42] 125 * 1.5 = 187.5, rounded up to 188 [00:21:50] It's the hardcoded 135px width for the 1.5x + logos that... is wrong? [00:21:53] * legoktm pours soup on Isarra [00:22:05] for some reason the Wikipedia logo is 125px not 135px [00:22:16] So the logo RL... module/featuyre thing is what's messing it up. [00:22:23] https://en.wikipedia.org/static/images/project-logos/enwiki.png is 135px through [00:22:31] It just assumes nominal size for all logos is 135px and sets the size to that. [00:22:47] https://commons.wikimedia.org/wiki/File:Wikipedia-logo-v2-wordmark.svg is 125px [00:22:51] ughhh [00:22:56] Which usually makes them too small because they're often rather wider in reality, but in this case is making it too big because nominal size is 125px here. [00:22:58] should I switch them to 135px based then? [00:23:01] ok [00:23:40] Well, you could also add a note that that assumption is bloody dumb and needs to die in a fire and we should just be setting the nominal size in wgLogos like we do for wordmarks... >.> [00:23:59] Because no random assigned size is going to make sense in all cases anyway. [00:24:18] Like in monobook it's just fixed by someone overriding it to 100px somewhere else... >.> [00:24:43] that's....going to be hard to fix in the longer term [00:25:11] this is the same SVG, but with 135px 1x PNG https://usercontent.irccloud-cdn.com/file/H2OsOmXX/image.png [00:25:50] what do you think, Isarra / legoktm ? [00:25:59] it appears...bigger, which may or may not be an issue :D [00:26:58] the new 2x looks fine locally for me [00:27:05] As long as it fits it's probably fine? :P [00:27:11] no non-square logos, ever again [00:28:06] tested Timeless/MonoBook too, looks good [00:28:11] let me recompress and update the patch [00:29:00] So, let's go forward? [00:29:50] mhm [00:31:06] (03PS2) 10Legoktm: Add enwiki20 "Option A" fixed logos [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657927 (https://phabricator.wikimedia.org/T272526) [00:31:08] (03PS2) 10Legoktm: Switch enwiki to use enwiki20 "Option A" logo variant [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657929 (https://phabricator.wikimedia.org/T272526) [00:31:25] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [00:32:01] \o/ [00:32:20] 135, 202, 270 now [00:32:32] from https://www.mediawiki.org/wiki/Manual:$wgLogos so I didn't do math on my own [00:33:05] (03CR) 10Legoktm: [C: 03+2] Add enwiki20 "Option A" fixed logos [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657927 (https://phabricator.wikimedia.org/T272526) (owner: 10Legoktm) [00:33:09] (y) [00:34:15] (03Merged) 10jenkins-bot: Add enwiki20 "Option A" fixed logos [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657927 (https://phabricator.wikimedia.org/T272526) (owner: 10Legoktm) [00:36:22] !log legoktm@deploy1001 Synchronized static/images/project-logos/: Add enwiki20 "Option A" fixed logos (T272526) (duration: 00m 59s) [00:36:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:36:28] T272526: Change enwiki logo to "Option A" until February 4 - https://phabricator.wikimedia.org/T272526 [00:37:07] \o/ [00:37:18] https://en.wikipedia.org/static/images/project-logos/enwiki20a.png https://en.wikipedia.org/static/images/project-logos/enwiki20a-1.5x.png https://en.wikipedia.org/static/images/project-logos/enwiki20a-2x.png [00:37:55] URLs work for me [00:38:23] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [00:38:38] (03CR) 10Legoktm: [C: 03+2] Switch enwiki to use enwiki20 "Option A" logo variant [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657929 (https://phabricator.wikimedia.org/T272526) (owner: 10Legoktm) [00:39:33] * legoktm twiddles thumbs [00:40:03] (03Merged) 10jenkins-bot: Switch enwiki to use enwiki20 "Option A" logo variant [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657929 (https://phabricator.wikimedia.org/T272526) (owner: 10Legoktm) [00:41:00] Urbanecm: Isarra: Earwig: staged on mwdebug1002 [00:41:54] works for me legoktm [00:42:18] Urbanecm: are you ready to remove the CSS too? [00:42:28] legoktm: yes, edit is prepared, just waiting for your scap :) [00:45:20] sync started [00:45:24] Urbanecm: ^ [00:45:47] https://en.wikipedia.org/w/index.php?title=MediaWiki:Common.css&diff=1002131729&oldid=1002046127 [00:46:14] !log legoktm@deploy1001 Synchronized wmf-config/InitialiseSettings.php: Switch enwiki to use enwiki20 "Option A" logo variant (T272526) (duration: 00m 57s) [00:46:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:46:18] T272526: Change enwiki logo to "Option A" until February 4 - https://phabricator.wikimedia.org/T272526 [00:48:16] sounds to work for me legoktm [00:48:57] awesome [00:49:02] thanks for all of your help [00:49:09] couldn't have done it without you [00:50:28] also major respect to everyone who processes logo requests, this was honestly a nightmare [00:51:12] Isarra: thanks to you too :) [00:51:32] thanks for your help too, legoktm :) [00:51:33] thank you Urbanecm, legoktm <3 sincere apologies for the trouble [00:51:42] Thank you for looking into all the skins and things! [00:51:45] Y'all are awesome. [00:51:48] I will now pass out. [00:51:59] * Mz7 salutes Isarra [00:52:16] I'm glad my badly explained ranting managed to contain actually useful information. >.> [01:00:01] is this the right time to complain that the "Over One Billion Edits" text seems illegibly small? :P [01:02:08] (03PS1) 10Mstyles: bump memory for flink processes [deployment-charts] - 10https://gerrit.wikimedia.org/r/657941 [01:11:45] * legoktm slaps Earwig [01:12:35] Earwig: not sure about illegibly, but it definitely is too small. But I just copied it from the other logo, didn't really have the skills to fix it [01:13:22] yeah, I don't think we need to mess with it further tonight [01:13:58] there were complaints on VPR that "over" is grammatically wrong (should be "more than") or that "one" should be "1"... [01:15:29] lol [01:15:49] the SVG is on Commons, feel free to edit it [01:21:26] The smalltext should use the same metrics relative to the large caps as the wikipedia wordmark itself does. [01:21:41] Instead of the first letters being so much bigger. That would make it much more legible. [01:30:45] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [01:36:17] RECOVERY - Check systemd state on pki1001 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [01:37:23] and legoktm's next project will be making logos easier >.> <.< [01:37:33] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [01:39:18] !log dzahn@cumin1001 conftool action : set/pooled=no; selector: name=mw1413.eqiad.wmnet [01:39:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:40:21] !log dzahn@cumin1001 conftool action : set/pooled=no; selector: name=mw1268.eqiad.wmnet [01:40:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:41:13] !log dzahn@cumin1001 conftool action : set/pooled=no; selector: name=mw2334.codfw.wmnet [01:41:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:41:33] !log dzahn@cumin1001 conftool action : set/pooled=no; selector: name=mw2330.codfw.wmnet [01:41:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:42:57] PROBLEM - Check systemd state on pki1001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [01:43:10] !log dzahn@cumin1001 conftool action : set/pooled=yes; selector: name=mw1268.eqiad.wmnet [01:43:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:44:19] !log dzahn@cumin1001 conftool action : set/pooled=yes; selector: name=mw1413.eqiad.wmnet [01:44:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:44:51] RECOVERY - mediawiki-installation DSH group on mw1268 is OK: OK https://wikitech.wikimedia.org/wiki/Monitoring/check_dsh_groups [01:47:07] RECOVERY - mediawiki-installation DSH group on mw1413 is OK: OK https://wikitech.wikimedia.org/wiki/Monitoring/check_dsh_groups [01:47:51] PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=netbox_device_statistics site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [01:48:39] !log reset user email for Davey2010 [01:48:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:50:07] RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [01:50:24] !log dzahn@cumin1001 conftool action : set/pooled=yes; selector: name=mw2334.codfw.wmnet [01:50:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:50:37] !log dzahn@cumin1001 conftool action : set/pooled=yes; selector: name=mw2330.codfw.wmnet [01:50:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:51:29] !log dzahn@cumin1001 conftool action : set/pooled=no; selector: name=mw2328.codfw.wmnet [01:51:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:51:42] !log dzahn@cumin1001 conftool action : set/pooled=no; selector: name=mw2332.codfw.wmnet [01:51:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:52:48] !log dzahn@cumin1001 conftool action : set/pooled=yes; selector: name=mw2328.codfw.wmnet [01:52:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:52:58] !log dzahn@cumin1001 conftool action : set/pooled=yes; selector: name=mw2332.codfw.wmnet [01:53:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:54:25] RECOVERY - mediawiki-installation DSH group on mw2332 is OK: OK https://wikitech.wikimedia.org/wiki/Monitoring/check_dsh_groups [01:54:37] RECOVERY - mediawiki-installation DSH group on mw2328 is OK: OK https://wikitech.wikimedia.org/wiki/Monitoring/check_dsh_groups [01:54:37] RECOVERY - mediawiki-installation DSH group on mw2334 is OK: OK https://wikitech.wikimedia.org/wiki/Monitoring/check_dsh_groups [02:13:05] RECOVERY - mediawiki-installation DSH group on mw2330 is OK: OK https://wikitech.wikimedia.org/wiki/Monitoring/check_dsh_groups [02:31:37] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [02:38:31] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [02:47:59] RECOVERY - Check systemd state on pki2001 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [02:54:45] PROBLEM - Check systemd state on pki2001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [03:17:57] PROBLEM - WDQS SPARQL on wdqs1013 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook [03:20:17] RECOVERY - WDQS SPARQL on wdqs1013 is OK: HTTP OK: HTTP/1.1 200 OK - 691 bytes in 4.074 second response time https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook [03:27:19] PROBLEM - WDQS SPARQL on wdqs1013 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook [03:31:09] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [03:38:05] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [03:57:10] (re `WDQS SPARQL`) ^ Not sure why it's flapping specifically, but this is a known issue with monitoring following a change of how we deploy the WDQS gui; there's no impact on the actual availability of the service. I'm suppressing these alerts until next week when we can roll out a fix [04:02:43] Ah, the flapping itself would be caused by `wdqs1013`'s blazegraph instance getting deadlocked or similar. Restarting blazegraph on `wdqs1013`: [04:03:00] !log Restarted `wdqs-blazegraph` on `wdqs1013`: `sudo systemctl restart wdqs-blazegraph` [04:03:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:03:41] PROBLEM - Query Service HTTP Port on wdqs1013 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 380 bytes in 0.001 second response time https://wikitech.wikimedia.org/wiki/Wikidata_query_service [04:05:29] !log Depooled `wdqs1013` (it has ~50 mins of lag to catch up on, and also the bad gateway above) [04:05:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:05:59] RECOVERY - Query Service HTTP Port on wdqs1013 is OK: HTTP OK: HTTP/1.1 200 OK - 448 bytes in 0.020 second response time https://wikitech.wikimedia.org/wiki/Wikidata_query_service [04:06:11] RECOVERY - WDQS SPARQL on wdqs1013 is OK: HTTP OK: HTTP/1.1 200 OK - 689 bytes in 1.071 second response time https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook [04:18:01] PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=netbox_device_statistics site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [04:20:13] RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [04:30:09] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [04:36:53] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [05:17:11] PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=netbox_device_statistics site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [05:19:27] RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [05:32:01] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [05:34:21] 10SRE, 10Security-Team, 10Wikimedia-Mailing-lists: Upgrade GNU Mailman from 2.1 to Mailman3 - https://phabricator.wikimedia.org/T52864 (10Ladsgroup) [05:34:30] 10SRE, 10Wikimedia-Mailing-lists, 10User-Ladsgroup: Setup Mailman3 in Cloud VPS - https://phabricator.wikimedia.org/T258365 (10Ladsgroup) 05Open→03Resolved This is done. [05:39:03] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [06:03:37] 10SRE, 10fundraising-tech-ops, 10netops, 10Patch-For-Review: Manage frack switches with Netbox - https://phabricator.wikimedia.org/T268802 (10ayounsi) 05Resolved→03Open a:05Dwisehaupt→03None Thanks for taking care of that. Re-opening as the scope of this task is larger than the Puppet change. [06:31:01] (03PS1) 10Ladsgroup: mailman3: Start apache2 for web [puppet] - 10https://gerrit.wikimedia.org/r/657950 (https://phabricator.wikimedia.org/T256542) [06:31:29] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [06:38:11] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [07:11:01] (03PS2) 10Ladsgroup: mailman3: Start apache2 for web [puppet] - 10https://gerrit.wikimedia.org/r/657950 (https://phabricator.wikimedia.org/T256542) [07:11:29] (03CR) 10jerkins-bot: [V: 04-1] mailman3: Start apache2 for web [puppet] - 10https://gerrit.wikimedia.org/r/657950 (https://phabricator.wikimedia.org/T256542) (owner: 10Ladsgroup) [07:12:37] (03PS3) 10Ladsgroup: mailman3: Start apache2 for web [puppet] - 10https://gerrit.wikimedia.org/r/657950 (https://phabricator.wikimedia.org/T256542) [07:13:04] (03CR) 10jerkins-bot: [V: 04-1] mailman3: Start apache2 for web [puppet] - 10https://gerrit.wikimedia.org/r/657950 (https://phabricator.wikimedia.org/T256542) (owner: 10Ladsgroup) [07:16:44] (03PS4) 10Ladsgroup: mailman3: Start apache2 for web [puppet] - 10https://gerrit.wikimedia.org/r/657950 (https://phabricator.wikimedia.org/T256542) [07:17:12] (03CR) 10jerkins-bot: [V: 04-1] mailman3: Start apache2 for web [puppet] - 10https://gerrit.wikimedia.org/r/657950 (https://phabricator.wikimedia.org/T256542) (owner: 10Ladsgroup) [07:19:31] (03PS5) 10Ladsgroup: mailman3: Start apache2 for web [puppet] - 10https://gerrit.wikimedia.org/r/657950 (https://phabricator.wikimedia.org/T256542) [07:20:06] (03PS6) 10Ladsgroup: mailman3: Start apache2 for web [puppet] - 10https://gerrit.wikimedia.org/r/657950 (https://phabricator.wikimedia.org/T256542) [07:27:06] (03CR) 10Ladsgroup: "It works with this patch being cherry-picked on the puppetmaster: https://mailman-puppet.wmcloud.org/mailman3/favicon.ico" [puppet] - 10https://gerrit.wikimedia.org/r/657950 (https://phabricator.wikimedia.org/T256542) (owner: 10Ladsgroup) [07:30:06] 10SRE, 10Wikimedia-Mailing-lists, 10Patch-For-Review: Puppetize mailman3 web and hyperkitty (mailman archiver) - https://phabricator.wikimedia.org/T256542 (10Ladsgroup) With the above patch being cherry-picked on the puppetmaster the web is accessible publicly: https://mailman-puppet.wmcloud.org/mailman3/fav... [07:30:23] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [07:37:09] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [07:40:50] (03PS1) 10Ladsgroup: mailman3: Fix python package for mysql [puppet] - 10https://gerrit.wikimedia.org/r/657952 (https://phabricator.wikimedia.org/T256542) [08:00:41] (03PS7) 10Ladsgroup: mailman3: Start apache2 for web [puppet] - 10https://gerrit.wikimedia.org/r/657950 (https://phabricator.wikimedia.org/T256542) [08:08:50] 10SRE, 10Wikimedia-Mailing-lists, 10Patch-For-Review: Puppetize mailman3 web and hyperkitty (mailman archiver) - https://phabricator.wikimedia.org/T256542 (10Ladsgroup) With the patch above and some work, it works just fine now: https://mailman-puppet.wmcloud.org/ The archiver doesn't work yet since it's not... [08:14:42] (03CR) 10Ladsgroup: "In PS7, I removed the enforcing https:// bit since it doesn't work with the cloud's web proxy. Once I get to setting acme chief and all th" [puppet] - 10https://gerrit.wikimedia.org/r/657950 (https://phabricator.wikimedia.org/T256542) (owner: 10Ladsgroup) [08:28:18] (03PS1) 10Legoktm: Drop obsolete requirements.txt and setup.py [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657954 [08:28:20] (03PS1) 10Legoktm: Split $wmgSiteLogo{1,1_5,2}x to a separate logos.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657955 [08:28:22] (03PS1) 10Legoktm: Add script to mostly automate logo management [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657956 (https://phabricator.wikimedia.org/T98640) [08:31:25] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [08:38:35] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [08:42:14] (03CR) 10ArielGlenn: [C: 03+1] snapshot: Switch require_package -> ensure_packages [puppet] - 10https://gerrit.wikimedia.org/r/657916 (https://phabricator.wikimedia.org/T266479) (owner: 10Legoktm) [08:50:11] the registry2002 flapping is me, I'll try to figure out what's wrong on Monday. it's not critical at all [09:20:04] (03PS1) 10Ladsgroup: lvs: Migrate hiera() to lookup() and set datatypes [puppet] - 10https://gerrit.wikimedia.org/r/657958 (https://phabricator.wikimedia.org/T209953) [09:22:45] (03CR) 10Ladsgroup: "PCC: https://puppet-compiler.wmflabs.org/compiler1003/27636/" [puppet] - 10https://gerrit.wikimedia.org/r/657958 (https://phabricator.wikimedia.org/T209953) (owner: 10Ladsgroup) [09:31:21] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [09:38:09] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [10:09:47] RECOVERY - HTTPS-wmfusercontent on phab.wmfusercontent.org is OK: SSL OK - Certificate *.wikipedia.org valid until 2021-04-16 09:01:44 +0000 (expires in 82 days) https://phabricator.wikimedia.org/tag/phabricator/ [10:10:49] RECOVERY - HTTPS-planet on en.planet.wikimedia.org is OK: SSL OK - Certificate *.wikipedia.org valid until 2021-04-16 09:01:44 +0000 (expires in 82 days) https://wikitech.wikimedia.org/wiki/Planet.wikimedia.org [10:30:21] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [10:37:25] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [11:31:43] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [11:38:31] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [12:32:03] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [12:38:49] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [12:41:02] (03PS1) 10Evrifaessa: Add localized Wikivoyage wordmark for the mobile view of Turkish Wikivoyage [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657971 (https://phabricator.wikimedia.org/T272776) [13:31:07] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [13:37:53] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [13:49:23] RECOVERY - Check systemd state on pki2001 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [13:56:09] PROBLEM - Check systemd state on pki2001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [14:32:13] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [14:39:01] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [15:08:41] 10SRE, 10Wikimedia-Mailing-lists, 10Privacy, 10Security, 10User-Josve05a: Show listadmins on main page - https://phabricator.wikimedia.org/T272778 (10Ciell) [15:31:07] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [15:37:45] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [15:57:38] (03PS1) 10Evrifaessa: Defining wgSitename for trwikivoyage. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657992 (https://phabricator.wikimedia.org/T272779) [16:03:55] (03PS1) 10Evrifaessa: Enable SandboxLink on Turkish Wikivoyage [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657993 (https://phabricator.wikimedia.org/T272780) [16:10:50] (03PS1) 10Evrifaessa: Add Turkish 'Powered by MediaWiki' and 'A Wikimedia project' icons for Turkish Wikivoyage. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657994 (https://phabricator.wikimedia.org/T272781) [16:11:58] (03Abandoned) 10Evrifaessa: Add Turkish 'Powered by MediaWiki' and 'A Wikimedia project' icons for Turkish Wikivoyage. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657994 (https://phabricator.wikimedia.org/T272781) (owner: 10Evrifaessa) [16:12:55] (03Restored) 10Evrifaessa: Add Turkish 'Powered by MediaWiki' and 'A Wikimedia project' icons for Turkish Wikivoyage. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657994 (https://phabricator.wikimedia.org/T272781) (owner: 10Evrifaessa) [16:13:41] (03PS2) 10Evrifaessa: Add Turkish 'Powered by MediaWiki' and 'A Wikimedia project' icons for Turkish Wikivoyage. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657994 (https://phabricator.wikimedia.org/T272781) [16:14:14] (03PS3) 10Evrifaessa: Add Turkish 'Powered by MediaWiki' and 'A Wikimedia project' icons for Turkish Wikivoyage. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657994 (https://phabricator.wikimedia.org/T272781) [16:23:18] (03PS1) 10Evrifaessa: Add namespace aliases to Turkish Wikivoyage. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657995 (https://phabricator.wikimedia.org/T272782) [16:29:44] (03PS1) 10Evrifaessa: Set $wgCategoryCollation = uca-tr on trwikivoyage [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657997 (https://phabricator.wikimedia.org/T272783) [16:31:23] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [16:38:13] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [16:51:30] (03PS1) 10Evrifaessa: Resize the logo of Turkish Wikivoyage. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657998 (https://phabricator.wikimedia.org/T272784) [16:59:02] 10SRE, 10DNS, 10Mail, 10Traffic: ITS request to update SPF & DNS Records for Trust & Safety - https://phabricator.wikimedia.org/T272750 (10Reedy) Current SPF is in https://raw.githubusercontent.com/wikimedia/operations-dns/b34bfdb0b90ad250d31137eb228e3421c9bafd4c/templates/wikimedia.org ` ; SPF txt and rr... [17:03:29] 10SRE, 10Wikimedia-Mailing-lists, 10Privacy, 10Security, 10User-Josve05a: Show listadmins on main page - https://phabricator.wikimedia.org/T272778 (10Aklapper) This sounds unrelated to the parent task. [17:03:35] 10SRE, 10Wikimedia-Mailing-lists, 10Privacy, 10Security, 10User-Josve05a: Show listadmins on main page - https://phabricator.wikimedia.org/T272778 (10Aklapper) [17:03:37] 10SRE, 10Wikimedia-Mailing-lists, 10Privacy, 10Security, 10User-Josve05a: Stop storing Mailman passwords in plain text - https://phabricator.wikimedia.org/T181803 (10Aklapper) [17:12:58] (03CR) 10David Caro: [C: 03+2] wmcs.backup.images: Fix full backup creation [puppet] - 10https://gerrit.wikimedia.org/r/657399 (https://phabricator.wikimedia.org/T272510) (owner: 10David Caro) [17:19:41] PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=netbox_device_statistics site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [17:21:57] RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [17:30:57] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [17:37:43] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [17:45:16] (03PS1) 10David Caro: wmcs.backup: fix missing image_name parameter [puppet] - 10https://gerrit.wikimedia.org/r/658000 [17:47:01] (03CR) 10David Caro: [C: 03+2] wmcs.backup: fix missing image_name parameter [puppet] - 10https://gerrit.wikimedia.org/r/658000 (owner: 10David Caro) [17:52:43] (03CR) 10Jforrester: "We just last year deleted the last Python in the repo and removed the CI config for it. Oy. ;-)" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657956 (https://phabricator.wikimedia.org/T98640) (owner: 10Legoktm) [17:55:00] (03CR) 10Jforrester: Split $wmgSiteLogo{1,1_5,2}x to a separate logos.php (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657955 (owner: 10Legoktm) [18:14:51] (03CR) 10RhinosF1: [C: 03+1] "But note to the deployer to not forget to run https://www.mediawiki.org/wiki/Manual:UpdateCollation.php after deployment" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657997 (https://phabricator.wikimedia.org/T272783) (owner: 10Evrifaessa) [18:31:31] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [18:38:15] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [19:32:11] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [19:39:03] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [20:30:53] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [20:37:37] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [21:31:47] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [21:38:37] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [22:08:53] 10SRE, 10Wikimedia-Mailing-lists: Show listadmins on main page - https://phabricator.wikimedia.org/T272778 (10Legoktm) >>! In T272778#6771332, @Aklapper wrote: >> I would prefer to have the usernames of list admins on the main page again. > > Hi, as you wrote "again", when was this the case? Does "main page"... [22:12:35] (03CR) 10Urbanecm: [C: 03+1] "sounds good" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657993 (https://phabricator.wikimedia.org/T272780) (owner: 10Evrifaessa) [22:12:51] (03CR) 10Urbanecm: [C: 03+1] "looks good" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657992 (https://phabricator.wikimedia.org/T272779) (owner: 10Evrifaessa) [22:13:16] (03CR) 10Urbanecm: [C: 03+1] "lgtm" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657994 (https://phabricator.wikimedia.org/T272781) (owner: 10Evrifaessa) [22:21:33] !log volker-e@deploy1001 Started deploy [design/style-guide@63e39e7]: Deploy design/style-guide: 63e39e7 “Components”: Amend button groups states SVG font stack (#427) [22:21:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:21:39] !log volker-e@deploy1001 Finished deploy [design/style-guide@63e39e7]: Deploy design/style-guide: 63e39e7 “Components”: Amend button groups states SVG font stack (#427) (duration: 00m 06s) [22:21:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:25:17] (03CR) 10Legoktm: "> Patch Set 1:" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657956 (https://phabricator.wikimedia.org/T98640) (owner: 10Legoktm) [22:28:16] (03CR) 10Legoktm: Split $wmgSiteLogo{1,1_5,2}x to a separate logos.php (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/657955 (owner: 10Legoktm) [22:30:33] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [22:37:29] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [23:18:21] PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=netbox_device_statistics site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [23:20:37] RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [23:32:11] RECOVERY - Check systemd state on registry2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [23:39:03] PROBLEM - Check systemd state on registry2002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state