[00:03:02] <wikibugs>	 (03PS3) 10Bstorm: k8s: Set default requests for the new cluster [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/563592 (https://phabricator.wikimedia.org/T236202)
[00:06:24] <wikibugs>	 10Operations, 10Packaging, 10serviceops: package requirements for upgrading deployment_servers to buster - https://phabricator.wikimedia.org/T242480 (10Dzahn)
[00:06:29] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] k8s: Set default requests for the new cluster [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/563592 (https://phabricator.wikimedia.org/T236202) (owner: 10Bstorm)
[00:08:13] <icinga-wm>	 PROBLEM - Postgres Replication Lag on maps2001 is CRITICAL: POSTGRES_HOT_STANDBY_DELAY CRITICAL: DB template1 (host:localhost) 85337992 and 5 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[00:09:45] <wikibugs>	 (03PS1) 10Dzahn: devtools (cloud): switch deployment server to deploy1002 [puppet] - 10https://gerrit.wikimedia.org/r/563616
[00:10:03] <icinga-wm>	 RECOVERY - Postgres Replication Lag on maps2001 is OK: POSTGRES_HOT_STANDBY_DELAY OK: DB template1 (host:localhost) 0 and 108 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[00:10:14] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] devtools (cloud): switch deployment server to deploy1002 [puppet] - 10https://gerrit.wikimedia.org/r/563616 (owner: 10Dzahn)
[00:10:19] <wikibugs>	 (03PS2) 10Dzahn: devtools (cloud): switch deployment server to deploy1002 [puppet] - 10https://gerrit.wikimedia.org/r/563616
[00:10:24] <wikibugs>	 (03CR) 10Dzahn: [C: 03+2] devtools (cloud): switch deployment server to deploy1002 [puppet] - 10https://gerrit.wikimedia.org/r/563616 (owner: 10Dzahn)
[00:20:53] <wikibugs>	 (03PS1) 10Dzahn: devtools: add Hiera values for a deployment_server in cloud [puppet] - 10https://gerrit.wikimedia.org/r/563618
[00:22:46] <wikibugs>	 (03PS1) 10Dzahn: codesearch: fix dependency cycle with git::clone [puppet] - 10https://gerrit.wikimedia.org/r/563619
[00:26:19] <wikibugs>	 (03CR) 10Dzahn: [C: 03+2] codesearch: fix dependency cycle with git::clone [puppet] - 10https://gerrit.wikimedia.org/r/563619 (owner: 10Dzahn)
[00:26:29] <wikibugs>	 (03PS2) 10Dzahn: codesearch: fix dependency cycle with git::clone [puppet] - 10https://gerrit.wikimedia.org/r/563619
[00:35:02] <wikibugs>	 (03CR) 10Dzahn: "@Legoktm After the follow-ups in the topic branch https://gerrit.wikimedia.org/r/q/topic:%22codesearch%22+(status:open%20OR%20status:merge" [puppet] - 10https://gerrit.wikimedia.org/r/563114 (https://phabricator.wikimedia.org/T242319) (owner: 10Legoktm)
[00:39:38] <wikibugs>	 (03PS4) 10Bstorm: k8s: Set default requests for the new cluster [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/563592 (https://phabricator.wikimedia.org/T236202)
[00:45:33] <wikibugs>	 (03CR) 10Bstorm: "Most of this is just implementing the hilarious effort to talk out a needed code change in IRC.  However, I did make some adjustments to t" (033 comments) [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/563592 (https://phabricator.wikimedia.org/T236202) (owner: 10Bstorm)
[00:48:11] <wikibugs>	 (03CR) 10Bstorm: [C: 03+2] k8s: Set default requests for the new cluster [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/563592 (https://phabricator.wikimedia.org/T236202) (owner: 10Bstorm)
[00:51:25] <wikibugs>	 (03CR) 10Bstorm: "Actually running this produces an error:" [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/563610 (owner: 10Bstorm)
[00:51:54] <wikibugs>	 (03CR) 10Bstorm: "> Patch Set 1:" [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/563610 (owner: 10Bstorm)
[01:04:59] <wikibugs>	 (03PS2) 10Dzahn: devtools: add Hiera values for a deployment_server in cloud [puppet] - 10https://gerrit.wikimedia.org/r/563618
[01:18:31] <wikibugs>	 10Operations, 10Wikimedia-General-or-Unknown, 10serviceops, 10Performance-Team (Radar), 10Wikimedia-Incident: Investigate recurrent GET latency spikes on MediaWiki appservers (Oct 2019) - https://phabricator.wikimedia.org/T235872 (10Krinkle)
[01:20:23] <icinga-wm>	 PROBLEM - Check systemd state on ms-be1037 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[01:22:38] <wikibugs>	 10Operations, 10Performance-Team, 10Traffic, 10Patch-For-Review, 10Wikimedia-Incident: 15% response start regression as of 2019-11-11 (Varnish->ATS) - https://phabricator.wikimedia.org/T238494 (10Krinkle)
[01:27:11] <wikibugs>	 10Operations, 10Wikimedia-General-or-Unknown, 10serviceops, 10Performance-Team (Radar), 10Wikimedia-Incident: Investigate recurrent GET latency spikes on MediaWiki appservers (Oct 2019) - https://phabricator.wikimedia.org/T235872 (10Krinkle) It is not clear to me whether this is an actual issue or not....
[01:29:10] <wikibugs>	 (03PS1) 10Bstorm: k8s: Don't restart all k8s machinery to reboot a basic webservice [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/563624 (https://phabricator.wikimedia.org/T228499)
[01:45:39] <icinga-wm>	 RECOVERY - Check systemd state on ms-be1037 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[01:49:27] <wikibugs>	 (03PS10) 10Bstorm: Make Kubernetes the default backend and warn when guessing [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/443190 (https://phabricator.wikimedia.org/T154504) (owner: 10Nehajha)
[01:59:46] <wikibugs>	 10Operations, 10Internet-Archive, 10Wikimedia-Portals: www.wikipedia.org/robots.txt should not be a redirect - https://phabricator.wikimedia.org/T242500 (10Krinkle)
[02:00:01] <wikibugs>	 10Operations, 10Wikimedia-Portals: www.wikipedia.org/robots.txt should not be a redirect - https://phabricator.wikimedia.org/T242500 (10Krinkle)
[02:50:41] <icinga-wm>	 PROBLEM - IPv6 ping to esams on ripe-atlas-esams IPv6 is CRITICAL: CRITICAL - failed 223 probes of 509 (alerts on 35) - https://atlas.ripe.net/measurements/23449938/#!map https://wikitech.wikimedia.org/wiki/Network_monitoring%23Atlas_alerts
[02:56:29] <icinga-wm>	 RECOVERY - IPv6 ping to esams on ripe-atlas-esams IPv6 is OK: OK - failed 32 probes of 509 (alerts on 35) - https://atlas.ripe.net/measurements/23449938/#!map https://wikitech.wikimedia.org/wiki/Network_monitoring%23Atlas_alerts
[03:24:17] <wikibugs>	 10Operations, 10Wikimedia-Portals, 10Regression: www.wikipedia.org/robots.txt should not be a redirect - https://phabricator.wikimedia.org/T242500 (10DannyS712)
[03:26:46] <icinga-wm>	 PROBLEM - puppet last run on ms-be1035 is CRITICAL: CRITICAL: Puppet last ran 6 hours ago https://wikitech.wikimedia.org/wiki/Monitoring/puppet_checkpuppetrun
[03:58:47] <wikibugs>	 (03CR) 10Legoktm: "> Patch Set 6:" [puppet] - 10https://gerrit.wikimedia.org/r/563114 (https://phabricator.wikimedia.org/T242319) (owner: 10Legoktm)
[04:00:59] <wikibugs>	 (03PS1) 10Bstorm: Revert "Add busybox to buster and stretch images" [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/563632
[04:02:19] <wikibugs>	 (03PS2) 10Bstorm: Revert "Add busybox to buster and stretch images" [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/563632
[04:04:24] <wikibugs>	 10Operations, 10ops-codfw, 10DBA: Missing Network drivers from Stretch and Buster installer for BRCM 2P 1G BT + 2P 10G SFP NDC - https://phabricator.wikimedia.org/T242481 (10Peachey88)
[04:05:02] <wikibugs>	 (03PS1) 10Legoktm: codesearch: Install docker-ce from thirdparty/kubeadm-k8s component [puppet] - 10https://gerrit.wikimedia.org/r/563633
[04:06:04] <wikibugs>	 (03CR) 10Bstorm: [C: 03+2] Revert "Add busybox to buster and stretch images" [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/563632 (owner: 10Bstorm)
[04:06:37] <wikibugs>	 (03CR) 10Legoktm: codesearch: Install docker-ce from thirdparty/kubeadm-k8s component (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/563633 (owner: 10Legoktm)
[05:34:45] <logmsgbot>	 !log volker-e@deploy1001 Started deploy [design/style-guide@6a44c69]: Deploy design/style-guide:
[05:34:46] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[05:34:52] <logmsgbot>	 !log volker-e@deploy1001 Finished deploy [design/style-guide@6a44c69]: Deploy design/style-guide:  (duration: 00m 08s)
[05:34:53] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[05:51:57] <wikibugs>	 10Operations, 10ops-codfw, 10DBA: Missing Network drivers from Stretch and Buster installer for BRCM 2P 1G BT + 2P 10G SFP NDC - https://phabricator.wikimedia.org/T242481 (10Marostegui) What if you try to configure the network manually rather than using DHCP, does it fail too?  If you send me the MAC address...
[06:08:43] <wikibugs>	 10Operations, 10ops-codfw, 10DBA: Missing Network drivers from Stretch and Buster installer for BRCM 2P 1G BT + 2P 10G SFP NDC - https://phabricator.wikimedia.org/T242481 (10Papaul) @Marostegui the request is not getting to the DHCP server
[06:10:42] <wikibugs>	 10Operations, 10ops-codfw, 10DBA: Missing Network drivers from Stretch and Buster installer for BRCM 2P 1G BT + 2P 10G SFP NDC - https://phabricator.wikimedia.org/T242481 (10Marostegui) And setting the IP, GW etc manually doesn't work either?
[06:13:15] <wikibugs>	 10Operations, 10ops-codfw, 10DBA: Missing Network drivers from Stretch and Buster installer for BRCM 2P 1G BT + 2P 10G SFP NDC - https://phabricator.wikimedia.org/T242481 (10Papaul) I didn't try that but you can try with es2024 or es2021.
[07:08:23] <wikibugs>	 10Operations, 10ops-codfw, 10DBA: Missing Network drivers from Stretch and Buster installer for BRCM 2P 1G BT + 2P 10G SFP NDC - https://phabricator.wikimedia.org/T242481 (10Marostegui) When reseting es2021 via IDRAC I saw this and after a couple of powercycles the host booted up: ` !!!! X64 Exception Type -...
[07:35:02] <wikibugs>	 10Operations, 10ops-codfw, 10DBA: Missing Network drivers from Stretch and Buster installer for BRCM 2P 1G BT + 2P 10G SFP NDC - https://phabricator.wikimedia.org/T242481 (10Marostegui) I have tried es2024 and setting its IP manually and it seems that it indeed cannot reach the network. However, I do see the...
[07:51:57] <icinga-wm>	 PROBLEM - Check systemd state on ms-be1037 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[07:55:33] <icinga-wm>	 PROBLEM - Check systemd state on ms-be1037 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[08:38:11] <wikibugs>	 10Operations, 10Phabricator, 10Traffic, 10serviceops, 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)): Phabricator downtime due to aphlict and websockets (aphlict current disabled) - https://phabricator.wikimedia.org/T238593 (10mmodell) >>! In T238593#5792383, @akosiaris wrote: >> I really can'...
[08:46:01] <icinga-wm>	 RECOVERY - Check systemd state on ms-be1037 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[09:22:13] <wikibugs>	 (03PS1) 10Elukey: Add spark encryption option to Hadoop test's yarn configuration [puppet] - 10https://gerrit.wikimedia.org/r/563651 (https://phabricator.wikimedia.org/T240934)
[09:23:22] <wikibugs>	 (03CR) 10Elukey: [C: 03+2] Add spark encryption option to Hadoop test's yarn configuration [puppet] - 10https://gerrit.wikimedia.org/r/563651 (https://phabricator.wikimedia.org/T240934) (owner: 10Elukey)
[09:37:37] <icinga-wm>	 PROBLEM - MediaWiki exceptions and fatals per minute on icinga1001 is CRITICAL: cluster=logstash job=statsd_exporter level=ERROR site=eqiad https://wikitech.wikimedia.org/wiki/Application_servers https://grafana.wikimedia.org/d/000000438/mediawiki-alerts?panelId=2&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops
[09:41:18] <icinga-wm>	 RECOVERY - MediaWiki exceptions and fatals per minute on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Application_servers https://grafana.wikimedia.org/d/000000438/mediawiki-alerts?panelId=2&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops
[09:57:31] <icinga-wm>	 PROBLEM - MediaWiki exceptions and fatals per minute on icinga1001 is CRITICAL: cluster=logstash job=statsd_exporter level=ERROR site=eqiad https://wikitech.wikimedia.org/wiki/Application_servers https://grafana.wikimedia.org/d/000000438/mediawiki-alerts?panelId=2&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops
[10:01:05] <icinga-wm>	 RECOVERY - MediaWiki exceptions and fatals per minute on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Application_servers https://grafana.wikimedia.org/d/000000438/mediawiki-alerts?panelId=2&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops
[10:04:54] <wikibugs>	 (03PS1) 10DannyS712: Deploy partial blocks on enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/563653 (https://phabricator.wikimedia.org/T218626)
[10:05:18] <wikibugs>	 (03PS2) 10DannyS712: Deploy partial blocks on enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/563653 (https://phabricator.wikimedia.org/T218626)
[12:12:04] <wikibugs>	 (03CR) 10Nuria: "Is there documentation on how to make use of these?" [deployment-charts] - 10https://gerrit.wikimedia.org/r/562623 (https://phabricator.wikimedia.org/T240985) (owner: 10Ottomata)
[12:43:57] <icinga-wm>	 PROBLEM - High average GET latency for mw requests on appserver in eqiad on icinga1001 is CRITICAL: cluster=appserver code=200 handler=proxy:unix:/run/php/fpm-www.sock https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=eqiad+prometheus/ops&var-cluster=appserver&var-method=GET
[14:25:19] <icinga-wm>	 PROBLEM - Disk space on ms-be1039 is CRITICAL: DISK CRITICAL - /srv/swift-storage/sdd1 is not accessible: Input/output error https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=ms-be1039&var-datasource=eqiad+prometheus/ops
[14:47:31] <icinga-wm>	 PROBLEM - HP RAID on ms-be1039 is CRITICAL: CRITICAL: Slot 3: Failed: 1I:1:6 - OK: 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 1I:1:5, 1I:1:7, 1I:1:8, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, 2I:4:1, 2I:4:2 - Controller: OK - Battery/Capacitor: OK https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Hardware_Raid_Information_Gathering
[14:47:36] <icinga-wm>	 ACKNOWLEDGEMENT - HP RAID on ms-be1039 is CRITICAL: CRITICAL: Slot 3: Failed: 1I:1:6 - OK: 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 1I:1:5, 1I:1:7, 1I:1:8, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, 2I:4:1, 2I:4:2 - Controller: OK - Battery/Capacitor: OK nagiosadmin RAID handler auto-ack: https://phabricator.wikimedia.org/T242511 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Hardware_Raid_Information_Gathering
[14:47:39] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on ms-be1039 - https://phabricator.wikimedia.org/T242511 (10ops-monitoring-bot)
[15:14:04] <icinga-wm>	 PROBLEM - Device not healthy -SMART- on ms-be1039 is CRITICAL: cluster=swift device=cciss,13 instance=ms-be1039:9100 job=node site=eqiad https://wikitech.wikimedia.org/wiki/SMART%23Alerts https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=ms-be1039&var-datasource=eqiad+prometheus/ops
[16:34:25] <wikibugs>	 10Operations, 10ops-codfw, 10DBA: Missing Network drivers from Stretch and Buster installer for BRCM 2P 1G BT + 2P 10G SFP NDC - https://phabricator.wikimedia.org/T242481 (10Papaul) @Marostegui  ` papaul@asw-a-codfw> show interfaces ge-6/0/12 descriptions     Interface       Admin Link Description ge-6/0/12...
[16:59:58] <wikibugs>	 10Operations, 10ops-codfw, 10DBA: Missing Network drivers from Stretch and Buster installer for BRCM 2P 1G BT + 2P 10G SFP NDC - https://phabricator.wikimedia.org/T242481 (10Marostegui) >>! In T242481#5794821, @Papaul wrote: > @Marostegui  > ` > papaul@asw-a-codfw> show interfaces ge-6/0/12 descriptions...
[17:07:46] <icinga-wm>	 PROBLEM - Postgres Replication Lag on maps2003 is CRITICAL: POSTGRES_HOT_STANDBY_DELAY CRITICAL: DB template1 (host:localhost) 95137280 and 13 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[17:08:16] <wikibugs>	 10Operations, 10ops-codfw, 10DBA: Missing Network drivers from Stretch and Buster installer for BRCM 2P 1G BT + 2P 10G SFP NDC - https://phabricator.wikimedia.org/T242481 (10Papaul) @Marostegui  ` Logical          Vlan          TAG     MAC         STP         Logical           Tagging  interface        membe...
[17:09:08] <wikibugs>	 10Operations, 10ops-codfw, 10DBA: Missing Network drivers from Stretch and Buster installer for BRCM 2P 1G BT + 2P 10G SFP NDC - https://phabricator.wikimedia.org/T242481 (10Marostegui) I just saw this: ` [35393.835622] tg3 0000:01:00.0 eno3: Link is up at 1000 Mbps, full duplex [35393.835634] tg3 0000:01:00...
[17:09:32] <icinga-wm>	 RECOVERY - Postgres Replication Lag on maps2003 is OK: POSTGRES_HOT_STANDBY_DELAY OK: DB template1 (host:localhost) 7464 and 108 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[17:09:51] <wikibugs>	 10Operations, 10ops-codfw, 10DBA: Missing Network drivers from Stretch and Buster installer for BRCM 2P 1G BT + 2P 10G SFP NDC - https://phabricator.wikimedia.org/T242481 (10Papaul) @Marostegui  switch didn't learn any MAC address on that interface   ` papaul@asw-a-codfw> show ethernet-switching table interf...
[17:22:02] <wikibugs>	 10Operations, 10ops-codfw, 10DBA: Missing Network drivers from Stretch and Buster installer for BRCM 2P 1G BT + 2P 10G SFP NDC - https://phabricator.wikimedia.org/T242481 (10Marostegui) You are not able to see any MAC addresses for es2024?
[17:31:26] <wikibugs>	 10Operations, 10ops-codfw, 10DBA: Missing Network drivers from Stretch and Buster installer for BRCM 2P 1G BT + 2P 10G SFP NDC - https://phabricator.wikimedia.org/T242481 (10Papaul) correct, on the switch side
[17:33:04] <wikibugs>	 10Operations, 10ops-codfw, 10DBA: Missing Network drivers from Stretch and Buster installer for BRCM 2P 1G BT + 2P 10G SFP NDC - https://phabricator.wikimedia.org/T242481 (10Marostegui) I have unloaded and then loaded again `bnxt_en` kernel module and I can see the main iface disappearing and then coming bac...
[17:40:46] <icinga-wm>	 PROBLEM - High average GET latency for mw requests on appserver in eqiad on icinga1001 is CRITICAL: cluster=appserver code=205 handler=proxy:unix:/run/php/fpm-www.sock https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=eqiad+prometheus/ops&var-cluster=appserver&var-method=GET
[17:49:46] <icinga-wm>	 RECOVERY - High average GET latency for mw requests on appserver in eqiad on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=eqiad+prometheus/ops&var-cluster=appserver&var-method=GET
[17:56:54] <wikibugs>	 10Operations, 10ops-codfw, 10DBA: Missing Network drivers from Stretch and Buster installer for BRCM 2P 1G BT + 2P 10G SFP NDC - https://phabricator.wikimedia.org/T242481 (10Papaul) @Marostegui  i double check the switch configuration for both es2020 and es2024 and the DNS files from https://gerrit.wikimedia...
[18:09:24] <wikibugs>	 10Operations, 10ops-codfw, 10DBA: Missing Network drivers from Stretch and Buster installer for BRCM 2P 1G BT + 2P 10G SFP NDC - https://phabricator.wikimedia.org/T242481 (10Papaul) @Marostegui  I will focus more on troubleshooting this on the NIC level on Monday.  Since the 1GB and 10GB interfaces are on th...
[18:10:26] <icinga-wm>	 PROBLEM - Postgres Replication Lag on maps1003 is CRITICAL: POSTGRES_HOT_STANDBY_DELAY CRITICAL: DB template1 (host:localhost) 87862328 and 12 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[18:12:18] <icinga-wm>	 RECOVERY - Postgres Replication Lag on maps1003 is OK: POSTGRES_HOT_STANDBY_DELAY OK: DB template1 (host:localhost) 88424 and 71 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[18:20:43] <wikibugs>	 10Operations, 10ops-codfw, 10DBA: Missing Network drivers from Stretch and Buster installer for BRCM 2P 1G BT + 2P 10G SFP NDC - https://phabricator.wikimedia.org/T242481 (10Marostegui) Sounds good @papaul! Thank you a lot. Have a good one!!
[19:01:27] <wikibugs>	 10Operations, 10Wikimedia-Mailing-lists: Allow Cloud mailing list to be indexed - https://phabricator.wikimedia.org/T242520 (10Peachey88)
[19:02:47] <wikibugs>	 10Operations, 10ops-eqiad, 10SRE-swift-storage: Degraded RAID on ms-be1039 - https://phabricator.wikimedia.org/T242511 (10Peachey88)
[19:58:35] <wikibugs>	 10Operations, 10Wikimedia-Mailing-lists: Allow Cloud mailing list to be indexed - https://phabricator.wikimedia.org/T242520 (10RhinosF1) Has been done before in T193572 for other lists.  Should be as simple as replacing https://phabricator.wikimedia.org/source/operations-puppet/browse/production/modules/mailma...
[20:07:31] <wikibugs>	 (03PS1) 10RhinosF1: Add wikimedia cloud mailing list to mailman’s robots.txt [puppet] - 10https://gerrit.wikimedia.org/r/563684 (https://phabricator.wikimedia.org/T242520)
[20:09:18] <icinga-wm>	 PROBLEM - Postgres Replication Lag on maps1001 is CRITICAL: POSTGRES_HOT_STANDBY_DELAY CRITICAL: DB template1 (host:localhost) 38409448 and 3 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[20:11:06] <icinga-wm>	 RECOVERY - Postgres Replication Lag on maps1001 is OK: POSTGRES_HOT_STANDBY_DELAY OK: DB template1 (host:localhost) 175368 and 72 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[20:12:05] <wikibugs>	 (03PS2) 10RhinosF1: Add wikimedia cloud mailing list to mailman’s robots.txt [puppet] - 10https://gerrit.wikimedia.org/r/563684 (https://phabricator.wikimedia.org/T242520)
[20:13:36] <wikibugs>	 10Operations, 10Wikimedia-Mailing-lists, 10Patch-For-Review, 10User-RhinosF1: Allow Cloud mailing list to be indexed - https://phabricator.wikimedia.org/T242520 (10RhinosF1) a:03RhinosF1 >>! In T242520#5794935, @gerritbot wrote: > Change 563684 had a related patch set uploaded (by RhinosF1; owner: Rhinos...
[20:14:35] <wikibugs>	 10Operations, 10Wikimedia-Mailing-lists, 10Patch-For-Review, 10User-RhinosF1: Allow Cloud mailing list to be indexed - https://phabricator.wikimedia.org/T242520 (10RhinosF1) p:05Triage→03Normal
[20:16:55] <wikibugs>	 10Operations, 10Beta-Cluster-Infrastructure, 10Lexicographical data, 10Wikidata, 10User-DannyS712: PHP fatal error on beta cluster - https://phabricator.wikimedia.org/T242188 (10Cutmuetia1998)
[20:18:38] <wikibugs>	 10Operations, 10Beta-Cluster-Infrastructure, 10Lexicographical data, 10Wikidata, 10User-DannyS712: PHP fatal error on beta cluster - https://phabricator.wikimedia.org/T242188 (10RhinosF1)
[21:13:42] <wikibugs>	 (03CR) 10Ammarpad: "recheck" [puppet] - 10https://gerrit.wikimedia.org/r/563684 (https://phabricator.wikimedia.org/T242520) (owner: 10RhinosF1)
[21:44:40] <wikibugs>	 10Operations, 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Scap, 10serviceops: On beta, scap can't clear opcache on some mw servers - https://phabricator.wikimedia.org/T237033 (10hashar) Those settings are for the Puppet roles. Given roles are solely for production, on WMCS the hiera look...
[21:47:52] <wikibugs>	 (03CR) 10RhinosF1: "Recheck what?" [puppet] - 10https://gerrit.wikimedia.org/r/563684 (https://phabricator.wikimedia.org/T242520) (owner: 10RhinosF1)
[21:51:19] <wikibugs>	 10Operations, 10Wikimedia-Mailing-lists, 10Patch-For-Review, 10User-RhinosF1: Allow Cloud mailing list to be indexed - https://phabricator.wikimedia.org/T242520 (10MarcoAurelio) I would prefer not to go forward with this. I don't feel like opening the gate even more to spammers and URL/email address grabbe...
[22:00:18] <wikibugs>	 10Operations, 10Wikimedia-Mailing-lists, 10Patch-For-Review, 10User-RhinosF1: Allow Cloud mailing list to be indexed - https://phabricator.wikimedia.org/T242520 (10RhinosF1) >>! In T242520#5795026, @MarcoAurelio wrote: > I would prefer not to go forward with this. I don't feel like opening the gate even mo...
[22:00:49] <wikibugs>	 10Operations, 10Wikimedia-Mailing-lists, 10Patch-For-Review, 10User-RhinosF1: Allow Cloud mailing list to be indexed - https://phabricator.wikimedia.org/T242520 (10RoySmith) I'm fine with an internal search tool instead of external search engines, but the mailing list really does need to be indexed in some...
[22:41:34] <wikibugs>	 10Operations, 10Wikimedia-Mailing-lists, 10Patch-For-Review, 10User-RhinosF1: Allow Cloud mailing list to be indexed - https://phabricator.wikimedia.org/T242520 (10Platonides) As an external solution, it could be added to an external mailing list archiver such as [[ https://marc.info/?q=about#Add | marc ]]...
[23:11:35] <wikibugs>	 10Operations, 10Wikimedia-Mailing-lists, 10Patch-For-Review, 10User-RhinosF1: Allow Cloud mailing list to be indexed - https://phabricator.wikimedia.org/T242520 (10Bawolff) Note that the archives do replace @ signs with "at" for whatever good that does. Personally I think ease of finding answers to old que...
[23:12:55] <wikibugs>	 (03CR) 10Brian Wolff: "> Recheck what?" [puppet] - 10https://gerrit.wikimedia.org/r/563684 (https://phabricator.wikimedia.org/T242520) (owner: 10RhinosF1)