[00:15:52] <icinga-wm>	 RECOVERY - rpki grafana alert on icinga1001 is OK: OK: RPKI ( https://grafana.wikimedia.org/d/UwUa77GZk/rpki ) is not alerting. https://wikitech.wikimedia.org/wiki/RPKI%23Grafana_alerts https://grafana.wikimedia.org/d/UwUa77GZk/
[02:38:00] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=atlas_exporter site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[02:40:08] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[03:29:40] <wikibugs>	 (03PS1) 10Alex Monk: Allow configuration of AddressFamily used for DNS validation [software/acme-chief] - 10https://gerrit.wikimedia.org/r/574221 (https://phabricator.wikimedia.org/T245937)
[03:31:37] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Allow configuration of AddressFamily used for DNS validation [software/acme-chief] - 10https://gerrit.wikimedia.org/r/574221 (https://phabricator.wikimedia.org/T245937) (owner: 10Alex Monk)
[03:31:52] <wikibugs>	 (03PS2) 10Alex Monk: Allow configuration of AddressFamily used for DNS validation [software/acme-chief] - 10https://gerrit.wikimedia.org/r/574221 (https://phabricator.wikimedia.org/T245937)
[03:34:38] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Allow configuration of AddressFamily used for DNS validation [software/acme-chief] - 10https://gerrit.wikimedia.org/r/574221 (https://phabricator.wikimedia.org/T245937) (owner: 10Alex Monk)
[03:36:29] <wikibugs>	 (03PS3) 10Alex Monk: Allow configuration of AddressFamily used for DNS validation [software/acme-chief] - 10https://gerrit.wikimedia.org/r/574221 (https://phabricator.wikimedia.org/T245937)
[08:47:42] <icinga-wm>	 PROBLEM - rpki grafana alert on icinga1001 is CRITICAL: CRITICAL: RPKI ( https://grafana.wikimedia.org/d/UwUa77GZk/rpki ) is alerting: RRDP status alert. https://wikitech.wikimedia.org/wiki/RPKI%23Grafana_alerts https://grafana.wikimedia.org/d/UwUa77GZk/
[13:13:24] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=swagger_check_cxserver_cluster_eqiad site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[13:15:32] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[13:17:53] <wikibugs>	 (03CR) 10Urbanecm: [C: 03+1] "LGTM" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/574184 (https://phabricator.wikimedia.org/T245911) (owner: 10MarcoAurelio)
[14:32:14] <icinga-wm>	 PROBLEM - OSPF status on cr2-codfw is CRITICAL: OSPFv2: 4/5 UP : OSPFv3: 4/5 UP https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[14:32:24] <icinga-wm>	 PROBLEM - Router interfaces on cr2-eqiad is CRITICAL: CRITICAL: host 208.80.154.197, interfaces up: 240, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[15:01:08] <icinga-wm>	 PROBLEM - Cxserver LVS eqiad on cxserver.svc.eqiad.wmnet is CRITICAL: /v2/suggest/source/{title}/{to} (Suggest a source title to use for translation) timed out before a response was received https://wikitech.wikimedia.org/wiki/CX
[15:03:10] <icinga-wm>	 PROBLEM - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is CRITICAL: /api (bad URL) timed out before a response was received: /api (Ensure Zotero is working) is WARNING: Test Ensure Zotero is working responds with unexpected value at path [0]/itemType = webpage https://wikitech.wikimedia.org/wiki/Citoid
[15:05:20] <icinga-wm>	 RECOVERY - Cxserver LVS eqiad on cxserver.svc.eqiad.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/CX
[15:05:55] <wikibugs>	 (03PS1) 10Dapete: Partially revert changes to improve support for extra_args [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/574236 (https://phabricator.wikimedia.org/T244894)
[15:07:26] <icinga-wm>	 RECOVERY - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Citoid
[15:08:22] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=swagger_check_cxserver_cluster_eqiad site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[15:08:41] <wikibugs>	 10Operations, 10SRE-Access-Requests: Requesting access to stat1007 for jmorgan - https://phabricator.wikimedia.org/T244785 (10Nuria) Hello, approving on my end. @jbond you need to provide an ssh key, without it (regardless of approvals we cannot give you access)
[15:08:54] <icinga-wm>	 PROBLEM - restbase endpoints health on restbase1022 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase
[15:08:54] <icinga-wm>	 PROBLEM - restbase endpoints health on restbase-dev1004 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase
[15:10:30] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[15:10:56] <icinga-wm>	 RECOVERY - restbase endpoints health on restbase1022 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase
[15:11:00] <icinga-wm>	 RECOVERY - restbase endpoints health on restbase-dev1004 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase
[15:11:12] <icinga-wm>	 PROBLEM - wikifeeds eqiad on wikifeeds.svc.eqiad.wmnet is CRITICAL: /{domain}/v1/media/image/featured/{year}/{month}/{day} (retrieve featured image data for April 29, 2016) is CRITICAL: Test retrieve featured image data for April 29, 2016 returned the unexpected status 504 (expecting: 200) https://wikitech.wikimedia.org/wiki/Wikifeeds
[15:13:16] <icinga-wm>	 RECOVERY - wikifeeds eqiad on wikifeeds.svc.eqiad.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Wikifeeds
[15:13:52] <icinga-wm>	 PROBLEM - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is CRITICAL: /api (bad URL) timed out before a response was received https://wikitech.wikimedia.org/wiki/Citoid
[15:15:54] <icinga-wm>	 RECOVERY - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Citoid
[15:17:20] <Urbanecm>	 !log mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user='𐰇𐱅𐰚𐰤' /home/urbanecm/T245950 (T245950)
[15:17:25] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:17:26] <stashbot>	 T245950: Server side upload for 𐰇𐱅𐰚𐰤 - https://phabricator.wikimedia.org/T245950
[15:57:10] <wikibugs>	 (03PS2) 10Dapete: Partially revert changes to improve support for extra_args [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/574236 (https://phabricator.wikimedia.org/T244894)
[16:04:32] <wikibugs>	 10Operations, 10SRE-Access-Requests: Requesting access to stat1007 for jmorgan - https://phabricator.wikimedia.org/T244785 (10RhinosF1) >>! In T244785#5910410, @Nuria wrote: > Hello, approving on my end. @jbond you need to provide an ssh key, without it (regardless of approvals we cannot give you access)  Don’...
[16:08:38] <icinga-wm>	 PROBLEM - restbase endpoints health on restbase-dev1004 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase
[16:10:42] <icinga-wm>	 RECOVERY - restbase endpoints health on restbase-dev1004 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase
[16:16:20] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=swagger_check_citoid_cluster_eqiad site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[16:18:28] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[16:24:50] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=swagger_check_cxserver_cluster_eqiad site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[16:26:58] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[16:27:29] <wikibugs>	 10Operations, 10SRE-Access-Requests: Requesting access to stat1007 for jmorgan - https://phabricator.wikimedia.org/T244785 (10Nuria) Ahem, yes, Indeed! @Capt_Swing , please be so kind to provide ssh key
[16:31:14] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=swagger_check_citoid_cluster_eqiad site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[16:33:22] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[16:37:36] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=swagger_check_citoid_cluster_eqiad site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[16:39:42] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[16:41:17] <wikibugs>	 10Operations, 10SRE-Access-Requests: Requesting access to stat1007 for jmorgan - https://phabricator.wikimedia.org/T244785 (10Nuria) And second mistake, i did not realize that @Capt_Swing is SO ORGANIZED that the link above points to a wiki with ssh key. so ya, ready to go.
[16:46:50] <wikibugs>	 10Operations, 10netops, 10cloud-services-team (Kanban): CloudVPS: enable BGP in the neutron transport network - https://phabricator.wikimedia.org/T245606 (10Krenair) I've been reading the linked proposal and noticed this: "the internal flat network CIDR. This is 172.16.0.0/21 in eqiad1 and **172.16.128.0/24...
[16:48:27] <wikibugs>	 10Operations, 10SRE-Access-Requests: Requesting access to stat1007 for jmorgan - https://phabricator.wikimedia.org/T244785 (10elukey) @Capt_Swing is there any reason why you stated stat1007 on the request? I am asking since people tend to cluster on it and it is a little crowded now (resources are often exhaus...
[16:51:09] <wikibugs>	 (03PS1) 10RhinosF1: Add jmorgan to analytics-privatedata-users [puppet] - 10https://gerrit.wikimedia.org/r/574240
[16:52:18] <elukey>	 !log powercycle mw1372 - no mgmt console, no ssh
[16:52:21] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[16:54:04] <wikibugs>	 10Operations, 10serviceops: (Need By Dec 20) rack/setup/install mw13[49-84].eqiad.wmnet - https://phabricator.wikimedia.org/T236437 (10elukey) As FYI I just powercycled mw1372 since it was frozen (no ssh, no mgmt serial console usable, etc..).
[16:54:46] <icinga-wm>	 RECOVERY - Host mw1372 is UP: PING OK - Packet loss = 0%, RTA = 0.33 ms
[16:55:57] <wikibugs>	 (03PS2) 10RhinosF1: Add jmorgan to analytics-privatedata-users [puppet] - 10https://gerrit.wikimedia.org/r/574240
[16:56:54] <wikibugs>	 (03PS3) 10RhinosF1: Admin: Add jmorgan to analytics-privatedata-users [puppet] - 10https://gerrit.wikimedia.org/r/574240 (https://phabricator.wikimedia.org/T244785)
[16:58:24] <wikibugs>	 (03Abandoned) 10RhinosF1: Admin: Add jmorgan to analytics-privatedata-users [puppet] - 10https://gerrit.wikimedia.org/r/574240 (https://phabricator.wikimedia.org/T244785) (owner: 10RhinosF1)
[16:58:56] <wikibugs>	 (03Restored) 10RhinosF1: Admin: Add jmorgan to analytics-privatedata-users [puppet] - 10https://gerrit.wikimedia.org/r/574240 (https://phabricator.wikimedia.org/T244785) (owner: 10RhinosF1)
[16:59:31] <wikibugs>	 (03CR) 10Alex Monk: "Note: This feels really weird. Is it possible that our boxes are just misconfigured and should not have IPv6 enabled at all, and that appl" [software/acme-chief] - 10https://gerrit.wikimedia.org/r/574221 (https://phabricator.wikimedia.org/T245937) (owner: 10Alex Monk)
[16:59:43] <wikibugs>	 10Operations, 10SRE-Access-Requests: Requesting access to stat1007 for jmorgan - https://phabricator.wikimedia.org/T244785 (10RhinosF1) If I can read the history right, ssh key should already be in data.yaml so https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/574240/ should work.
[17:03:57] <wikibugs>	 10Operations, 10SRE-Access-Requests: Requesting access to stat1007 for jmorgan - https://phabricator.wikimedia.org/T244785 (10RhinosF1)
[17:41:00] <icinga-wm>	 PROBLEM - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is CRITICAL: /api (Ensure Zotero is working) timed out before a response was received https://wikitech.wikimedia.org/wiki/Citoid
[17:43:04] <icinga-wm>	 RECOVERY - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Citoid
[17:45:36] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=swagger_check_citoid_cluster_eqiad site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[17:47:46] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[17:48:15] <wikibugs>	 10Operations, 10SRE-Access-Requests, 10User-RhinosF1: Requesting access to stat1007 for jmorgan - https://phabricator.wikimedia.org/T244785 (10RhinosF1)
[18:49:14] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=swagger_check_cxserver_cluster_eqiad site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[18:51:22] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[19:42:22] <icinga-wm>	 PROBLEM - Cxserver LVS eqiad on cxserver.svc.eqiad.wmnet is CRITICAL: /v2/page/{sourcelanguage}/{targetlanguage}/{title}{/revision} (Translate enwiki protected page) timed out before a response was received https://wikitech.wikimedia.org/wiki/CX
[19:46:26] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=swagger_check_maps_esams site=esams https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[19:48:32] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[19:48:36] <icinga-wm>	 RECOVERY - Cxserver LVS eqiad on cxserver.svc.eqiad.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/CX
[19:54:56] <icinga-wm>	 PROBLEM - Cxserver LVS eqiad on cxserver.svc.eqiad.wmnet is CRITICAL: /v2/suggest/source/{title}/{to} (Suggest a source title to use for translation) is CRITICAL: Test Suggest a source title to use for translation returned the unexpected status 504 (expecting: 200) https://wikitech.wikimedia.org/wiki/CX
[19:59:10] <icinga-wm>	 RECOVERY - Cxserver LVS eqiad on cxserver.svc.eqiad.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/CX
[20:05:34] <icinga-wm>	 RECOVERY - rpki grafana alert on icinga1001 is OK: OK: RPKI ( https://grafana.wikimedia.org/d/UwUa77GZk/rpki ) is not alerting. https://wikitech.wikimedia.org/wiki/RPKI%23Grafana_alerts https://grafana.wikimedia.org/d/UwUa77GZk/
[20:27:11] <wikibugs>	 (03CR) 10Vgutierrez: Allow configuration of AddressFamily used for DNS validation (032 comments) [software/acme-chief] - 10https://gerrit.wikimedia.org/r/574221 (https://phabricator.wikimedia.org/T245937) (owner: 10Alex Monk)
[20:35:12] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=icinga site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[20:39:26] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[21:09:53] <wikibugs>	 (03CR) 10Krinkle: "Reviewed diffConfig job output, LGTM." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/570379 (https://phabricator.wikimedia.org/T232140) (owner: 10Jforrester)
[22:42:32] <icinga-wm>	 PROBLEM - configured eth on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Monitoring/check_eth
[22:42:36] <icinga-wm>	 PROBLEM - Check whether ferm is active by checking the default input chain on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Monitoring/check_ferm
[22:42:38] <icinga-wm>	 PROBLEM - MD RAID on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Hardware_Raid_Information_Gathering
[22:42:40] <icinga-wm>	 PROBLEM - puppet last run on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Monitoring/puppet_checkpuppetrun
[22:43:08] <icinga-wm>	 PROBLEM - DPKG on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Monitoring/dpkg
[22:43:28] <icinga-wm>	 PROBLEM - dhclient process on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Monitoring/check_dhclient
[22:43:34] <icinga-wm>	 PROBLEM - Check size of conntrack table on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack
[22:43:56] <icinga-wm>	 PROBLEM - Check systemd state on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[22:44:12] <icinga-wm>	 PROBLEM - Disk space on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=stat1007&var-datasource=eqiad+prometheus/ops
[22:47:24] <icinga-wm>	 RECOVERY - DPKG on stat1007 is OK: All packages OK https://wikitech.wikimedia.org/wiki/Monitoring/dpkg
[22:47:44] <icinga-wm>	 RECOVERY - dhclient process on stat1007 is OK: PROCS OK: 0 processes with command name dhclient https://wikitech.wikimedia.org/wiki/Monitoring/check_dhclient
[22:47:50] <icinga-wm>	 RECOVERY - Check size of conntrack table on stat1007 is OK: OK: nf_conntrack is 0 % full https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack
[22:48:28] <icinga-wm>	 RECOVERY - Disk space on stat1007 is OK: DISK OK https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=stat1007&var-datasource=eqiad+prometheus/ops
[22:48:48] <icinga-wm>	 RECOVERY - puppet last run on stat1007 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures https://wikitech.wikimedia.org/wiki/Monitoring/puppet_checkpuppetrun
[22:48:56] <icinga-wm>	 RECOVERY - configured eth on stat1007 is OK: OK - interfaces up https://wikitech.wikimedia.org/wiki/Monitoring/check_eth
[22:49:00] <icinga-wm>	 RECOVERY - Check whether ferm is active by checking the default input chain on stat1007 is OK: OK ferm input default policy is set https://wikitech.wikimedia.org/wiki/Monitoring/check_ferm
[22:49:02] <icinga-wm>	 RECOVERY - MD RAID on stat1007 is OK: OK: Active: 8, Working: 8, Failed: 0, Spare: 0 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Hardware_Raid_Information_Gathering
[23:09:28] <icinga-wm>	 RECOVERY - Check systemd state on stat1007 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state