[00:07:00] <wikibugs>	 (03CR) 10Paladox: phabricator: write my.cnf for db access into each admin home dir (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/551268 (https://phabricator.wikimedia.org/T238425) (owner: 10Dzahn)
[00:14:05] <icinga-wm>	 PROBLEM - Varnish traffic drop between 30min ago and now at esams on icinga1001 is CRITICAL: 42.78 le 60 https://wikitech.wikimedia.org/wiki/Varnish%23Diagnosing_Varnish_alerts https://grafana.wikimedia.org/dashboard/db/varnish-http-requests?panelId=6&fullscreen&orgId=1
[00:14:35] <icinga-wm>	 PROBLEM - Varnish traffic drop between 30min ago and now at eqsin on icinga1001 is CRITICAL: 25.67 le 60 https://wikitech.wikimedia.org/wiki/Varnish%23Diagnosing_Varnish_alerts https://grafana.wikimedia.org/dashboard/db/varnish-http-requests?panelId=6&fullscreen&orgId=1
[00:15:49] <icinga-wm>	 RECOVERY - Varnish traffic drop between 30min ago and now at esams on icinga1001 is OK: (C)60 le (W)70 le 74.88 https://wikitech.wikimedia.org/wiki/Varnish%23Diagnosing_Varnish_alerts https://grafana.wikimedia.org/dashboard/db/varnish-http-requests?panelId=6&fullscreen&orgId=1
[00:16:19] <icinga-wm>	 RECOVERY - Varnish traffic drop between 30min ago and now at eqsin on icinga1001 is OK: (C)60 le (W)70 le 116.1 https://wikitech.wikimedia.org/wiki/Varnish%23Diagnosing_Varnish_alerts https://grafana.wikimedia.org/dashboard/db/varnish-http-requests?panelId=6&fullscreen&orgId=1
[00:26:09] <wikibugs>	 10Operations, 10SRE-Access-Requests: Requesting access to LogStash for rxy - https://phabricator.wikimedia.org/T239494 (10Rxy) p:05Triage→03Normal
[00:27:17] <wikibugs>	 10Operations, 10SRE-Access-Requests: Requesting access to LogStash for rxy - https://phabricator.wikimedia.org/T239494 (10Rxy) p:05Normal→03Triage
[00:43:25] <icinga-wm>	 PROBLEM - Logstash Elasticsearch indexing errors on icinga1001 is CRITICAL: 0.9917 ge 0.5 https://wikitech.wikimedia.org/wiki/Logstash%23Indexing_errors https://logstash.wikimedia.org/goto/1cee1f1b5d4e6c5e06edb3353a2a4b83 https://grafana.wikimedia.org/dashboard/db/logstash
[00:46:55] <icinga-wm>	 RECOVERY - Logstash Elasticsearch indexing errors on icinga1001 is OK: (C)0.5 ge (W)0.1 ge 0.0375 https://wikitech.wikimedia.org/wiki/Logstash%23Indexing_errors https://logstash.wikimedia.org/goto/1cee1f1b5d4e6c5e06edb3353a2a4b83 https://grafana.wikimedia.org/dashboard/db/logstash
[01:11:00] <wikibugs>	 (03PS3) 10DannyS712: Enable partial blocks on eswiki and scowiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/553431 (https://phabricator.wikimedia.org/T239370)
[01:21:50] <wikibugs>	 (03PS1) 10Zoranzoki21: Add throttle rule for cawiki workshop on 2019-12-02 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/553787 (https://phabricator.wikimedia.org/T239465)
[04:09:23] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=bacula site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[04:37:23] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[06:13:59] <icinga-wm>	 PROBLEM - Host cp3057 is DOWN: PING CRITICAL - Packet loss = 100%
[07:02:03] <wikibugs>	 10Operations, 10Traffic: servers freeze across the caching cluster - https://phabricator.wikimedia.org/T238305 (10Marostegui) ` 06:13:59 <+icinga-wm> PROBLEM - Host cp3057 is DOWN: PING CRITICAL - Packet loss = 100 `  Could be another case of R440 going down?
[07:28:41] <wikibugs>	 10Operations, 10Traffic: cp3057 crashed - https://phabricator.wikimedia.org/T239502 (10Vgutierrez)
[07:30:32] <vgutierrez>	 !log depool and powercycle cp3057 - T239502
[07:30:38] <logmsgbot>	 !log vgutierrez@puppetmaster1001 conftool action : set/pooled=no; selector: name=cp3057.esams.wmnet
[07:30:38] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:30:39] <stashbot>	 T239502: cp3057 crashed - https://phabricator.wikimedia.org/T239502
[07:30:42] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:34:11] <icinga-wm>	 RECOVERY - Host cp3057 is UP: PING OK - Packet loss = 0%, RTA = 83.39 ms
[07:38:01] <wikibugs>	 10Operations, 10Traffic: cp3057 crashed - https://phabricator.wikimedia.org/T239502 (10Vgutierrez) 05Open→03Resolved p:05Triage→03Normal a:03Vgutierrez As with other occurrences of T238305, nothing on the console, nothing on the SEL and nothing weird on the logs.
[07:38:03] <wikibugs>	 10Operations, 10Traffic: servers freeze across the caching cluster - https://phabricator.wikimedia.org/T238305 (10Vgutierrez)
[07:38:21] <wikibugs>	 10Operations, 10Traffic: servers freeze across the caching cluster - https://phabricator.wikimedia.org/T238305 (10Vgutierrez)
[07:40:03] <wikibugs>	 10Operations, 10Traffic: servers freeze across the caching cluster - https://phabricator.wikimedia.org/T238305 (10Vgutierrez) >>! In T238305#5690539, @BBlack wrote: > It was observed earlier in the traffic meeting that we're fairly certain that none of our R440 hosts have had this problem more than once, so th...
[07:40:22] <vgutierrez>	 !log repooling cp3057 - T239502
[07:40:27] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:40:28] <stashbot>	 T239502: cp3057 crashed - https://phabricator.wikimedia.org/T239502
[12:06:27] <icinga-wm>	 PROBLEM - Varnish traffic drop between 30min ago and now at eqiad on icinga1001 is CRITICAL: 50.01 le 60 https://wikitech.wikimedia.org/wiki/Varnish%23Diagnosing_Varnish_alerts https://grafana.wikimedia.org/dashboard/db/varnish-http-requests?panelId=6&fullscreen&orgId=1
[12:36:05] <icinga-wm>	 RECOVERY - Varnish traffic drop between 30min ago and now at eqiad on icinga1001 is OK: (C)60 le (W)70 le 85.61 https://wikitech.wikimedia.org/wiki/Varnish%23Diagnosing_Varnish_alerts https://grafana.wikimedia.org/dashboard/db/varnish-http-requests?panelId=6&fullscreen&orgId=1
[14:01:38] <wikibugs>	 (03PS1) 10Alex Monk: deployment-prep: Replace stretch poolcounter with a buster one [mediawiki-config] - 10https://gerrit.wikimedia.org/r/553806
[14:18:09] <icinga-wm>	 PROBLEM - mobileapps endpoints health on scb2005 is CRITICAL: /{domain}/v1/page/media/{title} (Get media in test page) timed out before a response was received: /{domain}/v1/page/definition/{title} (retrieve en-wiktionary definitions for cat) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/mobileapps
[14:19:47] <icinga-wm>	 RECOVERY - mobileapps endpoints health on scb2005 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/mobileapps
[14:27:55] <icinga-wm>	 PROBLEM - Varnish traffic drop between 30min ago and now at ulsfo on icinga1001 is CRITICAL: 57 le 60 https://wikitech.wikimedia.org/wiki/Varnish%23Diagnosing_Varnish_alerts https://grafana.wikimedia.org/dashboard/db/varnish-http-requests?panelId=6&fullscreen&orgId=1
[14:27:55] <icinga-wm>	 PROBLEM - Varnish traffic drop between 30min ago and now at eqiad on icinga1001 is CRITICAL: 57.49 le 60 https://wikitech.wikimedia.org/wiki/Varnish%23Diagnosing_Varnish_alerts https://grafana.wikimedia.org/dashboard/db/varnish-http-requests?panelId=6&fullscreen&orgId=1
[14:28:25] <icinga-wm>	 PROBLEM - Varnish traffic drop between 30min ago and now at eqsin on icinga1001 is CRITICAL: 52.15 le 60 https://wikitech.wikimedia.org/wiki/Varnish%23Diagnosing_Varnish_alerts https://grafana.wikimedia.org/dashboard/db/varnish-http-requests?panelId=6&fullscreen&orgId=1
[14:29:39] <icinga-wm>	 RECOVERY - Varnish traffic drop between 30min ago and now at ulsfo on icinga1001 is OK: (C)60 le (W)70 le 103.5 https://wikitech.wikimedia.org/wiki/Varnish%23Diagnosing_Varnish_alerts https://grafana.wikimedia.org/dashboard/db/varnish-http-requests?panelId=6&fullscreen&orgId=1
[14:29:39] <icinga-wm>	 RECOVERY - Varnish traffic drop between 30min ago and now at eqiad on icinga1001 is OK: (C)60 le (W)70 le 104 https://wikitech.wikimedia.org/wiki/Varnish%23Diagnosing_Varnish_alerts https://grafana.wikimedia.org/dashboard/db/varnish-http-requests?panelId=6&fullscreen&orgId=1
[14:30:09] <icinga-wm>	 RECOVERY - Varnish traffic drop between 30min ago and now at eqsin on icinga1001 is OK: (C)60 le (W)70 le 92.8 https://wikitech.wikimedia.org/wiki/Varnish%23Diagnosing_Varnish_alerts https://grafana.wikimedia.org/dashboard/db/varnish-http-requests?panelId=6&fullscreen&orgId=1
[14:30:39] <Krenair>	 looks like traffic randomly increased 30m ago and then went back down
[14:33:15] <Krenair>	 looks like GETs to text caches
[14:33:50] <Krenair>	 13:55-13:59ish
[15:47:27] <Urbanecm>	 !log Reset email of SUL user Hayk.arabaget (T239462)
[15:47:32] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:47:33] <stashbot>	 T239462: Reset the password of user Hayk.arabaget - https://phabricator.wikimedia.org/T239462
[17:34:05] <icinga-wm>	 PROBLEM - Logstash Elasticsearch indexing errors on icinga1001 is CRITICAL: 1.729 ge 0.5 https://wikitech.wikimedia.org/wiki/Logstash%23Indexing_errors https://logstash.wikimedia.org/goto/1cee1f1b5d4e6c5e06edb3353a2a4b83 https://grafana.wikimedia.org/dashboard/db/logstash
[17:37:33] <icinga-wm>	 RECOVERY - Logstash Elasticsearch indexing errors on icinga1001 is OK: (C)0.5 ge (W)0.1 ge 0.02083 https://wikitech.wikimedia.org/wiki/Logstash%23Indexing_errors https://logstash.wikimedia.org/goto/1cee1f1b5d4e6c5e06edb3353a2a4b83 https://grafana.wikimedia.org/dashboard/db/logstash
[17:57:31] <wikibugs>	 10Operations, 10ops-codfw, 10DC-Ops: Move kafka200[123] to logstash202[012] - https://phabricator.wikimedia.org/T235125 (10Volans) 05Resolved→03Open Re-opening as the DNS name of the interfaces attached to those hosts have not been modified in Netbox. Things like: ` IP address: 10.193.1.23/16  Parent: lo...
[18:03:39] <wikibugs>	 10Operations, 10ops-codfw, 10SRE-swift-storage, 10decommission, 10User-fgiunchedi: decom ms-be201[345] - https://phabricator.wikimedia.org/T221068 (10Volans) 05Resolved→03Open `ms-be2013` and `ms-be2014` are marked as `Decommissioning` in Netbox, if they were unracked their status should be changed t...
[18:21:57] <icinga-wm>	 PROBLEM - Wikitech-static main page has content on labweb1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static
[18:21:57] <icinga-wm>	 PROBLEM - Wikitech-static main page has content on labweb1002 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static
[18:22:19] <icinga-wm>	 PROBLEM - Wikitech-static main page has content on cloudweb2001-dev is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1866 bytes in 0.122 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static
[18:23:33] <icinga-wm>	 RECOVERY - Wikitech-static main page has content on labweb1001 is OK: HTTP OK: HTTP/1.1 200 OK - 28426 bytes in 0.254 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static
[18:23:35] <icinga-wm>	 RECOVERY - Wikitech-static main page has content on labweb1002 is OK: HTTP OK: HTTP/1.1 200 OK - 28426 bytes in 0.857 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static
[18:24:03] <icinga-wm>	 RECOVERY - Wikitech-static main page has content on cloudweb2001-dev is OK: HTTP OK: HTTP/1.1 200 OK - 28547 bytes in 0.335 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static
[18:25:45] <icinga-wm>	 RECOVERY - Wikitech and wt-static content in sync on cloudweb2001-dev is OK: wikitech-static OK - wikitech and wikitech-static in sync (85946 200000s) https://wikitech.wikimedia.org/wiki/Wikitech-static
[18:30:01] <wikibugs>	 10Operations, 10ops-codfw, 10decommission: Decommission db2061.codfw.wmnet - https://phabricator.wikimedia.org/T238526 (10Volans) 05Resolved→03Open Netbox status is currently `Decommissioning`, if the host has been unracked it should be `Offline`.
[18:30:03] <wikibugs>	 10Operations, 10DBA: Decommission db2043-db2070 - https://phabricator.wikimedia.org/T228258 (10Volans)
[18:32:08] <wikibugs>	 10Operations, 10ops-codfw: Decommission old mw2231/WMF6435 replaced with WMF6403 - https://phabricator.wikimedia.org/T232126 (10Volans) 05Resolved→03Open It seems that Netbox's ip address has not been updated and still reports `graphite2002` in the DNS name, see https://netbox.wikimedia.org/ipam/ip-address...
[18:40:53] <wikibugs>	 10Operations, 10ops-codfw, 10decommission: Decommission db2061.codfw.wmnet - https://phabricator.wikimedia.org/T238526 (10Papaul) 05Open→03Resolved
[18:40:55] <wikibugs>	 10Operations, 10DBA: Decommission db2043-db2070 - https://phabricator.wikimedia.org/T228258 (10Papaul)
[18:44:20] <wikibugs>	 10Operations, 10ops-codfw, 10SRE-swift-storage, 10decommission, 10User-fgiunchedi: decom ms-be201[345] - https://phabricator.wikimedia.org/T221068 (10Papaul) 05Open→03Resolved
[18:48:16] <wikibugs>	 (03PS1) 10Jhedden: tools: add qdisc node collector to tools bastion [puppet] - 10https://gerrit.wikimedia.org/r/553815
[18:50:09] <wikibugs>	 (03CR) 10Jhedden: "Hoping this will provide more information on the recent bastion load utilization" [puppet] - 10https://gerrit.wikimedia.org/r/553815 (owner: 10Jhedden)
[18:52:52] <wikibugs>	 10Operations, 10ops-codfw: Decommission old mw2231/WMF6435 replaced with WMF6403 - https://phabricator.wikimedia.org/T232126 (10Papaul) @Volans do we have a template in place when a server name changes from X to Y  to also  update DNS in netbox. The last steps that I recalled were  1- change switch port descri...
[18:58:31] <wikibugs>	 10Operations, 10ops-codfw: Decommission old mw2231/WMF6435 replaced with WMF6403 - https://phabricator.wikimedia.org/T232126 (10Volans) @Papaul given we're setting the DNS name of the ip address in Netbox, that one too needs to be updated, see the links above: ` IP: 10.193.2.251/16 Assignment: mw2231 (mgmt) DN...
[18:59:55] <icinga-wm>	 PROBLEM - Logstash Elasticsearch indexing errors on icinga1001 is CRITICAL: 1.113 ge 0.5 https://wikitech.wikimedia.org/wiki/Logstash%23Indexing_errors https://logstash.wikimedia.org/goto/1cee1f1b5d4e6c5e06edb3353a2a4b83 https://grafana.wikimedia.org/dashboard/db/logstash
[19:03:25] <icinga-wm>	 RECOVERY - Logstash Elasticsearch indexing errors on icinga1001 is OK: (C)0.5 ge (W)0.1 ge 0 https://wikitech.wikimedia.org/wiki/Logstash%23Indexing_errors https://logstash.wikimedia.org/goto/1cee1f1b5d4e6c5e06edb3353a2a4b83 https://grafana.wikimedia.org/dashboard/db/logstash
[19:09:49] <wikibugs>	 10Operations, 10ops-codfw: Decommission old mw2231/WMF6435 replaced with WMF6403 - https://phabricator.wikimedia.org/T232126 (10Papaul) @volans confirm that 10.193.1.118 is indeed the mgmt IP address for mw2231 and  10.193.2.251 is no longer in use.  Thanks
[20:10:43] <icinga-wm>	 RECOVERY - Wikitech and wt-static content in sync on labweb1002 is OK: wikitech-static OK - wikitech and wikitech-static in sync (85946 200000s) https://wikitech.wikimedia.org/wiki/Wikitech-static
[20:10:45] <icinga-wm>	 RECOVERY - Wikitech and wt-static content in sync on labweb1001 is OK: wikitech-static OK - wikitech and wikitech-static in sync (85946 200000s) https://wikitech.wikimedia.org/wiki/Wikitech-static
[20:26:51] <icinga-wm>	 PROBLEM - cassandra-b SSL 10.192.48.122:7001 on restbase2017 is CRITICAL: SSL CRITICAL - failed to connect or SSL handshake:Connection refused https://phabricator.wikimedia.org/T120662
[20:26:55] <icinga-wm>	 PROBLEM - cassandra-b CQL 10.192.48.122:9042 on restbase2017 is CRITICAL: connect to address 10.192.48.122 and port 9042: Connection refused https://phabricator.wikimedia.org/T93886
[20:27:03] <icinga-wm>	 PROBLEM - cassandra-b service on restbase2017 is CRITICAL: CRITICAL - Expecting active but unit cassandra-b is failed https://wikitech.wikimedia.org/wiki/Monitoring/systemd_unit_state
[20:27:09] <icinga-wm>	 PROBLEM - cassandra-b CQL 10.192.32.111:9042 on restbase2016 is CRITICAL: connect to address 10.192.32.111 and port 9042: Connection refused https://phabricator.wikimedia.org/T93886
[20:27:17] <icinga-wm>	 PROBLEM - cassandra-b SSL 10.192.32.111:7001 on restbase2016 is CRITICAL: SSL CRITICAL - failed to connect or SSL handshake:Connection refused https://phabricator.wikimedia.org/T120662
[20:27:21] <icinga-wm>	 PROBLEM - cassandra-c CQL 10.192.32.154:9042 on restbase2011 is CRITICAL: connect to address 10.192.32.154 and port 9042: Connection refused https://phabricator.wikimedia.org/T93886
[20:27:23] <icinga-wm>	 PROBLEM - Check systemd state on restbase2017 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[20:27:27] <icinga-wm>	 PROBLEM - Check systemd state on restbase2016 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[20:27:37] <icinga-wm>	 PROBLEM - cassandra-c SSL 10.192.32.154:7001 on restbase2011 is CRITICAL: SSL CRITICAL - failed to connect or SSL handshake:Connection refused https://phabricator.wikimedia.org/T120662
[20:27:45] <icinga-wm>	 PROBLEM - cassandra-c service on restbase2011 is CRITICAL: CRITICAL - Expecting active but unit cassandra-c is failed https://wikitech.wikimedia.org/wiki/Monitoring/systemd_unit_state
[20:28:09] <icinga-wm>	 PROBLEM - Check systemd state on restbase2011 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[20:28:13] <icinga-wm>	 PROBLEM - cassandra-b service on restbase2016 is CRITICAL: CRITICAL - Expecting active but unit cassandra-b is failed https://wikitech.wikimedia.org/wiki/Monitoring/systemd_unit_state
[20:29:11] <icinga-wm>	 RECOVERY - Check systemd state on restbase2016 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[20:29:57] <icinga-wm>	 RECOVERY - cassandra-b service on restbase2016 is OK: OK - cassandra-b is active https://wikitech.wikimedia.org/wiki/Monitoring/systemd_unit_state
[20:30:41] <icinga-wm>	 RECOVERY - cassandra-b CQL 10.192.32.111:9042 on restbase2016 is OK: TCP OK - 0.036 second response time on 10.192.32.111 port 9042 https://phabricator.wikimedia.org/T93886
[20:30:47] <icinga-wm>	 RECOVERY - cassandra-b SSL 10.192.32.111:7001 on restbase2016 is OK: SSL OK - Certificate restbase2016-b valid until 2020-11-29 09:26:15 +0000 (expires in 364 days) https://phabricator.wikimedia.org/T120662
[20:41:05] <icinga-wm>	 RECOVERY - cassandra-b service on restbase2017 is OK: OK - cassandra-b is active https://wikitech.wikimedia.org/wiki/Monitoring/systemd_unit_state
[20:41:23] <icinga-wm>	 RECOVERY - Check systemd state on restbase2017 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[20:42:35] <icinga-wm>	 RECOVERY - cassandra-b SSL 10.192.48.122:7001 on restbase2017 is OK: SSL OK - Certificate restbase2017-b valid until 2020-11-29 09:26:18 +0000 (expires in 364 days) https://phabricator.wikimedia.org/T120662
[20:42:41] <icinga-wm>	 RECOVERY - cassandra-b CQL 10.192.48.122:9042 on restbase2017 is OK: TCP OK - 0.036 second response time on 10.192.48.122 port 9042 https://phabricator.wikimedia.org/T93886
[20:47:01] <icinga-wm>	 RECOVERY - cassandra-c service on restbase2011 is OK: OK - cassandra-c is active https://wikitech.wikimedia.org/wiki/Monitoring/systemd_unit_state
[20:47:23] <icinga-wm>	 RECOVERY - Check systemd state on restbase2011 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[20:48:17] <icinga-wm>	 RECOVERY - cassandra-c CQL 10.192.32.154:9042 on restbase2011 is OK: TCP OK - 0.039 second response time on 10.192.32.154 port 9042 https://phabricator.wikimedia.org/T93886
[20:48:35] <icinga-wm>	 RECOVERY - cassandra-c SSL 10.192.32.154:7001 on restbase2011 is OK: SSL OK - Certificate restbase2011-c valid until 2020-06-24 13:02:00 +0000 (expires in 206 days) https://phabricator.wikimedia.org/T120662
[22:30:13] <icinga-wm>	 PROBLEM - Disk space on netflow2001 is CRITICAL: DISK CRITICAL - free space: / 302 MB (3% inode=91%): https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=netflow2001&var-datasource=codfw+prometheus/ops