[00:04:38] <icinga-wm>	 PROBLEM - Widespread puppet agent failures on alert1001 is CRITICAL: 0.01009 ge 0.01 https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/yOxVDGvWk/puppet
[00:14:47] <wikibugs>	 (03PS1) 10Ladsgroup: Add IntelliJ files to .gitignore [debs/pybal] - 10https://gerrit.wikimedia.org/r/644036
[00:30:16] <icinga-wm>	 PROBLEM - Postgres Replication Lag on maps1003 is CRITICAL: POSTGRES_HOT_STANDBY_DELAY CRITICAL: DB template1 (host:localhost) 36694688 and 342 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[00:32:02] <icinga-wm>	 RECOVERY - Postgres Replication Lag on maps1003 is OK: POSTGRES_HOT_STANDBY_DELAY OK: DB template1 (host:localhost) 927120 and 0 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[00:37:16] <icinga-wm>	 PROBLEM - Postgres Replication Lag on maps1003 is CRITICAL: POSTGRES_HOT_STANDBY_DELAY CRITICAL: DB template1 (host:localhost) 45172632 and 3 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[00:37:32] <icinga-wm>	 PROBLEM - Postgres Replication Lag on maps2005 is CRITICAL: POSTGRES_HOT_STANDBY_DELAY CRITICAL: DB template1 (host:localhost) 59533832 and 446 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[00:37:38] <icinga-wm>	 PROBLEM - Postgres Replication Lag on maps2006 is CRITICAL: POSTGRES_HOT_STANDBY_DELAY CRITICAL: DB template1 (host:localhost) 120543416 and 452 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[00:37:46] <icinga-wm>	 PROBLEM - Postgres Replication Lag on maps2008 is CRITICAL: POSTGRES_HOT_STANDBY_DELAY CRITICAL: DB template1 (host:localhost) 96234296 and 460 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[00:39:02] <icinga-wm>	 RECOVERY - Postgres Replication Lag on maps1003 is OK: POSTGRES_HOT_STANDBY_DELAY OK: DB template1 (host:localhost) 693688 and 0 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[00:39:16] <icinga-wm>	 RECOVERY - Postgres Replication Lag on maps2005 is OK: POSTGRES_HOT_STANDBY_DELAY OK: DB template1 (host:localhost) 652512 and 550 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[00:39:22] <icinga-wm>	 RECOVERY - Postgres Replication Lag on maps2006 is OK: POSTGRES_HOT_STANDBY_DELAY OK: DB template1 (host:localhost) 1737288 and 556 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[00:39:30] <icinga-wm>	 RECOVERY - Postgres Replication Lag on maps2008 is OK: POSTGRES_HOT_STANDBY_DELAY OK: DB template1 (host:localhost) 820216 and 0 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[01:49:23] <wikibugs>	 (03PS1) 10Ladsgroup: [WIP] Start migrating pybal to python3 [debs/pybal] - 10https://gerrit.wikimedia.org/r/644041 (https://phabricator.wikimedia.org/T200319)
[02:22:23] <wikibugs>	 (03PS1) 10Reedy: Remove REL1_34 from $wgExtDistSnapshotRefs [mediawiki-config] - 10https://gerrit.wikimedia.org/r/644045 (https://phabricator.wikimedia.org/T268931)
[02:26:41] <wikibugs>	 (03PS1) 10Ladsgroup: [DNM] Test if tests are being ran [debs/pybal] - 10https://gerrit.wikimedia.org/r/644046
[02:27:39] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] [DNM] Test if tests are being ran [debs/pybal] - 10https://gerrit.wikimedia.org/r/644046 (owner: 10Ladsgroup)
[02:28:34] <wikibugs>	 (03Abandoned) 10Ladsgroup: [DNM] Test if tests are being ran [debs/pybal] - 10https://gerrit.wikimedia.org/r/644046 (owner: 10Ladsgroup)
[02:33:36] <wikibugs>	 (03PS2) 10Ladsgroup: [WIP] Start migrating pybal to python3 [debs/pybal] - 10https://gerrit.wikimedia.org/r/644041 (https://phabricator.wikimedia.org/T200319)
[02:33:54] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] [WIP] Start migrating pybal to python3 [debs/pybal] - 10https://gerrit.wikimedia.org/r/644041 (https://phabricator.wikimedia.org/T200319) (owner: 10Ladsgroup)
[02:40:08] <wikibugs>	 (03PS3) 10Ladsgroup: [WIP] Start migrating pybal to python3 [debs/pybal] - 10https://gerrit.wikimedia.org/r/644041 (https://phabricator.wikimedia.org/T200319)
[03:00:32] <wikibugs>	 (03PS4) 10Ladsgroup: [WIP] Start migrating pybal to python3 [debs/pybal] - 10https://gerrit.wikimedia.org/r/644041 (https://phabricator.wikimedia.org/T200319)
[03:01:03] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] [WIP] Start migrating pybal to python3 [debs/pybal] - 10https://gerrit.wikimedia.org/r/644041 (https://phabricator.wikimedia.org/T200319) (owner: 10Ladsgroup)
[03:16:48] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp3058 is CRITICAL: cluster=cache_text instance=cp3058 job=purged layer=backend site=esams https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3058
[03:16:52] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp3064 is CRITICAL: cluster=cache_text instance=cp3064 job=purged layer=backend site=esams https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3064
[03:17:14] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp2029 is CRITICAL: cluster=cache_text instance=cp2029 job=purged layer=backend site=codfw https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=codfw+prometheus/ops&var-instance=cp2029
[03:17:14] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp5007 is CRITICAL: cluster=cache_text instance=cp5007 job=purged layer=backend site=eqsin https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqsin+prometheus/ops&var-instance=cp5007
[03:17:20] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp3062 is CRITICAL: cluster=cache_text instance=cp3062 job=purged layer=backend site=esams https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3062
[03:17:30] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp3060 is CRITICAL: cluster=cache_text instance=cp3060 job=purged layer=backend site=esams https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3060
[03:17:32] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp5009 is CRITICAL: cluster=cache_text instance=cp5009 job=purged layer=backend site=eqsin https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqsin+prometheus/ops&var-instance=cp5009
[03:17:34] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp3056 is CRITICAL: cluster=cache_text instance=cp3056 job=purged layer=backend site=esams https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3056
[03:17:38] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp1075 is CRITICAL: cluster=cache_text instance=cp1075 job=purged layer=backend site=eqiad https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1075
[03:17:38] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp4030 is CRITICAL: cluster=cache_text instance=cp4030 job=purged layer=backend site=ulsfo https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4030
[03:17:42] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp4028 is CRITICAL: cluster=cache_text instance=cp4028 job=purged layer=backend site=ulsfo https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4028
[03:17:46] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp1087 is CRITICAL: cluster=cache_text instance=cp1087 job=purged layer=backend site=eqiad https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1087
[03:17:48] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp2027 is CRITICAL: cluster=cache_text instance=cp2027 job=purged layer=backend site=codfw https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=codfw+prometheus/ops&var-instance=cp2027
[03:17:56] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp4032 is CRITICAL: cluster=cache_text instance=cp4032 job=purged layer=backend site=ulsfo https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4032
[03:17:56] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp4029 is CRITICAL: cluster=cache_text instance=cp4029 job=purged layer=backend site=ulsfo https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4029
[03:17:56] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp1089 is CRITICAL: cluster=cache_text instance=cp1089 job=purged layer=backend site=eqiad https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1089
[03:17:58] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp4027 is CRITICAL: cluster=cache_text instance=cp4027 job=purged layer=backend site=ulsfo https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4027
[03:18:08] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp5011 is CRITICAL: cluster=cache_text instance=cp5011 job=purged layer=backend site=eqsin https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqsin+prometheus/ops&var-instance=cp5011
[03:18:18] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp2033 is CRITICAL: cluster=cache_text instance=cp2033 job=purged layer=backend site=codfw https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=codfw+prometheus/ops&var-instance=cp2033
[03:18:22] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp5008 is CRITICAL: cluster=cache_text instance=cp5008 job=purged layer=backend site=eqsin https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqsin+prometheus/ops&var-instance=cp5008
[03:18:32] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp1077 is CRITICAL: cluster=cache_text instance=cp1077 job=purged layer=backend site=eqiad https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1077
[03:18:34] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp4031 is CRITICAL: cluster=cache_text instance=cp4031 job=purged layer=backend site=ulsfo https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4031
[03:18:38] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp3054 is CRITICAL: cluster=cache_text instance=cp3054 job=purged layer=backend site=esams https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3054
[03:18:42] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp2035 is CRITICAL: cluster=cache_text instance=cp2035 job=purged layer=backend site=codfw https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=codfw+prometheus/ops&var-instance=cp2035
[03:18:46] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp1083 is CRITICAL: cluster=cache_text instance=cp1083 job=purged layer=backend site=eqiad https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1083
[03:18:56] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp1079 is CRITICAL: cluster=cache_text instance=cp1079 job=purged layer=backend site=eqiad https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1079
[03:29:24] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp5007 is CRITICAL: cluster=cache_text instance=cp5007 job=purged layer=backend site=eqsin https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqsin+prometheus/ops&var-instance=cp5007
[03:32:04] <wikibugs>	 10Operations, 10MediaWiki-extensions-Score, 10Security-Team, 10Wikimedia-General-or-Unknown, and 2 others: Extension:Score / Lilypond is disabled on all wikis - https://phabricator.wikimedia.org/T257066 (10Ycrusoe) Hi,  I'm one among a presumably significant group of people around the world trying to learn...
[03:41:00] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp1075 is CRITICAL: 5.681e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1075
[03:43:22] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp1077 is CRITICAL: 5.95e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1077
[03:43:50] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp1083 is CRITICAL: 5.374e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1083
[03:45:34] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp1087 is CRITICAL: 4.973e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1087
[03:46:40] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp1089 is CRITICAL: 5.035e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1089
[03:48:12] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp3064 is CRITICAL: 5.308e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3064
[03:49:08] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp2027 is CRITICAL: 5.201e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=codfw+prometheus/ops&var-instance=cp2027
[03:50:02] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp1079 is CRITICAL: 3.857e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1079
[03:52:12] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp3060 is CRITICAL: 4.998e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3060
[03:52:16] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp3050 is CRITICAL: cluster=cache_text instance=cp3050 job=purged layer=backend site=esams https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3050
[03:55:02] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp3054 is CRITICAL: 5.143e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3054
[03:57:21] <Krenair>	 wonder what's going on
[03:57:24] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp3062 is CRITICAL: 4.999e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3062
[03:57:26] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp5008 is CRITICAL: 5.352e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqsin+prometheus/ops&var-instance=cp5008
[03:57:54] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp3058 is CRITICAL: 5.08e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3058
[03:59:04] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp2029 is CRITICAL: 5.408e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=codfw+prometheus/ops&var-instance=cp2029
[03:59:08] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp5009 is CRITICAL: 5.268e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqsin+prometheus/ops&var-instance=cp5009
[03:59:22] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp2033 is CRITICAL: 5.539e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=codfw+prometheus/ops&var-instance=cp2033
[04:01:18] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp4029 is CRITICAL: 7.253e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4029
[04:02:18] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp2035 is CRITICAL: 4.853e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=codfw+prometheus/ops&var-instance=cp2035
[04:03:22] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp5011 is CRITICAL: 4.87e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqsin+prometheus/ops&var-instance=cp5011
[04:10:44] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp3056 is CRITICAL: 4.414e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3056
[04:13:08] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp3050 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3050
[04:17:38] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp4030 is CRITICAL: 4.505e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4030
[04:23:58] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp4032 is CRITICAL: 8998 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4032
[04:24:34] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp4031 is CRITICAL: 1.82e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4031
[04:27:02] <icinga-wm>	 PROBLEM - Number of messages locally queued by purged for processing on cp3050 is CRITICAL: cluster=cache_text instance=cp3050 job=purged layer=backend site=esams https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3050
[04:34:22] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp4027 is CRITICAL: 7.055e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4027
[04:35:30] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp4028 is CRITICAL: 1.055e+05 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4028
[04:36:24] <icinga-wm>	 PROBLEM - Time elapsed since the last kafka event processed by purged on cp5007 is CRITICAL: 5.581e+04 gt 5000 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqsin+prometheus/ops&var-instance=cp5007
[04:40:54] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp3050 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3050
[04:46:50] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp5007 is OK: (C)5000 gt (W)3000 gt 2749 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqsin+prometheus/ops&var-instance=cp5007
[04:49:59] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp4027 is OK: (C)5000 gt (W)3000 gt 70.04 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4027
[04:50:36] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp4031 is OK: (C)5000 gt (W)3000 gt 61.34 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4031
[04:51:08] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp4028 is OK: (C)5000 gt (W)3000 gt 69.37 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4028
[04:51:46] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp4032 is OK: (C)5000 gt (W)3000 gt 1480 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4032
[04:52:22] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp4030 is OK: (C)5000 gt (W)3000 gt 73.15 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4030
[04:52:28] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp3056 is OK: (C)5000 gt (W)3000 gt 261.6 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3056
[04:53:00] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp5007 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqsin+prometheus/ops&var-instance=cp5007
[04:53:16] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp4028 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4028
[04:53:28] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp4032 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4032
[04:53:28] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp4027 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4027
[04:54:04] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp4031 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4031
[04:54:54] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp4030 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4030
[04:58:18] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp3056 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3056
[05:02:32] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp5011 is OK: (C)5000 gt (W)3000 gt 406.1 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqsin+prometheus/ops&var-instance=cp5011
[05:03:08] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp2035 is OK: (C)5000 gt (W)3000 gt 34.73 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=codfw+prometheus/ops&var-instance=cp2035
[05:03:52] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp4029 is OK: (C)5000 gt (W)3000 gt 87.45 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4029
[05:04:16] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp5011 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqsin+prometheus/ops&var-instance=cp5011
[05:04:36] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp2035 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=codfw+prometheus/ops&var-instance=cp2035
[05:05:06] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp2029 is OK: (C)5000 gt (W)3000 gt 33.06 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=codfw+prometheus/ops&var-instance=cp2029
[05:05:36] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp4029 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=ulsfo+prometheus/ops&var-instance=cp4029
[05:07:02] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp5009 is OK: (C)5000 gt (W)3000 gt 283.9 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqsin+prometheus/ops&var-instance=cp5009
[05:07:08] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp2033 is OK: (C)5000 gt (W)3000 gt 41.33 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=codfw+prometheus/ops&var-instance=cp2033
[05:08:18] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp2029 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=codfw+prometheus/ops&var-instance=cp2029
[05:08:38] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp3062 is OK: (C)5000 gt (W)3000 gt 224.3 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3062
[05:08:52] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp5008 is OK: (C)5000 gt (W)3000 gt 236.5 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqsin+prometheus/ops&var-instance=cp5008
[05:08:52] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp5009 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqsin+prometheus/ops&var-instance=cp5009
[05:09:08] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp3058 is OK: (C)5000 gt (W)3000 gt 298 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3058
[05:09:24] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp2033 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=codfw+prometheus/ops&var-instance=cp2033
[05:09:42] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp5008 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqsin+prometheus/ops&var-instance=cp5008
[05:10:14] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp3062 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3062
[05:11:26] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp3058 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3058
[05:11:32] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp3054 is OK: (C)5000 gt (W)3000 gt 207.4 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3054
[05:13:16] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp3054 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3054
[05:15:06] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp1079 is OK: (C)5000 gt (W)3000 gt 236.1 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1079
[05:15:34] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp3060 is OK: (C)5000 gt (W)3000 gt 233.9 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3060
[05:15:56] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp2027 is OK: (C)5000 gt (W)3000 gt 152.9 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=codfw+prometheus/ops&var-instance=cp2027
[05:16:52] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp1079 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1079
[05:17:18] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp3060 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3060
[05:19:18] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp2027 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=codfw+prometheus/ops&var-instance=cp2027
[05:20:16] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp3064 is OK: (C)5000 gt (W)3000 gt 200.4 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3064
[05:22:04] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp1089 is OK: (C)5000 gt (W)3000 gt 78.26 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1089
[05:22:44] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp1087 is OK: (C)5000 gt (W)3000 gt 117.3 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1087
[05:23:38] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp3064 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=esams+prometheus/ops&var-instance=cp3064
[05:24:24] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp1087 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1087
[05:24:34] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp1089 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1089
[05:27:24] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp1077 is OK: (C)5000 gt (W)3000 gt 112.1 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1077
[05:27:54] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp1083 is OK: (C)5000 gt (W)3000 gt 187.3 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1083
[05:30:20] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp1077 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1077
[05:30:36] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp1083 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1083
[05:32:02] <icinga-wm>	 RECOVERY - Time elapsed since the last kafka event processed by purged on cp1075 is OK: (C)5000 gt (W)3000 gt 92.57 https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1075
[05:34:40] <icinga-wm>	 RECOVERY - Number of messages locally queued by purged for processing on cp1075 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Purged%23Alerts https://grafana.wikimedia.org/dashboard/db/purged?var-datasource=eqiad+prometheus/ops&var-instance=cp1075
[06:49:05] <wikibugs>	 (03CR) 10Xqt: [WIP] Start migrating pybal to python3 (033 comments) [debs/pybal] - 10https://gerrit.wikimedia.org/r/644041 (https://phabricator.wikimedia.org/T200319) (owner: 10Ladsgroup)
[08:00:04] <jouncebot>	 Deploy window No deploys all day! See Deployments/Emergencies if things are broken. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20201129T0800)
[09:18:40] <icinga-wm>	 PROBLEM - snapshot of x1 in codfw on alert1001 is CRITICAL: snapshot for x1 at codfw taken more than 3 days ago: Most recent backup 2020-11-26 08:45:26 https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Alerting
[09:39:46] <wikibugs>	 10Operations, 10Analytics: Backport kafkacat 1.6.0 from bullseye to buster-backports or buster-wikimedia - https://phabricator.wikimedia.org/T268936 (10elukey)
[09:48:24] <icinga-wm>	 PROBLEM - Host an-presto1004 is DOWN: PING CRITICAL - Packet loss = 100%
[09:49:24] <elukey>	 this is me --^ Checking what's happening, the host was in the d-i for some reason
[09:54:00] <icinga-wm>	 RECOVERY - Host an-presto1004 is UP: PING OK - Packet loss = 0%, RTA = 0.24 ms
[09:56:57] <elukey>	 I'll check tomorrow, the host doesn't recognize the disks, now it is in d-i (so no real recovery)
[10:01:39] <wikibugs>	 (03PS5) 10Ladsgroup: [WIP] Start migrating pybal to python3 [debs/pybal] - 10https://gerrit.wikimedia.org/r/644041 (https://phabricator.wikimedia.org/T200319)
[10:02:08] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] [WIP] Start migrating pybal to python3 [debs/pybal] - 10https://gerrit.wikimedia.org/r/644041 (https://phabricator.wikimedia.org/T200319) (owner: 10Ladsgroup)
[10:25:46] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=swagger_check_citoid_cluster_eqiad site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[10:27:30] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[10:40:59] <wikibugs>	 (03PS1) 10Ladsgroup: Move tests to a proper directory structure [debs/pybal] - 10https://gerrit.wikimedia.org/r/644050
[10:44:10] <wikibugs>	 (03PS2) 10Ladsgroup: Move tests to a proper directory structure [debs/pybal] - 10https://gerrit.wikimedia.org/r/644050
[11:16:22] <wikibugs>	 (03PS3) 10Ladsgroup: Move tests to a proper directory structure [debs/pybal] - 10https://gerrit.wikimedia.org/r/644050
[11:17:15] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Move tests to a proper directory structure [debs/pybal] - 10https://gerrit.wikimedia.org/r/644050 (owner: 10Ladsgroup)
[11:18:38] <wikibugs>	 (03PS4) 10Ladsgroup: Move tests to a proper directory structure [debs/pybal] - 10https://gerrit.wikimedia.org/r/644050
[11:25:59] <wikibugs>	 (03CR) 10Ladsgroup: [WIP] Start migrating pybal to python3 (033 comments) [debs/pybal] - 10https://gerrit.wikimedia.org/r/644041 (https://phabricator.wikimedia.org/T200319) (owner: 10Ladsgroup)
[12:12:35] <wikibugs>	 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: Publish Wikibase tarball releases on releases.wikimedia.org - https://phabricator.wikimedia.org/T268818 (10Aklapper)
[12:18:42] <icinga-wm>	 PROBLEM - snapshot of s1 in codfw on alert1001 is CRITICAL: snapshot for s1 at codfw taken more than 3 days ago: Most recent backup 2020-11-26 11:57:44 https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Alerting
[12:27:44] <icinga-wm>	 PROBLEM - Postgres Replication Lag on maps2001 is CRITICAL: POSTGRES_HOT_STANDBY_DELAY CRITICAL: DB template1 (host:localhost) 127649904 and 4 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[12:27:56] <icinga-wm>	 PROBLEM - Postgres Replication Lag on maps2005 is CRITICAL: POSTGRES_HOT_STANDBY_DELAY CRITICAL: DB template1 (host:localhost) 139030440 and 9 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[12:28:02] <icinga-wm>	 PROBLEM - Postgres Replication Lag on maps2006 is CRITICAL: POSTGRES_HOT_STANDBY_DELAY CRITICAL: DB template1 (host:localhost) 1340240576 and 292 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[12:28:28] <icinga-wm>	 PROBLEM - Postgres Replication Lag on maps2008 is CRITICAL: POSTGRES_HOT_STANDBY_DELAY CRITICAL: DB template1 (host:localhost) 188441152 and 11 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[12:32:50] <icinga-wm>	 PROBLEM - Postgres Replication Lag on maps2003 is CRITICAL: POSTGRES_HOT_STANDBY_DELAY CRITICAL: DB template1 (host:localhost) 835593184 and 47 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[12:34:56] <icinga-wm>	 RECOVERY - Postgres Replication Lag on maps2005 is OK: POSTGRES_HOT_STANDBY_DELAY OK: DB template1 (host:localhost) 84000 and 28 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[12:35:02] <icinga-wm>	 RECOVERY - Postgres Replication Lag on maps2006 is OK: POSTGRES_HOT_STANDBY_DELAY OK: DB template1 (host:localhost) 0 and 34 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[12:35:26] <icinga-wm>	 RECOVERY - Postgres Replication Lag on maps2008 is OK: POSTGRES_HOT_STANDBY_DELAY OK: DB template1 (host:localhost) 5304 and 58 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[12:36:20] <icinga-wm>	 RECOVERY - Postgres Replication Lag on maps2003 is OK: POSTGRES_HOT_STANDBY_DELAY OK: DB template1 (host:localhost) 74632 and 113 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[12:36:28] <icinga-wm>	 RECOVERY - Postgres Replication Lag on maps2001 is OK: POSTGRES_HOT_STANDBY_DELAY OK: DB template1 (host:localhost) 26488 and 121 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[12:49:42] <icinga-wm>	 PROBLEM - snapshot of s8 in codfw on alert1001 is CRITICAL: snapshot for s8 at codfw taken more than 3 days ago: Most recent backup 2020-11-26 12:15:28 https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Alerting
[14:52:20] <icinga-wm>	 PROBLEM - snapshot of s2 in codfw on alert1001 is CRITICAL: snapshot for s2 at codfw taken more than 3 days ago: Most recent backup 2020-11-26 14:34:21 https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Alerting
[15:15:14] <wikibugs>	 (03PS1) 10Vlad.shapik: Expiration date: OAuth 2.0 access tokens have effectively infinite expiration date [mediawiki-config] - 10https://gerrit.wikimedia.org/r/644056 (https://phabricator.wikimedia.org/T265075)
[15:18:03] <wikibugs>	 (03CR) 10Vlad.shapik: "Have a look, please." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/644056 (https://phabricator.wikimedia.org/T265075) (owner: 10Vlad.shapik)
[15:52:02] <icinga-wm>	 PROBLEM - tilerator on maps2009 is CRITICAL: connect to address 10.192.16.107 and port 6534: Connection refused https://wikitech.wikimedia.org/wiki/Services/Monitoring/tilerator
[15:52:32] <icinga-wm>	 PROBLEM - Check systemd state on maps2009 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[16:24:36] <icinga-wm>	 PROBLEM - snapshot of s4 in codfw on alert1001 is CRITICAL: snapshot for s4 at codfw taken more than 3 days ago: Most recent backup 2020-11-26 15:59:28 https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Alerting
[16:45:51] <wikibugs>	 10Operations, 10Gerrit-Privilege-Requests, 10LDAP-Access-Requests: Offboard Pablo-WMDE from WMF systems - https://phabricator.wikimedia.org/T268946 (10WMDE-leszek)
[16:52:52] <wikibugs>	 (03PS1) 10Andrew Bogott: Keystone: turn off INFO-level logging [puppet] - 10https://gerrit.wikimedia.org/r/644063 (https://phabricator.wikimedia.org/T268175)
[16:53:57] <wikibugs>	 (03CR) 10Andrew Bogott: [C: 03+2] Keystone: turn off INFO-level logging [puppet] - 10https://gerrit.wikimedia.org/r/644063 (https://phabricator.wikimedia.org/T268175) (owner: 10Andrew Bogott)
[17:00:34] <wikibugs>	 (03PS1) 10Andrew Bogott: designate: set log levels to recommended upstream defaults [puppet] - 10https://gerrit.wikimedia.org/r/644064 (https://phabricator.wikimedia.org/T268175)
[17:01:20] <wikibugs>	 (03CR) 10Andrew Bogott: [C: 03+2] designate: set log levels to recommended upstream defaults [puppet] - 10https://gerrit.wikimedia.org/r/644064 (https://phabricator.wikimedia.org/T268175) (owner: 10Andrew Bogott)
[17:30:20] <icinga-wm>	 PROBLEM - tilerator on maps2010 is CRITICAL: connect to address 10.192.48.166 and port 6534: Connection refused https://wikitech.wikimedia.org/wiki/Services/Monitoring/tilerator
[17:31:28] <icinga-wm>	 PROBLEM - Check systemd state on maps2010 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[17:56:50] <icinga-wm>	 PROBLEM - snapshot of s7 in codfw on alert1001 is CRITICAL: snapshot for s7 at codfw taken more than 3 days ago: Most recent backup 2020-11-26 17:28:34 https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Alerting
[18:09:48] <wikibugs>	 (03PS1) 10Andrew Bogott: OpenStack Glance: further attempt to quiet down logging a bit [puppet] - 10https://gerrit.wikimedia.org/r/644066 (https://phabricator.wikimedia.org/T268175)
[18:11:05] <wikibugs>	 (03CR) 10Andrew Bogott: [C: 03+2] OpenStack Glance: further attempt to quiet down logging a bit [puppet] - 10https://gerrit.wikimedia.org/r/644066 (https://phabricator.wikimedia.org/T268175) (owner: 10Andrew Bogott)
[18:20:34] <icinga-wm>	 PROBLEM - Host an-presto1004 is DOWN: PING CRITICAL - Packet loss = 100%
[18:25:32] <elukey>	 this is me --^
[18:28:56] <wikibugs>	 10Operations, 10ops-eqiad: an-presto1004 shows only the NIC in the boot list - https://phabricator.wikimedia.org/T268951 (10elukey)
[18:29:12] <wikibugs>	 10Operations, 10ops-eqiad, 10Analytics: an-presto1004 shows only the NIC in the boot list - https://phabricator.wikimedia.org/T268951 (10elukey)
[18:29:57] <icinga-wm>	 ACKNOWLEDGEMENT - SSH on an-presto1004 is CRITICAL: CRITICAL - Socket timeout after 10 seconds Elukey T268951 https://wikitech.wikimedia.org/wiki/SSH/monitoring
[18:29:57] <icinga-wm>	 ACKNOWLEDGEMENT - Host an-presto1004 is DOWN: PING CRITICAL - Packet loss = 100% Elukey T268951
[18:58:16] <icinga-wm>	 PROBLEM - snapshot of s5 in codfw on alert1001 is CRITICAL: snapshot for s5 at codfw taken more than 3 days ago: Most recent backup 2020-11-26 18:46:32 https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Alerting
[19:00:50] <wikibugs>	 10Operations, 10Gerrit-Privilege-Requests, 10LDAP-Access-Requests: Offboard Pablo-WMDE from WMF systems - https://phabricator.wikimedia.org/T268946 (10Aklapper) Thanks for filing this! I archived also https://phabricator.wikimedia.org/tag/user-pablo-wmde/ , wondering what to do with open tasks having no othe...
[19:01:02] <wikibugs>	 10Operations, 10Gerrit-Privilege-Requests, 10LDAP-Access-Requests: Offboard Pablo-WMDE from WMF systems - https://phabricator.wikimedia.org/T268946 (10Aklapper)
[19:47:16] <icinga-wm>	 PROBLEM - Widespread puppet agent failures on alert1001 is CRITICAL: 0.01009 ge 0.01 https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/yOxVDGvWk/puppet
[20:14:46] <wikibugs>	 (03PS1) 10Urbanecm: Enable RelatedArticles on ptwikinews [mediawiki-config] - 10https://gerrit.wikimedia.org/r/644070 (https://phabricator.wikimedia.org/T268945)
[20:23:00] <wikibugs>	 10Operations, 10MediaWiki-extensions-Score, 10Security-Team, 10Wikimedia-General-or-Unknown, and 2 others: Extension:Score / Lilypond is disabled on all wikis - https://phabricator.wikimedia.org/T257066 (10Ankry) >>! In T257066#6654054, @Ycrusoe wrote: > Hi, >  > I'm one among a presumably significant grou...
[20:30:22] <icinga-wm>	 PROBLEM - snapshot of s6 in codfw on alert1001 is CRITICAL: snapshot for s6 at codfw taken more than 3 days ago: Most recent backup 2020-11-26 20:24:11 https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Alerting
[21:01:06] <icinga-wm>	 PROBLEM - snapshot of s3 in codfw on alert1001 is CRITICAL: snapshot for s3 at codfw taken more than 3 days ago: Most recent backup 2020-11-26 20:39:27 https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Alerting