[02:00:47] (03PS4) 10Faidon Liambotis: Add "accounting" report [software/netbox-reports] - 10https://gerrit.wikimedia.org/r/506663 [02:07:55] PROBLEM - Postgres Replication Lag on maps1003 is CRITICAL: POSTGRES_HOT_STANDBY_DELAY CRITICAL: DB template1 (host:localhost) 210371904 and 15 seconds [02:08:35] 10Operations, 10ops-codfw: scs-a1-codfw: update serial in netbox - https://phabricator.wikimedia.org/T221984 (10faidon) [02:10:31] PROBLEM - Postgres Replication Lag on maps1003 is CRITICAL: POSTGRES_HOT_STANDBY_DELAY CRITICAL: DB template1 (host:localhost) 213751000 and 12 seconds [02:14:29] PROBLEM - Postgres Replication Lag on maps1003 is CRITICAL: POSTGRES_HOT_STANDBY_DELAY CRITICAL: DB template1 (host:localhost) 188318176 and 9 seconds [02:15:47] RECOVERY - Postgres Replication Lag on maps1003 is OK: POSTGRES_HOT_STANDBY_DELAY OK: DB template1 (host:localhost) 13736 and 20 seconds [02:23:32] (03PS1) 10DannyS712: Add namespace aliases on zhwiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/506892 (https://phabricator.wikimedia.org/T222024) [02:27:29] (03PS2) 10DannyS712: Add namespace aliases on zhwiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/506892 (https://phabricator.wikimedia.org/T222024) [03:05:19] PROBLEM - puppet last run on wtp1040 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [03:05:26] Yes [03:23:39] PROBLEM - puppet last run on db1069 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [03:31:51] RECOVERY - puppet last run on wtp1040 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [03:34:15] PROBLEM - puppet last run on mw1346 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 3 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/share/GeoIP/GeoIPCity.dat.gz] [03:36:27] PROBLEM - puppet last run on mw2266 is CRITICAL: CRITICAL: Puppet has 2 failures. Last run 5 minutes ago with 2 failures. Failed resources (up to 3 shown): File[/usr/share/GeoIP/GeoIP2-City.mmdb.gz],File[/usr/share/GeoIP/GeoIP2-City.mmdb.test] [03:36:31] PROBLEM - puppet last run on mw1309 is CRITICAL: CRITICAL: Puppet has 2 failures. Last run 5 minutes ago with 2 failures. Failed resources (up to 3 shown): File[/usr/share/GeoIP/GeoIP2-City.mmdb.gz],File[/usr/share/GeoIP/GeoIP2-City.mmdb.test] [03:50:09] RECOVERY - puppet last run on db1069 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [03:52:27] RECOVERY - HP RAID on ms-be1037 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4 - Controller: OK - Battery/Capacitor: OK https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Hardware_Raid_Information_Gathering [04:02:57] RECOVERY - puppet last run on mw2266 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [04:03:03] RECOVERY - puppet last run on mw1309 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [04:06:03] RECOVERY - puppet last run on mw1346 is OK: OK: Puppet is currently enabled, last run 5 minutes ago with 0 failures [04:17:13] PROBLEM - puppet last run on db1065 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [04:43:43] RECOVERY - puppet last run on db1065 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [05:30:21] "PHP fatal error: [05:30:21] entire web request took longer than 60 seconds and timed out" [05:30:37] this is a new error message [05:42:11] PROBLEM - Mediawiki Cirrussearch update rate - codfw on icinga1001 is CRITICAL: CRITICAL: 30.00% of data under the critical threshold [50.0] https://grafana.wikimedia.org/d/JLK3I_siz/elasticsearch-indexing?panelId=44&fullscreen&orgId=1 [05:43:29] PROBLEM - puppet last run on dbproxy1006 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [05:43:53] PROBLEM - Mediawiki Cirrussearch update rate - eqiad on icinga1001 is CRITICAL: CRITICAL: 30.00% of data under the critical threshold [50.0] https://grafana.wikimedia.org/d/JLK3I_siz/elasticsearch-indexing?panelId=44&fullscreen&orgId=1 [05:46:01] PROBLEM - puppet last run on mw1271 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [05:50:05] PROBLEM - puppet last run on analytics1047 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [05:55:07] PROBLEM - Check systemd state on ms-be1015 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [05:57:43] RECOVERY - Check systemd state on ms-be1015 is OK: OK - running: The system is fully operational [06:07:03] RECOVERY - Mediawiki Cirrussearch update rate - codfw on icinga1001 is OK: OK: Less than 1.00% under the threshold [80.0] https://grafana.wikimedia.org/d/JLK3I_siz/elasticsearch-indexing?panelId=44&fullscreen&orgId=1 [06:07:25] RECOVERY - Mediawiki Cirrussearch update rate - eqiad on icinga1001 is OK: OK: Less than 1.00% under the threshold [80.0] https://grafana.wikimedia.org/d/JLK3I_siz/elasticsearch-indexing?panelId=44&fullscreen&orgId=1 [06:09:59] RECOVERY - puppet last run on dbproxy1006 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:16:35] RECOVERY - puppet last run on analytics1047 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [06:17:49] RECOVERY - puppet last run on mw1271 is OK: OK: Puppet is currently enabled, last run 5 minutes ago with 0 failures [06:30:47] PROBLEM - puppet last run on mw2285 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [06:57:13] RECOVERY - puppet last run on mw2285 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [08:38:21] PROBLEM - Check systemd state on ms-be1014 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [09:02:01] PROBLEM - puppet last run on db1067 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [09:10:59] RECOVERY - Check systemd state on ms-be1014 is OK: OK - running: The system is fully operational [09:28:27] RECOVERY - puppet last run on db1067 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [10:10:30] (03PS1) 10Revi: Change kr.wikimedia.org destination [puppet] - 10https://gerrit.wikimedia.org/r/506895 (https://phabricator.wikimedia.org/T222033) [10:11:33] (03CR) 10Revi: [C: 04-1] "Please ping `revi` on #wm-operations before merging this because I need to move the page on Meta." [puppet] - 10https://gerrit.wikimedia.org/r/506895 (https://phabricator.wikimedia.org/T222033) (owner: 10Revi) [12:45:45] (03CR) 10Reedy: [C: 04-1] "You need to compile it too" [puppet] - 10https://gerrit.wikimedia.org/r/506895 (https://phabricator.wikimedia.org/T222033) (owner: 10Revi) [12:47:04] Reedy: how do I do that? :-) [12:47:15] revi: IIRC, there's a ruby script to run [12:47:33] I'll look around tomorrow [12:47:54] unless it's been made part of the puppet side... [12:48:22] https://github.com/wikimedia/puppet/blob/95e34bdf87b89480115a1cc068549988f6ee9fd6/modules/mediawiki/manifests/web/prod_sites.pp#L5 [12:48:26] It has, nvm then [12:48:40] (03CR) 10Reedy: "Or not anymore, apparently" [puppet] - 10https://gerrit.wikimedia.org/r/506895 (https://phabricator.wikimedia.org/T222033) (owner: 10Revi) [13:47:07] PROBLEM - puppet last run on mw1228 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [14:03:07] PROBLEM - puppet last run on labpuppetmaster1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [14:13:39] RECOVERY - puppet last run on mw1228 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [14:29:37] RECOVERY - puppet last run on labpuppetmaster1002 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [14:35:21] (03CR) 10Jforrester: [C: 03+2] Provide a temporary trwiki logo marking two years of censorship [mediawiki-config] - 10https://gerrit.wikimedia.org/r/506849 (owner: 10Jforrester) [14:36:25] (03Merged) 10jenkins-bot: Provide a temporary trwiki logo marking two years of censorship [mediawiki-config] - 10https://gerrit.wikimedia.org/r/506849 (owner: 10Jforrester) [14:38:58] (03CR) 10jenkins-bot: Provide a temporary trwiki logo marking two years of censorship [mediawiki-config] - 10https://gerrit.wikimedia.org/r/506849 (owner: 10Jforrester) [14:40:08] (03CR) 10Jforrester: [C: 03+2] Provide a temporary trwiki logo marking two years of censorship [mediawiki-config] - 10https://gerrit.wikimedia.org/r/506849 (owner: 10Jforrester) [14:44:18] !log jforrester@deploy1001 Synchronized static/images/project-logos/trwiki-2x.png: trwiki: Update logo for 2 year anniversary, part I (duration: 00m 55s) [14:44:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:45:32] !log jforrester@deploy1001 Synchronized static/images/project-logos/trwiki-1.5x.png: trwiki: Update logo for 2 year anniversary, part II (duration: 00m 53s) [14:45:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:47:48] !log jforrester@deploy1001 Synchronized static/images/project-logos/trwiki.png: trwiki: Update logo for 2 year anniversary, part III (duration: 00m 53s) [14:47:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:53:16] !log Manually purged the trwiki logos from Varnish as part of updating them for 2 year anniversary. [14:53:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:55:38] !log Updated trwiki's MediaWiki:Common.css to not over-ride the logo. [14:55:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:58:33] (03PS1) 10Jforrester: Revert "Provide a temporary trwiki logo marking two years of censorship" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/506903 [14:58:44] (03CR) 10Jforrester: [C: 04-1] "Not yet." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/506903 (owner: 10Jforrester) [15:53:55] PROBLEM - Check systemd state on ms-be1015 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [15:57:51] RECOVERY - Check systemd state on ms-be1015 is OK: OK - running: The system is fully operational [16:57:27] PROBLEM - puppet last run on neon is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:00:47] PROBLEM - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is CRITICAL: /api (Ensure Zotero is working) timed out before a response was received: /api (Scrapes sample page) timed out before a response was received https://wikitech.wikimedia.org/wiki/Citoid [17:01:59] RECOVERY - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Citoid [17:03:13] PROBLEM - Host cp3037 is DOWN: PING CRITICAL - Packet loss = 100% [17:08:27] PROBLEM - Host cp3037.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [17:10:39] PROBLEM - IPsec on cp1080 is CRITICAL: Strongswan CRITICAL - ok: 68 not-conn: cp3037_v4, cp3037_v6 [17:10:43] PROBLEM - IPsec on cp2020 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 [17:10:57] PROBLEM - IPsec on cp1090 is CRITICAL: Strongswan CRITICAL - ok: 68 not-conn: cp3037_v4, cp3037_v6 [17:10:57] PROBLEM - IPsec on cp1088 is CRITICAL: Strongswan CRITICAL - ok: 68 not-conn: cp3037_v4, cp3037_v6 [17:10:59] PROBLEM - IPsec on cp2002 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 [17:10:59] PROBLEM - IPsec on cp2008 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 [17:10:59] PROBLEM - IPsec on cp2017 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 [17:10:59] PROBLEM - IPsec on cp2005 is CRITICAL: Strongswan CRITICAL - ok: 60 connecting: cp3037_v4 not-conn: cp3037_v6 [17:10:59] PROBLEM - IPsec on cp2018 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 [17:11:03] PROBLEM - IPsec on cp2022 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 [17:11:19] PROBLEM - IPsec on cp1078 is CRITICAL: Strongswan CRITICAL - ok: 68 not-conn: cp3037_v4, cp3037_v6 [17:11:19] PROBLEM - IPsec on cp1082 is CRITICAL: Strongswan CRITICAL - ok: 68 not-conn: cp3037_v4, cp3037_v6 [17:11:21] PROBLEM - IPsec on cp1086 is CRITICAL: Strongswan CRITICAL - ok: 68 not-conn: cp3037_v4, cp3037_v6 [17:11:23] PROBLEM - IPsec on cp2011 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 [17:11:23] PROBLEM - IPsec on cp2024 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 [17:11:33] PROBLEM - IPsec on cp1084 is CRITICAL: Strongswan CRITICAL - ok: 68 not-conn: cp3037_v4, cp3037_v6 [17:11:33] PROBLEM - IPsec on cp1076 is CRITICAL: Strongswan CRITICAL - ok: 68 not-conn: cp3037_v4, cp3037_v6 [17:11:49] PROBLEM - IPsec on cp2026 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 [17:11:49] PROBLEM - IPsec on cp2025 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 [17:11:49] PROBLEM - IPsec on cp2014 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 [17:23:57] RECOVERY - puppet last run on neon is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [17:46:03] !log Depooling cp3037 [17:46:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:46:32] !log jiji@cumin1001 conftool action : set/pooled=no; selector: name=cp3037.esams.wmnet [17:46:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:52:11] 10Operations, 10Traffic: cp3037 is currently unreachable - https://phabricator.wikimedia.org/T222041 (10Vgutierrez) [17:52:47] 10Operations, 10Traffic: cp3037 is currently unreachable - https://phabricator.wikimedia.org/T222041 (10Vgutierrez) p:05Triage→03Normal [17:54:27] ACKNOWLEDGEMENT - IPsec on cp1076 is CRITICAL: Strongswan CRITICAL - ok: 68 not-conn: cp3037_v4, cp3037_v6 Effie Mouzeli cp3037 is down - T222041 [17:54:27] ACKNOWLEDGEMENT - IPsec on cp1078 is CRITICAL: Strongswan CRITICAL - ok: 68 not-conn: cp3037_v4, cp3037_v6 Effie Mouzeli cp3037 is down - T222041 [17:54:27] ACKNOWLEDGEMENT - IPsec on cp1080 is CRITICAL: Strongswan CRITICAL - ok: 68 not-conn: cp3037_v4, cp3037_v6 Effie Mouzeli cp3037 is down - T222041 [17:54:27] ACKNOWLEDGEMENT - IPsec on cp1082 is CRITICAL: Strongswan CRITICAL - ok: 68 not-conn: cp3037_v4, cp3037_v6 Effie Mouzeli cp3037 is down - T222041 [17:54:27] ACKNOWLEDGEMENT - IPsec on cp1084 is CRITICAL: Strongswan CRITICAL - ok: 68 not-conn: cp3037_v4, cp3037_v6 Effie Mouzeli cp3037 is down - T222041 [17:54:27] ACKNOWLEDGEMENT - IPsec on cp1086 is CRITICAL: Strongswan CRITICAL - ok: 68 not-conn: cp3037_v4, cp3037_v6 Effie Mouzeli cp3037 is down - T222041 [17:54:27] ACKNOWLEDGEMENT - IPsec on cp1088 is CRITICAL: Strongswan CRITICAL - ok: 68 not-conn: cp3037_v4, cp3037_v6 Effie Mouzeli cp3037 is down - T222041 [17:54:28] ACKNOWLEDGEMENT - IPsec on cp1090 is CRITICAL: Strongswan CRITICAL - ok: 68 not-conn: cp3037_v4, cp3037_v6 Effie Mouzeli cp3037 is down - T222041 [17:54:28] ACKNOWLEDGEMENT - IPsec on cp2002 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 Effie Mouzeli cp3037 is down - T222041 [17:54:29] ACKNOWLEDGEMENT - IPsec on cp2005 is CRITICAL: Strongswan CRITICAL - ok: 60 connecting: cp3037_v4 not-conn: cp3037_v6 Effie Mouzeli cp3037 is down - T222041 [17:54:29] ACKNOWLEDGEMENT - IPsec on cp2008 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 Effie Mouzeli cp3037 is down - T222041 [17:54:30] ACKNOWLEDGEMENT - IPsec on cp2011 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 Effie Mouzeli cp3037 is down - T222041 [17:54:30] ACKNOWLEDGEMENT - IPsec on cp2014 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 Effie Mouzeli cp3037 is down - T222041 [17:54:31] ACKNOWLEDGEMENT - IPsec on cp2017 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 Effie Mouzeli cp3037 is down - T222041 [17:54:31] ACKNOWLEDGEMENT - IPsec on cp2018 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 Effie Mouzeli cp3037 is down - T222041 [17:54:32] ACKNOWLEDGEMENT - IPsec on cp2020 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 Effie Mouzeli cp3037 is down - T222041 [17:54:32] ACKNOWLEDGEMENT - IPsec on cp2022 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 Effie Mouzeli cp3037 is down - T222041 [17:54:33] ACKNOWLEDGEMENT - IPsec on cp2024 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 Effie Mouzeli cp3037 is down - T222041 [17:54:33] ACKNOWLEDGEMENT - IPsec on cp2025 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 Effie Mouzeli cp3037 is down - T222041 [17:54:34] ACKNOWLEDGEMENT - IPsec on cp2026 is CRITICAL: Strongswan CRITICAL - ok: 60 not-conn: cp3037_v4, cp3037_v6 Effie Mouzeli cp3037 is down - T222041 [17:55:32] ACKNOWLEDGEMENT - Host cp3037 is DOWN: PING CRITICAL - Packet loss = 100% Effie Mouzeli cp3037 is down - T222041 [17:55:32] ACKNOWLEDGEMENT - Host cp3037.mgmt is DOWN: PING CRITICAL - Packet loss = 100% Effie Mouzeli cp3037 is down - T222041 [18:37:11] PROBLEM - Memory correctable errors -EDAC- on kafka1023 is CRITICAL: 4.001 ge 4 https://grafana.wikimedia.org/dashboard/db/host-overview?orgId=1&var-server=kafka1023&var-datasource=eqiad+prometheus/ops [18:37:13] PROBLEM - High CPU load on API appserver on mw1281 is CRITICAL: CRITICAL - load average: 62.99, 27.72, 19.99 [18:38:31] RECOVERY - High CPU load on API appserver on mw1281 is OK: OK - load average: 21.67, 22.70, 18.83 [19:22:20] 10Operations, 10ops-eqiad, 10RESTBase, 10Core Platform Team Backlog (Watching / External), and 2 others: rack/setup/install restbase10[19-27].eqiad.wmnet - https://phabricator.wikimedia.org/T219404 (10mobrovac) @Cmjohnson any movement on this? Do you have an ETA on when the machines will be installed? [19:59:55] PROBLEM - puppet last run on oresrdb1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [20:26:25] RECOVERY - puppet last run on oresrdb1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [21:57:25] RECOVERY - HP RAID on ms-be1030 is OK: OK: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4 - Controller: OK - Battery/Capacitor: OK https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Hardware_Raid_Information_Gathering [22:42:13] PROBLEM - exim queue on mx1001 is CRITICAL: CRITICAL: 3167 mails in exim queue.