[00:19:54] PROBLEM - puppet last run on mw2082 is CRITICAL: CRITICAL: puppet fail [00:45:53] RECOVERY - puppet last run on mw2082 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [00:53:22] PROBLEM - puppet last run on mw2152 is CRITICAL: CRITICAL: puppet fail [01:20:24] RECOVERY - puppet last run on mw2152 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [01:42:08] PROBLEM - Ulsfo HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [1000.0] [01:45:29] PROBLEM - Text HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [1000.0] [02:13:47] RECOVERY - Ulsfo HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [02:15:27] RECOVERY - Text HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [02:22:53] !log mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 09m 47s) [02:22:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [02:29:49] !log l10nupdate@tin ResourceLoader cache refresh completed at Fri Dec 25 02:29:49 UTC 2015 (duration 6m 56s) [02:29:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [02:51:29] 6operations, 6Performance-Team: jobrunner memory leaks - https://phabricator.wikimedia.org/T122069#1903886 (10Krinkle) >>! In T122069#1895995, @aaron wrote: > The first change only effected JobChron. From http://graphite.wikimedia.org/render/?width=1887&height=960&_salt=1450730658.495&target=jobrunner.memory.*... [03:37:29] (03PS2) 10KartikMistry: CX: Use config.yaml to read registry [puppet] - 10https://gerrit.wikimedia.org/r/260575 [04:04:00] (03PS3) 10KartikMistry: CX: Use config.yaml to read registry [puppet] - 10https://gerrit.wikimedia.org/r/260575 [04:11:52] PROBLEM - Incoming network saturation on labstore1003 is CRITICAL: CRITICAL: 14.29% of data above the critical threshold [100000000.0] [04:14:32] PROBLEM - puppet last run on ms-be2007 is CRITICAL: CRITICAL: puppet fail [04:37:26] RECOVERY - Incoming network saturation on labstore1003 is OK: OK: Less than 10.00% above the threshold [75000000.0] [04:42:17] RECOVERY - puppet last run on ms-be2007 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:30:01] PROBLEM - puppet last run on einsteinium is CRITICAL: CRITICAL: puppet fail [06:30:40] PROBLEM - puppet last run on ms-fe1004 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:31] PROBLEM - puppet last run on db2055 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:40] PROBLEM - puppet last run on db2060 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:40] PROBLEM - puppet last run on mw1119 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:51] PROBLEM - puppet last run on holmium is CRITICAL: CRITICAL: Puppet has 2 failures [06:32:35] PROBLEM - puppet last run on mw2129 is CRITICAL: CRITICAL: Puppet has 1 failures [06:32:43] PROBLEM - puppet last run on mw2207 is CRITICAL: CRITICAL: Puppet has 1 failures [06:32:54] PROBLEM - puppet last run on mw1135 is CRITICAL: CRITICAL: Puppet has 1 failures [06:35:03] PROBLEM - puppet last run on mw2158 is CRITICAL: CRITICAL: Puppet has 1 failures [06:35:23] PROBLEM - puppet last run on analytics1021 is CRITICAL: CRITICAL: Puppet has 1 failures [06:35:53] PROBLEM - puppet last run on wtp2008 is CRITICAL: CRITICAL: Puppet has 1 failures [06:36:53] PROBLEM - puppet last run on mw2126 is CRITICAL: CRITICAL: Puppet has 1 failures [06:38:33] PROBLEM - puppet last run on bast4001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:35] RECOVERY - puppet last run on db2060 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [06:55:54] RECOVERY - puppet last run on ms-fe1004 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [06:56:03] RECOVERY - puppet last run on holmium is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [06:56:44] RECOVERY - puppet last run on einsteinium is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [06:57:04] RECOVERY - puppet last run on mw2129 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [06:57:14] RECOVERY - puppet last run on wtp2008 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:57:14] RECOVERY - puppet last run on mw2207 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:57:15] RECOVERY - puppet last run on db2055 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:57:34] RECOVERY - puppet last run on mw1135 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [06:57:34] RECOVERY - puppet last run on mw1119 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:58:14] RECOVERY - puppet last run on mw2158 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:58:14] RECOVERY - puppet last run on mw2126 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:58:34] RECOVERY - puppet last run on analytics1021 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:03:10] RECOVERY - puppet last run on bast4001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:35:53] PROBLEM - puppet last run on mw2166 is CRITICAL: CRITICAL: puppet fail [07:59:14] RECOVERY - puppet last run on mw2166 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [08:40:56] PROBLEM - Host cp4007 is DOWN: PING CRITICAL - Packet loss = 100% [08:43:36] PROBLEM - IPsec on cp1049 is CRITICAL: Strongswan CRITICAL - ok: 58 not-conn: cp4007_v4, cp4007_v6 [08:43:36] PROBLEM - IPsec on cp1074 is CRITICAL: Strongswan CRITICAL - ok: 58 not-conn: cp4007_v4, cp4007_v6 [08:43:36] PROBLEM - IPsec on cp1099 is CRITICAL: Strongswan CRITICAL - ok: 58 not-conn: cp4007_v4, cp4007_v6 [08:43:45] PROBLEM - IPsec on cp1061 is CRITICAL: Strongswan CRITICAL - ok: 58 not-conn: cp4007_v4, cp4007_v6 [08:43:45] PROBLEM - IPsec on kafka1020 is CRITICAL: Strongswan CRITICAL - ok: 164 not-conn: cp4007_v4, cp4007_v6 [08:43:47] PROBLEM - IPsec on kafka1014 is CRITICAL: Strongswan CRITICAL - ok: 164 not-conn: cp4007_v4, cp4007_v6 [08:44:06] PROBLEM - IPsec on cp1063 is CRITICAL: Strongswan CRITICAL - ok: 58 not-conn: cp4007_v4, cp4007_v6 [08:44:06] PROBLEM - IPsec on cp1073 is CRITICAL: Strongswan CRITICAL - ok: 58 not-conn: cp4007_v4, cp4007_v6 [08:44:06] PROBLEM - IPsec on cp1050 is CRITICAL: Strongswan CRITICAL - ok: 58 not-conn: cp4007_v4, cp4007_v6 [08:44:17] PROBLEM - IPsec on cp1062 is CRITICAL: Strongswan CRITICAL - ok: 58 not-conn: cp4007_v4, cp4007_v6 [08:44:26] PROBLEM - IPsec on kafka1013 is CRITICAL: Strongswan CRITICAL - ok: 164 not-conn: cp4007_v4, cp4007_v6 [08:44:35] PROBLEM - IPsec on cp1048 is CRITICAL: Strongswan CRITICAL - ok: 58 not-conn: cp4007_v4, cp4007_v6 [08:44:36] PROBLEM - IPsec on cp1051 is CRITICAL: Strongswan CRITICAL - ok: 58 not-conn: cp4007_v4, cp4007_v6 [08:44:55] PROBLEM - IPsec on cp1064 is CRITICAL: Strongswan CRITICAL - ok: 58 not-conn: cp4007_v4, cp4007_v6 [08:44:56] PROBLEM - IPsec on cp1072 is CRITICAL: Strongswan CRITICAL - ok: 58 not-conn: cp4007_v4, cp4007_v6 [08:44:56] PROBLEM - IPsec on kafka1012 is CRITICAL: Strongswan CRITICAL - ok: 164 not-conn: cp4007_v4, cp4007_v6 [08:44:56] PROBLEM - IPsec on kafka1022 is CRITICAL: Strongswan CRITICAL - ok: 164 not-conn: cp4007_v4, cp4007_v6 [08:44:56] PROBLEM - IPsec on kafka1018 is CRITICAL: Strongswan CRITICAL - ok: 164 not-conn: cp4007_v4, cp4007_v6 [08:45:16] PROBLEM - IPsec on cp1071 is CRITICAL: Strongswan CRITICAL - ok: 58 not-conn: cp4007_v4, cp4007_v6 [09:15:01] cp4007 is unresponsive to ssh, ping, management console, trying to powrcycle now [09:15:16] PROBLEM - check_mysql on db1008 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 662 [09:17:18] !log powercycling cp4007 (unresponsive to ssh, ping, serial console) [09:17:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [09:19:16] RECOVERY - Host cp4007 is UP: PING OK - Packet loss = 0%, RTA = 79.72 ms [09:19:26] RECOVERY - IPsec on cp1061 is OK: Strongswan OK - 60 ESP OK [09:19:37] RECOVERY - IPsec on cp1071 is OK: Strongswan OK - 60 ESP OK [09:19:38] RECOVERY - IPsec on cp1063 is OK: Strongswan OK - 60 ESP OK [09:19:38] RECOVERY - IPsec on cp1062 is OK: Strongswan OK - 60 ESP OK [09:19:56] RECOVERY - IPsec on kafka1013 is OK: Strongswan OK - 166 ESP OK [09:19:56] RECOVERY - IPsec on cp1049 is OK: Strongswan OK - 60 ESP OK [09:19:57] RECOVERY - IPsec on cp1074 is OK: Strongswan OK - 60 ESP OK [09:19:57] RECOVERY - IPsec on cp1073 is OK: Strongswan OK - 60 ESP OK [09:20:06] RECOVERY - IPsec on cp1048 is OK: Strongswan OK - 60 ESP OK [09:20:17] RECOVERY - IPsec on kafka1014 is OK: Strongswan OK - 166 ESP OK [09:20:17] RECOVERY - IPsec on kafka1020 is OK: Strongswan OK - 166 ESP OK [09:20:17] RECOVERY - IPsec on cp1099 is OK: Strongswan OK - 60 ESP OK [09:20:17] RECOVERY - IPsec on cp1051 is OK: Strongswan OK - 60 ESP OK [09:20:27] RECOVERY - IPsec on cp1050 is OK: Strongswan OK - 60 ESP OK [09:20:57] RECOVERY - IPsec on cp1064 is OK: Strongswan OK - 60 ESP OK [09:21:06] RECOVERY - IPsec on cp1072 is OK: Strongswan OK - 60 ESP OK [09:21:07] RECOVERY - IPsec on kafka1022 is OK: Strongswan OK - 166 ESP OK [09:21:07] RECOVERY - IPsec on kafka1012 is OK: Strongswan OK - 166 ESP OK [09:21:07] RECOVERY - IPsec on kafka1018 is OK: Strongswan OK - 166 ESP OK [09:26:47] 6operations, 6Performance-Team: jobrunner memory leaks - https://phabricator.wikimedia.org/T122069#1903975 (10jcrespo) I do not know if the test is over (but puppet is still disabled). If that is not the case, mw1015 has just freezed again right now (presumably because of OOM). [09:27:28] PROBLEM - configured eth on mw1015 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:27:37] PROBLEM - dhclient process on mw1015 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:27:38] PROBLEM - nutcracker port on mw1015 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:27:47] PROBLEM - RAID on mw1015 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:27:57] PROBLEM - nutcracker process on mw1015 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:28:07] PROBLEM - Disk space on mw1015 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:28:17] PROBLEM - SSH on mw1015 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:28:36] PROBLEM - salt-minion processes on mw1015 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:28:57] PROBLEM - DPKG on mw1015 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:29:26] RECOVERY - configured eth on mw1015 is OK: OK - interfaces up [09:29:27] RECOVERY - dhclient process on mw1015 is OK: PROCS OK: 0 processes with command name dhclient [09:29:36] RECOVERY - nutcracker port on mw1015 is OK: TCP OK - 0.000 second response time on port 11212 [09:29:37] RECOVERY - RAID on mw1015 is OK: OK: no RAID installed [09:29:56] RECOVERY - nutcracker process on mw1015 is OK: PROCS OK: 1 process with UID = 108 (nutcracker), command name nutcracker [09:30:06] RECOVERY - Disk space on mw1015 is OK: DISK OK [09:30:16] RECOVERY - check_mysql on db1008 is OK: Uptime: 320003 Threads: 119 Questions: 14916576 Slow queries: 4026 Opens: 20467 Flush tables: 2 Open tables: 409 Queries per second avg: 46.613 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [09:30:16] RECOVERY - SSH on mw1015 is OK: SSH OK - OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.3 (protocol 2.0) [09:30:28] RECOVERY - salt-minion processes on mw1015 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion [09:30:57] RECOVERY - DPKG on mw1015 is OK: All packages OK [09:45:19] PROBLEM - Host cp3010 is DOWN: PING CRITICAL - Packet loss = 100% [09:48:20] PROBLEM - IPsec on kafka1018 is CRITICAL: Strongswan CRITICAL - ok: 164 not-conn: cp3010_v4, cp3010_v6 [09:48:30] PROBLEM - IPsec on cp1053 is CRITICAL: Strongswan CRITICAL - ok: 56 not-conn: cp3010_v4, cp3010_v6 [09:48:50] PROBLEM - IPsec on cp1066 is CRITICAL: Strongswan CRITICAL - ok: 56 not-conn: cp3010_v4, cp3010_v6 [09:48:51] PROBLEM - IPsec on cp1054 is CRITICAL: Strongswan CRITICAL - ok: 56 not-conn: cp3010_v4, cp3010_v6 [09:49:01] PROBLEM - IPsec on cp1052 is CRITICAL: Strongswan CRITICAL - ok: 56 not-conn: cp3010_v4, cp3010_v6 [09:49:01] PROBLEM - IPsec on cp1065 is CRITICAL: Strongswan CRITICAL - ok: 56 not-conn: cp3010_v4, cp3010_v6 [09:49:10] PROBLEM - IPsec on kafka1013 is CRITICAL: Strongswan CRITICAL - ok: 164 not-conn: cp3010_v4, cp3010_v6 [09:50:13] PROBLEM - IPsec on cp1067 is CRITICAL: Strongswan CRITICAL - ok: 56 not-conn: cp3010_v4, cp3010_v6 [09:50:14] PROBLEM - IPsec on kafka1022 is CRITICAL: Strongswan CRITICAL - ok: 164 not-conn: cp3010_v4, cp3010_v6 [09:50:14] PROBLEM - IPsec on kafka1012 is CRITICAL: Strongswan CRITICAL - ok: 164 not-conn: cp3010_v4, cp3010_v6 [09:50:15] PROBLEM - IPsec on kafka1014 is CRITICAL: Strongswan CRITICAL - ok: 164 not-conn: cp3010_v4, cp3010_v6 [09:50:15] PROBLEM - IPsec on kafka1020 is CRITICAL: Strongswan CRITICAL - ok: 164 not-conn: cp3010_v4, cp3010_v6 [09:50:29] PROBLEM - IPsec on cp1068 is CRITICAL: Strongswan CRITICAL - ok: 56 not-conn: cp3010_v4, cp3010_v6 [09:50:30] PROBLEM - IPsec on cp1055 is CRITICAL: Strongswan CRITICAL - ok: 56 not-conn: cp3010_v4, cp3010_v6 [09:57:31] PROBLEM - NTP on cp4007 is CRITICAL: NTP CRITICAL: Offset unknown [10:25:17] PROBLEM - check_mysql on db1008 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 682 [10:31:24] (03CR) 10Muehlenhoff: "Puppet compiler bails on this, but I'm not sure why (it looks ok to me)?" [puppet] - 10https://gerrit.wikimedia.org/r/260926 (https://phabricator.wikimedia.org/T122396) (owner: 10Faidon Liambotis) [10:50:14] PROBLEM - check_mysql on db1008 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 699 [10:51:14] !log powercycle cp3010 [10:51:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [10:53:54] RECOVERY - Host cp3010 is UP: PING OK - Packet loss = 0%, RTA = 89.32 ms [10:54:14] RECOVERY - IPsec on cp1055 is OK: Strongswan OK - 58 ESP OK [10:54:14] RECOVERY - IPsec on cp1068 is OK: Strongswan OK - 58 ESP OK [10:54:25] RECOVERY - IPsec on kafka1020 is OK: Strongswan OK - 166 ESP OK [10:54:35] RECOVERY - IPsec on cp1065 is OK: Strongswan OK - 58 ESP OK [10:54:35] RECOVERY - IPsec on kafka1012 is OK: Strongswan OK - 166 ESP OK [10:54:45] RECOVERY - IPsec on cp1054 is OK: Strongswan OK - 58 ESP OK [10:54:55] RECOVERY - IPsec on kafka1022 is OK: Strongswan OK - 166 ESP OK [10:55:04] RECOVERY - IPsec on kafka1013 is OK: Strongswan OK - 166 ESP OK [10:55:05] RECOVERY - IPsec on cp1053 is OK: Strongswan OK - 58 ESP OK [10:55:05] RECOVERY - IPsec on kafka1018 is OK: Strongswan OK - 166 ESP OK [10:55:06] RECOVERY - IPsec on cp1066 is OK: Strongswan OK - 58 ESP OK [10:55:14] PROBLEM - check_mysql on db1008 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 875 [10:55:24] RECOVERY - IPsec on cp1052 is OK: Strongswan OK - 58 ESP OK [10:55:35] RECOVERY - IPsec on cp1067 is OK: Strongswan OK - 58 ESP OK [10:55:45] RECOVERY - IPsec on kafka1014 is OK: Strongswan OK - 166 ESP OK [11:00:14] PROBLEM - check_mysql on db1008 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1176 [11:05:08] 6operations, 10DBA, 7Tracking: Migrate MySQLs to use ROW-based replication (tracking) - https://phabricator.wikimedia.org/T109179#1904049 (10jcrespo) [11:05:13] PROBLEM - check_mysql on db1008 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1476 [11:10:13] PROBLEM - check_mysql on db1008 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 770 [11:20:13] RECOVERY - check_mysql on db1008 is OK: Uptime: 326602 Threads: 117 Questions: 15035994 Slow queries: 4207 Opens: 21586 Flush tables: 2 Open tables: 410 Queries per second avg: 46.037 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [13:19:20] !log setting db2018's binlog_format as MIXED [13:19:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [13:22:20] 6operations, 10DBA, 10MediaWiki-Special-pages, 7Performance: Batch update of special pages creates slave lag on s3 over WAN - https://phabricator.wikimedia.org/T122429#1904079 (10jcrespo) I cannot say for sure if it is the Special pages or wbc_entity_usage updates, one of the two: {F3146353} [13:22:46] 6operations, 10DBA, 10MediaWiki-Special-pages, 7Performance: Batch update of special pages creates slave lag on s3 over WAN - https://phabricator.wikimedia.org/T122429#1904081 (10jcrespo) Setting db2018 as MIXED temporarily to see if that helps. [13:27:13] 6operations, 10DBA, 10MediaWiki-Special-pages, 7Performance: Batch updated create slave lag on s3 over WAN - https://phabricator.wikimedia.org/T122429#1904082 (10jcrespo) [13:32:34] 6operations, 10DBA, 10MediaWiki-Special-pages, 7Performance: Batch updates create slave lag on s3 over WAN - https://phabricator.wikimedia.org/T122429#1904087 (10jcrespo) [13:32:43] (03CR) 10Faidon Liambotis: [C: 04-1] "Good catch. I know why: inline_template() returns a string, not an array, so "each" in the erb fails. Either I need to convert that into p" [puppet] - 10https://gerrit.wikimedia.org/r/260926 (https://phabricator.wikimedia.org/T122396) (owner: 10Faidon Liambotis) [14:09:17] PROBLEM - puppet last run on mw1043 is CRITICAL: CRITICAL: Puppet has 1 failures [14:34:10] RECOVERY - puppet last run on mw1043 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [14:50:30] RECOVERY - NTP on cp4007 is OK: NTP OK: Offset 0.04382479191 secs [15:50:17] !log testing new mariadb packages on db2070 [15:50:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [15:51:15] Feliz Navidad, jynus [15:51:38] merry christmas, Coren [15:52:26] y próspero año nuevo [16:57:40] (03PS1) 10Luke081515: dewikibooks: Set $wgRestrictDisplayTitle to false [mediawiki-config] - 10https://gerrit.wikimedia.org/r/260964 (https://phabricator.wikimedia.org/T122433) [20:09:11] PROBLEM - puppet last run on cp1067 is CRITICAL: CRITICAL: Puppet has 1 failures [20:35:52] RECOVERY - puppet last run on cp1067 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [21:24:02] PROBLEM - Disk space on restbase1003 is CRITICAL: DISK CRITICAL - free space: /var 111195 MB (3% inode=99%) [22:31:59] Any phab admin here? It's urgent [22:33:10] Can solve it on my own [23:03:27] PROBLEM - puppet last run on mw2146 is CRITICAL: CRITICAL: Puppet has 1 failures [23:29:06] RECOVERY - puppet last run on mw2146 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [23:56:08] PROBLEM - puppet last run on mw2107 is CRITICAL: CRITICAL: Puppet has 1 failures