[02:18:28] PROBLEM - Misc_Db_Lag on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1536s [02:24:08] PROBLEM - MySQL replication status on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1876s [02:28:18] RECOVERY - Misc_Db_Lag on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 0s [02:33:58] RECOVERY - MySQL replication status on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 0s [02:46:38] PROBLEM - Puppet freshness on brewster is CRITICAL: Puppet has not run in the last 10 hours [02:53:38] PROBLEM - Puppet freshness on sodium is CRITICAL: Puppet has not run in the last 10 hours [04:15:47] PROBLEM - Disk space on srv219 is CRITICAL: DISK CRITICAL - free space: / 0 MB (0% inode=60%): /var/lib/ureadahead/debugfs 0 MB (0% inode=60%): [04:18:57] PROBLEM - Disk space on srv220 is CRITICAL: DISK CRITICAL - free space: / 111 MB (1% inode=60%): /var/lib/ureadahead/debugfs 111 MB (1% inode=60%): [04:20:37] RECOVERY - Disk space on es1004 is OK: DISK OK [04:20:57] RECOVERY - MySQL disk space on es1004 is OK: DISK OK [04:44:17] PROBLEM - MySQL slave status on es1004 is CRITICAL: CRITICAL: Slave running: expected Yes, got No [04:45:37] RECOVERY - Disk space on srv219 is OK: DISK OK [04:48:57] RECOVERY - Disk space on srv220 is OK: DISK OK [09:11:56] PROBLEM - Disk space on srv223 is CRITICAL: DISK CRITICAL - free space: / 0 MB (0% inode=60%): /var/lib/ureadahead/debugfs 0 MB (0% inode=60%): [09:22:09] RECOVERY - Disk space on srv223 is OK: DISK OK [09:54:20] PROBLEM - MySQL disk space on es1004 is CRITICAL: DISK CRITICAL - free space: /a 456098 MB (3% inode=99%): [09:59:56] PROBLEM - Disk space on es1004 is CRITICAL: DISK CRITICAL - free space: /a 432021 MB (3% inode=99%): [10:32:34] RECOVERY - MySQL slave status on es1004 is OK: OK: [12:07:45] New patchset: Hashar; "testswarm: fix mw.org hostname" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1923 [12:38:36] !log Running puppet on freshly installed sodium [12:38:37] Logged the message, Master [12:38:43] RECOVERY - Puppet freshness on sodium is OK: puppet ran at Mon Jan 16 12:38:25 UTC 2012 [12:39:43] RECOVERY - HTTP on sodium is OK: HTTP OK HTTP/1.0 200 OK - 3827 bytes in 0.116 seconds [12:40:05] err: /Stage[main]/Nrpe::Packages/File[/etc/nagios/nrpe_local.cfg]/ensure: change from absent to file failed: Could not set 'file on ensure: No such file or directory - /etc/nagios/nrpe_local.cfg.puppettmp_2766 at /var/lib/git/operations/puppet/manifests/nrpe.pp:20 [12:45:21] err: /Stage[main]/Nrpe::Service/Service[nagios-nrpe-server]: Failed to call refresh: Could not stop Service[nagios-nrpe-server]: Execution of '/etc/init.d/nagios-nrpe-server stop' returned 1: at /var/lib/git/operations/puppet/manifests/nrpe.pp:29 [12:50:15] New patchset: Mark Bergsma; "Fix nrpe.pp" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1924 [12:50:35] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1924 [12:50:36] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1924 [12:57:07] PROBLEM - Puppet freshness on brewster is CRITICAL: Puppet has not run in the last 10 hours [13:00:50] RECOVERY - mailman on sodium is OK: PROCS OK: 10 processes with args mailman [13:02:36] RECOVERY - spamassassin on sodium is OK: PROCS OK: 4 processes with args spamd [13:07:35] RECOVERY - DPKG on sodium is OK: All packages OK [13:16:15] New patchset: Mark Bergsma; "Disable grub2's hidden timeout, so we'll know to press ESC (RT#1914)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1925 [13:16:30] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/1925 [13:18:15] RECOVERY - RAID on sodium is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0 [13:19:55] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1925 [13:19:55] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1925 [13:21:11] RECOVERY - HTTPS on sodium is OK: OK - Certificate will expire on 08/22/2015 22:23. [13:23:36] PROBLEM - RAID on nfs1 is CRITICAL: Connection refused by host [13:23:39] PROBLEM - Disk space on mw1084 is CRITICAL: Connection refused by host [13:23:47] PROBLEM - RAID on db1015 is CRITICAL: Connection refused by host [13:23:48] PROBLEM - Disk space on mw18 is CRITICAL: Connection refused by host [13:23:55] PROBLEM - Disk space on mw1092 is CRITICAL: Connection refused by host [13:23:55] PROBLEM - Disk space on mw1011 is CRITICAL: Connection refused by host [13:24:43] PROBLEM - DPKG on db1044 is CRITICAL: Connection refused by host [13:24:43] PROBLEM - RAID on db1044 is CRITICAL: Connection refused by host [13:25:23] PROBLEM - Disk space on mw1059 is CRITICAL: Connection refused by host [13:26:03] PROBLEM - Disk space on db1022 is CRITICAL: Connection refused by host [13:26:03] PROBLEM - DPKG on srv257 is CRITICAL: Connection refused by host [13:26:03] PROBLEM - Disk space on db1016 is CRITICAL: Connection refused by host [13:26:13] PROBLEM - MySQL disk space on db1015 is CRITICAL: Connection refused by host [13:26:23] PROBLEM - Disk space on locke is CRITICAL: Connection refused by host [13:26:45] PROBLEM - Disk space on db1044 is CRITICAL: Connection refused by host [13:27:03] PROBLEM - DPKG on mw67 is CRITICAL: Connection refused by host [13:27:13] PROBLEM - DPKG on mw1134 is CRITICAL: Connection refused by host [13:28:15] PROBLEM - DPKG on srv222 is CRITICAL: Connection refused by host [13:28:38] PROBLEM - Disk space on mw1101 is CRITICAL: Connection refused by host [13:28:38] PROBLEM - Disk space on mw1134 is CRITICAL: Connection refused by host [13:29:04] PROBLEM - Disk space on virt2 is CRITICAL: Connection refused by host [13:29:23] PROBLEM - RAID on mw67 is CRITICAL: Connection refused by host [13:29:23] PROBLEM - DPKG on mw1101 is CRITICAL: Connection refused by host [13:31:11] PROBLEM - Disk space on db1041 is CRITICAL: Connection refused by host [13:31:43] PROBLEM - Disk space on ms1002 is CRITICAL: Connection refused by host [13:31:53] PROBLEM - RAID on db51 is CRITICAL: Connection refused by host [13:32:15] PROBLEM - RAID on srv225 is CRITICAL: Connection refused by host [13:32:25] PROBLEM - mobile traffic loggers on cp1044 is CRITICAL: Connection refused by host [13:32:25] RECOVERY - RAID on srv278 is OK: OK: no RAID installed [13:32:47] PROBLEM - RAID on srv227 is CRITICAL: Connection refused by host [13:32:56] PROBLEM - RAID on cp1042 is CRITICAL: Connection refused by host [13:33:16] PROBLEM - DPKG on mw1044 is CRITICAL: Connection refused by host [13:33:16] PROBLEM - RAID on mw1104 is CRITICAL: Connection refused by host [13:33:16] PROBLEM - DPKG on mw1073 is CRITICAL: Connection refused by host [13:33:16] PROBLEM - DPKG on es1002 is CRITICAL: Connection refused by host [13:33:26] PROBLEM - DPKG on srv227 is CRITICAL: Connection refused by host [13:33:26] PROBLEM - RAID on mw1129 is CRITICAL: Connection refused by host [13:34:52] PROBLEM - Disk space on mw69 is CRITICAL: Connection refused by host [13:35:39] PROBLEM - DPKG on snapshot2 is CRITICAL: Connection refused by host [13:35:57] PROBLEM - Disk space on db1048 is CRITICAL: Connection refused by host [13:35:57] PROBLEM - RAID on srv253 is CRITICAL: Connection refused by host [13:35:58] RECOVERY - RAID on nfs1 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0 [13:36:15] RECOVERY - Disk space on mw1084 is OK: DISK OK [13:36:15] RECOVERY - RAID on db1015 is OK: OK: State is Optimal, checked 2 logical device(s) [13:36:29] PROBLEM - DPKG on db11 is CRITICAL: Connection refused by host [13:36:29] RECOVERY - Disk space on mw1092 is OK: DISK OK [13:36:29] RECOVERY - Disk space on mw1011 is OK: DISK OK [13:36:40] RECOVERY - Disk space on mw18 is OK: DISK OK [13:36:49] PROBLEM - Disk space on aluminium is CRITICAL: Connection refused by host [13:36:49] RECOVERY - DPKG on db1044 is OK: All packages OK [13:36:49] PROBLEM - RAID on srv200 is CRITICAL: Connection refused by host [13:36:49] RECOVERY - RAID on db1044 is OK: OK: State is Optimal, checked 2 logical device(s) [13:37:19] PROBLEM - RAID on db1040 is CRITICAL: Connection refused by host [13:37:19] PROBLEM - MySQL disk space on db1020 is CRITICAL: Connection refused by host [13:37:39] PROBLEM - DPKG on virt3 is CRITICAL: Connection refused by host [13:37:39] RECOVERY - Disk space on mw1059 is OK: DISK OK [13:38:01] RECOVERY - Disk space on db1022 is OK: DISK OK [13:38:11] PROBLEM - DPKG on db1040 is CRITICAL: Connection refused by host [13:38:11] RECOVERY - DPKG on srv257 is OK: All packages OK [13:38:23] RECOVERY - Disk space on db1016 is OK: DISK OK [13:38:33] RECOVERY - Disk space on locke is OK: DISK OK [13:38:41] RECOVERY - MySQL disk space on db1015 is OK: DISK OK [13:38:50] RECOVERY - Disk space on db1044 is OK: DISK OK [13:39:10] RECOVERY - DPKG on mw67 is OK: All packages OK [13:39:10] PROBLEM - DPKG on srv285 is CRITICAL: Connection refused by host [13:39:19] RECOVERY - DPKG on mw1134 is OK: All packages OK [13:39:42] PROBLEM - DPKG on mw70 is CRITICAL: Connection refused by host [13:39:45] PROBLEM - mobile traffic loggers on cp1043 is CRITICAL: Connection refused by host [13:40:00] PROBLEM - DPKG on mw1126 is CRITICAL: Connection refused by host [13:40:00] PROBLEM - DPKG on db34 is CRITICAL: Connection refused by host [13:40:19] PROBLEM - RAID on cp1043 is CRITICAL: Connection refused by host [13:40:31] PROBLEM - RAID on mw1035 is CRITICAL: Connection refused by host [13:40:31] PROBLEM - Disk space on virt3 is CRITICAL: Connection refused by host [13:40:31] RECOVERY - DPKG on srv222 is OK: All packages OK [13:40:31] PROBLEM - RAID on mw1029 is CRITICAL: Connection refused by host [13:40:53] PROBLEM - DPKG on srv251 is CRITICAL: Connection refused by host [13:40:53] RECOVERY - Disk space on mw1134 is OK: DISK OK [13:40:53] RECOVERY - Disk space on mw1101 is OK: DISK OK [13:41:11] RECOVERY - Disk space on virt2 is OK: DISK OK [13:41:41] PROBLEM - RAID on mw1015 is CRITICAL: Connection refused by host [13:41:41] RECOVERY - RAID on mw67 is OK: OK: no RAID installed [13:42:11] RECOVERY - DPKG on mw1101 is OK: All packages OK [13:42:24] PROBLEM - RAID on db1019 is CRITICAL: Connection refused by host [13:43:04] PROBLEM - Disk space on mw8 is CRITICAL: Connection refused by host [13:43:04] PROBLEM - Disk space on srv246 is CRITICAL: Connection refused by host [13:43:14] PROBLEM - DPKG on srv215 is CRITICAL: Connection refused by host [13:43:14] PROBLEM - RAID on db45 is CRITICAL: Connection refused by host [13:43:14] PROBLEM - RAID on mw1067 is CRITICAL: Connection refused by host [13:43:14] PROBLEM - Disk space on mw1159 is CRITICAL: Connection refused by host [13:43:24] PROBLEM - DPKG on mw1118 is CRITICAL: Connection refused by host [13:43:24] PROBLEM - MySQL disk space on db1033 is CRITICAL: Connection refused by host [13:43:24] RECOVERY - Disk space on db1041 is OK: DISK OK [13:43:24] PROBLEM - DPKG on es2 is CRITICAL: Connection refused by host [13:43:37] PROBLEM - Disk space on db45 is CRITICAL: Connection refused by host [13:44:07] RECOVERY - RAID on db51 is OK: OK: State is Optimal, checked 2 logical device(s) [13:44:25] PROBLEM - Disk space on es2 is CRITICAL: Connection refused by host [13:44:25] PROBLEM - DPKG on stafford is CRITICAL: Connection refused by host [13:44:35] RECOVERY - RAID on srv225 is OK: OK: no RAID installed [13:44:54] PROBLEM - DPKG on db25 is CRITICAL: Connection refused by host [13:45:13] RECOVERY - RAID on srv227 is OK: OK: no RAID installed [13:45:25] PROBLEM - RAID on srv262 is CRITICAL: Connection refused by host [13:45:25] PROBLEM - RAID on db1003 is CRITICAL: Connection refused by host [13:45:25] PROBLEM - DPKG on mw1072 is CRITICAL: Connection refused by host [13:45:25] PROBLEM - DPKG on mw1042 is CRITICAL: Connection refused by host [13:45:38] RECOVERY - RAID on cp1042 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0 [13:45:56] RECOVERY - DPKG on mw1044 is OK: All packages OK [13:45:56] RECOVERY - DPKG on mw1073 is OK: All packages OK [13:45:56] RECOVERY - RAID on mw1104 is OK: OK: no RAID installed [13:45:56] RECOVERY - DPKG on es1002 is OK: All packages OK [13:46:05] RECOVERY - DPKG on srv227 is OK: All packages OK [13:46:06] RECOVERY - RAID on mw1129 is OK: OK: no RAID installed [13:46:44] PROBLEM - RAID on srv221 is CRITICAL: Connection refused by host [13:47:05] PROBLEM - RAID on mw1158 is CRITICAL: Connection refused by host [13:47:14] RECOVERY - Disk space on mw69 is OK: DISK OK [13:47:36] RECOVERY - DPKG on snapshot2 is OK: All packages OK [13:47:54] PROBLEM - RAID on srv214 is CRITICAL: Connection refused by host [13:47:55] RECOVERY - Disk space on db1048 is OK: DISK OK [13:47:55] PROBLEM - Disk space on srv187 is CRITICAL: Connection refused by host [13:47:55] RECOVERY - RAID on srv253 is OK: OK: no RAID installed [13:48:04] PROBLEM - Disk space on mw1090 is CRITICAL: Connection refused by host [13:48:04] PROBLEM - MySQL disk space on db1035 is CRITICAL: Connection refused by host [13:48:14] RECOVERY - DPKG on db11 is OK: All packages OK [13:48:14] PROBLEM - Disk space on mw1070 is CRITICAL: Connection refused by host [13:48:24] RECOVERY - RAID on srv200 is OK: OK: no RAID installed [13:48:24] RECOVERY - Disk space on aluminium is OK: DISK OK [13:48:54] PROBLEM - Disk space on mw1075 is CRITICAL: Connection refused by host [13:49:06] RECOVERY - MySQL disk space on db1020 is OK: DISK OK [13:49:08] PROBLEM - DPKG on srv193 is CRITICAL: Connection refused by host [13:49:15] PROBLEM - DPKG on srv220 is CRITICAL: Connection refused by host [13:49:15] RECOVERY - RAID on db1040 is OK: OK: State is Optimal, checked 2 logical device(s) [13:49:16] PROBLEM - Disk space on mw1060 is CRITICAL: Connection refused by host [13:49:25] RECOVERY - DPKG on virt3 is OK: All packages OK [13:49:25] PROBLEM - RAID on srv276 is CRITICAL: Connection refused by host [13:49:34] PROBLEM - DPKG on srv276 is CRITICAL: Connection refused by host [13:49:34] PROBLEM - DPKG on db1002 is CRITICAL: Connection refused by host [13:49:34] PROBLEM - MySQL disk space on db1038 is CRITICAL: Connection refused by host [13:49:34] PROBLEM - Disk space on srv220 is CRITICAL: Connection refused by host [13:49:34] PROBLEM - Disk space on srv294 is CRITICAL: Connection refused by host [13:49:35] PROBLEM - DPKG on srv243 is CRITICAL: Connection refused by host [13:50:55] RECOVERY - DPKG on db1040 is OK: All packages OK [13:50:56] PROBLEM - DPKG on db1004 is CRITICAL: Connection refused by host [13:51:05] PROBLEM - RAID on ms1002 is CRITICAL: Connection refused by host [13:51:47] PROBLEM - DPKG on db1026 is CRITICAL: Connection refused by host [13:51:47] PROBLEM - Disk space on mw1048 is CRITICAL: Connection refused by host [13:51:58] PROBLEM - RAID on emery is CRITICAL: Connection refused by host [13:51:58] PROBLEM - MySQL disk space on db1002 is CRITICAL: Connection refused by host [13:52:16] RECOVERY - DPKG on srv285 is OK: All packages OK [13:52:22] PROBLEM - DPKG on mw1147 is CRITICAL: Connection refused by host [13:52:39] PROBLEM - Disk space on db1026 is CRITICAL: Connection refused by host [13:52:49] RECOVERY - mobile traffic loggers on cp1043 is OK: PROCS OK: 2 processes with command name varnishncsa [13:52:49] RECOVERY - DPKG on mw70 is OK: All packages OK [13:53:01] RECOVERY - DPKG on mw1126 is OK: All packages OK [13:53:04] RECOVERY - DPKG on db34 is OK: All packages OK [13:53:18] RECOVERY - RAID on cp1043 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0 [13:53:18] RECOVERY - RAID on mw1035 is OK: OK: no RAID installed [13:53:18] RECOVERY - Disk space on virt3 is OK: DISK OK [13:53:18] RECOVERY - RAID on mw1029 is OK: OK: no RAID installed [13:53:28] RECOVERY - DPKG on srv251 is OK: All packages OK [13:53:38] PROBLEM - DPKG on mw1141 is CRITICAL: Connection refused by host [13:53:48] PROBLEM - Disk space on srv243 is CRITICAL: Connection refused by host [13:54:08] PROBLEM - RAID on db1006 is CRITICAL: Connection refused by host [13:54:08] RECOVERY - RAID on mw1015 is OK: OK: no RAID installed [13:54:48] PROBLEM - MySQL disk space on db1026 is CRITICAL: Connection refused by host [13:54:48] PROBLEM - Disk space on mw1147 is CRITICAL: Connection refused by host [13:54:58] PROBLEM - Disk space on mw1141 is CRITICAL: Connection refused by host [13:55:19] PROBLEM - Disk space on db1017 is CRITICAL: Connection refused by host [13:55:21] RECOVERY - RAID on db1019 is OK: OK: State is Optimal, checked 2 logical device(s) [13:55:21] RECOVERY - Disk space on mw8 is OK: DISK OK [13:55:21] RECOVERY - Disk space on srv246 is OK: DISK OK [13:55:21] PROBLEM - Disk space on srv193 is CRITICAL: Connection refused by host [13:55:43] PROBLEM - Disk space on emery is CRITICAL: Connection refused by host [13:55:43] RECOVERY - DPKG on srv215 is OK: All packages OK [13:55:43] RECOVERY - RAID on db45 is OK: OK: State is Optimal, checked 2 logical device(s) [13:55:43] RECOVERY - Disk space on mw1159 is OK: DISK OK [13:55:43] RECOVERY - RAID on mw1067 is OK: OK: no RAID installed [13:55:43] RECOVERY - DPKG on mw1118 is OK: All packages OK [13:55:55] RECOVERY - MySQL disk space on db1033 is OK: DISK OK [13:55:55] RECOVERY - DPKG on es2 is OK: All packages OK [13:56:24] RECOVERY - Disk space on db45 is OK: DISK OK [13:57:01] RECOVERY - DPKG on stafford is OK: All packages OK [13:57:02] RECOVERY - Disk space on es2 is OK: DISK OK [13:57:21] RECOVERY - DPKG on db25 is OK: All packages OK [13:57:41] RECOVERY - RAID on srv262 is OK: OK: no RAID installed [13:57:41] RECOVERY - DPKG on mw1072 is OK: All packages OK [13:57:41] RECOVERY - RAID on db1003 is OK: OK: State is Optimal, checked 2 logical device(s) [13:57:52] RECOVERY - DPKG on mw1042 is OK: All packages OK [13:59:01] RECOVERY - RAID on srv221 is OK: OK: no RAID installed [13:59:31] RECOVERY - RAID on mw1158 is OK: OK: no RAID installed [14:00:01] RECOVERY - RAID on srv214 is OK: OK: no RAID installed [14:00:13] RECOVERY - Disk space on srv187 is OK: DISK OK [14:00:14] RECOVERY - Disk space on mw1090 is OK: DISK OK [14:00:32] RECOVERY - MySQL disk space on db1035 is OK: DISK OK [14:00:46] RECOVERY - Disk space on mw1070 is OK: DISK OK [14:01:31] RECOVERY - Disk space on mw1075 is OK: DISK OK [14:01:51] RECOVERY - DPKG on srv193 is OK: All packages OK [14:01:51] RECOVERY - DPKG on srv220 is OK: All packages OK [14:02:11] RECOVERY - RAID on srv276 is OK: OK: no RAID installed [14:02:11] RECOVERY - DPKG on srv276 is OK: All packages OK [14:02:11] RECOVERY - DPKG on db1002 is OK: All packages OK [14:02:11] RECOVERY - MySQL disk space on db1038 is OK: DISK OK [14:02:11] RECOVERY - Disk space on srv220 is OK: DISK OK [14:02:11] RECOVERY - Disk space on srv294 is OK: DISK OK [14:02:21] RECOVERY - DPKG on srv243 is OK: All packages OK [14:02:33] RECOVERY - Disk space on mw1060 is OK: DISK OK [14:02:44] RECOVERY - DPKG on db1004 is OK: All packages OK [14:02:52] RECOVERY - RAID on ms1002 is OK: OK: State is Optimal, checked 2 logical device(s) [14:03:23] RECOVERY - Disk space on mw1048 is OK: DISK OK [14:03:23] RECOVERY - DPKG on db1026 is OK: All packages OK [14:03:23] RECOVERY - RAID on emery is OK: OK: Active: 2, Working: 2, Failed: 0, Spare: 0 [14:03:23] RECOVERY - MySQL disk space on db1002 is OK: DISK OK [14:03:41] RECOVERY - DPKG on mw1147 is OK: All packages OK [14:04:01] RECOVERY - Disk space on db1026 is OK: DISK OK [14:05:13] RECOVERY - DPKG on mw1141 is OK: All packages OK [14:05:15] RECOVERY - Disk space on srv243 is OK: DISK OK [14:05:42] RECOVERY - RAID on db1006 is OK: OK: State is Optimal, checked 2 logical device(s) [14:06:11] RECOVERY - Disk space on mw1147 is OK: DISK OK [14:06:12] RECOVERY - Disk space on mw1141 is OK: DISK OK [14:06:22] RECOVERY - MySQL disk space on db1026 is OK: DISK OK [14:06:22] RECOVERY - Disk space on db1017 is OK: DISK OK [14:06:36] RECOVERY - Disk space on srv193 is OK: DISK OK [14:07:03] RECOVERY - Disk space on emery is OK: DISK OK [14:35:25] New review: Dzahn; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/1923 [14:35:26] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1923 [15:55:37] !log shutting down srv199 for main board replacement [15:55:39] Logged the message, Master [15:56:05] RECOVERY - Squid on brewster is OK: TCP OK - 0.010 second response time on port 8080 [16:31:05] PROBLEM - Disk space on srv219 is CRITICAL: DISK CRITICAL - free space: / 271 MB (3% inode=60%): /var/lib/ureadahead/debugfs 271 MB (3% inode=60%): [16:41:05] RECOVERY - Disk space on srv219 is OK: DISK OK [17:02:20] PROBLEM - Disk space on srv222 is CRITICAL: DISK CRITICAL - free space: / 0 MB (0% inode=60%): /var/lib/ureadahead/debugfs 0 MB (0% inode=60%): [17:03:21] RECOVERY - Disk space on srv222 is OK: DISK OK [21:43:31] New patchset: Hashar; "testswarm: job to wipe clients idling" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1926 [23:06:56] PROBLEM - Puppet freshness on brewster is CRITICAL: Puppet has not run in the last 10 hours