[01:54:46] PROBLEM - Puppet freshness on lvs1003 is CRITICAL: Puppet has not run in the last 10 hours [01:54:46] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: Puppet has not run in the last 10 hours [02:21:56] PROBLEM - Misc_Db_Lag on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1610s [02:25:26] PROBLEM - MySQL replication status on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1820s [02:44:56] RECOVERY - Misc_Db_Lag on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 17s [03:00:16] RECOVERY - MySQL replication status on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 5s [03:14:16] PROBLEM - Disk space on srv221 is CRITICAL: DISK CRITICAL - free space: / 0 MB (0% inode=64%): /var/lib/ureadahead/debugfs 0 MB (0% inode=64%): [03:25:56] RECOVERY - Disk space on srv221 is OK: DISK OK [07:07:53] RECOVERY - mysqld processes on db35 is OK: PROCS OK: 1 process with command name mysqld [07:21:23] PROBLEM - MySQL Slave Delay on db35 is CRITICAL: CRIT replication delay 94001 seconds [07:27:53] PROBLEM - LVS Lucene on search-pool2.svc.pmtpa.wmnet is CRITICAL: Connection timed out [07:58:59] RECOVERY - LVS Lucene on search-pool2.svc.pmtpa.wmnet is OK: TCP OK - 0.001 second response time on port 8123 [08:17:39] PROBLEM - Lucene on search6 is CRITICAL: Connection timed out [08:35:39] PROBLEM - LVS Lucene on search-pool2.svc.pmtpa.wmnet is CRITICAL: Connection timed out [09:09:49] RECOVERY - MySQL Slave Delay on db35 is OK: OK replication delay 0 seconds [12:06:19] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: Puppet has not run in the last 10 hours [12:06:19] PROBLEM - Puppet freshness on lvs1003 is CRITICAL: Puppet has not run in the last 10 hours [12:48:02] PROBLEM - LVS Lucene on search-pool2.svc.pmtpa.wmnet is CRITICAL: Connection timed out [12:55:00] hi. re LVS Lucene pages: i restarted lsearchd on search6, last time that stopped it.. but besides that..hmm..shrug [12:55:22] RECOVERY - Lucene on search6 is OK: TCP OK - 0.002 second response time on port 8123 [13:10:50] !log stopped and started lsearchd once again on search6 [13:10:52] Logged the message, Master [13:13:02] RECOVERY - LVS Lucene on search-pool2.svc.pmtpa.wmnet is OK: TCP OK - 0.002 second response time on port 8123 [13:13:12] yay [13:49:56] PROBLEM - LVS Lucene on search-pool2.svc.pmtpa.wmnet is CRITICAL: Connection timed out [14:01:46] RECOVERY - LVS Lucene on search-pool2.svc.pmtpa.wmnet is OK: TCP OK - 0.003 second response time on port 8123 [14:07:38] New review: Dzahn; "did not look at the details yet, but +1 for sudoers.d. " [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/2283 [15:39:14] PROBLEM - LVS Lucene on search-pool2.svc.pmtpa.wmnet is CRITICAL: Connection timed out [15:51:40] mutante: are you around? [15:51:47] I see you kicked search earlier today [15:56:07] RECOVERY - LVS Lucene on search-pool2.svc.pmtpa.wmnet is OK: TCP OK - 0.003 second response time on port 8123 [16:34:07] PROBLEM - LVS Lucene on search-pool2.svc.pmtpa.wmnet is CRITICAL: Connection timed out [16:38:49] !log running alter table add column on db1017 enwiki.revision to benchmark [16:38:50] Logged the message, Master [16:58:06] !log restarted lsearchd on search6 [16:58:08] Logged the message, Master [17:03:23] RECOVERY - LVS Lucene on search-pool2.svc.pmtpa.wmnet is OK: TCP OK - 0.002 second response time on port 8123 [17:08:33] PROBLEM - MySQL Slave Delay on db1017 is CRITICAL: CRIT replication delay 1856 seconds [17:51:07] ... [19:59:08] PROBLEM - Host db1001 is DOWN: PING CRITICAL - Packet loss = 100% [20:00:27] RECOVERY - MySQL Slave Delay on db1017 is OK: OK replication delay NULL seconds [20:51:45] PROBLEM - MySQL Replication Heartbeat on db42 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [22:16:57] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: Puppet has not run in the last 10 hours [22:16:57] PROBLEM - Puppet freshness on lvs1003 is CRITICAL: Puppet has not run in the last 10 hours [23:51:37] PROBLEM - MySQL Replication Heartbeat on db42 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.