[01:23:25] (03PS1) 10Ryan Lane: Deployment apache for sartoris project [operations/puppet] - 10https://gerrit.wikimedia.org/r/89493 [01:25:19] (03PS2) 10Ryan Lane: Deployment apache for sartoris project [operations/puppet] - 10https://gerrit.wikimedia.org/r/89493 [01:26:20] (03CR) 10Ryan Lane: [C: 032] Deployment apache for sartoris project [operations/puppet] - 10https://gerrit.wikimedia.org/r/89493 (owner: 10Ryan Lane) [01:34:50] (03PS1) 10Ryan Lane: Add a salt master override for labs [operations/puppet] - 10https://gerrit.wikimedia.org/r/89494 [01:35:42] (03PS2) 10Ryan Lane: Add a salt master override for labs [operations/puppet] - 10https://gerrit.wikimedia.org/r/89494 [01:44:50] (03PS3) 10Ryan Lane: Add a salt master override for labs [operations/puppet] - 10https://gerrit.wikimedia.org/r/89494 [01:51:42] (03CR) 10Ryan Lane: [C: 032] Add a salt master override for labs [operations/puppet] - 10https://gerrit.wikimedia.org/r/89494 (owner: 10Ryan Lane) [02:09:00] !log LocalisationUpdate completed (1.22wmf20) at Sun Oct 13 02:09:00 UTC 2013 [02:09:13] Logged the message, Master [02:16:35] !log LocalisationUpdate completed (1.22wmf21) at Sun Oct 13 02:16:34 UTC 2013 [02:16:49] Logged the message, Master [02:21:29] (03PS1) 10Ryan Lane: Salt 0.17 compatibility for deployment scripts [operations/puppet] - 10https://gerrit.wikimedia.org/r/89495 [02:27:58] !log LocalisationUpdate ResourceLoader cache refresh completed at Sun Oct 13 02:27:58 UTC 2013 [02:28:09] Logged the message, Master [03:44:59] PROBLEM - MySQL Replication Heartbeat on db1046 is CRITICAL: CRIT replication delay 304 seconds [03:46:59] PROBLEM - MySQL Replication Heartbeat on db1046 is CRITICAL: CRIT replication delay 303 seconds [03:48:59] PROBLEM - MySQL Replication Heartbeat on db1046 is CRITICAL: CRIT replication delay 303 seconds [03:50:59] PROBLEM - MySQL Replication Heartbeat on db1046 is CRITICAL: CRIT replication delay 303 seconds [03:52:29] PROBLEM - Disk space on labsdb1003 is CRITICAL: DISK CRITICAL - free space: /a 125296 MB (3% inode=99%): [03:57:59] PROBLEM - MySQL Replication Heartbeat on db1046 is CRITICAL: CRIT replication delay 306 seconds [04:11:59] PROBLEM - MySQL Replication Heartbeat on db1046 is CRITICAL: CRIT replication delay 305 seconds [04:45:59] RECOVERY - MySQL Replication Heartbeat on db1046 is OK: OK replication delay 0 seconds [05:25:34] RobH: http://seriss.com/people/erco/unixtools/hostnames.html enjoy :) [05:29:09] PROBLEM - SSH on searchidx1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:29:19] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [05:30:19] RECOVERY - RAID on searchidx1001 is OK: OK: State is Optimal, checked 1 logical drive(s), 4 physical drive(s) [05:30:59] RECOVERY - SSH on searchidx1001 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [05:39:35] (03PS1) 10TTO: Add Portale namespace to wgContentNamespaces for itwikivoyage [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/89496 [05:48:19] (03PS1) 10TTO: Set up autopatroller right on eswikivoyage [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/89497 [05:54:29] PROBLEM - Disk space on labsdb1003 is CRITICAL: DISK CRITICAL - free space: /a 125157 MB (3% inode=99%): [06:05:59] PROBLEM - Puppet freshness on cp4001 is CRITICAL: No successful Puppet run in the last 10 hours [06:25:59] PROBLEM - Puppet freshness on cp4014 is CRITICAL: No successful Puppet run in the last 10 hours [06:26:59] PROBLEM - Puppet freshness on cp4019 is CRITICAL: No successful Puppet run in the last 10 hours [06:27:59] PROBLEM - Puppet freshness on cp4005 is CRITICAL: No successful Puppet run in the last 10 hours [06:27:59] PROBLEM - Puppet freshness on cp4015 is CRITICAL: No successful Puppet run in the last 10 hours [06:29:59] PROBLEM - Puppet freshness on cp4017 is CRITICAL: No successful Puppet run in the last 10 hours [06:29:59] PROBLEM - Puppet freshness on lvs4001 is CRITICAL: No successful Puppet run in the last 10 hours [06:59:29] PROBLEM - Disk space on labsdb1003 is CRITICAL: DISK CRITICAL - free space: /a 125507 MB (3% inode=99%): [07:03:29] PROBLEM - RAID on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:06:09] PROBLEM - DPKG on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:12:19] PROBLEM - Disk space on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:20:19] RECOVERY - Disk space on snapshot3 is OK: DISK OK [07:27:29] RECOVERY - RAID on snapshot3 is OK: OK: no RAID installed [07:28:59] RECOVERY - DPKG on snapshot3 is OK: All packages OK [07:30:19] PROBLEM - Disk space on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:31:19] RECOVERY - Disk space on snapshot3 is OK: DISK OK [07:32:29] PROBLEM - RAID on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:35:29] RECOVERY - RAID on snapshot3 is OK: OK: no RAID installed [07:38:09] PROBLEM - DPKG on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:39:59] RECOVERY - DPKG on snapshot3 is OK: All packages OK [07:44:09] PROBLEM - DPKG on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:44:59] RECOVERY - DPKG on snapshot3 is OK: All packages OK [08:16:59] PROBLEM - Puppet freshness on neon is CRITICAL: No successful Puppet run in the last 10 hours [08:18:59] PROBLEM - Puppet freshness on bast4001 is CRITICAL: No successful Puppet run in the last 10 hours [08:18:59] PROBLEM - Puppet freshness on cp4002 is CRITICAL: No successful Puppet run in the last 10 hours [08:18:59] PROBLEM - Puppet freshness on cp4003 is CRITICAL: No successful Puppet run in the last 10 hours [08:18:59] PROBLEM - Puppet freshness on cp4004 is CRITICAL: No successful Puppet run in the last 10 hours [08:18:59] PROBLEM - Puppet freshness on cp4006 is CRITICAL: No successful Puppet run in the last 10 hours [08:18:59] PROBLEM - Puppet freshness on cp4007 is CRITICAL: No successful Puppet run in the last 10 hours [08:18:59] PROBLEM - Puppet freshness on cp4008 is CRITICAL: No successful Puppet run in the last 10 hours [08:19:00] PROBLEM - Puppet freshness on cp4009 is CRITICAL: No successful Puppet run in the last 10 hours [08:19:00] PROBLEM - Puppet freshness on cp4010 is CRITICAL: No successful Puppet run in the last 10 hours [08:19:01] PROBLEM - Puppet freshness on cp4011 is CRITICAL: No successful Puppet run in the last 10 hours [08:19:01] PROBLEM - Puppet freshness on cp4012 is CRITICAL: No successful Puppet run in the last 10 hours [08:19:02] PROBLEM - Puppet freshness on cp4013 is CRITICAL: No successful Puppet run in the last 10 hours [08:19:02] PROBLEM - Puppet freshness on cp4016 is CRITICAL: No successful Puppet run in the last 10 hours [08:19:03] PROBLEM - Puppet freshness on cp4018 is CRITICAL: No successful Puppet run in the last 10 hours [08:19:03] PROBLEM - Puppet freshness on cp4020 is CRITICAL: No successful Puppet run in the last 10 hours [08:19:04] PROBLEM - Puppet freshness on lvs4002 is CRITICAL: No successful Puppet run in the last 10 hours [08:19:04] PROBLEM - Puppet freshness on lvs4003 is CRITICAL: No successful Puppet run in the last 10 hours [08:19:05] PROBLEM - Puppet freshness on lvs4004 is CRITICAL: No successful Puppet run in the last 10 hours [08:29:29] PROBLEM - Disk space on labsdb1003 is CRITICAL: DISK CRITICAL - free space: /a 125606 MB (3% inode=99%): [09:05:29] PROBLEM - Disk space on labsdb1003 is CRITICAL: DISK CRITICAL - free space: /a 121866 MB (3% inode=99%): [09:41:29] PROBLEM - Disk space on labsdb1003 is CRITICAL: DISK CRITICAL - free space: /a 124944 MB (3% inode=99%): [11:08:29] PROBLEM - Disk space on labsdb1003 is CRITICAL: DISK CRITICAL - free space: /a 125244 MB (3% inode=99%): [12:16:35] Hm, so it goes up and down [12:41:29] PROBLEM - Disk space on labsdb1003 is CRITICAL: DISK CRITICAL - free space: /a 125476 MB (3% inode=99%): [15:26:19] PROBLEM - Disk space on stafford is CRITICAL: DISK CRITICAL - free space: /var/lib/puppet 757 MB (3% inode=95%): [15:43:19] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:45:19] RECOVERY - RAID on searchidx1001 is OK: OK: State is Optimal, checked 1 logical drive(s), 4 physical drive(s) [16:06:59] PROBLEM - Puppet freshness on cp4001 is CRITICAL: No successful Puppet run in the last 10 hours [16:26:59] PROBLEM - Puppet freshness on cp4014 is CRITICAL: No successful Puppet run in the last 10 hours [16:27:59] PROBLEM - Puppet freshness on cp4019 is CRITICAL: No successful Puppet run in the last 10 hours [16:28:59] PROBLEM - Puppet freshness on cp4005 is CRITICAL: No successful Puppet run in the last 10 hours [16:28:59] PROBLEM - Puppet freshness on cp4015 is CRITICAL: No successful Puppet run in the last 10 hours [16:30:59] PROBLEM - Puppet freshness on cp4017 is CRITICAL: No successful Puppet run in the last 10 hours [16:30:59] PROBLEM - Puppet freshness on lvs4001 is CRITICAL: No successful Puppet run in the last 10 hours [16:32:09] (03CR) 10Yurik: [C: 04-1] "(1 comment)" [operations/puppet] - 10https://gerrit.wikimedia.org/r/88261 (owner: 10Dr0ptp4kt) [16:53:00] (03PS2) 10Andrew Bogott: Change role::labs-mysql-server to use the mysql module. [operations/puppet] - 10https://gerrit.wikimedia.org/r/89404 [16:53:01] (03PS1) 10Andrew Bogott: Change the labs rt role to use the mysql module. [operations/puppet] - 10https://gerrit.wikimedia.org/r/89539 [18:17:59] PROBLEM - Puppet freshness on neon is CRITICAL: No successful Puppet run in the last 10 hours [18:19:26] !log restarting gmetad on nickel to clear old metrics [18:19:31] what was the last message sent here? [18:19:37] before ori-l [18:19:41] Logged the message, Master [18:19:59] PROBLEM - Puppet freshness on bast4001 is CRITICAL: No successful Puppet run in the last 10 hours [18:19:59] PROBLEM - Puppet freshness on cp4002 is CRITICAL: No successful Puppet run in the last 10 hours [18:19:59] PROBLEM - Puppet freshness on cp4003 is CRITICAL: No successful Puppet run in the last 10 hours [18:19:59] PROBLEM - Puppet freshness on cp4004 is CRITICAL: No successful Puppet run in the last 10 hours [18:19:59] PROBLEM - Puppet freshness on cp4006 is CRITICAL: No successful Puppet run in the last 10 hours [18:19:59] PROBLEM - Puppet freshness on cp4007 is CRITICAL: No successful Puppet run in the last 10 hours [18:19:59] PROBLEM - Puppet freshness on cp4009 is CRITICAL: No successful Puppet run in the last 10 hours [18:20:00] PROBLEM - Puppet freshness on cp4008 is CRITICAL: No successful Puppet run in the last 10 hours [18:20:00] PROBLEM - Puppet freshness on cp4010 is CRITICAL: No successful Puppet run in the last 10 hours [18:20:01] PROBLEM - Puppet freshness on cp4011 is CRITICAL: No successful Puppet run in the last 10 hours [18:20:01] PROBLEM - Puppet freshness on cp4012 is CRITICAL: No successful Puppet run in the last 10 hours [18:20:02] PROBLEM - Puppet freshness on cp4016 is CRITICAL: No successful Puppet run in the last 10 hours [18:20:02] PROBLEM - Puppet freshness on cp4013 is CRITICAL: No successful Puppet run in the last 10 hours [18:20:03] PROBLEM - Puppet freshness on cp4018 is CRITICAL: No successful Puppet run in the last 10 hours [18:20:03] PROBLEM - Puppet freshness on cp4020 is CRITICAL: No successful Puppet run in the last 10 hours [18:20:04] PROBLEM - Puppet freshness on lvs4002 is CRITICAL: No successful Puppet run in the last 10 hours [18:20:04] PROBLEM - Puppet freshness on lvs4003 is CRITICAL: No successful Puppet run in the last 10 hours [18:20:05] PROBLEM - Puppet freshness on lvs4004 is CRITICAL: No successful Puppet run in the last 10 hours [18:31:27] RickyB98: check the logs http://ur1.ca/edq22 [19:31:39] !log restarting gmetad on nickel to clear old metrics (again) [19:31:53] Logged the message, Master [19:45:29] PROBLEM - MySQL Processlist on db1040 is CRITICAL: CRIT 1 unauthenticated, 0 locked, 0 copy to table, 71 statistics [19:46:29] RECOVERY - MySQL Processlist on db1040 is OK: OK 0 unauthenticated, 0 locked, 0 copy to table, 0 statistics [20:52:41] (03PS1) 10Bartosz DziewoƄski: Explicitly set 'watchcreations' and 'watchdefault' options to false [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/89603 [21:28:13] MatmaRex: I suppose https://bugzilla.wikimedia.org/show_bug.cgi?id=54680 has finished? [21:31:18] I bet it hasn't [21:31:31] Selecting next 10000 rows... processing...6640000 done. [21:32:18] Oh. Thanks Reedy [21:32:24] out of 19624409 [21:32:34] About a third done [21:42:17] (03CR) 10Reedy: "https://gerrit.wikimedia.org/r/#/c/89329/ wants merging first" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/89331 (owner: 10Reedy) [21:45:19] PROBLEM - Host cp1052 is DOWN: PING CRITICAL - Packet loss = 100%