[00:02:14] New patchset: Demon; "More fixups to gerrit:" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25462 [00:03:09] New patchset: Ryan Lane; "Initial commit of debian directory" [operations/debs/gerrit] (master) - https://gerrit.wikimedia.org/r/25463 [00:03:09] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/25462 [00:03:31] Change merged: Ryan Lane; [operations/debs/gerrit] (master) - https://gerrit.wikimedia.org/r/25463 [00:04:15] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25462 [00:18:46] New patchset: Demon; "Two minor gerrit fixes:" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25464 [00:19:42] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/25464 [00:19:44] New patchset: Ryan Lane; "Use unattended-upgrades in Labs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25465 [00:20:40] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25464 [00:20:40] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/25465 [00:21:32] New patchset: Dzahn; "use keys function from puppetlibs-stdlib instead to get hash keys as an array" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25466 [00:22:26] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25466 [00:31:05] New patchset: Dzahn; "planet, and now fix the way the hash values are referenced in .erb" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25469 [00:32:07] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25469 [00:32:25] crap [00:42:52] New patchset: Faidon; "swift: add support for timeline/math paths in rewrite.py" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/24303 [00:43:49] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/24303 [00:43:52] New patchset: Faidon; "swift: add support for timeline/math paths in rewrite.py" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/24303 [00:44:47] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/24303 [00:46:12] New review: Faidon; "Might be too evil, since it restarts services. But it's better than having no security updates. Let'..." [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/25465 [00:46:55] PROBLEM - Puppet freshness on virt1000 is CRITICAL: Puppet has not run in the last 10 hours [00:51:49] New patchset: Dzahn; "planet: do not use $ sign in hash lookup" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25472 [00:52:44] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25472 [00:56:25] New review: Dzahn; "yay:) thanks peter" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25472 [00:59:49] PROBLEM - Puppet freshness on virt0 is CRITICAL: Puppet has not run in the last 10 hours [01:01:54] New review: Krinkle; "Does this apply to the labs servers or the project instances as well?" [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/25465 [01:42:07] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 268 seconds [01:43:01] PROBLEM - MySQL Slave Delay on storage3 is CRITICAL: CRIT replication delay 281 seconds [01:43:37] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 2 seconds [01:46:10] RECOVERY - MySQL Slave Delay on storage3 is OK: OK replication delay 21 seconds [01:53:49] PROBLEM - Puppet freshness on analytics1007 is CRITICAL: Puppet has not run in the last 10 hours [02:11:13] PROBLEM - SSH on fenari is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:12:34] PROBLEM - HTTP on fenari is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:14:59] PROBLEM - check_all_memcacheds on spence is CRITICAL: (Service Check Timed Out) [02:17:40] RECOVERY - check_all_memcacheds on spence is OK: MEMCACHED OK - All memcacheds are online [02:26:04] PROBLEM - check_all_memcacheds on spence is CRITICAL: (Service Check Timed Out) [02:28:47] RECOVERY - check_all_memcacheds on spence is OK: MEMCACHED OK - All memcacheds are online [02:29:49] PROBLEM - NTP on fenari is CRITICAL: NTP CRITICAL: No response from NTP server [02:34:44] fenari is full of fail [02:36:36] !log powercycle fenari, it's nonresponsive even via drac terminal [02:36:53] Logged the message, Master [02:38:58] RECOVERY - HTTP on fenari is OK: HTTP OK HTTP/1.1 200 OK - 4416 bytes in 0.004 seconds [02:39:43] RECOVERY - SSH on fenari is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [02:43:28] RECOVERY - NTP on fenari is OK: NTP OK: Offset 4.67300415e-05 secs [03:00:22] New review: Siebrand; "Can has yes please!" [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/12188 [03:02:17] New patchset: Jgreen; "adding root@boron's key to fundraising backupmover's authorized keys" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25474 [03:03:29] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/25474 [03:04:17] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25474 [03:28:23] New review: Ryan Lane; "Only to instances." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/25465 [03:29:49] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [03:29:49] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Puppet has not run in the last 10 hours [03:29:49] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Puppet has not run in the last 10 hours [03:29:49] PROBLEM - Puppet freshness on virt1002 is CRITICAL: Puppet has not run in the last 10 hours [03:29:49] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [03:33:35] New patchset: Jgreen; "add misc::fundraising::backup::archive to aluminium/grosley" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25475 [03:41:09] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25475 [04:01:28] PROBLEM - Squid on brewster is CRITICAL: Connection refused [04:02:58] PROBLEM - Puppet freshness on cp1040 is CRITICAL: Puppet has not run in the last 10 hours [04:42:43] RECOVERY - Squid on brewster is OK: TCP OK - 0.003 second response time on port 8080 [04:47:22] PROBLEM - Squid on brewster is CRITICAL: Connection refused [05:26:19] PROBLEM - Lucene on search1016 is CRITICAL: Connection timed out [05:29:10] PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours [05:39:49] RECOVERY - Lucene on search1016 is OK: TCP OK - 0.027 second response time on port 8123 [05:47:28] PROBLEM - LVS Lucene on search-pool4.svc.eqiad.wmnet is CRITICAL: Connection timed out [05:48:40] RECOVERY - LVS Lucene on search-pool4.svc.eqiad.wmnet is OK: TCP OK - 0.037 second response time on port 8123 [05:49:25] PROBLEM - Lucene on search1016 is CRITICAL: Connection timed out [05:55:07] RECOVERY - Lucene on search1016 is OK: TCP OK - 0.027 second response time on port 8123 [05:55:07] PROBLEM - LVS Lucene on search-pool4.svc.eqiad.wmnet is CRITICAL: Connection timed out [05:56:19] RECOVERY - LVS Lucene on search-pool4.svc.eqiad.wmnet is OK: TCP OK - 0.030 second response time on port 8123 [06:04:16] PROBLEM - LVS Lucene on search-pool4.svc.eqiad.wmnet is CRITICAL: Connection timed out [06:05:37] RECOVERY - LVS Lucene on search-pool4.svc.eqiad.wmnet is OK: TCP OK - 3.025 second response time on port 8123 [06:06:49] PROBLEM - Lucene on search1016 is CRITICAL: Connection timed out [06:08:02] RECOVERY - Lucene on search1016 is OK: TCP OK - 0.027 second response time on port 8123 [06:18:33] PROBLEM - Puppet freshness on ms-be7 is CRITICAL: Puppet has not run in the last 10 hours [06:19:00] PROBLEM - Lucene on search1016 is CRITICAL: Connection timed out [06:24:51] RECOVERY - Lucene on search1016 is OK: TCP OK - 0.028 second response time on port 8123 [06:37:36] PROBLEM - Puppet freshness on cp1030 is CRITICAL: Puppet has not run in the last 10 hours [06:56:27] PROBLEM - Puppet freshness on cp1029 is CRITICAL: Puppet has not run in the last 10 hours [07:09:21] PROBLEM - Puppet freshness on cp1032 is CRITICAL: Puppet has not run in the last 10 hours [07:16:15] RECOVERY - Squid on brewster is OK: TCP OK - 0.007 second response time on port 8080 [07:21:38] PROBLEM - Puppet freshness on ms-fe1 is CRITICAL: Puppet has not run in the last 10 hours [07:25:32] PROBLEM - Puppet freshness on cp1034 is CRITICAL: Puppet has not run in the last 10 hours [07:25:32] PROBLEM - Puppet freshness on cp1035 is CRITICAL: Puppet has not run in the last 10 hours [07:41:35] PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours [07:44:35] PROBLEM - Puppet freshness on cp1033 is CRITICAL: Puppet has not run in the last 10 hours [07:54:38] PROBLEM - Puppet freshness on cp1036 is CRITICAL: Puppet has not run in the last 10 hours [08:58:57] New review: Hashar; "That is good for me. I am a bit too busy this week to handle shell requests though :-/ If nobody doe..." [operations/mediawiki-config] (master); V: 0 C: 1; - https://gerrit.wikimedia.org/r/23059 [09:03:18] PROBLEM - Puppet freshness on magnesium is CRITICAL: Puppet has not run in the last 10 hours [09:03:18] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [09:36:53] New patchset: Dereckson; "(bug 31067) Namespace configuration for az.wikibooks" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/25493 [09:54:18] PROBLEM - Puppet freshness on mw11 is CRITICAL: Puppet has not run in the last 10 hours [09:54:18] PROBLEM - Puppet freshness on mw12 is CRITICAL: Puppet has not run in the last 10 hours [09:54:18] PROBLEM - Puppet freshness on mw13 is CRITICAL: Puppet has not run in the last 10 hours [09:54:18] PROBLEM - Puppet freshness on mw1 is CRITICAL: Puppet has not run in the last 10 hours [09:54:18] PROBLEM - Puppet freshness on mw14 is CRITICAL: Puppet has not run in the last 10 hours [09:54:19] PROBLEM - Puppet freshness on mw10 is CRITICAL: Puppet has not run in the last 10 hours [09:54:19] PROBLEM - Puppet freshness on mw15 is CRITICAL: Puppet has not run in the last 10 hours [09:54:20] PROBLEM - Puppet freshness on mw16 is CRITICAL: Puppet has not run in the last 10 hours [09:54:20] PROBLEM - Puppet freshness on mw3 is CRITICAL: Puppet has not run in the last 10 hours [09:54:21] PROBLEM - Puppet freshness on mw5 is CRITICAL: Puppet has not run in the last 10 hours [09:54:21] PROBLEM - Puppet freshness on mw2 is CRITICAL: Puppet has not run in the last 10 hours [09:54:22] PROBLEM - Puppet freshness on mw6 is CRITICAL: Puppet has not run in the last 10 hours [09:54:22] PROBLEM - Puppet freshness on mw4 is CRITICAL: Puppet has not run in the last 10 hours [09:54:23] PROBLEM - Puppet freshness on mw8 is CRITICAL: Puppet has not run in the last 10 hours [09:54:23] PROBLEM - Puppet freshness on mw7 is CRITICAL: Puppet has not run in the last 10 hours [09:54:24] PROBLEM - Puppet freshness on mw9 is CRITICAL: Puppet has not run in the last 10 hours [10:00:18] PROBLEM - Puppet freshness on locke is CRITICAL: Puppet has not run in the last 10 hours [10:20:42] PROBLEM - Host ms-be6 is DOWN: PING CRITICAL - Packet loss = 100% [10:40:27] RECOVERY - Host ms-be6 is UP: PING OK - Packet loss = 0%, RTA = 1.41 ms [10:47:57] PROBLEM - Puppet freshness on virt1000 is CRITICAL: Puppet has not run in the last 10 hours [10:54:06] PROBLEM - Host ms-be6 is DOWN: PING CRITICAL - Packet loss = 100% [10:55:03] !log ignore ms-be6 messages, I'm trying to get into the bleeping lsi raid util [10:55:09] RECOVERY - Host ms-be6 is UP: PING OK - Packet loss = 0%, RTA = 0.42 ms [10:55:15] Logged the message, Master [11:00:51] PROBLEM - Puppet freshness on virt0 is CRITICAL: Puppet has not run in the last 10 hours [11:54:48] PROBLEM - Puppet freshness on analytics1007 is CRITICAL: Puppet has not run in the last 10 hours [13:31:07] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [13:31:07] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Puppet has not run in the last 10 hours [13:31:07] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [13:31:07] PROBLEM - Puppet freshness on virt1002 is CRITICAL: Puppet has not run in the last 10 hours [13:31:07] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Puppet has not run in the last 10 hours [13:50:30] cmjohnson1: thank goodness [13:50:55] apergos1 I couldn't agree more [13:50:57] I'm pulling my hair out trying to get into the lsi raid configuration util, it won't pass through my ctrl-h (nor with esc-ctrl-h) [13:51:20] maybe you could do this and set up the disks? [13:51:38] odd...i will have to do it local [13:53:07] apergos1: don't spend much time on it...be6 is the one set for exchange [13:53:16] what was the last message you saw from me? [13:54:03] cmjohnson1: [13:54:35] that you couldn't enable jbod [13:54:43] that card doesn't support jobd [13:54:45] no I mean here in irc just now [13:55:12] oh...u couldn't do the lsi raid cnfg [13:55:15] right [13:55:24] would you be able to get into it and set up the disks? [13:56:10] not in dc at the moment...i will be there shortly [13:56:13] I spent a long time at trying to invoke it and failing, it just ignores the key sequence from me and boots [13:56:14] ah [13:56:15] ok [13:56:24] will you have time for that today? [13:56:45] this will mean a re-install after the disks are configured, as it is right now they aren't seen by the os [13:56:51] i should [13:57:04] ok, thank you [13:57:05] np...think we've reinstalled about 25x by now [13:57:11] uh huh [13:58:13] !log dist-upgrade & reboot grosley [13:58:24] Logged the message, Master [13:59:37] PROBLEM - Host grosley is DOWN: CRITICAL - Host Unreachable (208.80.152.164) [14:00:58] RECOVERY - Host grosley is UP: PING OK - Packet loss = 0%, RTA = 0.43 ms [14:04:16] PROBLEM - Puppet freshness on cp1040 is CRITICAL: Puppet has not run in the last 10 hours [15:06:34] !log indium dist-upgrade & reboot [15:06:45] Logged the message, Master [15:20:58] !log authdns-update for wikimediafoundation.info [15:21:08] Logged the message, RobH [15:30:22] PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours [15:33:13] !log mc1002 (rt3612) & mc1006 (rt3613) offline for memory repair/swap [15:33:23] Logged the message, RobH [15:41:51] New patchset: Demon; "Stop using gerrit2 for replication purposes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25508 [15:42:55] New review: gerrit2; "Change did not pass lint check. You will need to send an amended patchset for this (see: https://lab..." [operations/puppet] (production); V: -1 - https://gerrit.wikimedia.org/r/25508 [15:43:58] New patchset: Demon; "Stop using gerrit2 for replication purposes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25508 [15:44:58] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/25508 [15:45:59] New review: Demon; "PS2 fixes a stupid typo." [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/25508 [15:53:58] !log mc1002 repaired rt3612 [15:54:07] Logged the message, RobH [15:54:22] New patchset: Demon; "Stop using gerrit2 for replication purposes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25508 [15:55:19] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/25508 [15:57:11] notpeter: mc1002 is all yours [15:59:20] !log mc1006 memory repaired rt3613 [15:59:29] Logged the message, RobH [16:00:01] New patchset: Demon; "Stop using gerrit2 for replication purposes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25508 [16:01:04] New review: Demon; "PS3 fixes another stupid typo, PS4 adds a new role so I can easily test the destination side in labs." [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/25508 [16:01:04] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/25508 [16:19:16] PROBLEM - Puppet freshness on ms-be7 is CRITICAL: Puppet has not run in the last 10 hours [16:37:39] !log cp1031 mainboard replaced per rt3614, will need reinstall [16:37:49] Logged the message, RobH [16:38:27] PROBLEM - Puppet freshness on cp1030 is CRITICAL: Puppet has not run in the last 10 hours [16:42:01] !log ms-be6 powering down [16:42:11] Logged the message, Master [16:44:53] PROBLEM - Host ms-be6 is DOWN: PING CRITICAL - Packet loss = 100% [16:57:38] PROBLEM - Puppet freshness on cp1029 is CRITICAL: Puppet has not run in the last 10 hours [17:10:32] PROBLEM - Puppet freshness on cp1032 is CRITICAL: Puppet has not run in the last 10 hours [17:11:07] New patchset: Alex Monk; "(bug 40575) Allow large account creation for ptwikiversity workshop in an hour" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/25514 [17:13:14] ^ any ops around who can review/deploy that? [17:14:16] Change merged: Demon; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/25514 [17:16:15] <^demon> Krenair: Done & synced. [17:17:06] thanks [17:22:32] PROBLEM - Puppet freshness on ms-fe1 is CRITICAL: Puppet has not run in the last 10 hours [17:26:35] PROBLEM - Puppet freshness on cp1034 is CRITICAL: Puppet has not run in the last 10 hours [17:26:35] PROBLEM - Puppet freshness on cp1035 is CRITICAL: Puppet has not run in the last 10 hours [17:36:31] RECOVERY - Host ms-be6 is UP: PING OK - Packet loss = 0%, RTA = 0.24 ms [17:42:30] !log bring db62 down to swap disk controller card per ct/asher [17:42:40] PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours [17:42:40] Logged the message, Master [17:45:31] PROBLEM - Puppet freshness on cp1033 is CRITICAL: Puppet has not run in the last 10 hours [17:46:32] ok well I've stopped swift and am trying a puppet run to see if partitioning works over there on ms-be6 [17:46:41] oughta be interesting [17:47:19] PROBLEM - swift-account-auditor on ms-be6 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-account-auditor [17:47:19] PROBLEM - swift-container-auditor on ms-be6 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [17:47:19] PROBLEM - swift-object-auditor on ms-be6 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [17:47:34] yeah hush you [17:47:37] PROBLEM - swift-account-reaper on ms-be6 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-account-reaper [17:47:46] PROBLEM - swift-container-replicator on ms-be6 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-replicator [17:48:04] PROBLEM - swift-object-replicator on ms-be6 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-object-replicator [17:48:04] PROBLEM - swift-account-replicator on ms-be6 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-account-replicator [17:48:04] PROBLEM - swift-container-server on ms-be6 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-server [17:48:04] PROBLEM - swift-object-server on ms-be6 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-object-server [17:48:07] yay I see a partitioning on /dev/sdh that supposedly succeeded. we'll know when the filsystem mounts come up though [17:48:22] PROBLEM - swift-object-updater on ms-be6 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-object-updater [17:48:40] PROBLEM - swift-account-server on ms-be6 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-account-server [17:48:40] PROBLEM - swift-container-updater on ms-be6 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-updater [17:53:28] RECOVERY - swift-container-auditor on ms-be6 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [17:53:46] RECOVERY - swift-account-server on ms-be6 is OK: PROCS OK: 25 processes with regex args ^/usr/bin/python /usr/bin/swift-account-server [17:53:47] RECOVERY - swift-container-replicator on ms-be6 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-replicator [17:53:55] RECOVERY - swift-account-reaper on ms-be6 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-reaper [17:53:55] RECOVERY - swift-object-auditor on ms-be6 is OK: PROCS OK: 2 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [17:54:04] RECOVERY - swift-account-replicator on ms-be6 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-replicator [17:54:04] RECOVERY - swift-object-server on ms-be6 is OK: PROCS OK: 25 processes with regex args ^/usr/bin/python /usr/bin/swift-object-server [17:54:31] RECOVERY - swift-object-replicator on ms-be6 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-object-replicator [17:54:31] RECOVERY - swift-object-updater on ms-be6 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-object-updater [17:54:49] RECOVERY - swift-account-auditor on ms-be6 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-auditor [17:54:58] RECOVERY - swift-container-updater on ms-be6 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-updater [17:55:34] RECOVERY - swift-container-server on ms-be6 is OK: PROCS OK: 25 processes with regex args ^/usr/bin/python /usr/bin/swift-container-server [17:55:35] PROBLEM - Puppet freshness on cp1036 is CRITICAL: Puppet has not run in the last 10 hours [17:56:08] New patchset: preilly; "add Dialog Sri Lanka configuration" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25520 [17:56:17] paravoid: ^^ can you merge this [17:57:36] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/25520 [18:03:31] RECOVERY - Host db1012 is UP: PING OK - Packet loss = 0%, RTA = 26.54 ms [19:03:45] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [19:03:45] PROBLEM - Puppet freshness on magnesium is CRITICAL: Puppet has not run in the last 10 hours [19:08:45] New patchset: Cmjohnson; "Removing decommissioned servers from dhcpd file s0-9600" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25535 [19:09:58] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/25535 [19:10:30] !log search32 going down to swap cpu's (dell request) [19:10:42] Logged the message, Master [19:13:30] PROBLEM - Host search32 is DOWN: PING CRITICAL - Packet loss = 100% [19:15:12] Change abandoned: Cmjohnson; "(no reason)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25535 [19:34:03] RECOVERY - Host search32 is UP: PING OK - Packet loss = 0%, RTA = 0.54 ms [19:48:45] PROBLEM - Host ms-be6 is DOWN: PING CRITICAL - Packet loss = 100% [19:49:12] RECOVERY - Host ms-be6 is UP: PING OK - Packet loss = 0%, RTA = 0.27 ms [19:49:59] PROBLEM - NTP on ms-be6 is CRITICAL: NTP CRITICAL: Offset unknown [19:53:08] RECOVERY - NTP on ms-be6 is OK: NTP OK: Offset 0.0005081892014 secs [19:55:14] PROBLEM - Puppet freshness on mw1 is CRITICAL: Puppet has not run in the last 10 hours [19:55:14] PROBLEM - Puppet freshness on mw10 is CRITICAL: Puppet has not run in the last 10 hours [19:55:14] PROBLEM - Puppet freshness on mw12 is CRITICAL: Puppet has not run in the last 10 hours [19:55:14] PROBLEM - Puppet freshness on mw13 is CRITICAL: Puppet has not run in the last 10 hours [19:55:14] PROBLEM - Puppet freshness on mw14 is CRITICAL: Puppet has not run in the last 10 hours [19:55:15] PROBLEM - Puppet freshness on mw16 is CRITICAL: Puppet has not run in the last 10 hours [19:55:15] PROBLEM - Puppet freshness on mw2 is CRITICAL: Puppet has not run in the last 10 hours [19:55:16] PROBLEM - Puppet freshness on mw3 is CRITICAL: Puppet has not run in the last 10 hours [19:55:16] PROBLEM - Puppet freshness on mw11 is CRITICAL: Puppet has not run in the last 10 hours [19:55:17] PROBLEM - Puppet freshness on mw4 is CRITICAL: Puppet has not run in the last 10 hours [19:55:17] PROBLEM - Puppet freshness on mw15 is CRITICAL: Puppet has not run in the last 10 hours [19:55:18] PROBLEM - Puppet freshness on mw5 is CRITICAL: Puppet has not run in the last 10 hours [19:55:18] PROBLEM - Puppet freshness on mw7 is CRITICAL: Puppet has not run in the last 10 hours [19:55:19] PROBLEM - Puppet freshness on mw8 is CRITICAL: Puppet has not run in the last 10 hours [19:55:19] PROBLEM - Puppet freshness on mw6 is CRITICAL: Puppet has not run in the last 10 hours [19:55:20] PROBLEM - Puppet freshness on mw9 is CRITICAL: Puppet has not run in the last 10 hours [20:01:14] PROBLEM - Puppet freshness on locke is CRITICAL: Puppet has not run in the last 10 hours [20:09:48] !log authdns-update for new misc servers mgmt in eqiad [20:09:58] Logged the message, RobH [20:49:14] PROBLEM - Puppet freshness on virt1000 is CRITICAL: Puppet has not run in the last 10 hours [21:02:03] PROBLEM - Puppet freshness on virt0 is CRITICAL: Puppet has not run in the last 10 hours [21:55:54] PROBLEM - Puppet freshness on analytics1007 is CRITICAL: Puppet has not run in the last 10 hours [22:09:10] New patchset: Ori.livneh; "Config to enable footerCleanup module in Vector extension." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/23632 [22:09:24] New review: Ori.livneh; "Ready for deployment." [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/23632 [22:11:36] New review: OliverKeyes; "Awesome :)." [operations/mediawiki-config] (master) C: 1; - https://gerrit.wikimedia.org/r/23632 [22:31:52] New patchset: Ori.livneh; "Config to enable footerCleanup module in Vector extension." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/23632 [22:40:37] New patchset: Ori.livneh; "Create wmgUseMicroDesign config var; use it to enable Vector changes" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/25598 [22:42:40] Change merged: Ori.livneh; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/25598 [22:43:13] Change abandoned: Robmoen; "(no reason)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/23632 [23:25:51] New patchset: Dereckson; "(bug 40582) Temporary event logo for sv.wikipedia" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/25599 [23:26:55] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:31:43] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Puppet has not run in the last 10 hours [23:31:43] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [23:31:43] PROBLEM - Puppet freshness on virt1002 is CRITICAL: Puppet has not run in the last 10 hours [23:31:43] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Puppet has not run in the last 10 hours [23:31:43] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [23:32:55] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 6.860 seconds