[01:50:33] PROBLEM - Puppet freshness on lvs1003 is CRITICAL: Puppet has not run in the last 10 hours [01:50:33] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: Puppet has not run in the last 10 hours [01:51:03] PROBLEM - MySQL Replication Heartbeat on db42 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [02:19:03] PROBLEM - MySQL replication status on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1571s [02:26:03] PROBLEM - Misc_Db_Lag on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1991s [02:37:13] RECOVERY - Misc_Db_Lag on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 19s [02:41:23] RECOVERY - MySQL replication status on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 31s [02:45:13] PROBLEM - Puppet freshness on brewster is CRITICAL: Puppet has not run in the last 10 hours [03:50:36] PROBLEM - Puppet freshness on knsq9 is CRITICAL: Puppet has not run in the last 10 hours [04:17:56] RECOVERY - Disk space on es1004 is OK: DISK OK [04:22:46] RECOVERY - MySQL disk space on es1004 is OK: DISK OK [04:37:22] PROBLEM - MySQL slave status on es1004 is CRITICAL: CRITICAL: Slave running: expected Yes, got No [05:48:45] PROBLEM - MySQL Replication Heartbeat on db42 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:53:56] PROBLEM - Disk space on es1004 is CRITICAL: DISK CRITICAL - free space: /a 440397 MB (3% inode=99%): [10:02:36] PROBLEM - MySQL disk space on es1004 is CRITICAL: DISK CRITICAL - free space: /a 390001 MB (3% inode=99%): [10:29:26] PROBLEM - RAID on searchidx2 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [10:40:26] RECOVERY - RAID on searchidx2 is OK: OK: State is Optimal, checked 4 logical device(s) [10:50:40] PROBLEM - MySQL Replication Heartbeat on db42 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [11:15:50] RECOVERY - MySQL slave status on es1004 is OK: OK: [12:01:41] PROBLEM - Puppet freshness on lvs1003 is CRITICAL: Puppet has not run in the last 10 hours [12:01:41] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: Puppet has not run in the last 10 hours [12:55:36] PROBLEM - Puppet freshness on brewster is CRITICAL: Puppet has not run in the last 10 hours [13:11:36] New patchset: Dzahn; "REVIEW REQUESTED major cleanup and refactoring and some parameterization of udp2log class, but should be no substantive changes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2083 [13:24:54] New review: Dzahn; "looks good. just fixed wrapped line 48 and cosmetic changes (whitespace), but is it missing class ud..." [operations/puppet] (production); V: 1 C: 1; - https://gerrit.wikimedia.org/r/2083 [13:45:51] RECOVERY - MySQL Slave Delay on db42 is OK: OK replication delay 0 seconds [13:50:31] RECOVERY - MySQL Replication Heartbeat on db42 is OK: OK replication delay 0 seconds [14:00:51] PROBLEM - Puppet freshness on knsq9 is CRITICAL: Puppet has not run in the last 10 hours [18:07:46] PROBLEM - MySQL Slave Delay on db42 is CRITICAL: CRIT replication delay 441 seconds [18:30:06] RECOVERY - MySQL Slave Delay on db42 is OK: OK replication delay 23 seconds [18:39:39] !log running authdns-update to activate be.wikimedia.org [18:39:40] Logged the message, Master [19:27:52] PROBLEM - MySQL Slave Delay on db42 is CRITICAL: CRIT replication delay 267 seconds [19:39:12] RECOVERY - MySQL Slave Delay on db42 is OK: OK replication delay 1 seconds [19:40:07] mutante: https://bugzilla.wikimedia.org/show_bug.cgi?id=33672 [19:41:40] hexmode: want me to test because it says "only from Germany"?? [19:42:06] mutante: if you could, yes [19:42:26] if you see the problem, could you diagnose a fix? [19:42:57] hmm..one of the images is flipped [19:43:03] well, I'm seeing different images [19:43:50] looks like 120px wasn't purged and is an old copy with a different exif-rotation [19:44:38] 120px has Last-Modified: Sun, 01 May 2011 16:24:22 GMT [19:48:05] Platonides: could you update the bug w/ your comments? [19:48:21] otherwise, I'll be tempted to cut-n-paste from IRC [19:48:22] ;) [19:51:09] hexmode, what do you see at http://commons.wikimedia.org/wiki/File:Sinai_Red_sea_ecological_disaster_Nabq_nature_reserve.jpg#filehistory ? [19:51:33] PROBLEM - ps1-d2-sdtpa-infeed-load-tower-A-phase-Z on ps1-d2-sdtpa is CRITICAL: ps1-d2-sdtpa-infeed-load-tower-A-phase-Z CRITICAL - *2513* [19:51:44] the thumbs for the last two entries look the same to me [19:52:25] Platonides: the thumb for all but the third one doesn't match the image for me [19:52:47] hexmode: solution = purge thumb i guess.. but i'm not sure whats the best way currently [19:52:58] http://commons.wikimedia.org/wiki/File:Sinai_Red_sea_ecological_disaster_Nabq_nature_reserve.jpg?action=purge doesn't seem to work [19:54:46] * mutante notices the "request rotation)" link [19:55:24] also see the text when hovering over it [19:55:34] Platonides: oo! I did purge and now the top thumbnail matches! [19:56:04] is this related to somebody requesting a rotation? [19:56:08] via that link [19:56:26] mutante: ask saibo in -commons for help [19:57:41] what happen if you send a manual purge to the european squids? [19:59:51] you mean using mwscript purgeList.php? [20:01:54] mutante: do you want OP's help? [20:02:53] hexmode: talking to Saibo now [20:03:28] mutante: yeah, saw that. thanks! [22:12:22] PROBLEM - Puppet freshness on lvs1003 is CRITICAL: Puppet has not run in the last 10 hours [22:12:22] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: Puppet has not run in the last 10 hours [22:58:34] PROBLEM - Disk space on srv219 is CRITICAL: DISK CRITICAL - free space: / 0 MB (0% inode=60%): /var/lib/ureadahead/debugfs 0 MB (0% inode=60%): [22:58:44] PROBLEM - Disk space on srv223 is CRITICAL: DISK CRITICAL - free space: / 0 MB (0% inode=60%): /var/lib/ureadahead/debugfs 0 MB (0% inode=60%): [23:06:34] PROBLEM - Puppet freshness on brewster is CRITICAL: Puppet has not run in the last 10 hours [23:10:24] RECOVERY - Disk space on srv223 is OK: DISK OK [23:43:44] RECOVERY - Disk space on srv219 is OK: DISK OK [23:46:01] !log moving instances from virt2 to virt1 to rebalance compute cluster [23:46:02] Logged the message, Master [23:51:40] RECOVERY - Squid on brewster is OK: TCP OK - 0.003 second response time on port 8080