[00:08:25] (03CR) 10CSteipp: "I was able to trigger the overflows when I copied the function out and called it directly. I'm pretty sure it can't be triggered in the co" [operations/software/varnish/varnishkafka] - 10https://gerrit.wikimedia.org/r/127804 (owner: 10CSteipp) [02:22:44] !log LocalisationUpdate completed (1.24wmf1) at 2014-04-26 02:22:41+00:00 [02:22:55] Logged the message, Master [02:31:43] !log LocalisationUpdate completed (1.24wmf2) at 2014-04-26 02:31:41+00:00 [02:31:50] Logged the message, Master [03:11:22] !log LocalisationUpdate ResourceLoader cache refresh completed at Sat Apr 26 03:11:16 UTC 2014 (duration 11m 15s) [03:11:30] Logged the message, Master [03:35:51] PROBLEM - RAID on db1021 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [03:35:51] PROBLEM - MySQL Replication Heartbeat on db1021 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [03:36:40] RECOVERY - MySQL Replication Heartbeat on db1021 is OK: OK replication delay -0 seconds [03:36:40] RECOVERY - RAID on db1021 is OK: OK: optimal, 1 logical, 2 physical [04:25:30] PROBLEM - Disk space on lvs3002 is CRITICAL: DISK CRITICAL - free space: / 1505 MB (3% inode=97%): [04:26:30] RECOVERY - Disk space on lvs3002 is OK: DISK OK [04:26:51] PROBLEM - Disk space on lvs3003 is CRITICAL: DISK CRITICAL - free space: / 1565 MB (3% inode=97%): [04:27:51] PROBLEM - Disk space on lvs3004 is CRITICAL: DISK CRITICAL - free space: / 1685 MB (3% inode=97%): [04:32:52] RECOVERY - Disk space on lvs3003 is OK: DISK OK [04:32:52] RECOVERY - Disk space on lvs3004 is OK: DISK OK [06:17:00] PROBLEM - LVS HTTP IPv6 on bits-lb.ulsfo.wikimedia.org_ipv6 is CRITICAL: Connection timed out [06:18:00] PROBLEM - LVS HTTP IPv6 on mobile-lb.ulsfo.wikimedia.org_ipv6 is CRITICAL: Connection timed out [06:18:20] PROBLEM - LVS HTTPS IPv6 on upload-lb.ulsfo.wikimedia.org_ipv6 is CRITICAL: Connection timed out [06:18:51] RECOVERY - LVS HTTP IPv6 on mobile-lb.ulsfo.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 21340 bytes in 0.214 second response time [06:19:11] RECOVERY - LVS HTTPS IPv6 on upload-lb.ulsfo.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 652 bytes in 0.371 second response time [06:19:40] PROBLEM - LVS HTTPS IPv6 on mobile-lb.ulsfo.wikimedia.org_ipv6 is CRITICAL: Connection timed out [06:19:51] RECOVERY - LVS HTTP IPv6 on bits-lb.ulsfo.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 3875 bytes in 0.215 second response time [06:20:30] RECOVERY - LVS HTTPS IPv6 on mobile-lb.ulsfo.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 21403 bytes in 0.437 second response time [07:13:19] what's going on? [09:56:49] (03CR) 10Odder: admin module for user/group/permissions cleanup (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/129501 (owner: 10Rush) [11:01:03] (03PS2) 10Giuseppe Lavagetto: Use jinja2 templates, various fixes. [operations/software] - 10https://gerrit.wikimedia.org/r/129456 [11:03:22] <_joe_> I managed to do this from within emacs \o/ [11:06:17] (03CR) 10Giuseppe Lavagetto: [C: 032] "Need this in labs." [operations/software] - 10https://gerrit.wikimedia.org/r/129456 (owner: 10Giuseppe Lavagetto) [12:00:20] (03PS1) 10Hashar: Move license text to LICENSE and start README [operations/software] - 10https://gerrit.wikimedia.org/r/129883 [12:03:04] (03PS1) 10Hashar: puppet-compare: provision python-dev [operations/software] - 10https://gerrit.wikimedia.org/r/129884 [12:06:14] (03CR) 10Hashar: "I did:" [operations/software] - 10https://gerrit.wikimedia.org/r/129884 (owner: 10Hashar) [12:12:03] (03CR) 10Giuseppe Lavagetto: [C: 032] puppet-compare: provision python-dev [operations/software] - 10https://gerrit.wikimedia.org/r/129884 (owner: 10Hashar) [12:56:15] ACKNOWLEDGEMENT - Host db1016 is DOWN: PING CRITICAL - Packet loss = 100% Sean Pringle Exploded. Investigating... - The acknowledgement expires at: 2014-04-30 12:55:13. [12:57:58] !log powercycle db1016 unresponsive [12:58:05] Logged the message, Master [13:02:20] RECOVERY - Host db1016 is UP: PING OK - Packet loss = 0%, RTA = 2.90 ms [13:05:10] PROBLEM - mysqld processes on db1016 is CRITICAL: PROCS CRITICAL: 0 processes with command name mysqld [13:25:10] RECOVERY - mysqld processes on db1016 is OK: PROCS OK: 1 process with command name mysqld [13:26:47] !log db1016 xfs head behind tail. reverted to last snapshot volume [13:26:51] PROBLEM - MySQL Replication Heartbeat on db1016 is CRITICAL: CRIT replication delay 75987 seconds [13:26:53] Logged the message, Master [13:27:30] PROBLEM - MySQL Slave Delay on db1016 is CRITICAL: CRIT replication delay 75847 seconds [13:28:11] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [13:38:21] (03CR) 10Steinsplitter: [C: 031] Have Commons on Beta Labs use $stdlogo [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/122084 (owner: 10Mattflaschen) [13:40:07] (03PS6) 10Hoo man: Have Commons on Beta Labs use $stdlogo [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/122084 (owner: 10Mattflaschen) [13:40:50] (03CR) 10Hoo man: "Rebased" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/122084 (owner: 10Mattflaschen) [13:41:00] (03CR) 10Hoo man: [C: 032] "Beta-only change" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/122084 (owner: 10Mattflaschen) [13:41:08] (03Merged) 10jenkins-bot: Have Commons on Beta Labs use $stdlogo [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/122084 (owner: 10Mattflaschen) [13:42:41] !log hoo updated /a/common to {{Gerrit|Ic98928d54}}: Have Commons on Beta Labs use $stdlogo [13:42:47] Logged the message, Master [13:43:20] !log hoo synchronized wmf-config/InitialiseSettings-labs.php 'Syncing for cluster consistency' [13:43:25] Logged the message, Master [14:25:11] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [18:14:18] (03CR) 10preilly: [C: 031] "Why did you +1 this @Yuvipanda?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/129729 (owner: 10Andrew Bogott) [19:13:59] (03CR) 10Tim Landscheidt: "@Yuvipanda: With Ic37af0fd68fbabe0c8defbdc865364e142118290 now merged, do you still need this or can you abandon it?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/125241 (owner: 10Yuvipanda) [23:27:49] !log aaron synchronized php-1.24wmf2/includes/profiler/Profiler.php '7e20cdd2ba0381b81d3b43c8743fa4202a76bd61' [23:27:57] Logged the message, Master