[00:04:47] (03CR) 10Aaron Schulz: [C: 031] "Looks OK so far" [operations/puppet] - 10https://gerrit.wikimedia.org/r/108165 (owner: 10Chad) [00:10:06] (03PS3) 10Se4598: Resetting legacy channel names on labs and enabling IRC-RC echo again [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/108163 [00:11:43] (03CR) 10Se4598: "PS2: whitespace issue fixed" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/108163 (owner: 10Se4598) [00:11:50] PROBLEM - Check status of defined EventLogging jobs on vanadium is CRITICAL: CRITICAL: Stopped EventLogging jobs: consumer/vanadium consumer/server-side-events-log consumer/mysql-db1047 consumer/client-side-events-log consumer/all-events-log multiplexer/all-events processor/server-side-events processor/client-side-events forwarder/8422 forwarder/8421 [00:12:34] that's me [00:14:50] RECOVERY - Check status of defined EventLogging jobs on vanadium is OK: OK: All defined EventLogging jobs are runnning. [00:35:33] (03PS1) 10BryanDavis: logstash: Exclude api logs [operations/puppet] - 10https://gerrit.wikimedia.org/r/108184 [00:35:35] (03PS1) 10BryanDavis: kibana: Reduce cache duration for metadata [operations/puppet] - 10https://gerrit.wikimedia.org/r/108185 [00:35:41] (03CR) 10jenkins-bot: [V: 04-1] logstash: Exclude api logs [operations/puppet] - 10https://gerrit.wikimedia.org/r/108184 (owner: 10BryanDavis) [00:37:19] wtf jenkins? [00:37:53] (03PS2) 10BryanDavis: logstash: Exclude api logs from storage [operations/puppet] - 10https://gerrit.wikimedia.org/r/108184 [00:41:21] ori: https://gerrit.wikimedia.org/r/#/c/108184 [01:01:50] PROBLEM - Puppet freshness on hooft is CRITICAL: Last successful Puppet run was Fri 17 Jan 2014 06:59:52 PM UTC [01:03:48] (03PS2) 10BryanDavis: kibana: Reduce cache duration for metadata [operations/puppet] - 10https://gerrit.wikimedia.org/r/108185 [01:09:43] (03PS3) 10Ori.livneh: logstash: Exclude api logs from storage [operations/puppet] - 10https://gerrit.wikimedia.org/r/108184 (owner: 10BryanDavis) [01:10:06] (03CR) 10Ori.livneh: [C: 032 V: 032] logstash: Exclude api logs from storage [operations/puppet] - 10https://gerrit.wikimedia.org/r/108184 (owner: 10BryanDavis) [01:10:45] (03PS3) 10Ori.livneh: kibana: Reduce cache duration for metadata [operations/puppet] - 10https://gerrit.wikimedia.org/r/108185 (owner: 10BryanDavis) [01:11:03] (03CR) 10Ori.livneh: [C: 032 V: 032] kibana: Reduce cache duration for metadata [operations/puppet] - 10https://gerrit.wikimedia.org/r/108185 (owner: 10BryanDavis) [01:11:54] ori: Do I just need to force a puppet run now? [01:12:07] bd808: already on it [01:12:13] Coolio [01:17:12] bd808: done [01:17:41] Thanks. I see that the api events went away already. [01:22:36] bd808: also, we could, without modifying kibana itself, load a javascript file containing site-specific customizations; one of the things it could do is configure a global transformRequest function that modifies HTTP requests before they're dispatched [01:22:45] bd808: it's described in this SO answer: http://stackoverflow.com/a/12191613/582542 [01:23:16] it's one way to work around the PUT limitation, though not necessarily the best one. [01:23:21] !log [01:18:21] ok !log exporting no routes over XO transit in sdtpa [01:23:26] * Reedy kicks morebots [01:23:29] Logged the message, Master [01:25:28] ori: Ah. Yeah I could play with that and see what happens. I'm trying to figure out right now how to load kibana locally and proxy via logstash apache so I can test the bleeding edge kibana version. [01:25:58] just deploy it and revert if it sucks, no one is using this other than you at the moment [01:26:00] thank you Reedy [01:27:19] ori: True enough. Dinner first I think. My hunger is distracting :) [01:27:30] ciao [01:28:02] !log Leslie Carr is signing off. au revoir! [01:28:09] Logged the message, Mistress of the network gear. [01:28:28] LeslieCarr: saddest log message ever, have a good time! [01:28:38] Have fun LeslieCarr [01:39:50] (03CR) 10Anomie: "Personally, I think bug 57659 should block this. Before Flow starts being deployed to non-testing wikis, it really needs a non-crap API." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/107553 (owner: 10Spage) [01:43:54] lolol [02:12:58] !log LocalisationUpdate completed (1.23wmf10) at 2014-01-18 02:12:58+00:00 [02:13:06] Logged the message, Master [02:26:05] !log LocalisationUpdate completed (1.23wmf11) at 2014-01-18 02:26:05+00:00 [02:26:11] Logged the message, Master [02:46:09] !log LocalisationUpdate ResourceLoader cache refresh completed at 2014-01-18 02:46:09+00:00 [02:46:15] Logged the message, Master [03:30:08] deploying CSS-only fix to ULS for image /* @embed */ [03:32:52] !log ori synchronized php-1.23wmf10/extensions/UniversalLanguageSelector 'Update ULS to master for I862b01e6b (@embed fix)' [03:32:59] Logged the message, Master [03:38:05] !log ori synchronized php-1.23wmf11/extensions/UniversalLanguageSelector/resources/js/ext.uls.webfonts.js 'Update UniversalLanguageSelector to master for I2da436caa: Wait till rendering thread completion before applying webfonts (Bug: 59958)' [03:38:11] Logged the message, Master [03:38:24] er, wrong sync. [03:39:00] !log ori synchronized php-1.23wmf11/extensions/UniversalLanguageSelector 'Update ULS to master for I862b01e6b (@embed fix)' [03:39:06] Logged the message, Master [03:43:24] bd808: naughty naughty! [03:44:04] ori: what? [03:50:11] !log marc truncated /var/log/ganglia/ganglia_parser.log again. Check incoming email to ops-l [03:50:18] Logged the message, Master [03:52:12] !log marc (on neon) [03:52:19] Logged the message, Master [04:02:50] PROBLEM - Puppet freshness on hooft is CRITICAL: Last successful Puppet run was Fri 17 Jan 2014 06:59:52 PM UTC [04:17:20] (03PS1) 10BryanDavis: logstash: Reduce filter worker count to 1 [operations/puppet] - 10https://gerrit.wikimedia.org/r/108195 [04:24:13] (03PS2) 10BryanDavis: logstash: Reduce filter worker count to 1 [operations/puppet] - 10https://gerrit.wikimedia.org/r/108195 [04:25:30] ori: ^ May fix the crashes in logstash [04:32:40] PROBLEM - MySQL Processlist on db1021 is CRITICAL: CRIT 0 unauthenticated, 0 locked, 0 copy to table, 124 statistics [04:35:50] PROBLEM - MySQL Processlist on db1021 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [04:36:40] RECOVERY - MySQL Processlist on db1021 is OK: OK 0 unauthenticated, 0 locked, 0 copy to table, 16 statistics [04:41:09] (03CR) 10Ori.livneh: [C: 032] logstash: Reduce filter worker count to 1 [operations/puppet] - 10https://gerrit.wikimedia.org/r/108195 (owner: 10BryanDavis) [04:43:19] bd808: that is god telling you parsing udp2log is wrong [04:43:41] Probably. [04:43:55] ignore at your own peril, etc [04:44:14] Loki told me it would all work out in the end [05:27:01] (03PS1) 10Ori.livneh: Drop unused import in asset-check.py [operations/puppet] - 10https://gerrit.wikimedia.org/r/108198 [05:27:13] (03CR) 10Ori.livneh: [C: 032 V: 032] Drop unused import in asset-check.py [operations/puppet] - 10https://gerrit.wikimedia.org/r/108198 (owner: 10Ori.livneh) [06:03:20] PROBLEM - Host mw31 is DOWN: PING CRITICAL - Packet loss = 100% [06:03:50] RECOVERY - Host mw31 is UP: PING OK - Packet loss = 0%, RTA = 35.32 ms [06:05:50] PROBLEM - Apache HTTP on mw31 is CRITICAL: Connection refused [06:06:50] RECOVERY - Apache HTTP on mw31 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.445 second response time [07:03:50] PROBLEM - Puppet freshness on hooft is CRITICAL: Last successful Puppet run was Fri 17 Jan 2014 06:59:52 PM UTC [09:24:10] PROBLEM - Apache HTTP on mw1153 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:25:00] PROBLEM - Apache HTTP on mw1155 is CRITICAL: Connection timed out [09:25:10] PROBLEM - Apache HTTP on mw1159 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:25:40] PROBLEM - Apache HTTP on mw1160 is CRITICAL: Connection timed out [09:25:40] PROBLEM - LVS HTTP IPv4 on rendering.svc.eqiad.wmnet is CRITICAL: Connection timed out [09:25:50] PROBLEM - Apache HTTP on mw1154 is CRITICAL: Connection timed out [09:26:00] PROBLEM - Apache HTTP on mw1156 is CRITICAL: Connection timed out [09:26:20] PROBLEM - Apache HTTP on mw1158 is CRITICAL: Connection timed out [09:27:10] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [09:27:50] PROBLEM - Apache HTTP on mw1157 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:28:40] RECOVERY - Apache HTTP on mw1157 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.055 second response time [09:29:00] RECOVERY - Apache HTTP on mw1153 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.050 second response time [09:29:10] RECOVERY - Apache HTTP on mw1159 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 2.219 second response time [09:29:20] RECOVERY - Apache HTTP on mw1158 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.044 second response time [09:29:30] RECOVERY - Apache HTTP on mw1160 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.077 second response time [09:29:40] RECOVERY - LVS HTTP IPv4 on rendering.svc.eqiad.wmnet is OK: HTTP OK: HTTP/1.1 200 OK - 65429 bytes in 0.199 second response time [09:29:43] RECOVERY - Apache HTTP on mw1154 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.059 second response time [09:30:50] RECOVERY - Apache HTTP on mw1156 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.062 second response time [09:31:50] RECOVERY - Apache HTTP on mw1155 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.066 second response time [10:04:50] PROBLEM - Puppet freshness on hooft is CRITICAL: Last successful Puppet run was Fri 17 Jan 2014 06:59:52 PM UTC [10:30:10] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [10:35:11] !log powercycling ms-be1012; down for 12h, console unresponsive [10:35:18] Logged the message, Master [10:35:51] rtt min/avg/max/mdev = 932.587/1188.287/1899.698/254.427 ms, pipe 2 [10:35:58] funny to reboot servers like that [10:38:00] RECOVERY - Host ms-be1012 is UP: PING OK - Packet loss = 0%, RTA = 0.26 ms [10:50:53] (03PS1) 10Springle: repool db1041 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/108209 [10:51:28] (03CR) 10Springle: [C: 032] repool db1041 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/108209 (owner: 10Springle) [10:51:35] (03Merged) 10jenkins-bot: repool db1041 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/108209 (owner: 10Springle) [10:52:26] !log springle synchronized wmf-config/db-eqiad.php 'repool db1041' [10:52:32] Logged the message, Master [10:59:10] (03PS1) 10Springle: reassign db1033 to s7 during schema changes [operations/puppet] - 10https://gerrit.wikimedia.org/r/108210 [11:00:55] (03CR) 10Springle: [C: 032] reassign db1033 to s7 during schema changes [operations/puppet] - 10https://gerrit.wikimedia.org/r/108210 (owner: 10Springle) [11:04:03] !log xtrabackup clone db1007 to db1033 [11:04:10] Logged the message, Master [11:48:48] (03PS1) 10Odder: Add Niharika Kohli's blog to the English Planet [operations/puppet] - 10https://gerrit.wikimedia.org/r/108213 [12:34:00] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [12:35:00] RECOVERY - RAID on searchidx1001 is OK: OK: optimal, 1 logical, 4 physical [12:48:35] (03PS2) 10JanZerebecki: Remove wiktionary.wikipedia.org from rewrites as it is not in DNS. [operations/apache-config] - 10https://gerrit.wikimedia.org/r/92799 (owner: 10Reedy) [13:05:50] PROBLEM - Puppet freshness on hooft is CRITICAL: Last successful Puppet run was Fri 17 Jan 2014 06:59:52 PM UTC [13:13:41] (03CR) 10JanZerebecki: [C: 04-1] "Why would this need to allow insecure connections?" [operations/apache-config] - 10https://gerrit.wikimedia.org/r/83565 (owner: 10Reedy) [13:17:50] PROBLEM - MySQL Processlist on db1015 is CRITICAL: CRIT 0 unauthenticated, 0 locked, 0 copy to table, 134 statistics [13:18:50] RECOVERY - MySQL Processlist on db1015 is OK: OK 0 unauthenticated, 0 locked, 0 copy to table, 2 statistics [13:26:49] (03PS6) 10JanZerebecki: Move a lot of the miscellaneous wikis out of their own specific docroots [operations/apache-config] - 10https://gerrit.wikimedia.org/r/90703 (owner: 10Reedy) [15:13:34] (03PS1) 10Springle: warm up db1033 in s7 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/108217 [15:13:52] (03CR) 10Springle: [C: 032] warm up db1033 in s7 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/108217 (owner: 10Springle) [15:13:58] (03Merged) 10jenkins-bot: warm up db1033 in s7 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/108217 (owner: 10Springle) [15:14:48] !log springle synchronized wmf-config/db-eqiad.php 'warm up db1033 in s7' [15:14:54] Logged the message, Master [15:30:28] (03PS1) 10Springle: depool db1028 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/108218 [15:30:41] (03CR) 10Springle: [C: 032] depool db1028 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/108218 (owner: 10Springle) [15:31:37] !log springle synchronized wmf-config/db-eqiad.php 'depool db1028' [15:31:44] Logged the message, Master [15:38:11] !log xtrabackup clone db1007 to db1028 [15:38:17] Logged the message, Master [16:06:50] PROBLEM - Puppet freshness on hooft is CRITICAL: Last successful Puppet run was Fri 17 Jan 2014 06:59:52 PM UTC [18:06:20] PROBLEM - Apache HTTP on mw1158 is CRITICAL: Connection timed out [18:07:10] PROBLEM - Apache HTTP on mw1153 is CRITICAL: Connection timed out [18:07:10] PROBLEM - Apache HTTP on mw1159 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:07:50] PROBLEM - Apache HTTP on mw1154 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:08:40] RECOVERY - Apache HTTP on mw1154 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.063 second response time [18:09:10] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [18:09:20] RECOVERY - Apache HTTP on mw1158 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.166 second response time [18:10:00] RECOVERY - Apache HTTP on mw1153 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.061 second response time [18:10:00] RECOVERY - Apache HTTP on mw1159 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.062 second response time [18:58:30] PROBLEM - Host mw31 is DOWN: PING CRITICAL - Packet loss = 100% [18:59:40] RECOVERY - Host mw31 is UP: PING OK - Packet loss = 0%, RTA = 35.34 ms [19:07:50] PROBLEM - Puppet freshness on hooft is CRITICAL: Last successful Puppet run was Fri 17 Jan 2014 06:59:52 PM UTC [19:09:10] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [19:16:38] (03CR) 10Umherirrender: "FYI: The new limits only affect loggedin users, not anon users, because it not set for 'ip' or/and 'newbie' or 'anon'." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90265 (owner: 10Aaron Schulz) [19:38:18] (03PS2) 10Matanya: ldap: puppet 3 compatibility fix: fully qualify variable [operations/puppet] - 10https://gerrit.wikimedia.org/r/107823 [19:55:33] (03PS2) 10Matanya: udp2log: puppet 3 compatibility fix: fully qualify variable [operations/puppet] - 10https://gerrit.wikimedia.org/r/107828 [19:57:27] (03CR) 10Matanya: "The $host variable is defined in the same scope the template is called, so it should work with an out of scope lookup." [operations/puppet] - 10https://gerrit.wikimedia.org/r/107555 (owner: 10Matanya) [19:58:27] (03CR) 10Matanya: "important typo: should work without an out of scope lookup." [operations/puppet] - 10https://gerrit.wikimedia.org/r/107555 (owner: 10Matanya) [20:02:51] (03CR) 10Matanya: [C: 031] Add Niharika Kohli's blog to the English Planet [operations/puppet] - 10https://gerrit.wikimedia.org/r/108213 (owner: 10Odder) [20:03:58] i just realized (..no pun intended) that i have fundamentally misunderstood virtual resources all this time [20:04:57] hi ori [20:05:06] hello [20:05:32] do you know if the fiber cut was restored? [20:07:13] i'm not sure. i think so, but some links may still be impacted. [20:08:17] (03CR) 10Tim Landscheidt: [C: 04-1] "I'm a bit confused, but I don't quite understand the current code to see if it is equivalent to the proposed change." [operations/puppet] - 10https://gerrit.wikimedia.org/r/107823 (owner: 10Matanya) [20:08:59] ori: As I still have virtual resources to conquer, what was your misconception? [20:10:06] i thought that it's a way for multiple manifests to express a dependency on a resource without running into duplicate definitions [20:10:13] that is subtly wrong [20:10:34] you can't include two classes that each specify @myvirtualresource { 'foo': } [20:10:58] no, you can't [20:11:00] you can have multiple classes declare @myvirtualresource { 'foo': }, but you are supposed to only include one [20:11:14] and realize it from the class that depends on it [20:11:30] That's ... strange. [20:12:07] the intent is to allow for multiple implementations [20:12:50] i agree that they are not very useful [20:16:15] ori: the right term is barely useful [20:16:56] it's amazing that i considered the puppet code in mediawiki-vagrant to be of a high quality but it is littered with voodoo virtual resources everywhere that were never doing what i thought they were doing [20:17:12] cleaning it up now [20:17:31] (03CR) 10Matanya: "Honestly, i don't understand how it worked until now, without the ldap::server::config. the change here fixes the scope of the lookup var," [operations/puppet] - 10https://gerrit.wikimedia.org/r/107823 (owner: 10Matanya) [20:30:29] scfc_de: further explained in the commit message for https://gerrit.wikimedia.org/r/108223 [22:08:50] PROBLEM - Puppet freshness on hooft is CRITICAL: Last successful Puppet run was Fri 17 Jan 2014 06:59:52 PM UTC [22:21:01] (03PS1) 10Matanya: beta: convert into a module [operations/puppet] - 10https://gerrit.wikimedia.org/r/108289 [22:21:42] (03CR) 10jenkins-bot: [V: 04-1] beta: convert into a module [operations/puppet] - 10https://gerrit.wikimedia.org/r/108289 (owner: 10Matanya) [22:22:42] (03PS2) 10Matanya: beta: convert into a module [operations/puppet] - 10https://gerrit.wikimedia.org/r/108289 [22:23:59] (03CR) 10Matanya: "I have created : https://gerrit.wikimedia.org/r/#/c/108289/ in response to your suggestion." [operations/puppet] - 10https://gerrit.wikimedia.org/r/108041 (owner: 10Hashar) [23:59:21] !log Updated kibana to bef3db2