[00:00:02] (thats kind of over generalized, but you get the idea) [00:02:39] yeah, it is weird. i have leaved rillke a msg on his talk page, i am sure he find a fix this again :D [00:07:13] (03CR) 10GWicke: [C: 031] Use timeout to kill hanging PHP processes [operations/puppet] - 10https://gerrit.wikimedia.org/r/131236 (owner: 10Aaron Schulz) [00:52:34] (03CR) 10Ori.livneh: "PS3 renames the module from EditStream to Changes, prompted by Timo's germane observation that not all recent changes are edits, and chann" [operations/puppet] - 10https://gerrit.wikimedia.org/r/131040 (owner: 10Ori.livneh) [02:12:50] PROBLEM - Disk space on virt0 is CRITICAL: DISK CRITICAL - free space: /a 3800 MB (3% inode=99%): [02:15:46] !log LocalisationUpdate completed (1.24wmf2) at 2014-05-03 02:14:43+00:00 [02:15:58] Logged the message, Master [02:20:50] PROBLEM - Disk space on virt0 is CRITICAL: DISK CRITICAL - free space: /a 3434 MB (3% inode=99%): [02:27:45] !log LocalisationUpdate completed (1.24wmf3) at 2014-05-03 02:26:41+00:00 [02:27:50] Logged the message, Master [03:01:01] RECOVERY - Disk space on virt0 is OK: DISK OK [03:10:47] !log LocalisationUpdate ResourceLoader cache refresh completed at Sat May 3 03:09:40 UTC 2014 (duration 9m 39s) [03:10:54] Logged the message, Master [05:36:17] (03CR) 10Ori.livneh: [C: 032] Use timeout to kill hanging PHP processes [operations/puppet] - 10https://gerrit.wikimedia.org/r/131236 (owner: 10Aaron Schulz) [06:02:23] !log disabled puppet on osmium to test hhvm build [06:02:29] Logged the message, Master [08:29:20] PROBLEM - Puppet freshness on osmium is CRITICAL: Last successful Puppet run was Sat May 3 05:29:08 2014 [09:55:22] (03PS1) 10Odder: Add SVG logos for fifteen Wikisource wikis [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/131255 (https://bugzilla.wikimedia.org/52019) [11:30:20] PROBLEM - Puppet freshness on osmium is CRITICAL: Last successful Puppet run was Sat May 3 05:29:08 2014 [14:31:20] PROBLEM - Puppet freshness on osmium is CRITICAL: Last successful Puppet run was Sat May 3 05:29:08 2014 [15:21:00] PROBLEM - HTTP error ratio anomaly detection on tungsten is CRITICAL: CRITICAL: Anomaly detected: 10 data above and 8 below the confidence bounds [15:34:30] PROBLEM - MySQL Replication Heartbeat on db1022 is CRITICAL: CRIT replication delay 309 seconds [15:34:51] PROBLEM - MySQL Slave Delay on db1022 is CRITICAL: CRIT replication delay 317 seconds [15:42:50] RECOVERY - MySQL Slave Delay on db1022 is OK: OK replication delay 0 seconds [15:43:30] RECOVERY - MySQL Replication Heartbeat on db1022 is OK: OK replication delay -1 seconds [15:48:01] PROBLEM - HTTP error ratio anomaly detection on tungsten is CRITICAL: CRITICAL: Anomaly detected: 10 data above and 8 below the confidence bounds [15:58:46] Any ideas why a domain that works in the U.S. might not work in France? [15:59:03] I'm curl -I it and it responds nice from tools-login.wmflabs.org [15:59:30] But when I'm doing the same thing, it responds with a 'couldn't resolve host' message [15:59:34] wikisource.pl is the domain [16:02:54] the other server is, I think, in Roubaix in France: 94.23.242.48 is the IP address [16:20:34] (03CR) 10Krinkle: "I couldn't agree more. Especially considering we're not shutting down irc.wikimedia.org. This will start out as an experimental new servic" [operations/puppet] - 10https://gerrit.wikimedia.org/r/131040 (owner: 10Ori.livneh) [16:26:15] <_joe_> twkozlowski: can you please describe again your problem? [16:26:42] <_joe_> what curl are you performing exactly? [16:27:51] sounds like a DNS caching issue, try to nslookup the host on the various machines [16:29:33] <_joe_> hoo: or we may have some error in our geoip config (not that probable anyway) [16:29:48] <_joe_> anyway, I wanted to check [16:33:38] (03PS3) 10Nemo bis: Remove dead ULS configs after I49e812eae32266f165591c75fd67b86ca06b13f0 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/115880 [16:37:14] (03PS4) 10Nemo bis: Remove dead ULS configs after I49e812eae32266f165591c75fd67b86ca06b13f0 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/115880 [16:42:15] (03Abandoned) 10Nemo bis: Bring together all Translate/Language configurations and dependencies [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/115881 (owner: 10Nemo bis) [16:50:53] (03CR) 10Merlijn van Deen: [C: 04-1] "Looks good to me overall. I have one issue on the Python code (the first inline comment) and two nitpicks (the second and third comment)" (033 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/131040 (owner: 10Ori.livneh) [16:54:00] RECOVERY - HTTP error ratio anomaly detection on tungsten is OK: OK: No anomaly detected [17:08:36] _joe_: sorry, been away [17:09:04] _joe_: the domain wikisource.pl is unavailable to me here in Poland, and also in France, it appears [17:09:36] _joe_: but it does respond to curl -I with HTTP 301 when I run the command from the login shell on tools-login.wmflabs.org [17:09:39] <_joe_> twkozlowski: the curl you perform please :) [17:10:12] curl -I wikisource.pl [17:10:25] (03PS7) 10BryanDavis: Provision scap scripts using trebuchet [operations/puppet] - 10https://gerrit.wikimedia.org/r/129814 [17:10:38] <_joe_> mmmh it's simply not in the external DNS I'd say [17:10:41] <_joe_> let me check [17:11:09] _joe_: I'm logged in right now to an OVH-owned server in France, and curl -I doesn't work from here, either [17:12:06] <_joe_> twkozlowski: the reason it does not work is that the A record for wikisource.pl returns empty from your dns [17:12:33] <_joe_> which traces it up to ns{0,1,2}.wikimedia.org [17:12:52] <_joe_> so, I see that from one of our hosts I do get an A record [17:13:05] 208.80.154.224 is the answer I'm seeing [17:13:09] which is a WMF server [17:13:36] <_joe_> thus, this is a dns configuration [17:13:59] <_joe_> I honestly know too little of our dns setup to know if this is intentional - I tend to assume it is. [17:14:06] https://gerrit.wikimedia.org/r/#/c/126968/ [17:14:12] https://gerrit.wikimedia.org/r/#/c/126969/ [17:14:24] These are the two patched I did with mutante [17:14:29] patches* [17:14:39] <_joe_> twkozlowski: wait. there is something *very* wrong about my ISP's DNS [17:14:54] <_joe_> twkozlowski: dig +trace wikisource.pl [17:15:29] <_joe_> that works [17:16:21] http://www.downforeveryoneorjustme.com/wikisource.pl [17:16:38] (But it /is/ down for me :) [17:16:55] <_joe_> twkozlowski: what DNS do you use on your servers where you see the problem? [17:18:10] <_joe_> I just noticed I configured my dns to forward queries to google dns instead of my ISP's dns because they fiddle with responses. but any other DNS I query does not fail [17:18:44] (03CR) 10BryanDavis: "Should we try something else with this or just remove the cherry-pick from beta and forget about it?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/123444 (owner: 10Hashar) [17:18:47] 8.8.8.8 at home and 213.186.33.99 on the server [17:19:08] <_joe_> twkozlowski: again, 8.8.8.8 is failing [17:19:33] <_joe_> and this is /googles/ fault [17:20:04] <_joe_> I misinterpreted the result because we both were using that DNS [17:20:08] (03PS4) 10Krinkle: Add 'rcstream' module for broadcasting recent changes over WebSockets [operations/puppet] - 10https://gerrit.wikimedia.org/r/131040 (owner: 10Ori.livneh) [17:20:15] (03CR) 10Krinkle: "Reading through the documentation (it mentions "recent changes", and "RCFeed" from mediawiki a fair bit), I think it makes sense to stretc" [operations/puppet] - 10https://gerrit.wikimedia.org/r/131040 (owner: 10Ori.livneh) [17:25:16] _joe_: Mhhm. I just changed my DNS to OpenDNS's 208.67.222.222 and indeed it works. [17:32:20] PROBLEM - Puppet freshness on osmium is CRITICAL: Last successful Puppet run was Sat May 3 05:29:08 2014 [17:33:32] <_joe_> twkozlowski: told you, blame google, the new microsoft [17:37:51] :-) [17:38:59] (03CR) 10Ori.livneh: Add 'rcstream' module for broadcasting recent changes over WebSockets (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/131040 (owner: 10Ori.livneh) [17:39:36] (03CR) 10Ori.livneh: Add 'rcstream' module for broadcasting recent changes over WebSockets (034 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/131040 (owner: 10Ori.livneh) [17:42:05] (03PS5) 10Krinkle: Add 'rcstream' module for broadcasting recent changes over WebSockets [operations/puppet] - 10https://gerrit.wikimedia.org/r/131040 (owner: 10Ori.livneh) [17:42:34] (03CR) 10Krinkle: "Actually renamed the role file, previous patch only made the rename inside file contents." [operations/puppet] - 10https://gerrit.wikimedia.org/r/131040 (owner: 10Ori.livneh) [17:42:35] Krinkle: yes, i was about to fix that ;) [17:42:42] sorry [17:42:50] nah glad you beat me to it [17:43:20] the production redis cluster is password-protected so if we want to use it we should support password uris [17:43:36] otherwise we can use a separate redis instance [17:45:46] and tolerate a redis:// prefix [17:45:50] (03CR) 10Krinkle: Add 'rcstream' module for broadcasting recent changes over WebSockets (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/131040 (owner: 10Ori.livneh) [17:46:20] that way you can just copy-paste the value from $wgRCFeeds['redis']['uri'] [17:46:34] ori: consider the changeset unlocked. [17:46:45] btw, I'm setting up mediawiki-vagrant for the first time at the moment [17:46:47] gerrit mutex! [17:47:04] Hoping it'll be possible to test this class locally. [17:47:21] Do we have a connection with operations/puppet? [17:47:46] Or at least a recommended workflow to import the class there. [17:48:02] (or should I boot a labs instance and selfpuppetmaster it and then check it out with git there?) [17:48:34] yeah, it's one of the things we're working towards [17:48:49] Krinkle: labs + selfpuppetmaster is a better test for ops/puppet than vagrant at the moment [17:49:01] PROBLEM - HTTP error ratio anomaly detection on tungsten is CRITICAL: CRITICAL: Anomaly detected: 10 data above and 3 below the confidence bounds [17:49:06] bd808: not for that change, since it requires trusty [17:49:13] ook [17:49:15] Krinkle: you should edit Vagrantfile and s/precise/trusty/g [17:49:22] does our vagrant one run trusty? [17:49:25] Right.. [17:49:45] I guess that means I need to start over? Assuming that's not something vagrant can migrate [17:49:47] it works with trusty now, but still uses precise by default, because there are a lot of software components that we haven't packaged yet for trusty [17:49:53] no, sorry :( [17:49:54] k [17:50:25] ori: is 'vagrant destroy' enough? (before I change it to trusty and up again) [17:50:28] You just need to `vagrant destroy; vagrant up` [17:50:32] k [17:54:36] Krinkle: i used this to generate fake edits locally for testing: https://gist.github.com/atdt/027eab6cf1db8e310f56 [17:54:46] so you don't have to keep going back to your dev wiki to make silly edits [17:55:12] ori: I was planning on doing a setTimeout tail recursion using mw.Api from the browser console [17:55:30] But this is closer to the source :) [18:23:59] Krinkle: i updated the code and added additional files to https://gist.github.com/atdt/027eab6cf1db8e310f56 [18:24:07] that should be everything you need [18:25:31] note that test.html has to be served via http, you can't just open it, so you can use "python -m SimpleHTTPServer 9292" in the directory where you save the file on your host, and then navigate to 127.0.0.1:9292 [18:25:49] ori: Interesting, patrolled is tinyint but bot is boolean [18:25:52] something to look into [18:26:07] Krinkle: ah, is it working? [18:26:15] no, I'm not there yet [18:26:17] just looking at the gist [18:26:34] still building the linux box (could be done, but working on doc.wm.o at themoment) [18:26:40] i would have preferred lowerInitialCamelCase too rather than words_with_underscore [18:26:52] for keys in the json object [18:26:55] but the tinyint sounds like a big [18:26:57] *bug [18:27:00] yeah [18:27:06] I mean the sql field is tinyint [18:27:21] but should be boolean in the feed formatter [18:27:33] at the very least consistent between patrolled/bot/minor [18:29:04] ditto minor [18:34:27] (03CR) 10Ori.livneh: Add 'rcstream' module for broadcasting recent changes over WebSockets (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/131040 (owner: 10Ori.livneh) [18:35:54] just pick a random name, sartoris-style [18:36:03] it'll help with the indecisiveness :P [18:36:53] (just kidding, I prefer meaningful names) [18:39:27] paravoid: heh :) [19:28:01] RECOVERY - HTTP error ratio anomaly detection on tungsten is OK: OK: No anomaly detected [19:37:01] (03PS1) 10Hoo man: Remove $Id$ SVN leftover from db configuration [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/131276 [19:37:38] (03CR) 10Hoo man: [C: 032] "documentation only change" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/131276 (owner: 10Hoo man) [19:37:45] (03Merged) 10jenkins-bot: Remove $Id$ SVN leftover from db configuration [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/131276 (owner: 10Hoo man) [19:41:19] !log hoo synchronized wmf-config/ 'Documentation only change' [19:41:27] Logged the message, Master [20:33:20] PROBLEM - Puppet freshness on osmium is CRITICAL: Last successful Puppet run was Sat May 3 05:29:08 2014 [20:53:01] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 6.67% of data exceeded the critical threshold [500.0] [21:06:01] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1.00% data above the threshold [250.0] [23:02:20] PROBLEM - Host ms-be3003 is DOWN: PING CRITICAL - Packet loss = 100% [23:02:40] PROBLEM - Host cp3012 is DOWN: PING CRITICAL - Packet loss = 100% [23:03:31] PROBLEM - Host nescio is DOWN: PING CRITICAL - Packet loss = 100% [23:03:40] PROBLEM - Host bits-lb.esams.wikimedia.org is DOWN: PING CRITICAL - Packet loss = 100% [23:03:51] PROBLEM - Host cp3022 is DOWN: PING CRITICAL - Packet loss = 100% [23:05:21] hi. [23:05:40] RECOVERY - Host cp3012 is UP: PING OK - Packet loss = 0%, RTA = 96.97 ms [23:05:40] RECOVERY - Host cp3022 is UP: PING OK - Packet loss = 0%, RTA = 96.93 ms [23:05:40] RECOVERY - Host ms-be3003 is UP: PING OK - Packet loss = 0%, RTA = 95.42 ms [23:06:01] RECOVERY - Host nescio is UP: PING OK - Packet loss = 0%, RTA = 97.57 ms [23:06:10] RECOVERY - Host bits-lb.esams.wikimedia.org is UP: PING OK - Packet loss = 0%, RTA = 97.22 ms [23:06:21] hrm no idea what that was about [23:06:35] nescio did not reboot [23:34:22] PROBLEM - Puppet freshness on osmium is CRITICAL: Last successful Puppet run was Sat May 3 05:29:08 2014