[02:22:35] Zhaofeng_Li: It probably wouldn't be a good idea; we use the grid for resource allocation and gearman wouldn't play nice with it. [03:31:30] Is the "Migration to eqiad" announcement at the top of [[Help:Tool Labs]] still relevant? [04:01:17] Pathoschild: Probably not; we wanted to keep it up as long as it was plausible that someone had been away and had issues with their tools because of it, but I should expect that there are few enough of them left that individual help would suffice. [04:01:56] Sounds good. I'll leave a see-also link at the bottom so we can find it if needed. [04:03:06] Or just remove it, since it's in [[Category:Tool Labs]]. [04:03:07] Pathoschild: Remember to move the subpages - I had begun reorganizing the docs somewhat and much is now on subpages. [04:04:29] Will do. [04:06:15] But yeah, I agree the Help: namspace is a better home. [04:16:32] All subpages moved and categorised. [08:26:11] 3Wikimedia Labs / 3deployment-prep (beta): File upload area resorts to 0777 permissions to for uploaded conent - 10https://bugzilla.wikimedia.org/73206#c2 (10Antoine "hashar" Musso (WMF)) We had system users created in LDAP already, bug 66575 for cxserver and bug 63329 for parsoid. Maybe we need to create i... [08:28:21] Seems Google Chrome 41 no longer likes the SSL certificates you've issued for 4 years: http://i.imgur.com/BTyYah3.png [08:31:55] ebraminio: complain to Chromium project [08:32:18] ebraminio: that cert has been issued just a year ago and we are going to migrate bugzilla soon [08:33:45] ebraminio: hmmm interesting [08:33:56] thanks for this [08:34:38] I knew Chrome 41 was going to be more stringent on SHA1 certificates, I hadn't realized they were also going to do the > 39 months thing then [08:44:59] bah [08:45:11] paravoid: just sent a mail to ops, haven't noticed you replied there :D [08:46:58] thank you [08:57:05] ebraminio: so the gist of it is, chrome 41 is due to be released in April [08:57:16] and April is when the CA Browser Forum's criteria are tightened [08:58:05] we'll fix this eventually, but our deadline is April, not November, so you might have to suffer for a while longer :) [08:58:28] Interesting. Thank you :) [08:59:06] ebraminio: and Faidon kindly filled an internal bug about it . You might want to fill a public one in bugzilla so we can point end users to it :] [08:59:18] one sure thing: that is a nice catch [09:01:55] Thank you [09:03:27] hashar: please don't [09:03:31] having to sync two bugs [09:03:47] well one is going to be internal (RT) isn't it ? [09:03:57] it's fine [09:04:03] we'll migrate it to phabricator eventually [09:04:24] there is no "core-ops" in bugzilla mind you, never was [09:16:16] paravoid: yeah so we end up filling public bugs on bugzilla and refer to the RT ticket :D [09:16:53] no we generally don't do that [12:26:10] 3Tool Labs tools / 3Erwin's tools: Erwin's tools: Broken layout due to shared.css 404 Not Found - 10https://bugzilla.wikimedia.org/71482 (10Krinkle) [13:10:23] can someone please re-add doctaxon to tools.giftbot? i accidentally removed them and now there's noone left in the group [13:11:31] err, the user is named taxonbot [13:13:57] and Doc Taxon in wikitech [13:34:15] coren maybe, when you're there? [13:54:59] gifti: Gimme a sec [13:55:33] gifti: But not you? [13:56:34] gifti: {{done}} [14:06:22] Coren: many thanks [14:29:23] I've written a service for translating links based on their interwikis. An example: http://tools.wmflabs.org/linkstranslator/?p=France&p=Germany&from=enwiki&to=dewiki I wrote that for my wiki gadgets and tools but hope this would be useful for someone else or even bots. It is done via nodejs and also temporary caches them for better performance, it also resolve local wiki redirects. [15:59:02] Coren: A moment of your time? [17:53:36] so, I am getting a 502 Bad Gateway nginx error on my node app that's running on labs [17:53:39] any ideas? [17:53:44] post requests work great [17:53:52] but that happens for get requests [17:54:10] which are nearly identical except for the get vs post thing. [17:54:12] mvolz: heya! 502 probably is your webserver not responding... [17:54:25] yeah well the thing is, the post requests work [17:54:28] :P [17:54:44] mvolz: let me tail the nginx logs, moment [17:55:18] sweet thanks. are those possible for me to look at (i have no idea where they live) [17:55:56] mvolz: can you make the request now? [17:56:14] mvolz: they live on dynamicproxy-gateway.eqiad.wmflabs, not many people have direct access. [17:56:30] mvolz: also what host are you hitting? also which URL? [17:56:37] ok [17:56:49] https://citoid.wmflabs.org/api?search=10.1038%2Fng.590&format=mwDeprecated [17:57:44] 2014/11/10 17:56:29 [error] 12326#0: *176299388 upstream prematurely closed connection while reading response header from upstream, client: xxx.xxx.xxx.12, server: , request: "GET /api?search=10.1038%2Fng.590&format=mwDeprecated HTTP/1.1", upstream: "http://10.68.17.143:1970/api?search=10.1038%2Fng.590&format=mwDeprecated", host: "citoid.wmflabs.org" [17:58:12] mvolz: so your node app is crashing in some fashion in the middle of processing the request, and just for GET [17:58:39] weird. works on localhost and nothing interesting is making it to my logs :/ [17:58:56] thanks :) [17:59:27] mvolz: yw! let me know if you want me to tail again [18:00:29] gi11es: hey! Thanks for silencing the cron :) did you get all of 'em? [18:01:02] nvm, there are interesting things in my logs, ha [18:01:11] :D [18:01:39] YuviPanda: as far as I know, yes [18:01:50] gi11es: thanks! :) I'll poke again if I see more [18:46:47] andrewbogott: Sorry, I had to go out to buy a window. What's up? [18:46:56] Coren: errant baseball? [18:47:23] Coren: I wrote to the ops list with details. Just looking to fob off this puppet/packaging problem that's driving me crazy. [18:47:36] No, new wall - but the contractor was done way early and would have stalled waiting for the window otherwise. [18:48:24] Ah, that thing is still beating you up? :-( [18:49:53] It's the same manifest, although I think a different issue. [18:49:54] Well, not sure. [18:50:39] The thing with the config files that you diagnosed is now fixed. But now I've removed anything to do with config files and it's still broken [18:50:59] o_O [18:51:24] Well, most likely I'm making a dumb, obvious mistake :) [18:51:27] I kind of hope so [18:52:20] I don't see any dumb mistakes in there. [18:52:39] (Nor smart ones, for that matter). [19:20:18] hmm, my vagrant is stuck on trying to setup the NFS mount. anyone seen something like that before ? [19:20:30] mount -o 'noatime,rsize=32767,wsize=32767,async' 10.11.12.1:'/Users/hartman/Development/vagrant/vagrant-math' /vagrant (sudo=true) [19:20:33] DEBUG ssh: Sending SSH keep-alive... [19:20:43] did it ask for sudo? [19:21:04] yup [19:21:13] thedj: It can take a while sometimes. On my OSX host I have seen it take up to 3min :( [19:21:28] # Sometimes restarting the nfs server seems to wake it up [19:21:54] o_0 not working at all then [19:22:07] i dunno, i think it is confused with the interface it's trying to use... [19:22:30] what host os and what version of Vagarant (vagrant --version) [19:23:20] Vagrant 1.6.3 on yosemite [19:23:40] * thedj runs apt-get upgrade... [19:24:31] hmm, this seems to be more common... [19:24:36] stackoverflow says: sudo rm -f /etc/exports [19:24:37] The file will be recreated during the vagrant up process. [19:25:28] If it keeps giving you fits, you can disable nfs with `vagrant config nfs_shares no` [19:26:20] That will make our Vagrantfile config fall back to provider's native share mode [19:33:44] bd808: killing the vagrant lines in the hosts /etc/exports did the trick indeed [19:33:52] cool [19:33:57] well. not really :)] [19:34:26] but it's nice that it boots normally now [19:39:19] or not... [19:40:18] no nfs it is then. [19:42:06] Hello! I have an application on tools.wmflabs.org that is currently coming back "not found". [19:42:45] What can I do to restore it? (I can log in, and see the scripts and my data in the directory; I gather there was an outage on the 6th but things are back up?) [19:43:52] JMarkOckerbloom: tried: webservice start [19:43:55] ? [19:45:18] Haven't tried that yet; do I need to give it particular arguments, or just "webservice start"? [19:46:25] you login, then do 'become toolname', then run 'webservice status' (or start or stop) [19:46:58] Thanks! Looks like it's back now. [19:51:15] ah, now i understand why i couldn't connect to my instance... [19:51:56] took lynx to figure it out :) [19:52:05] File not found: /vagrant/mediawiki/skins/Vector/SkinVector.php [19:52:12] Error decompressing http://localhost/wiki/Main_Page with zlib: incorrect header check [20:05:28] In general, if the server goes down again like that, is there anything I can do to ensure my service comes back up with the server? [20:05:50] JMarkOckerbloom: yes, use bigbrother. [20:06:00] Coren: did the docs for bigbrother ever migrate to wikitech? [20:06:40] i don't think so [20:06:43] https://wikitech.wikimedia.org/w/index.php?search=bigbrother&title=Special%3ASearch&go=Go [20:07:27] If someone could put up docs on using bigbrother on the tools server, I'd appreciate it. [20:12:31] JMarkOckerbloom: there should be something in the mailng list archives [20:15:17] one of most annoying issue of librsvg is fixed recently, what should be done for updating wikimedia package? [20:15:44] Thanks! Found a note from July from Mac Pelletier; looks like if I create a ~/.bigbrotherrc file with the single line "webservice", it should renew service after restart. [20:15:45] ebraminio: file a bug, point to new packages? [20:16:02] (sorry, Marc Pelletier, not Mac) [20:19:36] YuviPanda: I hope it was like submitting a patch for a wikimedia git somewhere [20:19:44] *hoped [20:19:56] ebraminio: kind of, but we also need to upload the new package [20:22:57] YuviPanda: I think it is not packaged yet but the issue is one of most annoying ones which is fixed https://bugzilla.gnome.org/show_bug.cgi?id=620923 [20:23:38] ebraminio: awww, that'll make things much harder [20:26:06] ebraminio: still, do file a bug? [20:27:16] YuviPanda: No, and it really really shoulf. I will do so presently. [20:27:29] ty [20:34:49] Just recently the bug affected lots of SVGs uploaded on http://commons.wikimedia.org/wiki/Category:Material_design_icons I hope you fix it soon [21:39:25] !log deployment-prep deployment-pdf02 is not responding to git-deploy for OCG [21:39:32] Logged the message, Master [21:40:16] !log updated OCG to version d9855961b18f550f62c0b20da70f95847a215805 (skipping deployment-pdf02) [21:40:16] updated is not a valid project. [21:41:32] !log deployment-prep updated OCG to version d9855961b18f550f62c0b20da70f95847a215805 (skipping deployment-pdf02) [21:41:35] Logged the message, Master [21:42:03] cscott: Any good error message from the failure? [21:42:28] I just restarted the salt-minon process on pdf02 to see that would help [21:43:27] bd808: no, just: [21:43:27] Repo: ocg/ocg [21:43:27] Tag: ocg/ocg-sync-20141110-213043 [21:43:27] 1/2 minions completed fetch [21:43:29] Details: [21:43:29] i-000005d2.eqiad.wmflabs: [21:43:31] fetch status: 128 [started: 3 mins ago, last-return: 3 mins ago] [21:43:42] going to retry the sync now after your restart [21:44:05] "status: 128" is the magic bit. I'll see if I can find out what that error code means in the source [21:44:48] there were some dmesg lines on deployment-pdf02 complaining that the nfs server had gone down and then came back up, i was wondering if that's what might have killed salt-minion [21:45:18] bd808: i just retried it, it's still failing with status 128. [21:45:38] I don't see that code in the module -- https://github.com/wikimedia/operations-puppet/blob/production/modules/deployment/files/modules/deploy.py [21:45:53] I'll run manually on pdf02 and maybe see the real problem [21:48:06] cscott: "Command '/usr/bin/git fetch' failed with return code: 128" [21:48:14] "output: error: object file .git/objects/13/46045ceeb701bff79d352feef7ddb1c1cab127 is empty" [21:48:23] "fatal: loose object 1346045ceeb701bff79d352feef7ddb1c1cab127 (stored in .git/objects/13/46045ceeb701bff79d352feef7ddb1c1cab127) is corrupt" [21:48:33] that's certainly strange [21:48:41] filesystem corruption? [21:48:47] I got that by running `sudo salt-call deploy.fetch 'ocg/ocg'` on pdf02 [21:48:54] maybe 'git gc' will take care of it, since it's a loose object? [21:49:03] * bd808 can try [21:49:23] bd808: i'm busy finishing the ocg and parsoid deploys, but when i'm done with that i can lend a hand to debug this further [21:50:26] cscott: Looks like the local clone is jacked up. git-gc can't fix it. [21:50:40] Probably needs to be blown away and resynced [21:50:51] I'll leave that fun for you :) [21:51:12] bd808: ok. pdf02 is the scratch copy of the service anyway. [21:51:40] cscott: https://wikitech.wikimedia.org/wiki/Trebuchet#Troubleshooting may be of use when you try to rebuild it [21:51:52] thanks [22:35:07] bd808: ok, i rsync'ed .git from pdf01 to pdf02, everything looks happy now. [22:35:19] cool [22:35:29] you should !log it :) [22:36:38] `git deploy status restart` still says: [22:36:38] i-000005d2.eqiad.wmflabs: True [22:36:38] i-00000396.eqiad.wmflabs: No status available [22:36:55] though. that's pdf01 which has the issue. i've been just manually restarting pdf01 for a while. [22:37:25] !log deployment-prep rsync'ed .git from pdf01 to pdf02 to resolve git-deploy issues on pdf02 (git fsck on pdf02 reported lots of errors) [22:37:29] Logged the message, Master