[00:00:57] legoktm: That's ... odd (I'm sure "tim" is not forwarded to my mail address :-), so the domain should be honoured). [00:01:30] Let me see if I can find something in the logs. [00:02:24] the email you sent just ended up in .wikipedia's too...let me make sure I don't have a crazy thunderbird filter on [00:03:32] Nope, I can see it in the exim mainlog: "2014-07-25 00:01:50 1XASxK-0002vv-86 <= scfc@tools.wmflabs.org H=tools-login.eqiad.wmflabs [10.68.16.7] U=Debian-exim P=esmtp S=765 id=E1XASxK-0001Cf-4s@tools-login.eqiad.wmflabs" [00:03:40] "2014-07-25 00:01:50 1XASxK-0002vv-86 => legoktm.wikipedia@gmail.com R=dnslookup T=remote_smtp H=gmail-smtp-in.l.google.com [74.125.193.27] X=TLS1.0:RSA_ARCFOUR_SHA1:16" [00:03:50] "2014-07-25 00:01:50 1XASxK-0002vv-86 Completed" [00:05:06] legoktm: Yep, it even translates legoktm@tim-landscheidt.de => legoktm.wikipedia@gmail.com; so our exim config seems to be broken?! [00:05:13] lolwat [00:05:38] is it the legoktm@ part? [00:06:22] Probably. I'll file a bug; debugging exim isn't that much fun at this time of day here :-). [00:09:36] well, as long as the emails get to me somehow I guess it works :P [00:15:15] 3Wikimedia Labs / 3tools: Mail from the command line appears to ignore the domain name if the local part exists as a local user - 10https://bugzilla.wikimedia.org/68545 (10Tim Landscheidt) 3NEW p:3Unprio s:3normal a:3Tim Landscheidt "echo Test | mail -s Test tim@tim-landscheidt.de" gets delivered to... [00:19:48] (03PS1) 10Dzahn: add jouncebot changes to -operations channel [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/149201 [00:20:48] (03CR) 10Mwalker: [C: 032] add jouncebot changes to -operations channel [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/149201 (owner: 10Dzahn) [00:20:51] (03Merged) 10jenkins-bot: add jouncebot changes to -operations channel [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/149201 (owner: 10Dzahn) [00:41:10] Sandaru? [00:41:13] are you here? [01:31:58] 3Wikimedia Labs: centralauth_p is missing tables - 10https://bugzilla.wikimedia.org/66533#c5 (10Marc A. Pelletier) Ignore my previous comment; that wasn't intended on this bug. I have to check whether there are suppression issues in those tables first. More news soon. [01:32:42] 3Wikimedia Labs: Replicate centralauth.renameuser_status table to labs - 10https://bugzilla.wikimedia.org/68356#c1 (10Marc A. Pelletier) That table is available (but empty atm). [01:34:58] 3Wikimedia Labs: centralauth_p is missing tables - 10https://bugzilla.wikimedia.org/66533#c6 (10Kunal Mehta (Legoktm)) There are suppressed usernames in all 3 tables. spoofuser won't have the exact username since it would probably be normalized though. [01:36:28] 3Wikimedia Labs: Replicate centralauth.renameuser_status table to labs - 10https://bugzilla.wikimedia.org/68356#c2 (10Kunal Mehta (Legoktm)) Yes, it maintains status of CentralAuth renameuser jobs and if there are no jobs running (most of the time) the table is empty. [03:32:19] Coren: can we get a flag for jsub to just print out what the args it came up with are (and not actually run qsub) [03:32:20] ? [06:06:51] eh. Not possible to create instance with medium/large disk size from config dropdown? [09:18:16] how much space does labs provide?? (to store 1000s of pdfs) [09:45:09] !log deployment-prep rebasing puppet repo to get a ocg patch [09:45:12] Logged the message, Master [10:27:28] hashar: I can only create instance with small size. Not with medium/large. [10:27:38] kart_: out of quota I guess [10:27:54] oh ok :) [10:28:06] https://wikitech.wikimedia.org/w/index.php?title=Special:NovaProject&action=displayquotas&projectname=deployment-prep [10:28:10] Cores: 96/106 [10:28:10] RAM: 196608/262144 [10:28:10] Floating IPs: 5/5 [10:28:10] Instances: 36/40 [10:28:10] Security Groups: 2/10 [10:28:45] available: 10 cores 4 instances looooot of ram [10:32:26] kart_: no clue what can happen [10:32:48] hashar: this is for language team (the above issue) [10:32:51] kart_: for the language project look at https://wikitech.wikimedia.org/w/index.php?title=Special:NovaProject&action=displayquotas&projectname=language [10:32:57] that shows you the quotas for your project [10:33:01] might want to delete some old instance [10:39:17] hashar: sure. thanks! [10:56:29] (03CR) 10Qgil: "What is the status of this patch that now is more than a year old? It looks like it was close to be merged, but now it depends on an aband" [labs/toollabs] - 10https://gerrit.wikimedia.org/r/70110 (owner: 10Platonides) [11:00:25] (03CR) 10Qgil: "No reply and no reviews. Should this changeset be abandoned or is it still WIP?" [labs/toollabs] - 10https://gerrit.wikimedia.org/r/104397 (owner: 10Tim Landscheidt) [13:03:57] 3Wikimedia Labs / 3tools: Mail from the command line appears to ignore the domain name if the local part exists as a local user - 10https://bugzilla.wikimedia.org/68545#c1 (10Tim Landscheidt) On tools-login: | scfc@tools-login:~$ { echo 'Subject: Test, please ignore'; echo 'To: legoktm@tim-landscheidt.de';... [13:09:57] 3Wikimedia Labs / 3tools: Mail from the command line appears to ignore the domain name if the local part exists as a local user - 10https://bugzilla.wikimedia.org/68545#c2 (10Tim Landscheidt) And, indeed: | scfc@tools-mail:~$ sudo exim -d -bt legoktm@tim-landscheidt.de | [...] | >>>>>>>>>>>>>>>>>>>>>>>>>>>>... [14:15:49] is it still possible to create new ubuntu-12.04-precise instances? In my recent attempt I get a failed Puppet status [14:15:58] 3Wikimedia Labs / 3tools: Mail from the command line appears to ignore the domain name if the local part exists as a local user - 10https://bugzilla.wikimedia.org/68545#c3 (10Tim Landscheidt) 5NEW>3ASSI According to http://www.exim.org/exim-html-current/doc/html/spec_html/ch-the_default_configuration_fil... [14:16:32] physikerwelt: Can you log into the instance, or is the console output in wikitech meaningful? [14:16:47] I can login but I can not get root access [14:18:08] What does /var/log/puppet.log say? [14:18:13] the first error message I see in the console log is Jul 25 14:10:53 mathoid-puppet ntpd[5269]: bind(22) AF_INET6 fe80::f816:3eff:feaa:bdad%2#123 flags 0x11 failed: Cannot assign requested address [14:19:12] Yep, that's a recurring error that can be safely ignored. But you mentioned a "failed Puppet status". [14:19:42] I copied the output to https://gist.github.com/physikerwelt/995b5c77f6f3fc4198ab [14:21:28] beta labs not answering at all right now http://en.wikipedia.beta.wmflabs.org/ [14:22:08] physikerwelt: The latter part looks alright, the first line fishy: "ould not parse configuration file: Certificate names must be lower case; see #1168". What's the missing part? (Often, Puppet will right itself after a second run, so waiting half an hour *might* solve this as well.) [14:22:55] scfc_de: A "C" is missing [14:24:10] scfc_de: I can use a trusty instance if I get a negative answer to my question about the ubuntu version used in production "for the mathoid developments I'm trying to test my puppet role. the manual on https://wikitech.wikimedia.org/wiki/Help:Self-hosted_puppetmaster says that I should use a precise instance. is that still up to date for new developments?" [14:25:19] physikerwelt: I did create a Precise instance just ... yesterday? And it worked fine. Re what's used in production, don't know. [14:25:19] scfc_de: Since I have no root access I can not trigger a new puppet run... Thus I'll wait another 30 minutes and see what happens [14:25:33] (03PS2) 10Andrew Bogott: Fix anchor links from status to list page [labs/toollabs] - 10https://gerrit.wikimedia.org/r/106173 (owner: 10Tim Landscheidt) [14:26:26] scfc_de: did you fix it. Now everything is all right. [14:26:40] (03CR) 10Andrew Bogott: [C: 032] Fix anchor links from status to list page [labs/toollabs] - 10https://gerrit.wikimedia.org/r/106173 (owner: 10Tim Landscheidt) [14:27:37] physikerwelt: Hmmm. [14:27:46] 3Wikimedia Labs / 3deployment-prep (beta): beta labs not responding; API shows 503 from varnish - 10https://bugzilla.wikimedia.org/68574 (10Chris McMahon) 3NEW p:3Unprio s:3normal a:3None Neither http://en.wikipedia.beta.wmflabs.org/ nor http://en.wikipedia.beta.wmflabs.org/w/api.php are responding r... [14:27:58] 3Wikimedia Labs / 3deployment-prep (beta): beta labs not responding; API shows 503 from varnish - 10https://bugzilla.wikimedia.org/68574 (10Chris McMahon) s:5normal>3major [14:28:21] (03CR) 10Andrew Bogott: [C: 031] "This needs a by-hand rebase." [labs/toollabs] - 10https://gerrit.wikimedia.org/r/119882 (https://bugzilla.wikimedia.org/62710) (owner: 10Tim Landscheidt) [14:30:12] scfc_de: I think my fault was that I was irritated by the puppet status failed... I think the way to go is just to wait [14:30:32] andrewbogott: With the recent commit, all the stuff in labs/toollabs that touches changelog needs to be rebased; I'll do that in a bit. [14:30:45] scfc_de: ok [14:31:07] I'm trying to take care of Quim's request to unblock those patches, but it turns out I don't have merge rights anyway :) [14:34:33] Coren: you mean you don't have merge rights either? Who does, then? [14:35:09] No, I mean that I do but that I thought all of ops did. [14:35:40] And that gerrit is an inscrutable black box and I'm not sure how to go about making this match reality. :-) [14:35:59] Coren: poke qchris and things will be alright :D [14:36:02] I will look... [14:37:19] * Coren goes to take a peek at those patches too. [14:37:23] And now I'm looking at https://gerrit.wikimedia.org/r/#/admin/projects/labs/toollabs,access which is my favorite interface ever [14:38:38] Hm, I guess 'push' == 'submit'? [14:39:00] (03CR) 10coren: [C: 04-2] "Time travel snapshots are not coming back; they were too unstable for long-term use. Backups of user data are coming back, but not in the" [labs/toollabs] - 10https://gerrit.wikimedia.org/r/76313 (owner: 10Platonides) [14:40:16] dammit [14:40:34] qchris: do you understand how to manage gerrit privs? [14:42:46] Oh bah, version numbering woes. [14:43:40] I need to do a lot of cherry picking and merges for this. "yeay". [14:43:48] * Coren beats self up. [14:44:38] I just add all 'ops-reviewers' as owner of the project. Which apparently does… nothing [14:48:19] * qchris reads backscroll [14:50:04] andrewbogott: You just want to see get changes merged? [14:50:25] andrewbogott: Add yourself to the labs-toollabs group. [14:50:29] qchris: I want all Ops to have be able to merge changes in that repo [14:50:32] Ideally [14:50:55] Then add the that group to labs-toollabs group [14:51:01] https://gerrit.wikimedia.org/r/#/admin/groups/461,members [14:51:06] ok... [14:51:13] why did the thing I just did (ops-reviewers) not help? [14:51:17] You can use ldap/.... to get ldap groups. [14:51:26] Oh, maybe because that group is empty [14:52:04] "ldap/ops" might be the group you're looking for. [14:52:16] But Owners need not have "Submit" access. [14:52:18] scfc_de: In theory you now have +2/merge in that repo. Do you? [14:52:26] Oh, where is that managed then? [14:52:36] At https://gerrit.wikimedia.org/r/#/admin/projects/labs/toollabs,access [14:52:43] Last row. [14:52:59] But please just use the labs-toollabs group. [14:52:59] Ah, sure. OK. [14:53:44] I added ldap/ops to labs-toollabs right now. [14:53:50] Wait, so did I... [14:54:14] well, anyway [14:54:42] (03PS4) 10coren: become: Add --help option [labs/toollabs] - 10https://gerrit.wikimedia.org/r/119882 (https://bugzilla.wikimedia.org/62710) (owner: 10Tim Landscheidt) [14:54:44] Looks like I have 'submit' now, so that's good. Thanks qchris [14:54:48] yw [14:55:55] andrewbogott: That said, I have a metrick fsckton of hand merges to do, so beware. :-) [14:56:13] Coren: Yeah, I'm staying out of it until I hear otherwise [14:56:23] (03CR) 10coren: [C: 032] become: Add --help option [labs/toollabs] - 10https://gerrit.wikimedia.org/r/119882 (https://bugzilla.wikimedia.org/62710) (owner: 10Tim Landscheidt) [14:57:36] andrewbogott: At lunch ATM, 'll try later. [14:57:46] no rush! [14:59:00] andrewbogott: Did legoktm ping you about email output from echo on wikitech? [14:59:50] andrewbogott: I haven't seen an echo email since my patch went live and legoktm had an idea of what the problem and fix was. [15:00:07] (03PS3) 10coren: Package webservice [labs/toollabs] - 10https://gerrit.wikimedia.org/r/122841 (https://bugzilla.wikimedia.org/66845) (owner: 10Tim Landscheidt) [15:01:02] andrewbogott: It had something to do with $wgEchoEnableEmailBatch and adding a cron job but I don't know the specifics [15:01:14] 3Wikimedia Labs / 3deployment-prep (beta): beta labs mysteriously goes read-only overnight - 10https://bugzilla.wikimedia.org/65486#c12 (10Chris McMahon) This seems to have been fixed by https://gerrit.wikimedia.org/r/#/c/149052/ Thanks Sam! [15:01:14] 3Wikimedia Labs / 3deployment-prep (beta): beta labs mysteriously goes read-only overnight - 10https://bugzilla.wikimedia.org/65486 (10Chris McMahon) 5NEW>3RESO/FIX [15:02:00] bd808: um… no, this is the first I'm hearing of it [15:02:10] (03CR) 10coren: [C: 032] Package webservice [labs/toollabs] - 10https://gerrit.wikimedia.org/r/122841 (https://bugzilla.wikimedia.org/66845) (owner: 10Tim Landscheidt) [15:02:17] bd808: wasn't echo just sending empty emails before? [15:02:21] Or was it sort of working? [15:02:44] It was "sort of" working in that it sent an email that said "you have a notificaiton" [15:02:50] woo [15:02:54] Now no emails at all [15:03:07] So now it's more broken, but also better :) [15:03:16] yeah. :/ [15:03:39] Kunal thought that the emails were probably piling up in the db [15:03:46] with no job to tell them to send [15:04:13] Why would that have changed? [15:04:57] 3Wikimedia Labs / 3deployment-prep (beta): beta labs not responding; API shows 503 from varnish - 10https://bugzilla.wikimedia.org/68574 (10Greg Grossmeier) p:5Unprio>3Immedi s:5major>3blocke [15:05:08] This supposition was that adding the batch formatting messages like I did may have changed the internal behavior of Echo. [15:05:48] (03PS5) 10coren: become: Make more user-friendly [labs/toollabs] - 10https://gerrit.wikimedia.org/r/147096 (https://bugzilla.wikimedia.org/68156) (owner: 10Tim Landscheidt) [15:06:21] bd808: ok… I guess we wait for legoktm to show up and enlighten us... [15:06:52] Bah, idiot merged a conflict [15:08:15] 3Wikimedia Labs: [Regression] wikitech.wikimedia.org is sending empty Echo notification emails - 10https://bugzilla.wikimedia.org/53778#c8 (10Bryan Davis) [2014-07-24T03:36:49] andrewbogott_afk: I haven't gotten any echo notification emails from wikitech since Sun, 20 Jul 2014 21:38:05 +0000. I'm more... [15:08:41] (03PS6) 10coren: become: Make more user-friendly [labs/toollabs] - 10https://gerrit.wikimedia.org/r/147096 (https://bugzilla.wikimedia.org/68156) (owner: 10Tim Landscheidt) [15:09:56] andrewbogott: I poked him about it again yesterday morning and he said he'd put something in bugzilla but I can't find it. I pasted some old irc log info into bug 53778. [15:10:07] ok, thanks [15:10:08] (03CR) 10coren: [C: 032] become: Make more user-friendly [labs/toollabs] - 10https://gerrit.wikimedia.org/r/147096 (https://bugzilla.wikimedia.org/68156) (owner: 10Tim Landscheidt) [15:10:12] I'll poke him again at some point today [15:14:41] (03PS2) 10coren: Fix Lintian errors in misctools man pages [labs/toollabs] - 10https://gerrit.wikimedia.org/r/122628 (owner: 10Tim Landscheidt) [15:16:24] (03PS3) 10coren: Fix Lintian errors in misctools man pages [labs/toollabs] - 10https://gerrit.wikimedia.org/r/122628 (owner: 10Tim Landscheidt) [15:21:28] coren .. an update ? [15:31:30] (03CR) 10coren: [C: 032] Fix Lintian errors in misctools man pages [labs/toollabs] - 10https://gerrit.wikimedia.org/r/122628 (owner: 10Tim Landscheidt) [15:33:58] (03PS2) 10coren: Simplify toolwatcher [labs/toollabs] - 10https://gerrit.wikimedia.org/r/122094 (owner: 10Tim Landscheidt) [15:36:19] (03CR) 10coren: [C: 032] Simplify toolwatcher [labs/toollabs] - 10https://gerrit.wikimedia.org/r/122094 (owner: 10Tim Landscheidt) [15:38:54] (03PS3) 10coren: Retrieve $PACKAGE_VERSION from debian/changelog [labs/toollabs] - 10https://gerrit.wikimedia.org/r/106747 (owner: 10Tim Landscheidt) [15:40:11] (03CR) 10coren: [C: 032] Retrieve $PACKAGE_VERSION from debian/changelog [labs/toollabs] - 10https://gerrit.wikimedia.org/r/106747 (owner: 10Tim Landscheidt) [15:41:29] (03PS4) 10coren: Fix Lintian errors in jobutils man pages [labs/toollabs] - 10https://gerrit.wikimedia.org/r/106648 (owner: 10Tim Landscheidt) [15:42:38] (03CR) 10coren: [C: 032] Fix Lintian errors in jobutils man pages [labs/toollabs] - 10https://gerrit.wikimedia.org/r/106648 (owner: 10Tim Landscheidt) [15:44:09] (03PS5) 10coren: Work around pbuilder not properly setting $USER [labs/toollabs] - 10https://gerrit.wikimedia.org/r/106283 (owner: 10Tim Landscheidt) [15:49:05] petan: hi, we restarted wmbot yesterday which fixed an issue where it would not join new channels anymore, and also updated the docs, because the user it runs as changed to "wm-bot" (from wmib) [15:49:31] after that it worked again, now i hear "it has a funny name in #mediawiki" but not sure if that is caused by it [15:49:49] !log deployment-prep Restarted logstash on deployment-logstash1 [15:58:17] (03CR) 10coren: [C: 032] Work around pbuilder not properly setting $USER [labs/toollabs] - 10https://gerrit.wikimedia.org/r/106283 (owner: 10Tim Landscheidt) [16:00:07] (03PS5) 10coren: Fix build and run-time dependencies [labs/toollabs] - 10https://gerrit.wikimedia.org/r/106281 (owner: 10Tim Landscheidt) [16:00:08] !log deployment-prep udp2log events not being sent from deployment-bastion to deployment-logstash1 [16:00:39] !log deployment-prep Stopped udp2log and started udp2log-wm with no apparent effect [16:02:35] (03CR) 10coren: [C: 032] Fix build and run-time dependencies [labs/toollabs] - 10https://gerrit.wikimedia.org/r/106281 (owner: 10Tim Landscheidt) [16:02:56] hi [16:03:11] I had a question about tools [16:03:37] so when I ssh to my tools account, I can see cgi-bin, public html, some logs, and any folders I created myself [16:03:50] but from the web I can only see the contents of public_html right? [16:04:04] so if I wanted to make part of my backend php private [16:04:28] could I put the file in one of my self created folders in the project's root directory [16:04:45] and then import it in main.php in the public_html folder? [16:04:48] like is that possible> [16:04:56] kangaroopower: Keeping in mind that any code running on tools must be open source. [16:05:26] kangaroopower: What you suggest is generally the way things like credentials, etc are store. [16:05:35] Most of my code is- like 99% is. This is just managing hash generation to keep track of which user is which [16:05:49] kangaroopower: You also want to make sure file permissions aren't being overly permissive [16:06:00] like with chmod? [16:06:05] Right [16:06:09] so just do chmod 770 on a file [16:06:24] and then nobody can access it from the web, even if it's in public_html? [16:06:38] !log deployment-prep `tcpdump -n udp dst port 8324` shows packets leaving deployment-bastion for deployment-logstash1 [16:06:41] Well, 770 is probably not what you want since it also makes it executable; but chmod o= will make sure only you and your tool have access. [16:07:03] No, if it's in public_html then the user that reads it is your tool's. [16:07:06] yeah- 660 would probably make more sense right? [16:07:45] wait even if I chmod a file to be 660, or something and then put it in public_html, a user could still see it? [16:09:09] (03PS2) 10coren: Make dir a normal fd. [labs/toollabs] - 10https://gerrit.wikimedia.org/r/70110 (owner: 10Platonides) [16:09:45] (03CR) 10coren: "Rebased on HEAD" [labs/toollabs] - 10https://gerrit.wikimedia.org/r/70110 (owner: 10Platonides) [16:10:42] (03PS3) 10coren: Make dir a normal fd. [labs/toollabs] - 10https://gerrit.wikimedia.org/r/70110 (owner: 10Platonides) [16:12:40] (03PS2) 10coren: Unreviewed changes to rmtool [labs/toollabs] - 10https://gerrit.wikimedia.org/r/104397 (owner: 10Tim Landscheidt) [16:13:15] coren? [16:13:17] (03CR) 10coren: [C: 032] Make dir a normal fd. [labs/toollabs] - 10https://gerrit.wikimedia.org/r/70110 (owner: 10Platonides) [16:13:41] kangaroopower: Well, the web server can still access it; so it'll do its thing (execute it, show it, etc) [16:13:48] bblack: beta networking seems borked. I'm wondering if it is because of https://gerrit.wikimedia.org/r/#/c/146091/ [16:14:14] oh ok [16:14:31] so what does chmod o do? [16:15:12] chmod o= makes [o]ther rights = "" nothing. [16:15:12] bd808: that was preceeded by a step-1 commit that ensured => absent [16:15:25] is there some host I can look at? [16:15:33] hmmm... [16:15:53] http://en.wikipedia.beta.wmflabs.org/ seems dead. [16:16:16] https://gerrit.wikimedia.org/r/#/c/146090/ [16:16:22] yeah, I get a varnish error after a while [16:16:28] bblack: and udp2log packets are leaving deployment-bastion but not getting to deployment-logstash1 [16:16:38] * greg-g waits for it to timeout again to verify it [16:16:42] ^ that went in like 9 days ago and supposedly disabled the nat rules, the step 2 today was just cleanup [16:16:49] so if my file was named security.php [16:16:58] The two failures may be may be unrelated [16:16:59] what would be the command I'd use on it [16:17:03] like in full [16:17:06] bd808: could it have been delayed due ot that "puppet not running because /var was full" issue? [16:17:07] sorry im new to this [16:17:23] greg-g: I hadn't heard about that [16:17:27] * greg-g finds [16:17:28] scfc_de: The rmtool stuff should probably be abandoned; the things needs a rewrite given all the changes in web services, databases, etc. [16:17:55] * bd808 tries to narrow down when logstash traffic died [16:18:38] Coren: Probably, if petan doesn't take it up. [16:18:39] bd808: nvm, sorry, I'll step out, it was re Jenkins: https://bugzilla.wikimedia.org/show_bug.cgi?id=68254 [16:18:59] bblack, greg-g: The last event to be relayed via udp2log was at 2014-07-25T14:45:04.835Z [16:19:20] just a sec, and I'll get back to this [16:21:35] 2014-07-25T08:08:00Z should have been about when the step2 natfix removal went in [16:21:57] well, that's when it merged on palladium. labs merges are manual right? [16:23:14] bblack: beta pulls in the prod git branch once an hour'ish [16:23:51] * bd808 goes to look at merge log on deploymnet-salt [16:24:57] bblack: Confirmed that we pulled that patch in beta at 2014-07-25T08:17:03 [16:25:29] So likely not that if things kept working for several more hours [16:25:50] hmmmm [16:26:21] bblack: Can you look at deployment-cache-text02 and see if you can tell anythign about why varnish is timing out? [16:26:38] * bd808 keeps looking at puppet merge history [16:26:58] lookin [16:30:44] varnish is getting timeouts from the appserver backend, which doesn't respond to even e.g. GET / [16:30:55] do you know offhand what machine that is? [16:31:20] deployment-mediawiki01 and deployment-mediawiki02 [16:31:44] * bd808 looks at https://gerrit.wikimedia.org/r/#/c/148098/ [16:32:20] Apache seems borked on deployment-mediawiki01 [16:32:33] `curl http://localhost/` not responding [16:33:13] deployment-mediawiki02 as well [16:33:33] my guess is something to do with it running HHVM? [16:33:38] apache not even running.... [16:33:44] I'm looking at 01 [16:35:14] I restarted hhvm + apache2 services on 01, but things still look borked [16:35:22] hhvm had a bunch of zombie shellscript children :/ [16:35:44] The apache error logs on 02 look weird [16:35:50] no messages for days [16:35:59] and the last was "shutting down" [16:36:04] I think regular text traffic only goes to 01, but 02 is used for others like bits [16:36:18] * bd808 smashes things with kill [16:38:03] http://en.m.wikipedia.beta.wmflabs.org/wiki/Main_Page loads for me now [16:38:23] (+ non-m) [16:38:25] yup. [16:38:45] unless you saw something else interesting, I'm going with "HHVM died" [16:38:58] !log deployment-prep Killed apache2+hhvm and restarted on deployment-mediawiki0[12] [16:40:25] bblack: pstree showed "hhvm─┬─271*[sh]" [16:40:38] Which seems non-optimal [16:41:05] yeah mine had hundreds of: [16:41:07] apache 13967 2706 0 Jul24 ? 00:00:00 [sh] [16:41:13] where 2706 was the hhvm main proc [16:41:19] ori: hhvm weirdness? Do anything to that in beta lately? [16:41:42] bblack: That sounds like it would match the pstree output [16:42:14] That would be forking for lua I think. the in process lua was turned off [16:42:39] And... I think I saw something about disabling a timeout on it or something [16:42:57] ori: lua sh running forever under hhvm? [16:43:15] * bd808 say hist name three times: ori ori ori [16:43:38] bblack: sorry for blaming you first off [16:43:46] but thanks for the help [16:43:50] it's not unusual for me to break betalabs, it's ok :) [16:44:53] My first thought was network because of the udp2log stuff I was seeing too. [16:45:38] well I guess if no requests are being processed there's no logs either [16:45:38] I forgot to check that there was gas in the tank before tearing into the electrical system [16:46:09] yeah. Although there should be jobs I would think. I'll keep looking at that [16:46:27] break betalabs >> break prod. :-) [16:47:24] Yeah beta is for breaking, but it needs watching and fixing soon after :) [16:49:48] * bd808 sees traffic in logstash now that apaches are serving [16:51:31] !log [16:51:54] Where is the dam bot that's supposed to update SAL for me? [16:54:02] bd808: hhvm? [16:54:27] thanks bblack :) [16:55:25] greg-g: Looks like it [16:56:08] I saw something in backscroll of #-core aobut ori changing hhvm config earlier. I'll go see if the timeline matches up [16:56:24] Also logmsgbot is not here and that makes me sad [16:56:45] I copy pasted a bunch of !log from me into SAL [16:57:23] greg-g: [19:23] < ori> OK, I merged the config change for Labs, so we'll probably know within the next hour or so if we have additional bugs on our hands [16:57:35] hah [16:58:03] 19:23 what time? [16:58:11] MDT [16:58:17] * greg-g nods [16:58:29] so 18:23 in SF [16:58:33] * greg-g nods [16:58:45] UTC-6 [16:58:56] My dumb irc client won't show utc stamps. I need to patch that [16:59:34] /script exec $ENV{'TZ'}='UTC'; [16:59:41] so that puts it about 1:22 before the last event to udplog [17:00:13] so ... in a hour we might be hosed again? [17:00:20] probably :) [17:01:15] !log [17:01:15] * greg-g sets timer [17:01:26] who is the keeper of that bot? [17:01:44] maybe it runs on hhvm [17:01:50] :P [17:02:01] what's the name? wm-bot? [17:03:02] labs-morebots apparently [17:06:43] From an old SAL entry -- to restart morebots: ssh to tool labs, become morebots, then: qdel $(qstat | grep production | cut -d' ' -f 1) ; sleep 5 ; jstart -N production /usr/lib/adminbot/adminlogbot.py --config ./confs/production-logbot.py [17:08:32] it's allegedly documented at https://wikitech.wikimedia.org/wiki/Morebots [17:09:39] Where do I find a list of people associated with a tool? The tools side of labs is still a mystery to me [17:10:34] http://tools.wmflabs.org/ [17:10:38] YuviPanda: Can you restart morebots or get it to join here again at least? [17:11:04] lol, i'm a maintainer as well ? [17:11:13] but ori is as well:) [17:11:33] bd808: oh? [17:11:35] * YuviPanda reads backscroll [17:11:53] mutante: want to do it or should I take a hsot? [17:11:53] *shot [17:12:13] YuviPanda: please do, i'm surprised i'm listed as maintainer [17:12:18] :D [17:12:30] i dont think i ever used "become morebots" or "become anything" in toollabs [17:13:00] 3Wikimedia Labs / 3deployment-prep (beta): beta labs not responding; API shows 503 from varnish - 10https://bugzilla.wikimedia.org/68574#c1 (10Greg Grossmeier) p:5Immedi>3High s:5blocke>3major Bryan kicked HHVM and things are back (just slow for me). But still investigating. 16:57 < bd808> greg-... [17:14:30] bd808: ^ there you go [17:14:38] !log tool.morebots restarted labs-morebot [17:14:39] tool.morebots is not a valid project. [17:14:41] YuviPanda: ty [17:14:45] !log morebots restarted labs-morebot [17:14:45] morebots is not a valid project. [17:14:49] !log tools-morebots restarted labs-morebot [17:14:49] tools-morebots is not a valid project. [17:14:55] !log local-morebots restarted labs-morebot [17:14:55] local-morebots is not a valid project. [17:15:04] !log local.morebots restarted labs-morebot [17:15:04] local.morebots is not a valid project. [17:15:09] !log tools.morebots restarted labs-morebot [17:15:10] Logged the message, Master [17:15:12] finally [17:15:19] :) [17:15:26] !log deployment-prep Morebots is back! [17:15:28] Logged the message, Master [17:15:37] !log logging it works [17:15:37] logging is not a valid project. [17:16:14] good to know that they are "tools.foo" [17:17:29] 3Wikimedia Labs / 3deployment-prep (beta): beta labs not responding; API shows 503 from varnish - 10https://bugzilla.wikimedia.org/68574#c2 (10Bryan Davis) The last event seen in logstash was at 2014-07-25T14:45:04.835Z. Ori's irc message would have been around 2014-07-25T01:23Z. [17:28:36] bd808: I think the clock goes the other way right? [17:28:50] 13:23Z rather than 01:23Z [17:29:52] It's 17:29Z now and 11:29 local. That message was from 7pm last night in my timezone [17:29:54] oh, no, you're right :P [17:30:13] I cheated and had gcal tell me [17:34:13] 3Wikimedia Labs / 3deployment-prep (beta): beta labs not responding; API shows 503 from varnish - 10https://bugzilla.wikimedia.org/68574#c3 (10Bryan Davis) In apache error logs I see lots and lots and lots of: [Fri Jul 25 14:45:59.516788 2014] [proxy_fcgi:error] [pid 17215] (70014)End of f ile found: [clien... [17:43:58] 3Wikimedia Labs / 3deployment-prep (beta): beta labs not responding; API shows 503 from varnish - 10https://bugzilla.wikimedia.org/68574 (10Ori Livneh) [18:01:58] one quesiton- is it possible to use an editor other than vim for editing files when ssh'd from terminal [18:07:19] yeah there are lots of editors usually, what type do you prefer? [18:07:32] and i even looked it up but he had no patience [18:07:35] nanon [18:07:38] nano [18:07:42] oh yeah [18:16:17] ori: bd808: struggling with HHVM arent you ? [18:16:55] AH01067: Failed to read FastCGI header <-- I guess that is hhvm segfaulting / dieing whatever isn't it ? [18:17:33] hashar: It's just moving pretty fast. Having Brett in the office meant pushing faster to full web use. [18:17:45] ah yeah Brett [18:17:46] awesome [18:17:58] We wanted to get as many crashes as we could to happen while we had his attention :) [18:18:04] on my side (and Krinkle|detached ) we need to get Jenkins slaves running Trusty [18:18:42] Yeah that will be helpful for sure. [18:18:50] also I noticed some bug about libboost and luasandbox. Maybe our luasandbox expect an older libboost than the one in trusty [18:19:06] that sounded like something got removed in libboost version provided by Trusty [18:19:13] but then anomie/maxsem/tim would know better [18:19:38] Yeah. i'm staying out of c/c++ land on this :) [18:20:00] I'm supposed to be working on SUL features anyway :P [18:20:14] I believe HHVM as higher priority :d [18:20:40] Depends on who you ask I think. Dan may not agree. :) [18:20:58] let's take a vote? [18:21:15] * bd808 just wants everyone to get along [18:21:25] we have been talking about PHP 5.4 / hhvm for something like 2 years [18:21:30] (probably less) [18:21:38] And SUL for ... 10? [18:21:39] and we all know it is going to be a huge step forward for our infra [18:21:52] so SUL can wait an additional quarter [18:21:57] just my opinion though [18:21:58] or [18:22:07] we could double mw/core team :D [18:22:20] (or fire us all and hire more productive folks instead) [18:22:29] And make it all twice as late? [18:22:43] hehe [18:22:58] * bd808 believes that 9 women can't make a baby in a month [18:23:26] bd808: Have faith in science [18:23:34] though after 9 months you roughly have 9 babies [18:23:49] anyway back on subject. I noticed thathhvm as error logs pointing to /var/log/hhvm/error.log [18:23:56] wondering whether it supports logging to syslog [18:24:15] or on beta, we can point to /data/project/log/ to have a central place for all hhvm instances to report to [18:24:16] I think it does. We need to get a handle on logging for it for sure [18:24:38] bug filling it [18:24:48] "make HHVM log to a central place" [18:25:07] This is related -- https://bugzilla.wikimedia.org/show_bug.cgi?id=68459 [18:25:54] I think ori had a plan of for that involving a django app that correlates logs [18:26:08] wow cna't type [18:27:30] ah thanks [18:27:32] commented on it [18:27:46] 3Wikimedia Labs / 3deployment-prep (beta): HHVM crash logs need to go somewhere more visible than /tmp on the apache hosts - 10https://bugzilla.wikimedia.org/68459#c1 (10Antoine "hashar" Musso) The HHVM instances log to /var/log/hhvm/error.log . We should get all the logs centralized at some place, for examp... [18:28:12] bd808: also seems hhvm crash but is still up :/ so puppet does not restart it [18:28:15] which is unfortunate [18:29:47] !log deployment-prep Added twentyafterfour and several other WMF staff to under_NDA sudo group [18:29:50] Logged the message, Master [18:30:17] Hello, is there anybody? [18:31:04] I need some help, please. [18:31:20] !ask | Plenz [18:31:20] Plenz: Hi, how can we help you? Just ask your question. [18:31:40] @ask [18:31:43] bah [18:31:53] user63: is wmbot, huh [18:31:57] Yeah, it is [18:31:59] !help | bd808 [18:31:59] bd808: !documentation for labs !wm-bot for bot [18:32:06] * hashar grins [18:32:08] My project osm4wiki stopped working for unknown reasons. And I am in holiday, all my passwords and access addresses are at home. [18:32:27] ? [18:32:33] I feel not able to repair my project right now. [18:33:24] Is there an admin who can simply restart my project? [18:33:45] Plenz, is osm4wiki a tool on toollabs, or a labs project? [18:33:58] it is a tool on bastion [18:34:35] Plenz: 'on bastion' doesn't make any sense, the bastions are only used to provide access to other projects. [18:34:50] The address is https://tools.wmflabs.org/osm4wiki/ [18:36:03] Sorry, I never understood the "bastion" things completely. I transferred my project from my private server to tools, and it worked many monthes. [18:36:05] OK, and what do you typically do to restart the project? [18:36:57] I started it once, many monthes ago, and I don't remember how I did it. I am in vacation, I have only my notebook, and all addresses are at home. [18:37:46] kangaroopower: nano [18:38:02] i cant choose my own editor? [18:38:41] kangaroopower: you can suggest new patches to install one [18:38:53] where and how? [18:39:08] kangaroopower: but you cant just install software like that [18:39:12] kangaroopower: gerrit.. one sec [18:39:59] Or the editor you want may already be present. What is it? [18:40:16] Plenz: is that any better? [18:40:27] sublime text lol. I like using a mouse [18:40:32] kangaroopower: modules/toollabs/manifests/dev_environ.pp [18:40:55] thnx mutante [18:41:14] you will see a list of package names and bug numbers there [18:41:45] you can make a new bug like the others and ask for it [18:41:52] and/or just submit an actual patch to the file [18:42:24] Now my internet connection was interrupted [18:42:47] so I go to toolabs [18:42:49] on gerrit [18:42:57] and then to modules in the toolalabs repo? [18:43:50] kangaroopower: do you know how to git clone stuff? [18:44:07] I think I need PUTTY and I have to change many settings, but I don't remember where I found all those instructions [18:44:25] @mutante yes [18:44:27] mutante: sublime text won't really work, it, uh, requires a windowing system and stuff :) [18:44:28] Sorry, I remember really noting :( [18:44:37] Plenz, I restarted your web service, is that not what you needed? [18:44:48] YuviPanda: i was waiting for that part to be sorted out on gerrit . but heh [18:44:53] mutante: tch tch [18:44:59] I made all things many monthes ago and I did not touch it for a long time. [18:45:05] @Yuvi, in the future would it be possible to use editors outside of terminal? [18:45:29] kangaroopower: sadly, not that I can think of. You can, however, use http://wbond.net/sublime_packages/sftp [18:46:50] @YuviPanda and I'll be able to access tools from there and edit my project files? [18:47:14] kangaroopower: should, yeah. I haven't tried it myself (I don't use Sublime), but give it a shot? :) [18:47:21] ok cool [18:47:23] thanks! [18:47:29] :) [18:48:03] oh- one last question. I asked this before but got confused when I got into terminal [18:48:07] Plenz: I'm sorry, I do not understand what your question is now. I restarted your web service, now https://tools.wmflabs.org/osm4wiki/ displays different behavior from before. [18:48:25] so if I have a file that's privatestuff.php [18:48:39] and I want to make it so that my main.php file in public html can access it [18:48:40] YEs it works again, THANKS!!! [18:48:49] but other people can't see it [18:48:52] how would I do that [18:49:18] I hope it will not stop again for the next 2 weeks, then I am at home and I ccan care myself [18:50:02] For example this link shows it: https://tools.wmflabs.org/osm4wiki/cgi-bin/wiki/wiki-osm.pl?article=Zeche_Zollern§ion=Koordinaten&project=de [18:50:11] Plenz: Ok! It should be stable, barrying any major tools upset. [18:51:00] Thank you again. In case of troubles I join the chat again. [18:53:35] Yep, always happy to help :) [18:54:12] OK, bye and have a nice evening [20:34:24] well done! [21:51:37] andrewbogott: hi, do you have some time to look into the email thing? [21:51:48] I think you need to run the processEmailBatch.php script [21:51:56] legoktm: sure. [21:52:03] Can you catch me up on why you think it worked before and then stopped? [21:52:51] sure [21:53:05] so in https://github.com/wikimedia/mediawiki-extensions-OpenStackManager/commit/fa7ed40d2140ba00728fdb998b2ce14a9d97e8cd bd808 added the email-body-batch-message and email-body-batch-params options [21:53:19] I think that enabled batching support for email notifications, which requires a separate script [21:53:28] can you check to see if there is anything in the echo_email_batch table? [21:53:37] ok [21:54:02] * bd808 only wanted emails with useful content :( [21:55:04] legoktm: indeed, there are ~180 records [21:55:29] So this is like the jobqueue, where someone has to periodically run a job to send those emails [21:55:40] so, I if you run the processEmailBatch.php script in echo, it should start firing out emails [21:55:44] yeah, kind of [21:55:57] it's in Echo's maintenance directory [21:56:17] I wonder if I really want to dump those on everyone… seems like it might be more polite to just empty that table [21:56:20] so we probably just need that script on a cronjob, let me see how often prod is running it [21:56:22] before testing [21:56:48] umm, I don't think you can just delete the table [21:56:51] the rows* [21:56:55] I'm not sure what would happen... [21:56:57] ok [21:57:00] well, spam ho! [21:57:00] * legoktm looks [21:57:21] yeah, so it updates a timestamp stored in a different table when you send it out [21:57:38] ok. I'll just run the script. Presumably all 180 of those aren't going to the same user :) [21:58:08] bd808: how's your inbox looking? [21:58:40] * bd808 waits for gmail to see stuff [21:58:41] legoktm: is the right solution to add a cron, or to turn off batch processing so that emails are sent right away? [21:59:18] either works I suppose [21:59:29] Is there a flag to have the job queue run the emails out? [21:59:53] I assume wikitech has some sort of job queue runner [22:00:10] $wgEchoUseJobQueue was disabled on prod because it was too slow [22:00:22] well, it was disabled in the entire extension [22:00:47] oh. that's what I get for reading wiki pages for docs :) [22:01:47] andrewbogott: No flood of emails in my inbox yet but I'll keep an eye out [22:03:41] hm... [22:03:52] bd808: you were getting them before, right? So we know it's not a filtering issue? [22:04:31] Yeah I have many "new notification at wikitech" emails in my archive [22:04:50] But no new ones, huh? [22:04:54] I guess we'll give it a few minutes... [22:05:49] Nope no new ones. Last was X-Received: Sun, 20 Jul 2014 14:38:06 -0700 (PDT) [22:05:51] yeah, I don't have any emails either... [22:06:07] andrewbogott: is the table empty now? [22:07:20] legoktm: yes [22:07:47] legoktm: In case you missed this in #wikimedia-operations…. https://gerrit.wikimedia.org/r/#/c/149459/ [22:08:18] +1'd [22:08:26] well, if the table is empty *someone* had to get the emails [22:08:31] Maybe :) [22:08:34] Or we're still missing a step [22:09:23] did the script output stuff? [22:09:38] $this->output( "processing user_Id " . $userId . " \n" ); [22:09:59] yeah, looks perfectly reasonable. https://dpaste.de/cW0d [22:11:16] So, where do those emails go after they leave mediawiki? What other logfiles should I look in? [22:11:56] I'm not sure, Echo uses UserMailer::send, which should just be php's mail() function [22:12:31] unless you have $wgSMTP set, in case it'll use the configured SMTP server [22:12:39] in that case* [22:13:26] nope, not set [22:14:57] legoktm: so, what do you think? Should I merge that crontab patch and just wait a day and see what happens? Or are there other things to investigate? [22:15:47] we could figure out which users correspond to the user_ids that were outputted and ask them if they got any email? [22:16:28] hmmm still no mailz for me. Would user_id correspond with uid in ldap? [22:16:37] I don't think so [22:16:41] * bd808 is uid 3518 [22:16:48] https://wikitech.wikimedia.org/w/api.php?action=query&meta=userinfo [22:16:57] what does that say (need to be logged in) [22:17:08] 1604 [22:17:21] o_O I wasn't in the list [22:18:21] what's your name on wikitech? [22:18:32] But based on [[Special:Notifications]] there are at least 5 emails I didn't get [22:18:40] andrewbogott: BryanDavis [22:18:46] bd808: uh, what are your email preferences set to at Special:Prefs? [22:19:02] https://wikitech.wikimedia.org/wiki/Special:Preferences#mw-prefsection-echo specifically [22:19:15] namely, do you have digest enabled? [22:19:34] Yeah, looks like you're 1604 [22:19:38] legoktm: individual notificaitons, email for all but edit revert [22:20:01] hmm [22:20:08] so you shouldn't ever be in the bundling script [22:20:13] it should just send right away [22:20:17] * bd808 nods [22:20:26] Oh, the batch mode is only for people with digest turned on? [22:20:33] yeah >.> [22:20:36] ugh [22:20:40] which is part of why I wondered if my patch made things not work at all [22:21:16] I was poking buttons in the dark, but trying to make things match what I thought Thanks was doing [22:22:20] it might not be your patch then [22:22:45] andrewbogott: so I think setting up the cronjob is probably needed for some other notifications and a good thing to have, but not why these emails stopped sending [22:22:47] but it worked until Monday. At least emails went out [22:23:02] yeah, seems like [22:24:38] legoktm: I was patterning my changes based on https://github.com/wikimedia/mediawiki-extensions-Thanks/blob/master/Thanks.hooks.php#L134-L149 [22:25:02] yeah, your code looks fine [22:26:07] * andrewbogott is half working, half cooking, will drift in and out of this conversation [22:27:15] https://github.com/wikimedia/mediawiki-extensions-OpenStackManager/blob/master/OpenStackManager.php#L436 what is that for? [22:28:21] Krenair added it in https://github.com/wikimedia/mediawiki-extensions-OpenStackManager/commit/e44ee70b38cd4e747cbdb4cb5b3e6a78d5865a41 [22:28:42] I think it removes the default recipient [22:30:16] I have no idea at this point. It's just OSM emails that aren't being sent right? [22:30:38] Yeah. I've gotten page edit and other emails from wikitech [22:32:55] ugh [22:33:05] I really don't know whats happening then [22:33:12] heh [22:33:33] [15:32:06] petan: I just received an e-mail for the notification from https://wikitech.wikimedia.org/wiki/Special:Notifications which says "Petrb added you to project [[[No page]]] [22:33:33] [15:32:06] 07:20, 21 June 2013 [22:33:36] LOL [22:34:16] That's not really new [22:34:47] (in appearance) [22:35:28] Nemo_bis: the date? [22:35:50] helderwiki: so we were trying to figure out why some emails weren't sending, and we ran a script to clear out any queued emails and it seems some were really old [22:35:53] https://bugzilla.wikimedia.org/show_bug.cgi?id=43743 [22:36:29] Nemo_bis: oh, idk about that. I was more about how the email should have been sent out last year [22:36:29] Ah sorry, I meant the [[[No page]]] part [22:37:23] yeah I know about that bug :( [22:38:22] there is also "[[:[No page]]] was linked from Nova Resource:Tools. [[Special:WhatLinksHere/[No page]|See all links to this page]]." [22:38:31] in my list of notifications on wikitech [22:40:44] bd808: tangentially… if I update wikitech to the latest mediawiki, what branch is that? [22:40:48] yeah, that's an Echo bug that hasn't been fixed [22:42:03] andrewbogott: 1.24wmf15 according to https://www.mediawiki.org/wiki/Special:Version [22:42:17] andrewbogott: wikipedias are on wmf14 [22:42:23] bd808: thanks [22:43:26] Bah, I can't remember how to do this… git checkout -b wmf/1.24wmf14 origin/1.24wmf14 <- except not quite that [22:44:23] That should work. YOu may want to add --track in there too [22:44:44] to git checkout -t origin/.... should work [22:44:53] It doesn't work, though, because the origin/1.24wmf14 part is wrong [22:45:13] and origin/wmf/1.24wmf14 [22:45:21] s/and/ah/ [22:45:22] ah, you're right. Got it. [22:45:23] Thanks [22:46:10] * andrewbogott wonders if submodule update will break SMW [22:46:14] I feel like we tried to tag it properly... [23:49:30] hey, I m looking for a tool to get a list of articles in a category that don't exist in an other language [23:49:40] can someone help me [23:49:40] Hi Helmoony, just ask! There is no need to ask if you can ask [23:51:03] thank you [23:51:16] do you have a link for that tool ?