[00:17:04] [bz] (8NEW - created by: 2Addshore, priority: 4High - 6enhancement) [Bug 48894] Include pagecounts dumps in datasets - https://bugzilla.wikimedia.org/show_bug.cgi?id=48894 [03:24:28] !reboot [03:24:41] !reboot is https://wikitech.wikimedia.org/wiki/Nova_Resource:Bots/Documentation/wm-bot [03:24:41] Key was added [03:57:45] Ryan_Lane: hi [04:03:57] legoktm: hi, are you there? [04:48:27] Hi all [05:49:12] Can't anything be done to stop these spikes? [05:50:03] NFS? [05:50:13] they're on it. Coren had no power for 4 days, which kinda put a stop on things [05:51:10] YuviPanda, actually Coren said its people abusing crontabs. [05:51:28] And not jsubbing the scripts. [05:51:32] the last explanation I saw was the controller (hardware) itself being messed up [05:51:39] and yes, it got less bad after the crons were kicked out [06:28:53] YuviPanda, Cyberpower678: yep controller issues [06:30:30] Ryan_Lane: did dell respond? [06:30:37] no clue [07:39:14] !ping [07:39:14] ¤*POOF*¤ "Wadda need?" ¤*POOF*¤ "Wadda need?" ¤*POOF*¤ "Wadda need?" [07:39:18] wtf [07:40:34] Isn't that Helpmebot's things? [07:41:31] !ping [07:41:31] ¤*POOF*¤ "Wadda need?" ¤*POOF*¤ "Wadda need?" ¤*POOF*¤ "Wadda need?" [07:41:44] !ping [07:41:44] ¤*POOF*¤ "Wadda need?" ¤*POOF*¤ "Wadda need?" ¤*POOF*¤ "Wadda need?" [07:41:59] !ping is pong [07:41:59] ?? [07:41:59] petan, ^ [07:41:59] omg [07:41:59] wait [07:42:00] This key already exist - remove it, if you want to change it [07:43:04] XDD [07:43:11] petan, nice ping [07:44:54] Hi petan [07:45:10] hey [07:45:30] I just want to say hello to you [07:45:42] have a nice time :) [07:45:57] thanks [07:46:53] wm-bot is the new Helpmebot [07:47:40] !trout is goes fishing, catches a trout, and smack $1 with it, over and over. [07:47:40] This key already exist - remove it, if you want to change it [07:47:46] !trout petan [07:47:46] I am taking a 10-pound rainbow trout out of a river, and smacks petan with it over and over. [07:48:04] !trout del [07:48:04] Successfully removed trout [07:48:18] !trout is goes fishing, catches a trout, and smack $1 with it, over and over. [07:48:19] Key was added [07:48:22] #wikimedia-labs-offtopic for this [07:48:28] !trout petan [07:48:28] goes fishing, catches a trout, and smack petan with it, over and over. [07:48:38] petan, I'm done. [08:57:58] [bz] (8NEW - created by: 2Antoine "hashar" Musso, priority: 4Unprioritized - 6normal) [Bug 51700] https://login.wikimedia.beta.wmflabs.org/ trapped in an infinite self-redirect - https://bugzilla.wikimedia.org/show_bug.cgi?id=51700 [09:03:38] !log deployment-prep restarted varnish text cache [09:03:41] Logged the message, Master [09:28:29] [bz] (8NEW - created by: 2Addshore, priority: 4High - 6enhancement) [Bug 48894] Include pagecounts dumps in datasets - https://bugzilla.wikimedia.org/show_bug.cgi?id=48894 [10:52:37] !ping [10:52:37] ¤*POOF*¤ "Wadda need?" ¤*POOF*¤ "Wadda need?" ¤*POOF*¤ "Wadda need?" [10:53:00] Is NFS having issues again...? [10:54:37] toollabs diead again [10:54:49] :( [10:55:37] works to me? [10:55:41] YuviPanda what doesn't work? [10:55:58] it's back [10:56:02] legoktm: geeek [10:56:02] :\ [10:56:20] petan: so i have to re-run my jobs? [10:57:30] petan: .. [10:57:33] !reboot [10:57:34] https://wikitech.wikimedia.org/wiki/Nova_Resource:Bots/Documentation/wm-bot [10:57:41] ^^^ that key... [10:57:42] Amir1: depends [10:57:52] Amir1: check if they run or not [11:10:17] petan: I checked, they're working [11:10:33] thank you for guidance anyway [12:12:42] [bz] (8ASSIGNED - created by: 2Niklas Laxström, priority: 4High - 6major) [Bug 48203] Purging does not work on deployment-prep / beta labs - https://bugzilla.wikimedia.org/show_bug.cgi?id=48203 [12:16:00] [bz] (8NEW - created by: 2Antoine "hashar" Musso, priority: 4Unprioritized - 6normal) [Bug 51874] vhtcpd needs to support purge request send over unicast - https://bugzilla.wikimedia.org/show_bug.cgi?id=51874 [13:16:12] * YuviPanda waves [13:16:12] * YuviPanda just crossed 26 hours awake, not sure how useful he'll be today [13:16:12] wow! [13:16:51] 26 hours awake is far too many [13:17:04] doing a sleep cycle reset :) 5 more hours! [13:17:20] kma500: my sleep cycle had 'drifted' to a point where I was slepeing at 7am everyday. drastic action was necessary :) [13:18:24] YuviPanda: I hope the drastic action works :) [13:18:33] kma500: :) [13:19:03] it's been productive, however. 27 commits in that period of time [13:20:30] I don't know how people manage to be productive on no sleep. Congrats! It's a talent. [13:21:15] :) let's hope I can pop it to 30 before I hit the bed. [13:21:23] sleep also feels very much sweeter after these thigns [13:22:21] I took a shower! That is because I had hot water. Today is a good day. [13:22:29] I imagine. [13:22:47] You have power, coren? [13:22:53] Coren: I had really hot shower for one full hour! [13:22:59] keeps the sleep away :) [13:23:02] kma500: Yes, as of last evening. [13:23:07] excellent! [13:23:16] Coren: also welcome back to civilization! Happy to have you back whole and unharmed :) [13:23:28] I also need to buy myself a bigger generator. [13:24:21] funny how important this power thing is :) [13:24:58] It's actually a little scary when we notice how horribly dependent we are on it. [13:26:01] Coren where you live? [13:26:07] Coren US? Canada? [13:26:19] I am wondering which country has problems with power in 2013 :D [13:26:30] Canada [13:26:42] heh [13:26:50] but I guess it's not a problem in big cities only [13:27:06] Canada, in a suburb near Montreal; it was a freak storm. I get reliable power most of the time, but uprooted trees falling on power lines all over the place does not care. :-) [13:27:18] Storms know no boundaries [13:28:01] http://www.upi.com/Top_News/World-News/2013/07/20/At-least-one-killed-as-thunderstorms-sock-Ontario-and-Quebec/UPI-47391374352019/?spt=hs&or=tn [13:28:41] We gots tornadoes. I think the last time we had tornadoes was in the 70s. :-) [13:29:11] * T13|stillSleepy needs a shower too... [13:29:33] * T13|stillSleepy 's cats are avoiding him... :0 [13:29:40] Fun fact: a heat wave being broken by the Jet Stream shifting south all at once and bringing in a huge cold air mass slamming in the local air = bad news. [13:31:10] heh [13:31:20] Coren: has the freak weather stuff passed? [13:31:53] Freak weather indeed! [13:32:02] YuviPanda: Yeah, we've been back to normal weather for two days; the weather was crazy for just the one day but it took several more to fix the damage. There are still people without power. [13:32:38] ow. [13:34:08] Izawayz: ping [13:34:14] rschen7754|away: ping [13:34:40] Coren: how long do we need to wait for last 2 users who were informed 2 months ago that bots project will be closing before we finally kill their bots and shut it down? XD [13:35:17] petan: Have you announced a date? You want to do that at least two weeks in advance, on labs-l. [13:35:19] like 1 instance - 4cpu / 8gb of ram 80gb of storage is being blocked from deletion because of 1 tiny bot using like 20m of ram and no cpu [13:36:01] The other question is whether you know they've actually received the notification. :-) [13:36:12] Coren: I don't even remember but I know there were lot of announcements in lab-l and huge banner "DO NOT USE THIS INSTANCE IT WILL BE DELETED IN FEW DAYS" in motd [13:36:38] petan: That works. If you want to be generous, give them at least until the end of the week. [13:36:57] I will just wait then but it was Ryan who was like "omg we are out of storage, plz delete some instances" [13:37:00] BRB. Breakfast. [13:39:57] Hey, someone said a while ago that they were looking into getting me a shiny labs-mysql that won't hurt everyone. Did that ever happen? [13:40:22] I just realized I've kind of forgotten about that investigation I was doing a couple weeks ago that made nfs so sad [13:45:38] hi sumanah [13:45:45] hi kma500! how's it going? [13:45:58] good! how are you? [13:46:11] I'm all right :) [13:47:24] so http://etherpad.wmflabs.org/pad/p/Tool_Labs_Sprint_July_23 [13:47:29] I'm going to be in the office btw 10-10:30, but offline til then. Anything doc-related we need to figure out early? [13:47:35] yes. [13:48:26] kma500: I figure anyone who wants to can just dive in and start typing [13:48:32] e.g. what I just did on line 56 [13:48:52] kma500: you saw what Coren just said about his power problem :( [13:48:57] Just saw that. [13:49:08] Silke_WMDE_: hi, do you want to concentrate on any part of the docs in particular? [13:49:09] Coren has power again! [13:49:49] As long as you're fine with a very free-flowing process where people can just type on the Etherpad to fill in individual parts of the outline, I think we're set :) [13:51:22] Okay. Sounds good! [13:51:29] great! [13:51:37] it's not even 7am where you are - amazing that you are up :) [13:52:14] sumanah: It's not my fault! My husband set the alarm for five... [13:52:30] But I did want to check in since I'll be getting in on the later side. [13:52:36] Didn't want to hold anything up. [13:53:20] sumanah: I'm resetting my sleep cycle today (uptime: 27h right now!) so won't be there today. Will look at it and fill in things tomorrow when I wake uP :) [13:54:40] !ping [13:54:40] ¤*POOF*¤ "Wadda need?" ¤*POOF*¤ "Wadda need?" ¤*POOF*¤ "Wadda need?" [13:54:43] am connected. nice [13:54:47] wm-bot: ping is pong [13:54:47] Hi YuviPanda, there is some error, I am a stupid bot and I am not intelligent enough to hold a conversation with you :-) [13:54:55] !ping del [13:54:55] !ping del [13:54:55] Successfully removed ping [13:54:57] Unable to find the specified key in db [13:55:06] !ping is pong [13:55:06] :) [13:55:06] ty petan [13:55:08] Key was added [13:55:50] see you later! [13:57:31] YuviPanda: :) [13:57:37] kma500: makes sense, thanks [14:00:07] sumanah: give me a few minutes, meeting [14:08:36] sure, Silke :) [14:10:38] manybubbles_: It's pending on some love from one of our DB people. Since it's becoming pressing for you, do you want me to poke Ken to see if he can spare one of them for a half-hour? [14:11:34] Coren: that't be wonderful! Ken has asked me to poke him if I need something so I can do it too. My problem with this is that I just keep forgetting about it because I had to put it on hold. [14:12:45] manybubbles_: Part of the issue is that I think Asher is still OoO. I'll poke Ken, but it won't hurt if you also tell him that's a blocker for you. [14:17:14] Coren: emailed. it isn't a blocker - just kind of annoying not to have [14:17:38] Well, you can't really do what it was you needed to do until that's available. In my book, that's a blocker. :-) [14:18:20] (Also sent email) [14:21:10] Coren: how it looks with teh dedicated sql server you worked on week ago [14:21:39] petan: Needs love from one of our DBAs. See above exchange with manybubbles [14:21:40] :-) [14:21:51] aha [14:22:01] oooh - if other people want it then more power! [14:25:50] Coren: did you notice mails are broken on tools? [14:25:57] Coren: I am not receiving any e-mail :( [14:26:09] petan: Have you checked the logs? [14:26:09] last mail is from 20+ days ago [14:26:13] logs where [14:26:22] mail system is very undocumented [14:26:26] !toolsadmin [14:26:26] https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Documentation/Admin [14:31:29] hi Coren [14:31:55] a new column has been added to wikidata's wb_terms table [14:31:56] aude: Hello. [14:32:07] is there anything specific that needs to happen to have it appear in toollabs? [14:32:12] like rebuild a view? [14:32:13] petan: It's a normal exim4 on, predictably enough, tools-mail. :-) [14:32:31] * aude would be reassured to see it there :) [14:33:17] Hi! [14:33:31] Coren: it doesn't let me ssh :( [14:33:37] aude: What's the new column? [14:33:44] Coren: term_weight [14:33:56] it's there since thursday [14:34:07] aude: I see it replicated so yeah, only the vew being rebuilt is needed. I can do that if you give me a minute. [14:34:13] ok, thanks [14:34:13] petan: Hang on, will see why. [14:34:27] we'll be enabling wikidata to use the column in a few hours [14:36:12] aude: Should be there now. [14:36:25] awesome, thanks Coren ! [14:37:04] btw, my bot is running on toollabs now and it's doing well :) [14:37:14] great :) [14:37:27] petan: Ah, found why. When I rebooted the cluster after the NFS explosion, tools-mail was forgotten so it doesn't seem home. [14:37:28] * aude making regular backups in case stuff gets corrupt again [14:37:36] k [14:37:42] petan: Prepare to be spammed once the mail queues flush. :-) [14:37:49] cool [14:37:58] Coren: i am making use of postgres and postgis, though which i am doing via external (personally) hosted api [14:38:03] Coren: I presume there's a cluster reboot checklist to add that to now? :) [14:38:11] what are the plans to support postgres and postgis in toollabs? (any plans) [14:38:17] aude: cool [14:38:30] anything we can help with to make it happen? [14:39:00] sumanah: Not as such, beyond "Don't reboot the cluster, but if you have to reboot every host" which isn't proof against "Oops, forgot that one." :-) [14:39:07] aude: you helping document stuff? http://etherpad.wmflabs.org/pad/p/Tool_Labs_Sprint_July_23 [14:39:14] sumanah: not today [14:39:24] Coren: that seems inelegant to me .... but I defer to your judgment [14:39:57] sumanah: I don't think there's a whole section I can work on, it's rather single questions I can answer with the help of docs and (many) others I can't. [14:39:58] * aude thinks the documentation is decent or at least i was able to figure most out [14:40:01] sumanah: They are, thankfully, not order dependent. :-) [14:40:03] Silke_WMDE_: go ahead then :) [14:40:32] (I.e.: There is no black magic in having the cluster come up "right") [14:40:34] aude: yeah, fortunately this is an improvement sprint and not a write-from-scratch thing! [14:40:40] sumanah: yeah! [14:40:50] aude: you could write a case study - those are helpful [14:40:54] certainly the toolserver wiki has a lot of good info for newbies [14:40:58] @seen orsa [14:40:58] Technical_13: Last time I saw orsa they were quitting the network with reason: Remote host closed the connection N/A at 7/22/2013 8:44:59 PM (17h55m59s ago) [14:40:59] like how to do a query, and that's important [14:41:07] * aude needed that at one point [14:41:17] @notify orsa [14:41:17] I'll let you know when I see orsa around here [14:41:35] petan: tools-mail should let you in now, btw. [14:41:59] petan: Incidentally, since you're in the right group, no host should ever not let you in. If that happens, it's unfailingly a bug. [14:42:11] aude: what are you using postgist/postgres for? [14:42:15] * YuviPanda is interested too! [14:42:18] sumanah: something that i would like is https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Rules be filled in [14:42:20] aude: then that would be a great thing to link to in the FAQ [14:42:21] PostGIS ftw! :) [14:42:28] aude: have you put that in the Etherpad? [14:42:38] kma500: ^ aude is a Tool Labs user and has some requests :) [14:42:44] aude: I'm waiting on Luis / legal on this; we have a draft ready but it needs final tweaks and their blessing. [14:42:55] aude: It may well be stalled on the privacy policy stuff though. [14:42:57] YuviPanda: i am adding geocoordinates to wikidata and among other checks, i check it is geocoded in the correct province or state (that i am working on) [14:43:11] aude: where is your postgis instance hosted? your own vps? [14:43:16] and make a bounding box of the state, so i can select a batch of geocoordinates [14:43:19] YuviPanda: my own [14:43:21] right. [14:43:22] * Coren eeps as the mail backlog suddenly gets delivered. [14:43:26] and i have a secret api for it :) [14:43:33] Coren: ok [14:43:42] even if we have postgres / postgis with out an osm db, that's useful [14:43:55] like i can have my own, maybe with all the geo_tags geocoded [14:44:01] from wikipedia and wikidata [14:44:03] sumanah: Actually, if you might be able to help there by poking Luis with a gentle "When are we getting Labs TOS?" :-) [14:44:54] sorry to spam, but i was also wondering if it would be okay, per rules to host a such an "enhanced" version or copy of the geocoordinates [14:44:59] as a tool [14:45:01] aude: I can help setup puppet, etc. I'd love to have it as part of toollabs, provided Coren agrees :) [14:45:11] YuviPanda: cool :) [14:45:12] aude: I don't see why not. Why do you think there might be an issue? [14:45:29] Coren: i don't see an issue but not 100% sure [14:45:38] Coren: would having tools-postgres be an issue? :) [14:45:41] the toolserver had issues with, at least hosting any text content [14:45:53] YuviPanda: That's at the edge of the scope; Tool Labs users (as opposed to general Labs users) normally should not need to fiddle with puppet -- that's part of the point of tools. :-) [14:46:13] Coren: no no, *I* was offerint to setup tools-postgres via puppet so people can u se postgres [14:46:13] postgres should be available to all, same way as mysql is [14:46:17] indeed. [14:46:51] it's icky (especially since geo_tags is not in the dumps yet), to try to make an external copy of stuff geocoded [14:46:58] can be done, just a lot harder [14:47:05] YuviPanda: It would; having a DB in an instance is teh evils. I do know we need postgres for a few things, though, and it would be nice to have one for testing mediawiki against as well, so I'd rather add a postgress setup to the new physical DBs I'm working on this week. [14:47:25] Coren: woo! Is that before or after the NFS? and the uwsgi? [14:47:32] aude: file a bug for postgres/postgis? [14:47:40] YuviPanda: do you want to? [14:47:42] and yes, having it on a physical thing would be nice [14:47:49] aude: No issue with serving text on Labs. The toolserver limitation was partly technical and partly legal, but neither apply in Labs. [14:47:53] aude: you've the more immediate use case, so you should :) [14:47:57] Coren: that's what i thought [14:48:16] YuviPanda: it depends... if the geo_tags dump comes first or i get impatient [14:48:35] i believe that geo tags was added to the dumps but has not run everywhere yet [14:48:46] well, it's a bug report :) [14:49:11] k [14:49:23] ty aude [14:50:53] [bz] (8NEW - created by: 2Aude, priority: 4Unprioritized - 6normal) [Bug 51885] provide postgresql and postgis on toollabs - https://bugzilla.wikimedia.org/show_bug.cgi?id=51885 [14:50:57] there ^ [14:51:09] aude: ty! [14:51:20] Coren: any update on talking to Dell? [14:53:50] hmmmm, looks like my bot needs more work to do :) [14:54:02] YuviPanda: Nope. I was in the dark, remember? It's on my TODO for the day. [14:54:13] or it died [14:54:22] Coren: :) [14:54:28] Coren: apologies if I'm naggy. [14:54:33] * Coren has had no progress whatsoever on anything since Friday afternoon. [14:55:32] no, my bot needs more work [15:08:58] sumanah: The pad is really well prepared with all those links! Thanks. Also to Kirsten. Btw is she here? What's the nick? [15:09:42] Silke_WMDE_: kma500. [15:09:55] ah thx scfc_de! [15:14:38] scfc_de: You answered about database connections here: https://wikitech.wikimedia.org/wiki/Nova_Resource_Talk:Tools/Help#GUI_tool_for_databasework What would be an example for such a local GUI tool to connect to tools-login? I've never used such a thing. [15:15:55] Silke_WMDE_: Me neither, I only used mysql (CLI) for testing. Any Windows users around? [15:16:29] Silke_WMDE_: http://dev.mysql.com/downloads/tools/workbench/ [15:17:02] Coren, Ok, thanks, I'll add that as an example. [15:24:39] Damianz: is cluebot running from tools now? [15:24:50] petan: 3 is [15:24:58] 3? [15:25:04] 3 processes? [15:25:07] or what [15:25:14] ClueBot III [15:25:36] hmm [15:26:44] !damianz [15:26:44] some weirdo around here [15:27:08] Thanks for the reminder wm-bot [15:27:08] Hey Technical_13, you are welcome! [15:27:42] spambot cyberbot Task / Running 2013-07-20 21:17:59 CPU: 58h56m VMEM: 325M/4.8G [15:27:51] vmem 325m/4.8g o.o [15:28:05] sounds like python [15:28:36] if all these python bots were written in c whole tools project would fit into one instance with 1 cpu and 256mb of ram :P [15:28:51] *properly written [15:29:53] Coren: i feel I have to catch up a bit... Are the two projects still called "tools" and "bots"? [15:30:09] tools and toolsbeta [15:30:17] bots is closed kind of [15:30:19] Silke_WMDE_: "toolsbeta". Bots is obsolete and on its last breath. [15:30:27] \o/ [15:30:34] given the speed of migration the last breath is going to take few years [15:30:44] :) [15:31:06] welcome back Silke_WMDE_ [15:31:21] thx JohannesK_WMDE! [15:33:43] Yesterday, I noticed that my mediawiki install on labs changed ( http://ase.wikipedia.wmflabs.org ). The /srv directory can't be found and it breaks the wiki. So far, I have been unable to fix it. Can anyone help or offer a suggestion? [15:41:16] @seenrx zhuy* [15:41:16] petan: Last time I saw zhufengwill they were quitting the network with reason: no reason was given at 10/15/2012 7:55:16 AM (281d7h46m0s ago) (multiple results were found: rick_zhu, Yiranzhu, xiaozhu, zhufeng, johnzhuyiyi and 16 more results) [15:41:24] @seenrx zhuyi* [15:41:24] petan: Last time I saw johnzhuyiyi they were quitting the network with reason: no reason was given at 11/7/2012 7:14:35 AM (258d8h26m49s ago) (multiple results were found: zhuyifei1999, zhuyifei1999_, zhuyifei1999_zzz, zhuyifei1999__) [15:41:35] @seenrx zhuyife.* [15:41:35] petan: Last time I saw zhuyifei1999 they were quitting the network with reason: Ping timeout: 250 seconds at 7/20/2013 9:20:29 AM (3d6h21m6s ago) (multiple results were found: zhuyifei1999_, zhuyifei1999_zzz, zhuyifei1999__) [15:49:08] petan: Who are the two last users on Bots? [15:49:21] scfc_de Izawayz and rschen7754|away [15:49:27] not only them [15:49:34] but these 2 are holding some big resources [15:49:55] also Beetstra, Damian z and me (+ addshore) [15:50:04] not sure if addshore did move all stuff or not yet [15:50:28] * sumanah comes back from meeting, reads backscroll [15:50:52] Coren: if you fwd me whatever the last mail you sent Luis was, I can sort of piggyback on that [15:52:00] Silke_WMDE_: https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta and https://wikitech.wikimedia.org/wiki/Nova_Resource:Bots - I added some information to those pages to help people understand their roles :) [15:52:14] slevinski: did you get the help you needed? [15:52:30] sumanah: My last email on the subject was basically a one-liner saying "Hey, what's up with that." :-) [15:52:30] no, still having problems. [15:52:45] sumanah: thx [15:53:01] sumanah: Sent. [15:53:27] slevinski: What project is this instance part of? [15:53:28] YuviPanda: why there is no toolsbeta-redis [15:53:39] signwriting project [15:53:44] you people are skipping 1 important step [15:53:55] just deploying stuff on production with no testing is evil [15:54:21] petan: phrasing like "you people" can sometimes rub people the wrong way, just an FYI :) [15:54:34] petan: technically, that'd be testing *changes* in infrastructure. New stuff can't break production when it lives on a different instance. :-) [15:54:58] petan: But yeah, it could have been tested there first. [15:55:03] ok but now when we need to test a change to redis, we need to create whole new instance first instead of jus testing a change [15:55:22] slevinski: so does the /srv directory exist with wrong permissions or is it just gone? <----- uninformed question [15:55:25] also, you and your apache 2.4 rollback... :P that could be tested as well on toolsbeta [15:55:31] petan: Yes, but since we have reddis in puppet creating a toolsbeta reddis is as simple as just adding it. :-0 [15:55:41] petan: toolsbeta was not usable when it was deployed. [15:55:43] who knows... :P [15:55:47] why not? [15:55:53] it was usable for weeks [15:55:55] petan: Because it didn't *exist* [15:55:59] o.O [15:56:10] the /svr directory exists. Permissions look fine [15:56:13] toolsbeta is there for many weeks, that 2.4 rollback was few days ago or something [15:56:22] weeks [15:56:26] dunno [15:56:34] but I am pretty sure the time when it happened it did exist [15:56:38] petan: That's because it lingered for a long time after not having been worked on. It was done long before toolsbeta got there. [15:56:45] It looks like the /var/www directory is no longer the default web directory. [15:57:12] slevinski: Have you checked for recent puppet changes? [15:57:15] I don't understand how did you update it then, but toolsbeta is still 2.2 while production is 2.4 [15:57:17] slevinski: As the instance was created over a year ago, I'm not sure how it was set up and what changes in the mean time have occured. I would suggest asking Ryan_Lane or andrewbogott. [15:57:42] meh no, prod is 2.2. [15:57:43] hmmm [15:58:03] slevinski I would suggest asking bugzilla [15:58:30] !rb [15:58:30] broken? report a bug: https://bugzilla.wikimedia.org/enter_bug.cgi?product=Wikimedia%20Labs [15:59:00] I assumed it was a puppet change. Not sure how to check for recent changes. [15:59:15] recent changes are in gerrit [15:59:16] slevinski: http://hexm.de/mw-search might help you [15:59:44] slevinski: you can also search in Gerrit, e.g., https://gerrit.wikimedia.org/r/#/q/status:open+project:%255Eoperations.*,n,z [15:59:54] on how to search: https://gerrit.wikimedia.org/r/Documentation/user-search.html [16:00:04] petan: sorry, I only heard of the need to move yesterday. I'm willing to make the move now, but I would appreciate a walkthough for getting on to tool labs, as i'm still not that savvy with all this [16:00:22] Izhidez no problem [16:00:33] Izhidez don't tell me you didn't notice any of these huge motd banners [16:00:43] like "DON'T USE THIS INSTANCE" [16:00:55] "THIS INSTANCE IS GOING TO BE DELETED, GET OUT OF HERE" [16:01:09] Izhidez: more seriously, how did you hear of the need to move? [16:01:27] Izhidez: and if you tell us where you get your Wikimedia tech-specific information, we can inform you more usefully in the future [16:01:48] slevinski: When did it work for you the last time? [16:01:50] I saw the "This project will be closing" a long time ago, but didn't think it had that much effect. [16:02:39] sumanah: it's probably partially my fault. I think the email I have in labs right now is dead, and I've been inactive a fair bit so not watching labs-l at all. and I found out through rschen7754|away [16:03:16] Izhidez: ah, got it :) please do update your email in https://wikitech.wikimedia.org preferences [16:05:37] The problem was first noticed yesterday. It was working last week. I'd assume something happened over the weekend. [16:06:47] petan: no time yet. will do [16:06:55] sumanah: new address updated [16:07:35] petan: another thing though, I only go on labs when I have to reboot stuff...which is not that often, so it might be the reason I didn't see those notices [16:11:10] I know I saw a list of all projects in Labs somewhere. But where? any hints? [16:11:44] Silke_WMDE_: Main page, and then there's a link ("number of projects" or something like that). [16:12:30] got it, thx scfc_de [16:12:42] https://wikitech.wikimedia.org/wiki/Special:Ask/-5B-5BResource-20Type::project-5D-5D/-3F/-3FMember/-3FDescription/mainlabel%3D-2D/searchlabel%3Dprojects/offset%3D0 [16:13:03] I believe kma500 filed a bug, or has a TODO, to make that link a lot more prominent [16:13:15] you could beat her to the punch by filing it :) [16:20:00] [bz] (8NEW - created by: 2silke.meyer, priority: 4Unprioritized - 6normal) [Bug 51889] Make the link to list of all Labs projects more visible - https://bugzilla.wikimedia.org/show_bug.cgi?id=51889 [16:21:49] Am I right that in bugzilla the "bots" component is obsolete? [16:22:13] If so, I'd file a bug for deleting it too. [16:29:16] petan: https://wikitech.wikimedia.org/wiki/Shell_Request/Lakshman is another case where the user doesn't exist. [16:31:47] [bz] (8NEW - created by: 2silke.meyer, priority: 4Unprioritized - 6major) [Bug 51890] Bots compenent in bugzilla is obsolete - https://bugzilla.wikimedia.org/show_bug.cgi?id=51890 [16:31:54] ok, it's now 9:30am on the west coast of North America; in about half an hour I'm in a meeting when Kirsten gets back online. Argh! Ah well. [16:32:49] time zone as in waste of time... [16:32:50] petan: creating toolsbeta-redis now [16:33:02] * YuviPanda makes the earth flat [16:34:51] * sumanah listens to dance music to self-soothe [16:34:57] Silke_WMDE_: how was your vacation? [16:35:08] awesome!!!!! [16:35:19] I wanna go again! [16:35:20] petan: also, the only notice on bots-bnr1 is "THIS PROJECT IS DEPRECATED. MOVE YOUR BOT TO TOOLS PROJECT." which says nothing about deletion [16:35:33] or not using it asap [16:35:37] petan: andrewbogott Coren|Lunch I need to add ' role::labsnfs::client [?]' to instances to get them to be on NFS, right? [16:35:37] (for toolsbeta-redis) [16:36:29] Izhidez: petan - agreed, the notice should link to a "or else" explanation, and/or include a deadline [16:36:49] YuviaPanda, that sounds right but I haven't tried it. [16:36:58] andrewbogott: okay, let me [16:37:02] andrewbogott: also, You can type Yu to autocomplete my name :) [16:37:34] petan: hmm, I can't login to toolsbeta-redis [16:38:21] andrewbogott: I just created toolsbeta-login, and applied two roles to it, and restarted. console output is stuck at toolsbeta-redis login: [16:38:23] thoughts? [16:38:25] and ssh doesn't work [16:39:35] You've clobbered your homedirs somehow… they were mounted via gluster previously, now home is managed by nfs [16:39:52] So maybe a cron needs to run to get your homedirs set back up? I don't remember quite how this works. [16:40:39] andrewbogott: who would? [16:40:39] petan: is this the same thing you faced when trying to setup tools-redis? [16:41:02] I would start by waiting a few minutes and trying again :) [16:41:18] Coren should know more. [16:43:07] andrewbogott: ok :) [16:43:28] wooo, nice. now no route to host :) [16:50:16] ok, back in about 70 min :) [16:53:51] Ryan_Lane: do you know how long it takes between "apply a role on wikitech" to having puppet set that up? [16:53:55] on a new labs instance? [16:54:06] apply a role? [16:54:33] like adding someone to a projectadmin role? [16:54:36] Ryan_Lane: configure an instance by ticking a box in Special:NovaInstance? [16:54:44] ah. modifying a puppet class [16:54:49] right [16:54:59] puppet is set to run every 30 mins or so [16:55:05] Ryan_Lane: Specifically enabling NFS :-). [16:55:07] any way to force it? [16:55:08] if you want it to be immediate you need to force a puppet run [16:55:15] as root: puppetd -tv [16:55:24] or: puppetd --test [16:55:27] same same [16:55:35] Thanks for the help everyone. I was able to fix the problem. Change to DocumentRoot because of /etc/apache2/sites-enabled/000-wikicontroller.conf [16:55:35] scfc_de: I realized I don't need NFS for this [16:55:35] scfc_de: I... think? [16:55:41] scfc_de: I'm able to ssh in by doing ssh toolsbeta-redis.pmtpa.wmflabs [16:55:43] scfc_de: just can't from toolsbeta-login [16:55:50] Ryan_Lane: ty [16:56:10] Ryan_Lane: ah, it says 'notice: Run of Puppet configuration client already in progress; skipping' [16:56:11] so I should just let it run [16:56:17] are there going to be logs somewhere? [16:56:22] YuviPanda: HostBased enabled on source and dest? [16:56:39] YuviPanda: /var/log/puppet.log maybe? [16:56:54] hmm [16:56:55] Could not parse configuration file: Certificate names must be lower case; see #1168 [16:57:10] Ryan_Lane: ^ [16:57:22] scfc_de: source -> my system, or toolsbeta-login? [16:59:13] YuviPanda: On toolsbeta-login, in /etc/ssh/ssh_config (not sshd_config), "HostbasedAuthentication yes" and "EnableSSHKeysign yes". Which reminds me that I still need to puppetify that. [16:59:25] scfc_de: please do! [16:59:43] YuviPanda: I don't know what you're trying to add or why puppet is bitching at you ;) [17:00:27] Ryan_Lane: 1. create toolsbeta-redis on wikitech 2. wait for it to boot 3. ssh in to make sure I can 4. apply puppet class tools::redis 5. tail -f /var/log/puppet.log 6. get that message [17:00:29] scfc_de: I see a commented out 'no' [17:00:36] scfc_de: but you should puppetize that [17:00:55] YuviPanda: Will do later. You can look at tools-login's ssh_config how it should be set up. [17:01:02] scfc_de: ok [17:03:26] (Makes a note that on the non-TOS extra rules, along with "run bots on -login iff you want to be flogged" should be "you must subscribe to labs-l so that you can see announcements" [17:03:46] Ryan_Lane: Do you know where Labslogbot gets the information for https://wikitech.wikimedia.org/wiki/Shell_Request/Lakshman? There's no user with that name (even as part) on wikitech. [17:04:01] scfc_de: there's a bug about that [17:04:16] YuviPanda: try to run puppetd -tv [17:04:22] do you get the error? [17:04:25] you need to be root to do it [17:04:38] running [17:04:47] let me see what it gives [17:05:06] scfc_de: even if a user fails to fully create an account (because of captcha or shell account name) it'll still create the request. it's a bug [17:05:10] YuviPanda: Sadly, autofs doesn't cope with the switch usefully; the instance needs to be rebooted. [17:05:18] Ryan_Lane: Ah, okay, thanks. [17:05:37] no immediate error, at least [17:05:37] Ryan_Lane: it runs :) [17:05:51] Ryan_Lane: warning: /Stage[main]/Toollabs/File[/data/project/.system/store/hostkey-toolsbeta-redis.pmtpa.wmflabs]: Skipping because of failed dependencies [17:05:51] but it completed [17:06:22] Coren: so this one seems to work fine without rebooting [17:06:51] Coren: I just created it, ssh'd in from my local system (via labs bastion), then applied puppet class on wikitech, ran puppetd -tv, and it works now [17:06:58] petan: toolsbeta-redis is done [17:07:00] YuviPanda: Perhaps because nobody was using /data/project and /home yet. [17:07:01] Coren: Not related to the current case, but if you're taking notes :-): Announcements that need action on the users' part should be very clear, i. e. not one sentence in a 200 line mail. [17:07:03] Coren: I don't think a redis server needs NFS, does it? [17:07:31] YuviPanda: If you want admins to be able to log in and have their dotfiles it does. [17:07:44] YuviPanda: You should, at any rate, presume NFS -- it'll become the default soon enough. [17:07:47] Coren: right. [17:08:02] Coren: so do I need to apply a puppet class again? there's a labsnfs::client I spotted [17:08:08] YuviPanda: did you put something into a stage? [17:08:16] Ryan_Lane: ... no? [17:08:20] stages are evil and should basically never be used [17:08:25] yurik: That's the one. [17:08:28] for the tools-redis class [17:08:32] who wrote that class? [17:08:45] that was me, but the underlying redis one is from production [17:08:48] or maybe it just depends on something that fails [17:08:57] you need to figure out what failed ;) [17:09:01] and fix it [17:09:15] Ryan_Lane: it looks like hostkey stuff failed, and scfc_de was just talking about needing to puppetize that [17:09:22] ah [17:09:35] actually, I think that is explainable [17:09:41] it tried to read /data/project/.system [17:09:45] and since I hadn't applied the NFS class [17:09:46] it failed [17:09:47] tada [17:09:51] Right. [17:10:10] That horrible hack with /data/project/.system/store is caused by the fact that we can't use exported resources in puppet. [17:10:13] sumanah: so to make my own little "workspace", I would go to https://wikitech.wikimedia.org/wiki/Special:NovaProject --> bots, and "Add service group"? [17:10:29] Coren: should I put anything in managehome or static_nfs? [17:10:30] As you apply the nfs class, it'll appear. [17:10:42] YuviPanda: Nope. Just include it straight. [17:10:44] ok [17:10:55] Coren: so now I force puppet, then restart [17:11:04] Izhidez: You want to move to Tools? Then http://tools.wmflabs.org/ => "create new tool". [17:11:07] YuviPanda: Right. It should come back with your project home. [17:11:20] okay. Running puppet now [17:11:26] Izhidez: Not bots. "tools" :-) [17:12:11] rebooted [17:13:29] still rebooting [17:15:51] Coren: right, keep getting the two mixed up [17:16:29] scfc_de: there is "create a new user", is that it? [17:17:12] oh i'm blind [17:17:15] disregard [17:17:27] Coren: \o/ works [17:17:35] Izhidez: You're not a member of the Tools project yet? What's your wikitech username? [17:17:40] Coren: so I'm going to 1. delete it, 2. recreate it and 3. record my steps [17:17:59] YuviPanda: How... bold. :-) [17:18:16] I'm more underlined, really [17:18:18] scfc_de: I am part of it already, I just created my tool [17:18:27] Izhidez: Perfect. [17:18:33] * Coren is kinda small caps-ish. [17:19:53] Coren: well, well, well. [17:19:55] 'Failed to create instance as the host could not be added to LDAP. ' [17:19:58] when trying to create instance [17:20:18] scfc_de: now it wants me to sudo/use a password to get to my project? [17:20:56] trying to follow https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help [17:20:58] Izhidez: You should be able to use "become TOOLNAME" to do that. [17:21:21] scfc_de: deltaquad@tools-login:~$ become deltaquad-bots [17:21:21] sudo: sorry, a password is required to run sudo [17:21:33] Izhidez: Could you log out and log in again? [17:21:39] k [17:22:01] Izhidez: Also, there may be a short delay after creating a new tool, but I think (Coren?) that's a minute or so. [17:22:39] ok, something different now [17:22:47] local-deltaquad-bots@tools-login:~$ [17:22:54] am I in my project now? [17:22:58] Izhidez: Yes. [17:23:00] scfc_de: Yeah ~60 secs, but the "a password is required" is almost certainly caused by the fact that you logged in before you created the group. Unix group membership is checked on login only, so you wouldn't have been a member in this session. [17:23:12] s/you/DQ/ [17:23:25] k, I'll start moving my things now [17:25:37] kma500: ^^ This is a FAQ that probably bears noting down. [17:26:04] Coren: (Apparently "sleep 120".) [17:26:38] I'll be back tomorrow! Everyone! Fill the etherpad with documentation! http://etherpad.wmflabs.org/pad/p/Tool_Labs_Sprint_July_23 [17:27:18] * Silke_WMDE_ must go to prevent her bike gets locked in the yard... [17:27:44] Coren: BTW, JohannesK_WMDE tried to ssh to tools.wmflabs.org the other day. Would be nice if we could have that sshd emit a more meaningful error message. [17:28:27] ... "more meaningful" than what? [17:28:33] Thanks, Coren. You mean the 60 sec bit? [17:29:00] kma500: More the "a password is required" bit about needing to log off and back on if you created a new tool. [17:29:27] got it. [17:29:37] I'll add it to the etherpad [17:32:36] Coren: bleh, trying to repeat. for some reason puppet keeps failing :| [17:32:46] How? [17:33:03] Coren: for some reason it tries to setup ganglia and nagios? [17:33:05] and fails [17:33:06] let me pastebin [17:33:48] Ugh. That still exists? You need to purge a package to make puppet work again. [17:33:55] * Coren tries to remember which. [17:34:03] Coren: https://dpaste.de/XmUfK/ [17:34:24] that's /var/log/puppet.log [17:35:15] IIRC, it's a currently broken package so just doing an 'aptitude update' should show you which is broken. [17:35:41] https://wikitech.wikimedia.org/wiki/User:Yuvipanda/Creating_instance_toolsbeta is what I have so far [17:36:00] ganglia-monitor [17:36:06] purge gangila-monitor [17:36:26] Coren: hmm, apt-get update gives me [17:36:26] W: GPG error: http://ftp.osuosl.org precise Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY CBCB082A1BB943DB [17:36:40] I thought Ryan had managed to fix that in the image; apparently not in all circumstances. [17:36:51] Yes, the pubkey thing is an external problem we don't care about. :-) [17:36:54] well, apparently E: dpkg was interrupted, you must manually run 'sudo dpkg --configure -a' to correct the problem. [17:36:54] * YuviPanda runs that [17:37:17] Coren: apt-get purge ganglia-monitor asks me permission to install two packages? [17:37:18] .. [17:37:31] o_O [17:37:43] To *purge* a package? Heh. [17:37:49] Still do so. [17:38:05] If it's the same issue it once was, then that'll fix everything. [17:38:16] Otherwise, Ryan will be the one to bug. [17:38:37] still gives me egrep: /etc/ganglia/gmond.conf: No such file or directory [17:39:25] Coren: and major errors [17:39:34] I need to sleep now, about 34hours uptime right now :) [17:39:37] i'll pastebin and leave [17:39:40] sorry [17:39:44] the errors are bigger now [17:39:50] 'sok. I'll point Ryan at it. [17:40:13] Coren: this is toolsbeta-redis2 [17:40:19] scfc_de: https://gerrit.wikimedia.org/r/#/c/71112/ [17:40:24] scfc_de: Planning to finish that? [17:41:13] nite, Coren [17:49:15] scfc_de: can we use screen at all? [17:49:51] because that is what i'm used to using for my IRC bot [17:50:11] when i tried to do screen I got: Cannot open your terminal '/dev/pts/23' - please check. [17:51:10] Izhidez: if you need something running in the background, command & disown [17:51:48] what is "command & disown"? [17:54:24] fwilson: ^^ [17:55:32] !screenfix [17:55:33] script /dev/null [17:55:38] Izhidez: ^ [17:56:08] thank you valhallasw [18:00:02] petan: i'm off labs [18:00:21] bots* [18:00:25] bots [18:01:54] just woke up [18:02:29] petan: I should be off of bots-bnr1, but let me make sure everything works for a few hours before we delete it pls [18:02:29] rschen7754|away?? [18:02:37] no prob [18:04:14] petan|wk: yeah, the only thing i have on bots is a php webtool [18:04:38] rschen7754|away ok so all bots running under your user can be killed? [18:04:48] petan|wk: on bots, yes [18:04:56] k [18:08:11] Izawayz: or that :) & disown detaches the process from the shell [18:16:00] Izawayz: Bots should be run as jobs on the grid; running them in a screen session is evil :-). [18:16:20] Coren: Diff'ing my different repos; will upload in a bit. [18:17:47] Coren: Re tools.wmflabs.org, I think at the moment it just says "Permission denied" or something general. If we could instead jedi "you want to ssh tools-login.wmflabs.org", that would be great, but I don't know if it is feasible. [18:18:39] scfc_de: You realize that any error message gotten while attempting to connect comes from the ssh /client/, right? :-) [18:19:58] Coren: I do, but the "If you have trouble accessing, cf. Help#Access" doesn't come from the client :-). [18:23:17] On Gerrit => Projects => $ONEPROJECT, for example https://gerrit.wikimedia.org/r/#/admin/projects/labs/toollabs, is there no link to gitblit or what the source tree viewer is called? [18:24:26] Hmmm, apparently https://git.wikimedia.org/ which is referenced in the mailing lists sometimes, is down. [18:25:16] hi kma500! how's it going? [18:25:23] hey there! [18:25:31] I'm starting from the top and working my way down. [18:26:24] Translate in zh please. O_O zh-hant is disabled. o_O [19:08:08] <^demon> Ryan_Lane: https://gerrit.wikimedia.org/r/#/c/75366/ and https://gerrit.wikimedia.org/r/#/c/75350/. The former lets the user's browser do some caching so we have to render less stuff, the latter adds an init script so people other than me can figure out how to restart gitblit [19:09:28] [bz] (8RESOLVED - created by: 2Michelle Grover, priority: 4Unprioritized - 6major) [Bug 51694] Beta labs is down for MobileFrontend - https://bugzilla.wikimedia.org/show_bug.cgi?id=51694 [19:10:13] oh, btw Ryan_Lane: I updated the multi instance self hosted puppet doc [19:10:19] mainly made the things you mentioned more explicit [19:10:30] lemme now if there's more you think I should do [19:10:32] cool. thanks [19:10:49] we'll have a doc sprint later. I'll probably clean that page up as a whole then :) [19:10:56] k [19:11:53] your changes make it more clear, though [19:11:59] so thanks for them :) [19:12:15] [bz] (8RESOLVED - created by: 2Antoine "hashar" Musso, priority: 4Unprioritized - 6normal) [Bug 51700] https://login.wikimedia.beta.wmflabs.org/ trapped in an infinite self-redirect - https://bugzilla.wikimedia.org/show_bug.cgi?id=51700 [19:15:20] coren: I'm looking at your hackathon presentation/notes about the web cluster. What do you mean by 'Load is distributed between identical backends (statically), and they can all serve as cold spares to each other? [19:17:48] kma500: round robin load balancing to the web servers (from the proxies) [19:17:58] the web servers can all run the tools [19:18:07] since they come from shared storage [19:18:47] kma500: That's basically it. Every web server can serve any of the tools; the proxy distributes between them. [19:18:58] Thank you! [19:20:00] [bz] (8RESOLVED - created by: 2This, that and the other, priority: 4Unprioritized - 6enhancement) [Bug 51745] Implement a custom HTTP 500 error page on labs - https://bugzilla.wikimedia.org/show_bug.cgi?id=51745 [19:22:56] Ryan_Lane: No sticky sessions? :( [19:23:44] Ryan_Lane: ottomata - in helping prep for future doc sprints, it's GREAT to have even skeletal TODO notes. I sometimes leave those in the talk pages of the specific pages that need redoing, or file bugs, or keep a giant Etherpad [19:24:03] aye cool, good tip [19:24:09] (PS1) coren: Tool Labs: new custom error messages [labs/toollabs] - https://gerrit.wikimedia.org/r/75480 [19:24:15] Damianz: good question. no clue [19:24:27] Damianz: No, the mapping is static. [19:24:41] people should be handling sessions in shared storage or in redis [19:24:54] Coren: Which is good until you break 1 of the servers :P [19:24:58] Damianz: Which does mean that, in case of failure, the mapping has to be reajusted. [19:25:12] Damianz: Easy solution: don't break the servers. :-) [19:25:27] That's no fun [19:26:08] Obviously better error messages: http://tools.wmflabs.org/geohack/foo http://tools.wmflabs.org/anagrimes/cgi-bin/ [19:26:11] * Damianz goes to get dinner and draw up an openstack deployment using a netapp with the cinder plugin [19:32:32] Damianz: that would be nice [19:32:38] Damianz: is that using boot from volume? [19:32:45] It will be [19:32:58] are you controlling the api calls? [19:33:05] otherwise it's necessary for the user to make extra calls for that [19:33:32] Main usage will be the build systems creating stuff, so it's easily scriptable. [19:33:35] I was thinking of maybe using salt-cloud for api access [19:33:51] so that it can be controlled more easily [19:34:02] or maybe openstack vagrant [19:34:11] Vagrant is awesome [19:34:31] the downside to vagrant is that the code needs to be in ruby for anything custom [19:34:56] Vagrant + puppet is what I'm pushing atm, even if it's in ruby [19:35:52] Currently figuring out if I can do 10G iscsi betwean a dell blade center and a netapp, with a management and access network before the boss arrives tomorrow afternoon so I can give him the 'this is what we'd like to do' talk... damn him for being a month early [19:36:57] I am wondering what is that [19:36:59] Is http://etherpad.wmflabs.org/pad/p/Tool_Labs_Sprint_July_23 down for everyone? Or just me? [19:37:00] vagrant + puppet [19:37:08] I thought vagrant is just a wrapper [19:37:14] for vbox, puppet and so [19:37:18] oh... it's back! Nevermind [19:38:43] kma500: basically, for no reason we know, the Etherpad Lite installation sometimes goes down for 5 seconds. Then comes back up. [19:39:04] ah, okay. Thanks! [19:39:05] vagrant lets you create vms on different backends (vbox, lxc, openstack, aws, vmware etc) from template boxes... but you can have it kick off puppet manifests. So you can download my puppet repo, do `vagrant up build_agent__` wait 5min and you have a working vm. [19:39:32] kma500: If you come back a minute later and it is still down, Mark Holmquist is a good person to ping, as he is the one who runs that Labs project. If you type marktr that generally autocompletes into his username. :) [19:40:04] okay. thanks [19:40:22] kma500: in general we do recommend that you occasionally back up Etherpad work to something less volatile and ephemeral, like a wiki [19:41:02] okay. should i do that in my user space? Somewhere else? [19:41:54] kma500: Your user space is fine. [19:42:13] Damianz: so it's just a wrapper [19:42:25] okay. [19:42:27] kma500: also: you probably will not find this directly useful during the doc sprint this week, but https://blog.wikimedia.org/c/technology/labs/ has some blog posts you may find useful [19:42:31] I'd say 'framework', but yes [19:42:38] like a script that use the 3rd software like vbox or puppet to build a virtual boxes [19:42:52] will check it out, sumanah. thanks! [19:44:18] kma500: use etherpad.wikimedia.org rather than the wmflabs one [19:45:23] Ryan--we're already up on wmflabs. Should we move? Or just use etherpad.wikimedia.org going forward? [19:45:31] either you wish [19:45:49] services on wmflabs aren't supported ;) [19:45:56] well, not usually supported [19:46:00] I much prefer the WMFLabs instance because the scrollbar actually works [19:46:03] tools being an excepton [19:46:29] sumanah: there's nothing wrong with using it as long as you are fine with it possibly not working at any point :) [19:46:39] kma500: do you understand the various levels of "support" (contrasting among random Labs projects, Tool Labs, and "production")? [19:46:52] Ryan_Lane: yup, hence my recommendation of backup. [19:46:56] this is first i've heard it [19:46:58] Ryan_Lane: lol from my experience, the wikimedia version is broken far more often [19:47:26] petan|wk: I haven't found the production version to be broken very often [19:47:26] What are the levels of support? [19:47:29] we're migrating to epl soon anyway [19:47:35] kma500: ok. basically, "support" in this context means: if it breaks, how much help can you expect to get from WMF Operations, and how quickly and reliably? [19:47:43] kma500: "production" support (the wikimedia projects) [19:47:50] Ryan_Lane I have, but I haven't found the wmflabs version to be ever broken [19:47:51] and related infrastructure to make it work [19:48:16] "unsupported production" - we offer it in production, but don't put a lot of effort into making sure it works [19:48:28] I find if you feed lesslie whisky firewall changes happen quickly [19:49:02] "labs" - production level support for the underlying infrastructure, but no support at all for the services hosted in it [19:49:31] Ryan_Lane: it will probably help kma500 if you give an example of, for instance, "unsupported production" [19:49:59] "semi-production labs" - projects that are supported by the foundation at a close to production level [19:50:04] tools is semi-production [19:50:12] etherpad.wikimedia.org is unsupported production [19:50:29] is there a giant list of what's at what level of support? [19:50:43] nope [19:51:01] I'm thinking about this all... [19:51:14] status.wikimedia.org has a list of production supported services [19:51:25] I would say "community supported" instead of not supported for projects that are not supported by paid staff [19:51:43] petan|wk: yeah, that's a better terminology [19:52:02] "community supported" either means really damn awesome support or it's one dude lol [19:52:22] or possibly no one :) [19:52:27] ha! [19:52:30] Damianz: there are lot of community supported projects that actually works fine, like for example..... wikipedia :D [19:52:34] okay. thanks for all the clarification. [19:52:39] yw [19:52:48] we probably should have a list of this somewhere [19:52:54] I don't think that #wikipedia-en-hep is full of paid supporters [19:52:55] ops needs tech writing help :) [19:52:58] * help [19:53:15] petan|wk: yeah, but it's a mixture of community support and paid foundation support [19:53:18] You should put it in a mediawiki table, then we can all experience hell maintaining it [19:53:43] Damianz: we have VE! [19:53:46] QA put together a list of different levels of QA support https://www.mediawiki.org/wiki/QA/Features_testing/levels [19:53:46] Damianz lol [19:53:50] Really nothing is totally supported or community supported, since the entire point is anyone can make changes to production, ops are just a gate [19:53:54] thinking of that [19:53:57] I'm going to upgrade MW [19:54:07] Ryan_Lane: Actually, never edited since VE went live rofl [19:54:11] Damianz: you missed where I explained what I think "support" means [19:54:28] "support" means different things imo [19:54:42] if it breaks, how much help can you expect to get, and how quickly and reliably? [19:55:06] in the context of production vs semiproduction vs etc., I think that sort of service-level-agreement stuff is what's relevant [19:55:40] certainly in the context of "who has the ability to help improve them?" then "support" can mean more things [19:55:48] Depends what's broke though - there's stuff supported by ops that 2 people know how it works in my experience [19:56:02] Granted there is someone, who can look at fixing it... but it can go to community for fix, review by ops, live [19:56:13] I don't think the line is clear in the context of this environment [19:56:24] btw Damianz if you have improvements to make to the information in https://www.mediawiki.org/wiki/Developers/Maintainers#Operations.2Fsystems_administration please do go ahead [19:56:28] Ops can be useful just as a 'point in the right direction' though [19:58:23] * Damianz is amused that everytime he talks to sumanah he ends up arguing, but really agrees from a different slant... not the only person either [19:59:26] Damianz: yeah, you and I have pretty different communication styles and perspectives sometimes [20:00:43] I've worked with systems administrators who have a more collaborative teaching-y style and ones who are more argue-y [20:01:05] Damianz what is difference of vagrant and open stack? [20:01:16] speaking of pointing people in the right direction, it's been really useful in the past several months to have Coren and the person on "rt duty" [20:01:17] isn't it both kind of providing similar stuff [20:01:53] Another question from Coren's slides: 'All of the project databases replicated, with access level comparable to registered user.' Can anyone clarify 'access level' a bit more? [20:02:06] kma500: http://tools.wmflabs.org/geohack/foo http://tools.wmflabs.org/anagrimes/cgi-bin/ see anything I could add? [20:02:09] kma500: https://wikitech.wikimedia.org/wiki/Interrupts_Rotation may be a useful thing for you to know about, alongside https://www.mediawiki.org/wiki/Developers/Maintainers#Operations.2Fsystems_administration .... [20:02:24] * sumanah waits for someone else to answer Kirsten's question [20:02:48] kma500: The information you can get from the database is what can be gotten at on-wiki or through the API by normal registered users (i.e.: not including +sysop or other advanced permissions) [20:03:31] Coren: do you mean to the docs on those pages? [20:03:41] I meant the error page themselves. [20:03:43] petan|wk: openstack is a framework for stuff, one of the things is compute - vagrant is a simple interface to creating vms on openstack (or other platform). So you can have it spin up an image in a network in a project, fast. It's like asking what's the difference betwean cheese and tanks [20:04:09] Coren: thanks for the clarification! [20:04:20] Damianz: Cheese tends to be softer than tanks, but generally isn't armed. [20:04:32] Coren: It can kill you at 100 feet though [20:04:40] Or at the very least make you throw up all night [20:06:42] * Coren should probably add a 403 [20:07:07] kma500: the Interrupts Rotation is basically a listing of who is interruptible to follow up on existing Operations-related requests [20:07:13] coren: I like the bit about the magical script elves [20:08:01] kma500: in #wikimedia-operations we see in the channel /topic who is the person currently on duty and for how long (right now: "on RT duty: [name] (Jul 22-26)") [20:08:46] thanks, sumanah. [20:09:55] kma500: the Operations team often uses a system called RT to track requests, work, and bugs to fix. That's why the interrupts rotation is also called "rt duty" [20:11:18] okay [20:14:01] I'm trying to rebuild a labs box. I was able to blow it away but I can't readd it. I get "Failed to create instance as the host could not be added to LDAP. " [20:14:19] manybubbles: hm. which name? [20:14:19] Do I just have to wait longer? [20:14:27] Ryan_Lane: elasticsearch3 [20:14:33] one sec [20:16:25] i'm getting lots of emails now (1st of july – today) from jsub/cron, have i to pipe jsub to /dev/null? [20:16:32] hm. it didn't successfully delete the dns entry [20:16:40] manybubbles: one sec. I'm going to manually delete it [20:16:44] thanks! [20:16:58] giftpflanze: The mail queue was wedged when the NFS server barfed. I unwedged it today. [20:17:00] Ryan_Lane: I'm probably going to rebuild all the elasticsearch[0-3] this afternoon. [20:17:53] manybubbles: done [20:18:07] thanks! created just fine [20:18:09] manybubbles: you need to make sure you delete an instance before adding a new one with the same name [20:18:21] there's some bugs with the dns code in that regard [20:18:34] Ryan_Lane: I did that. Should I wait some time after deleting the instance? [20:18:59] nah. it should happen immediately [20:19:02] petan|wk: I cleared https://wikitech.wikimedia.org/wiki/Shell_Request/Lakshman, but wm-bot still lists him on #wikimedia-labs-requests. You might want to look at using "https://wikitech.wikimedia.org/w/api.php" "?action=ask" "&query=" (url-hexify-string "[[Category:Shell Access Requests]] [[Is Completed::No]]|?Shell Request User Name") "&format=json" to query the list of open requests. [20:19:04] there might have just been some hiccup [20:19:29] scfc_de: hmm [20:20:08] scfc_de: is it so easy now? :o [20:20:24] before it was a pain to get alist of all open requests [20:20:35] (PS2) coren: Tool Labs: new custom error messages [labs/toollabs] - https://gerrit.wikimedia.org/r/75480 [20:20:49] Did we just switch /var/log/messages to /var/log/syslog? [20:21:02] or have I been doing /var/log/syslog for a month and not thinking abotu it [20:21:04] petan|wk: No, without any knowledge in SMW, it took *me* quite a while to figure that out :-). [20:23:24] err, so, what is about the jsub output? [20:24:24] Ryan_Lane: I think puppet didn't like that much either: "The certificate retrieved from the master does not match the agent's private key." [20:24:53] giftpflanze: If you don't want to be informed of successful job submissions, you can use "jsub -quiet". [20:25:02] puppet seemed to be stuck so I killed it and started it manually [20:25:35] Ryan_Lane: BTW, is the user-in-limbo-triggers-labslogbot bug tracked somewhere? In Bugzilla there wasn't anything obvious. [20:25:38] that parameter seems like a dejavu [20:25:53] wow, it even has a man page :) [20:27:12] kma500: http://tools.wmflabs.org/aude/ for the 403s [20:27:52] Ryan_Lane: you can probably ignore my last complaint - if I run puppet using the wrapper script it runs ok [20:29:03] coren: your error messages crack me up! [20:29:49] Oh, you mean the secret underground lair of the maintainers? :-) [20:30:06] :) [20:30:19] I like how you pull out the note to the maintainers. [20:30:25] I think that's really useful [20:33:02] kma500: I didn't want to confusing people following links from projects about index files or permissions. :-) [20:33:36] that's kind! [20:34:18] kma500: No, it's selfish. People who get confusing error messages email the project admins thinking something broke or they did something wrong. :-) [20:34:40] one of those lovely times of enlightened self-interest :) [20:34:52] :) [20:36:29] Will toolsbeta have a request access link from its page https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta like the tools project does? Or do people request access another way? [20:37:47] kma500: Well, it could, but given that the environment is really meant for more specialized purposes you'd want people to talk to an admin first. [20:38:13] (To avoid the confusion between a /tool/ that's in beta vs a tool that needs a beta /environment/) [20:38:37] Okay. Where should I point people to talk to an admin? IRC? [20:40:22] kma500: IRC should be the first port of call since it tends to be fastest/easiest. Followed by discussion on labs-l or even just a bugzilla. Honestly, I don't expect end users to want to hop to toolsbeta unless suggested by one of us ("My tool doesn't work with Foo version 1.0 -- let's try to see if we can backport 2.0 on toolsbeta") [20:40:37] okay. Thanks! [20:52:06] it looks like this instance is stuck in deleting: i-000007a9 [20:52:15] We certainly don't need it any more [21:13:42] Coren: I just finished a draft of section one of the tool labs guide. Could you look it over when you have time and expand/correct as needed? It's mostly based on your presentation. [21:14:03] I was actually reading it now. :-) [21:14:17] :) [21:16:18] [bz] (8NEW - created by: 2Chris McMahon, priority: 4Unprioritized - 6major) [Bug 50622] Special:NewPagesFeed intermittently fails on beta cluster; causes test failure - https://bugzilla.wikimedia.org/show_bug.cgi?id=50622 [21:20:17] kma500: Added notes to 1.3.3 and 1.4.1. Otherwise, it's excellent. [21:20:46] awesome. Thanks! [21:23:01] Coren is it possible that the hebrew toolserver dbs are not set up properly [21:23:05] ? [21:23:48] OrenBochman: It's possible. Wait, I expect you meant tool *labs* dbs? :-) [21:27:23] OrenBochman: I see nothing wrong, at first glance. What is the issue you are running into? [21:28:02] kma500: 1.3.3 fixed. [21:28:37] great. thanks! [21:36:07] I am running a query from http://meta.wikimedia.org/wiki/Research:Community_visualization_using_Gephi which works fine on en db but fails on he [21:36:54] the problem is it does not detect / in hebrew page names [21:37:33] it's AND `page_title` NOT LIKE '%/%' [21:37:43] which fails [21:38:16] p.s. if you run the script add a limit and wait about 12 seconds [21:38:30] for results in the hebrew db [21:39:06] ;-) too labs [21:40:43] Huh. Interesting. I'm not even sure I see how that could even be an encoding issue. Please open a bugzilla about it, this'll require more eyes than just mine. [21:41:36] wouldn't you want to use revision_userindex on that because of the join? [21:42:20] LIKE '%/%' :( [21:42:28] no way it's gonna work fast [21:43:33] I agree [21:44:34] Nettrom: I tried but it uses the page_name and namespace [21:44:46] so I don't think you save anything [21:44:51] I get ~1M pags in 2.78s? [21:44:55] *pages [21:45:26] tha't about right - one page per user [21:47:14] I'm running a version that ends with HAVING COUNT(*) > 20 ORDER BY COUNT(*) DESC LIMIT 200; (and restrictions on the user name) [21:50:51] MaxSem: if this ever goes into production it will need a search engine or php filter [21:51:05] :) [21:52:11] right now its is for measuring iteraction between newbies [21:52:33] which is more like ... research/out reach [21:59:18] OrenBochman: I can't make the query fail using the command line, though, seems to correctly remove the "/" pages there [22:03:57] I have to sign off for the day. Thanks for all the help! [22:04:02] I did not succeed in running it from the command line [22:04:24] could you send me a paste of the command you used ? [22:04:54] I was trying to save the result to a file as csv [22:05:50] I can probably display the csv as a d3.js graph [22:34:18] hm… does it take a long time for replica.my.cnf to be created in new projects? [22:36:30] or "service groups". I created "catmonitor" about two hours ago, but there is still no replica.my.cnf in /data/project/catmonitor/ [22:38:11] danmichaelo: It normally takes just a minute, but I see a typo that prevented it. Fix't. [22:38:47] great! was it a typo of mine or somewhere else? [22:39:05] No, my typo not yours. :-) [22:40:06] k :) [22:41:55] Coren|Away... Make a typo? OMG!!! [22:42:18] * T13 runs to get the camera... [22:59:42] hi kma500 how's it going? [23:47:10] Coren|Away: Any idea if there's any known problems with connection sockets in tool labs? As of a few days I'm seeing a slightl increase in irc connections being terminated at seemingly random points in time with a timeout. Some of the bots get restarted by jstart, though not always it seems. [23:47:16] freenode irc bots [23:48:58] Krinkle: shouldn't be [23:49:06] Krinkle: is freenode killing your connection? [23:49:20] we had an issue with number of connections with them before [23:49:33] It seems to be a trend in the last 2-3 days. Just got another one (dbbot-wm in wikimedia-dev/wikimedia-operations) [23:49:46] we set up ident servers for tools so that they'd up our connection limit [23:49:47] No idea, checking logs for this one [23:50:26] PHP Warning: socket_connect(): unable to connect [110]: Connection timed out in /data/project/wmfdbbot/apps/ts-krinkle-Kribo/includes/Irc.php on line 59 [23:50:38] connection timed out? [23:50:44] that's weird [23:50:58] https://github.com/Krinkle/ts-krinkle-Kribo/blob/master/includes/Irc.php#L59 [23:50:59] indeed [23:51:00] that would be on the freenode side, likely