[12:00:24] arturo hi, im wondering is ores-*hosts being down related to https://phabricator.wikimedia.org/T187292 please? [12:00:35] im getting [12:00:36] [11:56:12] PROBLEM - Host ORES-worker04.experimental is DOWN: CRITICAL - Host Unreachable (ores-worker-04.ores.eqiad.wmflabs) [12:00:48] [11:56:27] PROBLEM - Host ORES-lb02.Experimental is DOWN: CRITICAL - Host Unreachable (ores-lb-02.ores.eqiad.wmflabs) [12:01:20] * arturo looking [12:01:37] thanks :) [12:14:05] paladox: I see labvirt1008 up and running. I couldn't find whether your machines are running or not. Still looking for it. I don't know enough openstack kungfu yet :-) [12:17:03] paladox: instances are shutdown [12:17:06] https://www.irccloud.com/pastebin/0SW3dARG/ [12:19:02] paladox: did you try starting them via horizon? https://horizon.wikimedia.org [12:22:15] I guess you should be able to see something similar to this from your horizon project view: [12:22:23] https://www.irccloud.com/pastebin/0NznX1HV/ [13:00:34] yes everything on 1008 is down atm [13:00:35] https://phabricator.wikimedia.org/T187292#3971559 [13:00:57] we could start them but it would probably reinstate the issue and we need to shut them down to shuffle them to another virt [13:04:53] !log tools reboot tools-paws-master-01 for T187315 [13:04:58] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [13:04:59] T187315: paws: memory allocation errors in tools-paws-master-01 - https://phabricator.wikimedia.org/T187315 [13:09:21] !log tools the reboot was OK, the server seems working and kubectl sees all the pods running in the deployment (T187315) [13:09:26] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [13:23:20] arturo sorry just had lunch, um i doin't have access to them, i just monitor them using icinga2 for the ai team :) [13:25:14] k [13:38:12] paladox: I sent an email with affected instances and we'll start figuring our a plan for them asap [13:38:16] thanks for asking [13:38:28] chasemp thanks :) [14:59:55] hmmm, wrt. to "thermal paste" on the last cloud-l email, did you had a fire in the server room or something? [15:05:10] * chicocvenancio hopes not [15:06:09] Hauskatze the servers got over heated according to the task [15:06:27] Hauskatze https://phabricator.wikimedia.org/T187292 [15:06:43] No fire though [15:07:21] thank God, well, I hope you have some sort of fire alarm/extinguisher mechanisms there though [15:07:23] just in case [15:09:10] thermal paste is this, something that all cpus shoud have: https://en.wikipedia.org/wiki/Thermal_grease nothing to worry. Sometimes it degrades/dries out etc. and servers detect that and shutdown, nothing to worry about [15:10:48] to my knowledge ( do not know the details) our servers are on facilities that are highly prepared to handle fires- which would by themselves be very suprising to happen in the first place [15:11:31] I was told that even puting a plaque with wikimedia logo on the dc had to be customly ordered, so it would be ok with the security measures of the hosting place [15:14:04] well, I expected we used some reliable and trusted site to host our servers :) [15:14:21] and that's good to have confirmation [15:14:46] wondering if someone from WMF has to be there on that site [15:14:55] Hauskatze i belive wmf own this data center :) [15:15:49] I don't think so. Equiad is owned by Equinix Int. [15:16:13] no, that is not true, we collocate on other datacenters- but we own the server infrastructre- it is physically locked and separated,etc. [15:17:01] you have some images here: https://commons.wikimedia.org/wiki/Category:Wikimedia_servers_in_2015 [15:17:09] jynus oh, so the data center is not owned by wmf? [15:17:40] if I show you the photo of the datacenter size, you will understand why not [15:18:06] jynus: who can "visit" the site? [15:18:33] ie, I guess sometimes we need to plug and unplug things there right? [15:18:50] Hauskatze i think any one on the ops team who has clearence. [15:18:52] if you mean the wmf servers, only people the wmf sre team oks [15:19:57] and if I go with a court order? :P [15:20:22] that is why we have lawyers [15:21:32] !log integration migrating integration-slave-jessie-1002 to labvirt1014 [15:21:34] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Integration/SAL [15:21:45] !log integration migrating integration-slave-jessie-1001 to labvirt1015 [15:21:47] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Integration/SAL [15:27:41] Hauskatze court order? [15:29:29] paladox: in some situations a judge can order fisical investigation and seizure of data or servers [15:29:49] Oh i see. That would be why there's a legal team @ wmf :) [15:29:50] just teasing j-ynus [15:32:00] there can also be the situation of a meteorite hiting the datacenter [15:34:09] jynus well wont that break everything? [15:35:04] depends on the meteorite [15:35:43] if small ,we should be up ina a few minutes on a secondary datacenter [15:36:12] if a few kilometers in size, wikipedia woull be the last of your concerns :-) [15:36:12] technically if a continent sized meteorite hits in the wrong place we are dead that's true [15:36:16] jynus you mean codfw? [15:36:18] but in that case we all have bigger issues [15:36:33] chasemp: no need for continent size [15:36:37] if it is really small, maybe we don't notice at all? [15:36:48] I was searching and 10Km would be enough for a mass extintion [15:37:04] that's cheery :) [15:37:05] That is a big rock [15:37:09] an ICBM too [15:37:34] but we can actually see those coming in pretty well [15:37:37] maybe we can ask Elon Musk to send a copy of wikipedia to space [15:37:46] lol [15:37:57] as a backup? [15:37:58] it's the smaller ones that will surprise us [15:38:40] http://www.thedrive.com/news/18431/elon-musk-snuck-a-secret-payload-into-space-aboard-the-tesla-carrying-spacex-falcon-heavy [15:39:45] jynus: I bet we could actually convince one of the space faring entities to do that [15:40:24] as redittors note, Wikipedia is pretty compact these days [15:41:59] https://www.reddit.com/r/AskReddit/comments/7x639l/comment/du5xybw?st=JDLURED3&sh=18433145 [15:46:49] we can probably do this for under $1000, 512GB sd cards only cost around $300 [15:46:59] * chicocvenancio goes back to work [15:49:21] it would be worth doing for teh blogpost alone [16:50:59] https://meta.wikimedia.org/wiki/Wikipedia_to_the_Moon/About [16:52:27] I don't think that is going to happen [16:52:41] I mean the specific instance, not in general [16:53:49] I don't know if that lunar xprize mission will actually succeed, but they did deliver the payload to them [16:55:14] yea, I meant the space part [16:55:25] not the community part [19:49:16] harej: I'm creating some documentation for a toolforge tool. Do you have a recommended place for me to put it? I'm thinking of putting it on Meta currently. [19:49:30] Documentation for users to use it. [19:51:35] Niharika: that's a good question I think there is somewhat of a conventino as in https://wikitech.wikimedia.org/wiki/Tool:Openstack-browser [19:51:54] Niharika: what is the audience for this tool? [19:52:23] harej: Grantees. [19:52:35] Meta, then. [19:52:36] chasemp: I thought wikitech was for developer-facing docs. [19:52:39] I vote Meta as well [19:52:45] for users [19:52:52] Got it. Thanks y'all! [19:52:58] Niharika: ah, makes sense yes [19:53:07] meta would seem ideal and a link between the two? [19:53:21] Yeah, will add one. [20:12:45] !log codesearch added Addshore, D3r1ck01, Ladsgroup, and Prtksxna as project admins of codesearch [20:12:47] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Codesearch/SAL [20:19:11] !log codesearch added Addshore, D3r1ck01, Ladsgroup, and Prtksxna to Gerrit labs-codesearch project [20:19:13] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Codesearch/SAL [20:23:59] (03CR) 10Legoktm: "recheck" [labs/codesearch] - 10https://gerrit.wikimedia.org/r/410276 (https://phabricator.wikimedia.org/T187151) (owner: 10Legoktm) [20:24:20] Niharika: thanks for asking the question. For what its worth, I think the "right" place to put the user facing docs for a tool is anywhere you feel that your target users can find and improve them. [20:24:43] and wikitech is pretty bad for that unless your users are all Cloud VPS/Toolforge/Tech folks [20:28:18] (03CR) 10Jforrester: "Yay." [labs/codesearch] - 10https://gerrit.wikimedia.org/r/410276 (https://phabricator.wikimedia.org/T187151) (owner: 10Legoktm) [20:28:51] Niharika: fwiw i put my irc bots main docs on it own site see tools.wmflabs.org/zppixbot/documentation.html [22:25:22] bd808: Do you have any advice on how to potentially avoid hacks like these - https://meta.wikimedia.org/wiki/Countervandalism_Network/Infrastructure#Mail ? [22:30:53] Specifically the ability to address cloud vps project admins and members by canonical address (even if opt-in) [22:41:22] Krinkle: https://phabricator.wikimedia.org/T47828 [22:42:27] legoktm: Aye, thanks. I'm getting old.