[00:38:19] Could it be utf8 (wild guess) bd808 ? [00:39:08] its more likely LDAP or Elasticsearch errors [00:39:22] Ah ok [00:39:32] I've !log'ed a lot of emojis ;) [00:48:00] Heh [01:01:49] !log wikibrain removed 'Postgresql::Master' roles from both VMs as it was causing a puppet failure [01:01:50] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Wikibrain/SAL [01:11:31] !log tools.stashbot Updated to 3b5f6d0 (Write severe errors to a file on disk) [01:11:37] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stashbot/SAL [01:12:22] !log tools.stashbot Changed LDAP server setting to point to ldap-ro.eqiad.wikimedia.org [01:12:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stashbot/SAL [01:12:47] !log tools.stashbot Added tools-elastic-01.tools.eqiad.wmflabs to elasticsearch.servers list [01:12:48] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stashbot/SAL [01:15:57] is the puppet stuff that was announced done? [01:27:05] Zppix: not yet, getting there [01:27:07] Zppix: not completely, no [01:27:28] now which reply to believe... jk, thanks for the update [13:51:40] Hey folks. Looks like I'm still unable to create VMs that I can log into. [13:59:07] halfak: is the VM still online? can you paste the name so I can take a look at the logs?: [13:59:28] ores-worker-03.ores.eqiad.wmflabs [13:59:44] thanks jeh [14:02:52] the console log for that host is showing puppet certificate issues. I'll look into it [14:09:00] Thanks. [14:22:29] halfak: can you please delete ores-worker-03 and rebuild it? There was an old cert laying around for that FQDN [14:22:40] Sure [14:23:24] * halfak rebuilds [14:26:22] !log ores restarting ores-uwsgi service on ores-web-01 [14:26:25] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Ores/SAL [14:35:26] \o/ jeh, I'm in. Thank you [14:53:55] !log twl update puppet certificates for new master on twlight-tracker [14:53:58] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Twl/SAL [14:54:51] !log twl update puppet certificates for new master on twlight-staging (hostname correction) [14:54:52] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Twl/SAL [15:27:04] !log ores rebuilt ores-worker-03 with role::labs::ores::worker [15:27:08] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Ores/SAL [20:01:04] are you all doing rolling restarts of Toolforge tools, or something? [20:01:32] musikanimal: the kubernetes cluster is freaking out. We are trying to fix it :/ [20:01:39] okay :) Thanks [21:00:21] Toollabs down? I'm gettin 504's from ipcheck and whois so far [21:00:37] oh, didn't read my scrollback [21:01:46] SQL: known see topic [21:02:16] Zppix: Yep, hence the scrollback comment 16 seconds later :P [21:02:28] SQL: heh I didnt see that until after i sent it [21:16:11] Toolforge problems status update: we have narrowed the problem down to the etcd cluster that tracks state for the Kubernetes cluster. We are still not sure exactly why the etcd cluster is unhappy, but are attempting various things to diagnose further [21:16:59] thats just tech for you [22:10:16] Give it beer, that always works to make me happy? [22:25:21] !log wikistream Restarted varnish (T232486) [22:34:49] !log wikistream Manually started `HOME=/var/tmp/ NODE_ENV=production /usr/bin/nodejs /opt/wikistream/app.js >/dev/null 2>&1 &` (T232486) [22:35:11] crap. stashbot never came back because k8s is sick [23:02:19] I know this is a bit selfish, but is there anyway to manually kill the `pageviews` k8s webservice? I could just move it to the grid (if not temporarily). I ask only because it's such a high-traffic tool [23:04:24] musikanimal: right now, no. WE are unable to start or stop anything on the k8s cluster [23:05:03] I guess I could try to find the Docker container and kill it... [23:05:28] but even that is flying blind because we can't see the state of the cluster to know where it is actually running