[08:58:42] !log toolsbeta remove puppet prefix etcd-k8s-ctest (unused) [08:58:44] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [08:59:22] !log toolsbeta refresh role for servers in toolsbeta-test-k8s-{master,worker} [08:59:23] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [09:00:34] !log toolsbeta remove puppet prefix toolsbeta-arturo-k8s-{etcd,master,worker} (unused) [09:00:35] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [09:02:29] is it possible to log into https://shinken.wmflabs.org/user/login ? [09:02:31] and sue it? [09:02:36] *use it, i dont want to sue it ... [09:03:59] addshore: I've never done that. It should be possible though [09:04:27] i wonder what i should log in with :P [09:05:39] no idea :-) It doesn't seem to allow LDAP credentials [09:05:54] I suggest you open a phab task to clarify login/usage of that web page [09:05:56] Krenair: any idea? [09:06:47] !log toolsbeta remove puppet prefix toolsbeta-valhallasw-puppet-compiler (unused) [09:06:49] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [09:14:22] I just created T235825 BTW [09:14:25] T235825: wikitech SAL: handle '{' and '}' in log entries - https://phabricator.wikimedia.org/T235825 [09:39:43] would anyone be able to investigate why my key doesn't seem to have made it to this new instance: wikidata-icinga.wikidata-dev.eqiad.wmflabs ? I don't really know what to check myself [09:41:23] tarrow: I can help! [09:41:38] arturo: awesome! Thanks :) [09:42:31] tarrow: the instance may not have completed the first puppet run yet [09:44:55] that makes sense to me but it has been up a little while now. Anyway I can trigger it? Or attached a shell in horizon or something to trigger it? [09:45:14] we could inspect the logs and see what was wrong [09:45:21] you can do that yourself using horizon [09:46:25] "Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Function Call, Cluster misc not defined in wikimedia_clusters at /etc/puppet/modules/profile/manifests/base.pp:49:9"? [09:46:25] click in the instance name in the Compute/Overview section, go to the "logs" tab and then click "View Full Log" [09:46:48] ok, so it's a legitimate puppet issue [09:46:50] looks suspicious to me [09:48:19] I didn't deliberately add any extra roles [09:49:26] godog: do you know anything about this? [09:52:01] arturo: as far as I can see there are no extra odd roles from prefixes or anything [09:52:22] tarrow: this seems to be a wide-spread issue [09:52:30] I see a similar puppet issue in many other VMs [09:53:02] gotcha; are you happy to make a ticket? I can but I can't add much insight beyond pasting the error [09:53:27] related ticket seems to be T234232 [09:53:27] T234232: Hosts in puppet with $cluster missing from wikimedia_clusters - https://phabricator.wikimedia.org/T234232 [09:53:34] you can add your info there! [09:56:22] uughh yeah looks like my change indeed arturo [09:56:46] it seems we lack some global hiera key godog ? [09:56:50] arturo: do you know how widespread it is in wmcs ? [09:57:07] godog: every server I checked so far :-/ [09:57:29] arturo: ack, and all reporting 'misc' cluster missing ? [09:57:49] I'm running the agent in cumin in Toolforge [09:58:30] sending a patch now [09:58:55] godog: yes, in Toolforge everything is about "misc" [09:59:45] cool :) [10:00:56] arturo: ack, thanks, https://gerrit.wikimedia.org/r/c/operations/puppet/+/544166 should fix it [10:01:14] apologies for the breakage, I didn't foreesee it'd break wmcs :| [10:01:53] godog: +1, thanks for the quick fix [10:03:11] arturo: np! let me know if that fixes everything or there are stragglers [10:03:28] ok, will let you know [10:10:35] tarrow: you might want to re-create the VM again [10:12:40] Cool; just doing that [10:44:12] !log admin double max_message_size from 40KB to 80KB in the cloud-admin mailing list. A simple email with a couple of quotes can go over the 40KB limit. [10:44:15] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [12:29:55] I just made a new toolforge project and when I want to become the project, it gives me this error "You were added to the group tools.phabricator-reporter after you started this login session. [12:29:55] You need to log out and in again to be able to "become phabricator-reporter".. I closed the ssh connection several times but do I need to log out and login again in wikitech.wikimedia.org to make it work? [12:36:46] amir1: no I don't think so... do you still get the errors after logging out of your ssh session? [12:37:00] It just got fixed. I don't know why [12:37:16] phamhi: by the way, nice to meet you! Welcome to WMF :) [12:37:33] ah..that's good.. nice to meet you too :) [12:42:13] Amir1: Laaaaaaaaaaag [12:43:17] :D [14:27:41] !log admin deleted a bunch of leaked VMS from earlier today from the admin-monitoring project. Fullstack leaks due to an api outage, maybe? [14:27:43] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:01:47] !log admin created the `eqiad1.wikimedia.cloud` DNS zone (T235846) [16:01:52] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:01:52] T235846: wikimedia.cloud: setup new domain - https://phabricator.wikimedia.org/T235846 [16:09:16] andrewbogott: re: "Fullstack leaks due to an api outage, maybe?" I think it was the puppet catalog errors https://phabricator.wikimedia.org/T234232 [16:09:37] addshore: the login is guest/guest I think that's documented somewhere, but its probably pretyt hidden [16:11:15] jeh: that fits, thanks [16:33:34] !log toolsbeta created DNS zone `toolsbeta.eqiad1.wikimedia.cloud` [16:33:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [16:33:42] bd808: thanks [21:29:54] !log tools Rescheduled all grid engine webservice jobs (T217815) [21:29:57] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [21:29:57] T217815: Grid Engine lighttpd error "opening temp-file failed: No such file or directory" - https://phabricator.wikimedia.org/T217815 [21:52:03] !log tools.russbot Cleared error state of grid job 9514189 [21:52:05] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.russbot/SAL [21:53:11] !log tools.flickrdash Cleared error state of grid job 59725 [21:53:12] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.flickrdash/SAL [21:59:47] !log tools.sge-status Started webservice, no sign in logs of why it was not running [21:59:49] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.sge-status/SAL [22:09:50] !log tools Cleared error state of webgrid-generic@tools-sgewebgrid-generic-0901, webgrid-lighttpd@tools-sgewebgrid-lighttpd-09{12,15,19,20,26} [22:09:53] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [22:15:31] !log tools Rescheduled continuous jobs away from tools-sgeexec-0904 because of high system load [22:15:34] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL