[05:07:51] !log tools cleared out old /tmp and /var/log files on tools-sgebastion-07 [05:07:54] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [10:27:08] !log tools shutdown again tools-prometheus-01, no longer in use (T238096) [10:27:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [10:27:12] T238096: Toolforge: prometheus: refresh setup - https://phabricator.wikimedia.org/T238096 [10:44:33] !log tools merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/565556 which is a behavior change to the Toolforge front proxy (T234617) [10:44:38] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [10:44:39] T234617: Toolforge. introduce new domain toolforge.org - https://phabricator.wikimedia.org/T234617 [10:59:45] !log tools.k8s-status restarted webservice to test how it works with latest front proxy changes (T234617) [10:59:48] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.k8s-status/SAL [10:59:48] T234617: Toolforge. introduce new domain toolforge.org - https://phabricator.wikimedia.org/T234617 [11:02:44] !log tools.k8s-status requires creating the ingress object by hand. Will leave that to the tool author (bryan) (T234617) [11:02:46] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.k8s-status/SAL [11:04:25] !log tools.openstack-browser restarted webservice to test how it works with latest front proxy changes (T234617) Then realized we lack newer version of webservice and this restart was for nothing [11:04:27] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.openstack-browser/SAL [11:05:32] Lucas_WMDE: you around? if you have a webservice running in the grid, you may try now http://$tool.toolforge.org and report what you see [11:05:56] \o/ [11:06:07] do you have one? [11:06:07] I only have one Grid service, I think, the rest is Kubernetes [11:06:09] *tries* [11:06:37] https://wd-shex-infer.toolforge.org/ looks good [11:06:54] ah, but internal server error on login (OAuth) [11:07:21] I guess because they don’t share a session [11:07:26] mmmm [11:07:36] and the OAuth callback uses the old domain [11:07:39] the return URL is still the old one [11:07:42] yeah [11:07:56] and there, the session (cookie) doesn’t contain the OAuth token [11:08:01] you may need to update the code. Not required now, but when we go live with this [11:08:09] and I never bothered to catch this as a proper error, so it crashes [11:08:34] so is the expectation that tools are hosted under both domains for a transition period? [11:08:41] it seems like that would make my life harder [11:08:42] yes [11:08:56] having a transition period? [11:09:08] well, if the transition was using a redirect, it seems it would be easier [11:09:25] because the tool could know it’s either hosted on X or on Y [11:09:29] instead of potentially both at the same time [11:10:01] having a timeline with a hard-cut would be difficult for most of the users [11:10:21] we expect many users will need to update their webservice source code to handle the new domain, new document root, etc [11:10:36] also, we cannot simply drop the old domain [11:10:56] there could be static/harcoded URLs in half of the internet that we would like to serve [11:11:07] I’m not suggesting you drop the old domain [11:11:39] and I was thinking the redirect might be up to the tool operator – something like `webservice start [--redirect-wmflabs-to-toolforge|--redirect-toolforge-to-wmflabs]` [11:12:23] unfortunately is not that easy. The legacy systems (grid, old k8s) work using certain mechanisms and the new 2020 kubernetes clusters uses a different mechanism [11:12:35] they all use in common the front proxy [11:12:48] (this old k8s tool seems to work as well btw: https://lexeme-forms.toolforge.org/ – whereas https://quickcategories.toolforge.org/, which I think is using new k8s, doesn’t) [11:13:45] yes, ironically, webservices running in the new k8s cluster need a newer version of the webservice command, which we haven't released yet [11:14:03] so, let me write down your suggestion [11:14:15] your suggestion is to either run under the old domain or in the new, but not both [11:14:39] isn't this something you can decide? i.e, generate URLs (and share them) for the old or new domain? [11:16:01] can't you detect the request URL and generate the appropriate callback URL for oauth? [11:16:18] if I’d registered the OAuth consumer with a free callback URL, probably [11:16:20] (again, this means coding updates on your side, I'm aware) [11:16:44] flask’s url_for() seems to do the right thing, all the links on https://wd-shex-infer.toolforge.org/ stay within the subdomain [11:16:51] so that’s good [11:17:21] but if I have to register a new OAuth consumer anyways because the old one has the tools.wmflabs.org callback URL hardcoded [11:17:25] then I think I’d rather hardcode the new callback URL [11:17:38] and make sure that my tool redirects people to the new domain if necessary [11:17:57] because if I support both, sooner or later they’ll just get confused anyways about why they’re not logged in at what looks like the same tool, I suspect [11:18:21] ok, here is a proposal, let's create a phabricator ticket, child of T234617 and discuss this oauth thing which seems important to figure out before we go live wit thi [11:18:22] T234617: Toolforge. introduce new domain toolforge.org - https://phabricator.wikimedia.org/T234617 [11:18:41] sounds good [11:18:53] I will create it, give me a second [11:19:04] (I also don’t yet really know the best practices for switching OAuth consumers at all, tbh… last time I did it some people started getting login issues :/ ) [11:21:41] I think I have another more general comment about session handling [11:21:52] I’ll see what task you create and decide if it fits there or I should leave it elsewhere :) [11:22:10] T244473 [11:22:11] T244473: Toolforge: both domains in parallel and OAuth - https://phabricator.wikimedia.org/T244473 [11:22:18] I think it fits there, writing comment [11:22:20] thanks [11:22:21] feel free to rename the task to 'oauth & session' handing [11:46:32] thanks Lucas_WMDE for your help :-) [11:49:44] thanks a lot for working on this! I’m very excited :) [11:50:08] maybe I can leave more useful comments when I get home (since this isn’t supposed to be staff activity ^^) [13:02:30] uh [13:02:41] arturo: I’m getting reports that my tool is broken even on wmflabs.org? https://twitter.com/fnielsen/status/1225404037622849536 [13:02:45] /o\ [13:02:49] * Lucas_WMDE checks logs [13:03:17] works for me [13:03:27] Lucas_WMDE: ^^^ [13:03:59] well it only breaks when you try to use it ^^ [13:04:02] or, wait [13:04:21] for example https://tools.wmflabs.org/lexeme-forms/api/v1/duplicates/www/de/tr [13:04:25] (that URL should work even without logging in) [13:04:34] but the error log looks like it might be unrelated [13:05:08] ok let me know if I can be of any help [13:05:48] oh, I see [13:05:55] when my server logs say [13:06:00] mwapi.errors.APIError: internal_api_error_TypeError: [XjwOWApAAE4AAF2Ra9gAAABW] Caught exception of type TypeError -- None [13:06:08] that’s actually an internal error in production lol [13:06:13] https://www.wikidata.org/w/api.php?action=wbsearchentities&format=json&uselang=de&search=tr&language=de&type=lexeme&limit=50 [13:06:25] so, not your fault and not my fault either :D [13:09:15] hehe [14:28:18] !log admin run hardware tests on cloudvirt1015 T220853 [14:28:25] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [14:28:26] T220853: VMs on cloudvirt1015 crashing - bad mainboard/memory - https://phabricator.wikimedia.org/T220853 [14:44:20] !log admin update apt packages on cloudvirt1015 T220853 [14:44:23] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [14:44:23] T220853: VMs on cloudvirt1015 crashing - bad mainboard/memory - https://phabricator.wikimedia.org/T220853 [14:59:39] I don't even remember seeing the *.toolforge.org domains being mentioned in the documentation anywhere. [15:00:57] well, AFAIU they’re brand new, and we’re probably not supposed to use them quite yet? [15:01:02] I assume they will be announced in due time [15:01:34] but I remember a few weeks ago *.toolforge.org still behaved the same as tools.wmflabs.org, i. e. tools were still mounted under a subpath (which didn’t have to match the subdomain either) [15:36:23] Been trying to find out where toolinfo records are being stored, without much luck ... is there a specific database name and table I need to connect to, or is it somewhere in the nfs directory structure? Trying to remove some now-unused records [15:45:56] if you do happen to get any details on that, leave a message on my talk page on wikitech [17:40:23] having an odd mwv + labs problem, maybe someone has ideas of where to look. Essentially the webserver can no longer see /vagrant, injecting `system("ls -l /vagrant/");die();` to /var/www/w/index.php shows 0 files, while when ssh'd in clearly the directory is mounted. Sounds like apparmor or some such but not finding anything... [17:40:47] perhaps something lxc specific [17:41:03] bd808: in the absence of 'award token to a specific comment', thumbs up on https://phab.wiki/234617 [18:23:53] ebernhardson: I have seen something similar when apache2 starts before the mounts from the host machine are made. I would try `vagrant ssh -- sudo service apache2 restart` to see if that fixes it. [18:25:28] bd808: success! thanks [18:25:48] excellent [20:00:25] I'm failing at finding...where are the ssh instructions for wmf-cloud ? [20:01:17] ebernhardson: maybe https://wikitech.wikimedia.org/wiki/Help:Accessing_Cloud_VPS_instances ? [20:01:35] bd808: yup! thats the one. You'd think the search team could figur eout how to search for something... [20:01:45] (lol) [20:01:47] :P [20:02:55] my awesome bar knew about Help:Access and it redirects there these days [20:03:09] * bd808 <3's Firefox's awesome bar [20:03:14] it's a fair point tbh - the apparent lack of best-match behavior always makes search a pain [20:04:51] Wikitech has a lot of search challenges. I think we have been making incremental progress on discoverability, but our case is so different from the normal article search on the project wikis that we will probably always wish for more. [20:05:32] Google does search so well that it has spoiled all of us :) [20:06:04] can't tell you the number of times I've tried to search for terms only to have the entire result list on the page be completely irrelevant because it happened to be an infobox parameter or something - that's not specific to wikitech at all [20:06:24] but MW in general [20:08:33] would be perfect if "wikitech" was a thing on top of https://codesearch.wmflabs.org/search/ as well [20:08:51] or like git clone the contents and grep -r [20:10:43] i often find, at least on wikitech, it's exacerbated by all the auto-generated pages. My first thought to find ssh instructions was search for "ProxyCommand". But it turns out the {{ProxySSH|...}} template includes ProxyCommand in many many instances. Unfortunately it's not clear how to decide which whitelists to ignore on 1k sites in 300 languages [20:10:48] s/whitelists/templates/ [20:14:57] ebernhardson: i understand, yea. though i don't have "Search in: All" enabled so i don't see the Templates. The default appears to be: Main, Help, Tool and Nova Resource name spaces to search in [20:16:48] mutante: in this case its instances, so pages like mwmaint1002 in the main namespace have the word ProxyCommand via the ProxySSH template. Although it seems there were only 26 results, i should have simply looked at the end of the results where the Help: namespace ended up [20:16:56] particulary notable when searching for ProxyCommand it shows a bunch of hits for simply "command" [20:17:05] which means a bunch of unrelated things [20:17:12] mutante: oh, yea you have to quote it. Otherwise it splits StudlyCaps into two words [20:17:56] i wonder what useful content is in the main namespace, if it's mostly things like the mwmaint1002 page, it should probably have it's weight in search pushed dow [20:19:13] this is the never ending debate about wikitech :) Main namespace is SRE stuff, Help namespace is Cloud Services stuff [20:19:15] a lot, incident documentation, runbooks ... [20:19:33] yea, Help: means cloud [20:19:36] for some reason [20:19:36] splitting wikis would help some things and hurt others [20:19:52] ahh [20:21:38] Things got mixed together before I landed here, but were related to the start of Labs/OpenStack cluster [20:22:15] in the way long ago, wikitech was just a VPS that brion hosted himself to keep track of how to fix the sites when they broke :) [20:23:24] true to that original idea there is still a wikitech-static clone of it that is hosted outside WMF infra [20:24:18] seems reasonable to give NS_HELP same weight as NS_MAIN in search then? https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/570712 [20:27:26] sure... ugh we get paged [20:31:41] things breaking everywhere, apparently [20:34:09] DSquirrelGM: better? [20:34:17] things got reverted [20:34:21] was mw deploy [20:36:03] meant more in reply to your message about being pinged - but I was having issues trying to upload some stuff, going to try again in a bit [20:36:44] was going to upload some user images on commons [21:03:43] seems to be back to normal now [21:13:41] DSquirrelGM: yes, should be. looks all good to us [21:13:47] thanks for confirming