[00:00:13] bd808: ok cool, pasting it was more or less askign if this one deserves tickets [00:00:55] musikanimal: I don't have enough user rights to poke at that database server to help diagnose the issue. A phabricator task tagged with #data-services and #DBA would probably be a good place to start an invesitgation. [00:02:32] okay, thanks! just wanted to check in case you knew upfront that this was a me problem (which is entirely possible :) [00:07:18] musikanimal: It might be a you problem, but it probably is a somebody else overloading ToolsDB problem in reality if you haven't really changed your code and the usage of that tool hasn't changed dramatically [00:07:39] mutante: I filed T214447 for the NPRE check failure [00:07:39] T214447: Toolforge alert for "worker class instances not spread out enough" - https://phabricator.wikimedia.org/T214447 [00:09:51] bd808: thanks! linking it [07:42:18] been having issues with bot connectivity for the past month or so, possibly longer. I kept thinking we broke something, but now I'm looking at https://tools.wmflabs.org/openstack-browser/project/cvn and notice that it says floating IPs are not allocted in the new region. [07:42:32] Could someone help to confirm that the new instances do not have a floating IP like they used to? [08:52:36] !log packaging upgraded packages on builder01 [08:52:37] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Packaging/SAL [08:53:25] Krinkle: there might have been an issue in the migration. does horizon allow you to allocate a new floating IP? [16:06:34] Technical Advice IRC meeting starting in 0 minutes in channel #wikimedia-tech, hosts: @Thiemo_WMDE & @nuria - all questions welcome, more infos: https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting [16:13:14] !help can someone please install bc on toolforge stretch bastions? i need it for compiling [16:13:14] annika: If you don't get a response in 15-30 minutes, please create a phabricator task -- https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?projects=wmcs-team [16:14:35] annika: file a sub-task of T55704 to request a new package [16:14:36] T55704: Packages to be added to toollabs puppet - https://phabricator.wikimedia.org/T55704 [17:44:46] !log tools Added rules to default security group for prometheus monitoring on port 9100 (T211684) [17:44:50] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [17:44:50] T211684: Toolforge: Port sge.py stats to Prometheus - https://phabricator.wikimedia.org/T211684 [19:19:28] A few questions... [19:20:04] How can I find the IPs that outreachdashboard.wmflabs.org is using to make edits (as seen by the production wikis)? [19:21:31] And will the IP(s) change when I move it to a new VPS (to get it onto Debian), or does the outreachdashboard.wmflabs.org alias provide a stable IP that can be kept even when the VPS changes? [19:22:25] Context here is that it's starting to hit ratelimits, mostly because of high usage on cz.wikipedia, so we're probably going to need the same kind of throttle exemption that's already in place for dashboard.wikiedu.org: https://github.com/wikimedia/operations-mediawiki-config/blob/master/wmf-config/InitialiseSettings.php#L8427 [19:22:37] It will change [19:24:14] cool. Urbanecm has offered to get the exemption added, once we know the stable IP after the move. [19:24:19] Not sure how to find it, though. [19:24:48] Should be in Horizon [19:24:52] (horizon.wikimedia.org) [19:29:33] Horizon says it's 172.16.2.96, but I get 185.15.56.1 when I do `dig +short myip.opendns.com @resolver1.opendns.com` [19:36:00] ragesoss, 185.15.56.1 is probably your public IP [19:36:34] but Wikipedia will see the other one? [19:36:49] yes, because the edit comes from Dashboard and not from your computer [19:37:16] oh, I mean I ran that dig command from the dashboard server. [19:37:25] aha [19:37:49] that's the public IP that's used in case the machine makes requests to the internet [19:38:35] I think it uses the private IP when communicating with Wikipedia, but I may be wrong [19:40:10] yes, the private IP is being used, definitely [19:41:51] !log project-proxy deleting old eqiad-region proxy nodes novaproxy-01 and novaproxy-02 [19:41:52] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Project-proxy/SAL [19:43:18] !log bastion deleting old eqiad-region bastions [19:43:19] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Bastion/SAL [19:49:10] !log tools shutting down eqiad-region proxies tools-proxy-01 and tools-proxy-02 [19:49:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [19:57:38] gtirloni: Regarding CVN and floating IPs - the Horizon interface tells me the quotas only on the summary page, not per region. And it tells me I'm using 2/2, and they're associated with the new region's instances. [19:58:00] gtirloni: so I guess the openstack browser looks at it differently and somehow knows or wrongly thinks it's used in the old region. [20:00:46] Krinkle: I see 2 floating IPs from the new region (185.x.x.x) associated to 2 VM's there.. do you need a 3rd floating IP? [20:04:37] gtirloni: nope, all good. [20:04:53] gtirloni: I noticed that on https://tools.wmflabs.org/openstack-browser/project/cvn it says the IPs are used in the other region instead. [20:05:12] gtirloni: and I thought maybe it would explain the connnectivity issues I'm seeing with the CVN's IRC bots (which require a dedicated IP to avoid Freenode's rate limits) [20:05:23] something liek maybe they're not actually attached properly post-migration [20:05:25] anyway, nevermind [20:15:15] Krinkle: oh. that display on openstack-browser is a bit messed up. floating ip quotas work differently in the Neutron region and we don't have access to the right data to show them yet. [20:15:45] the unreleased floating ips in the old region are probably a different bug/issue [20:30:19] help - is PAWS public down? 502 Bad Gateway [20:32:49] * chicocvenancio checks the paws proxy [20:44:48] !log tools.oauth-hello-world Moved webservice from Stretch job grid to PHP 7.2 Kubernetes [20:44:55] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.oauth-hello-world/SAL [20:46:10] !log paws moving paws_public proxy_pass to https://172.16.6.39 in paws-proxy-01 [20:46:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Paws/SAL [20:47:17] chicocvenancio: thanks seems to be back up [20:49:15] thanks for the message, things are moving around and sometimes I don't notice a change or anticipate it will break something [20:53:36] chicocvenancio: I'm hoping it's not got anything to do with load or crashing, and just config? [20:53:46] !log tools Deleted broken tools-sgewebgrid-lighttpd-0904 instance via Horizon (T214519) [20:53:49] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [20:53:49] T214519: tools-sgewebgrid-lighttpd-0904 has bad state in OpenStack - https://phabricator.wikimedia.org/T214519 [20:58:37] fuzheado: just config for the region change in Cloud Services [20:59:11] PAWS can handle a lot more than the usual traffic, and no single user is usually capable of overloading it [20:59:18] chicocvenancio: Excellent - just wanted to make sure I wasn't triggering anything :) not sure how much of a load that service gets [21:04:17] !log tools Building new tools-sgewebgrid-lighttpd-0904 instance (T214519) [21:04:20] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [21:04:20] T214519: tools-sgewebgrid-lighttpd-0904 has bad state in OpenStack - https://phabricator.wikimedia.org/T214519 [21:05:13] fuzheado: usually very littler, until a technical event gets it to be very much [21:05:50] thanks [22:09:43] !log tools Deleted tools-sgewebgrid-lighttpd-0904 instance via Horizon, used wrong base image (T214519) [22:09:46] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [22:09:46] T214519: tools-sgewebgrid-lighttpd-0904 has bad state in OpenStack - https://phabricator.wikimedia.org/T214519 [22:18:24] !log tools Building new tools-sgewebgrid-lighttpd-0904 instance using Stretch base image (T214519) [22:18:27] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [22:18:27] T214519: tools-sgewebgrid-lighttpd-0904 has bad state in OpenStack - https://phabricator.wikimedia.org/T214519 [22:33:11] i switched my tool to the stretch grid and the scripts seem to be a little unstable (several ssl connection errors) but this could also be unrelated [22:35:53] annika: please keep an eye on it for a day or two and open a bug in phabricator if you can find some data that we might be able to debug with. [22:36:36] ok [22:37:37] annika: there is T213475 which has caused a couple of people problems [22:37:37] T213475: Wikimedia varnish rules no longer exempt all Cloud VPS/Toolforge IPs from rate limits (HTTP 429 response) - https://phabricator.wikimedia.org/T213475 [22:38:28] I'm not sure at the moment how long it will be until we find a fix for that issue that satisfies everyone