[00:26:04] <bd808>	 !log tools Deleted DNS record for trusty-dev.tools.wmflabs.org
[00:26:06] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL
[00:26:59] <bd808>	 !log tools Deleted DNS record for login-trusty.tools.wmflabs.org
[00:27:00] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL
[01:10:12] <fuzheado>	 Anyone know what to do with this, when I try "webservice stop?"
[01:10:13] <fuzheado>	 ValueError: Found 2 objects of type ReplicaSet matching selector name=wd-depicts: <ReplicaSet wd-depicts-1035148579>, <ReplicaSet wd-depicts-1769414028>. See https://phabricator.wikimedia.org/T156626 
[01:10:21] <fuzheado>	 I cannot make sense of the phab link
[01:13:28] <zhuyifei1999_>	 fuzheado: can you comment on the task with the full traceback?
[01:13:41] <fuzheado>	 OK, will do
[01:15:43] <fuzheado>	 added
[01:16:44] <bd808>	 fuzheado: there is a duplicate ReplicaSet. You can see it with `kubectl get replicaset`.
[01:17:26] <bd808>	 I'm going to try to kill the one that is not attached to any running pods
[01:18:25] <fuzheado>	 ok thanks
[01:18:33] <bd808>	 !log tools.wd-depicts Deleted orphan replicaset wd-depicts-1035148579
[01:18:35] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wd-depicts/SAL
[01:18:47] <bd808>	 fuzheado: I think webservice stop should work for you now
[01:19:54] <fuzheado>	 hey thanks - it did
[01:20:31] <fuzheado>	 I'm having a slightly different issue though with attaching to mariadb - "WARNING: Server public key has changed. It means either you're under attack or the administrator has changed the key. New public fingerprint is: 52:3f:d0:2d:0f:8b:6c:6c:40:78:49:9e:cd:be:75:1d:1f:52:fc:d3"
[01:21:15] <bd808>	 !fingerprints
[01:21:16] <wm-bot>	 ssh keys for bastion hosts: https://wikitech.wikimedia.org/wiki/Help:SSH_Fingerprints
[01:21:16] <fuzheado>	 I thought this was resolved last week but seems to be presenting a problem again
[01:21:32] <bd808>	 expected fingerprints are ^^ there
[01:21:57] <fuzheado>	 Yeah, I thought this was all OK last week with the fingerprints there, but MySQL workbench is kicking up this error now
[01:22:43] <bd808>	 its in your local config unless the fingerprint you are getting does not match the host you are tunneling through
[01:24:44] <fuzheado>	 is tunneling via tools-login.wmflabs.org still the right way to go?
[01:32:54] <hare>	 tools-login.wmflabs.org should work
[01:33:53] <hare>	 fuzheado: in the process of OS upgrades we set up a new bastion. Hence different fingerprint. You will need to go into ~/.ssh/known_hosts and clear that line. 
[01:34:10] <fuzheado>	 Yeah I just did that and I think it's OK now
[01:34:12] <fuzheado>	 thanks
[13:22:42] <kostajh>	 I'm probably missing something obvious, but why are POST requests to tools.wmflabs.org/sonarqubebot showing up as GET requests with the content body missing?
[13:28:48] <thedj>	 kostajh: showing up where ?
[13:29:32] <kostajh>	 so, if I create `public_html/index.php` on the tools server, and put print_r($_REQUEST) or print_r($_POST), both are empty
[13:30:18] <kostajh>	 Maybe I missed something in setup. This is my first time trying to create something on tools
[13:32:51] <gtirloni>	 I see POST requests in access.log
[13:32:56] <thedj>	 but how are you making a request ?
[13:33:52] <kostajh>	 hrm. I'm using the Postman app on my computer
[13:34:07] <gtirloni>	 `curl -sL -X POST https://tools.wmflabs.org/sonarqubebot`
[13:35:03] <thedj>	 kostajh: are you using the x-www-form-urlencoded body type ?
[13:35:06] <gtirloni>	 tools.wmflabs.org - [26/Mar/2019:13:21:03 +0000] "POST /sonarqubebot HTTP/1.1" 301 0 "-" "PostmanRuntime/7.3.0"
[13:35:28] <gtirloni>	 notice the 301 redirect, not sure how postman interprets that
[13:35:37] <thedj>	 oh also, it's print_r($_POST, true)
[13:35:51] <kostajh>	 thedj: no, I was using raw
[13:36:06] <thedj>	 ah ok
[13:36:20] <kostajh>	 yeah I see the POST in the access.log
[13:36:31] <kostajh>	 but not in PHP
[13:36:59] <thedj>	 kostajh: so the raw body of other content types than form goes to the php input stream
[13:37:32] <thedj>	 if you are not using a php higher level framework (like guzzle for instance), you need to read it and parse it from the input stream
[13:37:39] <kostajh>	 you can see the code of the app in src/Controller/WebhookController.php
[13:37:50] <kostajh>	 I believe it's getting it from the input stream, let me check
[13:37:55] <thedj>	 or use webform encoding in postman, and then it does show up in $POST
[13:39:47] <kostajh>	 and if you look in var/log/dev.log you can see that Symfony is seeing the incoming request as a GET
[13:45:36] <gtirloni>	 I can confirm the web server where your tool is running is receiving a POST request. Still looking
[13:46:41] <anomie>	 bstorm_: FYI, https://phabricator.wikimedia.org/T212972#5057586
[13:47:10] <gtirloni>	 $ curl -sL -X POST -d "abc=def" https://tools.wmflabs.org/sonarqubebot/info.php | grep REQUEST_METHOD
[13:47:11] <gtirloni>	 <tr><td class="e">$_SERVER['REQUEST_METHOD']</td><td class="v">POST</td></tr>
[13:47:14] <arturo>	 anomie: perhaps a bit early for her
[13:47:24] <gtirloni>	 kostajh: ^^
[13:47:54] <anomie>	 arturo: It's not very urgent.
[13:48:29] <arturo>	 ok
[13:49:11] <kostajh>	 gtirloni: thanks for looking at this. I'll try to track it down later, maybe it's a configuration issue with my app
[13:49:46] <gtirloni>	 kostajh: yeah, could be that. I'd try with a simpler php script and confirm first. feel free to ping us here again
[13:57:58] <kostajh>	 gtirloni: I updated info.php. I don't know why all those values are empty
[13:58:12] <gtirloni>	 let me check
[14:04:20] <gtirloni>	 kostajh: https://www.php.net/print_r  (`return` should be false if you want to print it)
[14:04:55] <kostajh>	 drp
[14:05:14] <kostajh>	 gtirloni: sorry, that's what I had first but changed based on a comment above. thanks! 
[14:05:25] <gtirloni>	 np :)
[14:05:41] <kostajh>	 I'll comment again when I sort out what's happening with Symfony and why it's seeing GET
[14:08:26] <kostajh>	 lol... ok. https://github.com/postmanlabs/postman-app-support/issues/450#issuecomment-435352122 /facepalm
[14:12:08] <gtirloni>	 oh
[14:12:18] <gtirloni>	 are you using http:// or https:// ? we're enforcing https now
[14:12:30] <gtirloni>	 which adds that 301 redirect
[14:40:39] <kostajh>	 gtirloni: yeah, I just had `tools.wmflabs.org/sonarqubebot` in Postman which defaulted to http
[14:41:32] <gtirloni>	 ok
[14:41:36] <gtirloni>	 the world makes sense again :)
[14:42:38] <kostajh>	 :)
[14:44:43] <kostajh>	 follow up question... :) My tool is going to listen to webhooks from a third party service (SonarQube) and will post a comment in gerrit on a related patch with info from the webhook. SonarQube can use HTTP basic auth, which would be nice so that I wouldn't have to try to verify that the incoming post request is from SonarQube by looking at IP addresses or other metadata
[14:45:02] <kostajh>	 is it possible to use http basic auth on tools? I didn't see mention of it in the wiki
[14:46:06] <chicocvenancio>	 kostajh: in Toolforge? I guess you can set it to lighttp config
[14:46:20] <chicocvenancio>	 Haven't tested it. 
[14:47:03] <chicocvenancio>	 But, that sounds unsafe
[14:47:12] <kostajh>	 chicocvenancio: ok, I'll try it. I didn't see htpasswd in path so I assumed it was not functional, but I'll see if it works
[14:47:24] <kostajh>	 which part? :)
[14:47:45] <chicocvenancio>	 Using http Auth in Toolforge for anything secret
[14:48:25] <chicocvenancio>	 Not sure from your description what the repercussions of your credentials leaking
[14:51:41] <gtirloni>	 kostajh: I think we generally prefer something like oauth (dont ask me how to do that though, not my area)
[14:52:09] <kostajh>	 yeah, that won't work for this case, unfortunately
[14:52:55] <thedj>	 it's a webhook, usually those don't exchange private info; He job id is done. go look it up.
[14:53:55] <kostajh>	 right. example data is here T217008#5058345
[14:54:01] <thedj>	 well sometimes they do, but... then use proper auth indeed.
[14:54:04] <stashbot>	 T217008: Report results from SonarCloud to Gerrit - https://phabricator.wikimedia.org/T217008
[14:54:29] <kostajh>	 the issue is that I wouldn't want a malicious user to POST bogus requests to tools.wmflabs.org/sonarqubebot and that end up spamming gerrit
[14:54:38] <thedj>	 if you post directly to gerrit though, that might be used for spamming. will have to guard against that.
[14:55:01] <kostajh>	 I guess I'll need to keep track of taskIds that have been seen before, and also validate that they exist in SonarQube, before processing them
[14:55:12] <thedj>	 that could work.
[14:55:46] <chicocvenancio>	 http auth doesn't sound terrible to me, as long as you're aware of the limitations
[15:10:55] <chicocvenancio>	 kostajh: I'm looking at https://redmine.lighttpd.net/projects/lighttpd/wiki/docs_modauth and can't quite figure out what you need to add to get it working
[15:11:07] * chicocvenancio dislikes lighttpd
[15:58:49] <dschwen_>	 Good morning (timezone restrictions may apply)! My gerrit login is broken. We had an issue with my username (dschwen) a few months back (andrewbogott helped me get into horizon back then). Seems like this affects my gerrit account as well. When I try to sign in I get "Cannot assign user name "dschwen" to account 6995; name already in use."
[16:00:31] <andrewbogott>	 dschwen_: I don't know anything about gerrit but I'm trying to scare up someone helpful
[16:01:56] <andrewbogott>	 It sounds like tyler is the best bet, I'll see if I can get him to join here.
[16:02:02] <andrewbogott>	 do you need gerrit access to fix wma?
[16:08:43] <twentyafterfour>	 dschwen: I'll take a look
[16:14:58] <twentyafterfour>	 andrewbogott: dschwen: it looks like the username in gerrit is Dschwen rather than dschwen 
[16:15:10] <dschwen_>	 I need to add a new ssh key
[16:15:18] <dschwen_>	 let my try that name
[16:15:53] <dschwen_>	 hm, that does not work either (the error again shows the lower case version)
[17:21:39] <andrewbogott>	 !log cloudinfra adding Krenair as projectadmin as per T218448
[17:21:42] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Cloudinfra/SAL
[17:21:42] <stashbot>	 T218448: Volunteer NDA for Alex Monk - https://phabricator.wikimedia.org/T218448
[17:31:38] <arturo>	 !log tools T218126 create VM instances tools-sssd-sgeexec-test-[12]
[17:31:40] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL
[17:31:41] <stashbot>	 T218126: LDAP: try how sssd works with our servers - https://phabricator.wikimedia.org/T218126
[17:31:43] <arturo>	 cc bstorm_ ^^^
[17:32:17] <bstorm_>	 Cool, :)
[17:43:20] <andrewbogott>	 !log cloudinfra removing "profile::ldap::client::labs::restricted_to: ops" from project puppet to allow volunteer admin access
[17:43:21] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Cloudinfra/SAL
[19:46:10] <thedj>	 hehe
[19:46:11] <thedj>	 https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)#While_you_are_here;_prep_for_database_changes!
[19:46:37] <thedj>	 i assume some overlap in the crowd here, figured this wouldn't hurt ;)
[20:26:21] <bd808>	 thedj: thank you for promoting the cloud-announce list :)
[21:11:32] <thedj>	 bd808: i have some nerve.. i'm not even on it myself ;)
[21:11:54] <bd808>	 lol. Are you on cloud@?
[21:11:59] <thedj>	 nope
[21:12:06] * bd808 sighs
[21:12:11] <bd808>	 ;)
[21:12:14] <thedj>	 but i browse the archives every once in a while
[21:13:42] <thedj>	 too many channels ;) besides most stuff passes by in phab etc.
[22:00:20] <gtirloni>	 !log tools downtimed toolschecker
[22:00:22] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL
[22:00:35] <bd808>	 thanks gtirloni 
[22:02:35] <thedj>	 and it's back..
[22:02:37] <thedj>	 apache2    1247  1282         www-data    8w      REG              254,3 2735195750     267013 /var/log/apache2/other_vhosts_access.log.1 (deleted)
[22:02:40] <thedj>	 apache2    1247  1282         www-data    9w      REG              254,3 3436494527     267026 /var/log/apache2/tiles.wmflabs.org_access.log.1 (deleted)
[22:02:43] <thedj>	 apache2    1247  1286         www-data    2w      REG              254,3     600944     267004 /var/log/apache2/error.log.1 (deleted)
[22:02:46] <thedj>	 apache2    1247  1286         www-data    7w      REG              254,3 3602305225     267027 /var/log/apache2/tiles.wmflabs.org_error.log.1 (deleted)
[22:02:49] <thedj>	 apache2    1247  1286         www-data    8w      REG              254,3 2735195750     267013 /var/log/apache2/other_vhosts_access.log.1 (deleted)
[22:02:53] <thedj>	 apache2    1247  1286         www-data    9w      REG              254,3 3436494527     267026 /var/log/apache2/tiles.wmflabs.org_access.log.1 (deleted)
[22:03:45] <bd808>	 thedj: hmm... so apache is holding the prior inodes open after logrotate runs?
[22:03:53] <thedj>	 seems so
[22:04:13] <thedj>	 but it has postrotate and prerotate setup...
[22:04:24] <thedj>	 and that should prevent that right ?
[22:05:22] <bd808>	 I wonder if the problem is graceful reloads and some client that is keeping a child from being restarted?
[22:05:42] <thedj>	 hmm, seems like 3 of the apache children didn't get reloaded this morening...
[22:06:23] <thedj>	 yeah i suspect those children are stuck in some sort of mod_tile F'u indeed.
[22:06:44] <thedj>	 note how the graph changes this morning at 6:25
[22:06:44] <thedj>	 https://tiles.wmflabs.org/munin/maps-wmflabs/maps-tiles1.maps-wmflabs/renderd_processed.html
[22:06:55] <thedj>	 just after logrotate
[22:08:32] <thedj>	 ah.. even though logrotate refers to an apache-prerotate, that directory doesn't actually exist in the default setup...
[22:10:45] <bd808>	 Its not the normal setup around here, but I always liked piped logs better than logrotate for apache2 log management -- https://httpd.apache.org/docs/trunk/logs.html#piped
[22:11:41] <bd808>	 in a pre-fork model piped logs basically starts a co-process for each apache2 child and streams the log data to that program on stdin
[22:12:24] <bd808>	 https://httpd.apache.org/docs/trunk/programs/rotatelogs.html
[22:14:31] <thedj>	 hmm, apparently apache2 reload which the postrotate script does is a downstream addition. Some people advice switching back to apache2 graceful.
[22:18:23] <thedj>	 k. did an apache restart, and now all resources are released again. will configure logrotate to do apache2 graceful, see if the problem returns. If it returns, i'll just bruteforce it to apache2 restart.. I'm not gonna debug mod_tile to figure out which process gets stuck in a signal.
[22:20:21] <thedj>	 so apache2 reload, tells an apache process to interrupt a request (instead of complete like with graceful). i suspect the mod_tile extension barfs on that.
[22:22:15] <thedj>	 or better described, it interrupts the request handling and asks it to cleanup asap. as opposed to graceful (let request complete and then reload config and logs) and restart (stop process competely)
[22:23:33] <bd808>	 thedj: at least you are getting closer to finding the root problem!
[22:24:40] <thedj>	 yup
[22:27:29] <thedj>	 bd808: i also have a very good indicator now of when this problem is occuring I think. That graph makes it very visible. actually it looking so different is what made me take a look on the server, because i figured something must have been off...