[07:43:57] (03CR) 10jenkins-bot: Localisation updates from https://translatewiki.net. [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/394245 (owner: 10L10n-bot) [16:58:36] !log deployment-prep Running cleanupUsersWithNoId.php on Beta Cluster, see T181731 [16:58:43] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL [16:58:43] T181731: Run maintenance/cleanupUsersWithNoId.php on all wikis - https://phabricator.wikimedia.org/T181731 [17:49:42] !log deployment-prep Finished running cleanupUsersWithNoId.php on Beta Cluster for T181731 [17:49:48] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL [17:49:49] T181731: Run maintenance/cleanupUsersWithNoId.php on all wikis - https://phabricator.wikimedia.org/T181731 [18:17:53] (03PS1) 10BryanDavis: !log: only update Phabricator tasks if we are responding in channel [labs/tools/stashbot] - 10https://gerrit.wikimedia.org/r/394356 [18:18:42] (03CR) 10jerkins-bot: [V: 04-1] !log: only update Phabricator tasks if we are responding in channel [labs/tools/stashbot] - 10https://gerrit.wikimedia.org/r/394356 (owner: 10BryanDavis) [18:19:30] (03PS2) 10BryanDavis: !log: only update Phabricator tasks if we are responding in channel [labs/tools/stashbot] - 10https://gerrit.wikimedia.org/r/394356 [18:20:07] (03CR) 10jerkins-bot: [V: 04-1] !log: only update Phabricator tasks if we are responding in channel [labs/tools/stashbot] - 10https://gerrit.wikimedia.org/r/394356 (owner: 10BryanDavis) [18:20:41] oh F-you jerkins and flake8 [18:26:49] (03PS3) 10BryanDavis: !log: only update Phabricator tasks if we are responding in channel [labs/tools/stashbot] - 10https://gerrit.wikimedia.org/r/394356 [18:26:51] (03PS1) 10BryanDavis: flake8: E722 do not use bare except [labs/tools/stashbot] - 10https://gerrit.wikimedia.org/r/394359 [18:30:35] (03CR) 10Chad: [C: 032] flake8: E722 do not use bare except [labs/tools/stashbot] - 10https://gerrit.wikimedia.org/r/394359 (owner: 10BryanDavis) [18:31:05] (03Merged) 10jenkins-bot: flake8: E722 do not use bare except [labs/tools/stashbot] - 10https://gerrit.wikimedia.org/r/394359 (owner: 10BryanDavis) [18:33:29] (03CR) 10BryanDavis: [C: 032] !log: only update Phabricator tasks if we are responding in channel [labs/tools/stashbot] - 10https://gerrit.wikimedia.org/r/394356 (owner: 10BryanDavis) [18:34:01] (03Merged) 10jenkins-bot: !log: only update Phabricator tasks if we are responding in channel [labs/tools/stashbot] - 10https://gerrit.wikimedia.org/r/394356 (owner: 10BryanDavis) [18:35:18] (03PS1) 10Ottomata: Add dummy cergen created certificates for kafka-jumbo [labs/private] - 10https://gerrit.wikimedia.org/r/394361 [18:35:58] (03CR) 10Ottomata: [V: 032 C: 032] Add dummy cergen created certificates for kafka-jumbo [labs/private] - 10https://gerrit.wikimedia.org/r/394361 (owner: 10Ottomata) [18:39:21] hello [18:39:30] Hello Superyetkin [18:39:51] I have an SQL query for listing uncategorized pages but it shows cached results, I think [18:40:14] for instance, this page on trwiki does not have any categories but is not listed there [18:40:25] https://tr.wikipedia.org/wiki/Mono_(yaz%C4%B1l%C4%B1m) [18:41:04] the query is as follows: [18:41:05] SELECT page_title, page_len FROM page LEFT JOIN categorylinks ON page_id = cl_from WHERE cl_from IS NULL AND page_namespace = 0 AND page_is_redirect = 0 ORDER BY page_len ASC [18:41:32] is there a way to disable query caching? [18:41:43] Are you using quarry? [18:41:50] Superyetkin: we talked about this a couple of days ago didn't we? There isn't any cache to disable [18:41:57] no, my tool is on labs [18:42:01] http://tools.wmflabs.org/superyetkin/kategorisizsayfalar.php [18:42:22] (03PS1) 10Ottomata: Add dummy profile::kafka::broker::ssl_password for kafka jumbo [labs/private] - 10https://gerrit.wikimedia.org/r/394362 [18:43:01] bd808: I am looking for a way to make my tools give consistent results :) [18:46:44] (03CR) 10Ottomata: [V: 032 C: 032] Add dummy profile::kafka::broker::ssl_password for kafka jumbo [labs/private] - 10https://gerrit.wikimedia.org/r/394362 (owner: 10Ottomata) [18:49:25] If Superyetkin had stuck around I could tell them that the prod database says that page is in https://tr.wikipedia.org/wiki/Kategori:Kullan%C4%B1mdan_kald%C4%B1r%C4%B1lm%C4%B1%C5%9F_parametreli_kaynak_%C5%9Fablonu_i%C3%A7eren_sayfalar [18:57:00] (03PS1) 10Greg Grossmeier: releng: Cleanup [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/394366 [18:57:52] If Superyetkin comes back -- https://phabricator.wikimedia.org/P6409 -- I think they will want to figure out how to exclude hidden categories from their search [18:59:43] Ok will look out for them bd808 [18:59:57] !log deployment-prep Testing stashbot fix for double phab logging (T181731) [19:00:04] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL [19:00:05] T181731: Run maintenance/cleanupUsersWithNoId.php on all wikis - https://phabricator.wikimedia.org/T181731 [19:06:46] bd808: Could use some help on https://phabricator.wikimedia.org/T181742 if you have a minute [19:07:54] Krinkle: my first guess would be that it is running out of memory [19:08:35] bd808: Yeah, I've increased the memory from 2G to 3G, and that seems to have fixed it for now, but I'd still like to see a record of why it was killed for future reference. [19:08:40] have you tried -mem 4096m or something similar? [19:08:58] *nod* grid engine can be annoying for this [19:09:04] I usually takes a few weeks before it gets in this state, so I'd rather know for sure :) [19:09:18] our accounting file gets so big that it is hard to search :/ [19:09:24] Also, I suppose there is no way to figure out where that memory is going? [19:09:50] It's a simple php script that uses very little variables of any kind. It shells out to git a few times, never holding on to anything. [19:09:58] I don't know why the memory grows. [19:10:41] probably git I would suppose. the memory limit should be for all the sub-processes that it forks too [19:10:49] Also odd is that each run is independent. yet it seems somehow, that after it runs every hour for a few weeks, it gets to a point that after that, each run will OOM. [19:11:09] until I run it manually, and then it is fine again for a while [19:11:23] As if Git is building up something internally that makes each 'git pull' more expensive. [19:11:25] Weird right? [19:11:34] Maybe it's triggering git-gc or something. [19:11:43] that would be a reasonable guess [19:11:52] and git-gc can be a real resource hog [19:12:55] Krinkle: is "kf*" "krinkle function" :) [19:13:01] :D [19:13:17] https://github.com/Krinkle/mw-tool-snapshots/blob/master/scripts/updateSnaphots.php [19:13:55] also some functions from https://github.com/Krinkle/toollabs-base/blob/e01c08c44ffb49f3a82650c2581dea37e2665a3c/src/GlobalFunctions.php#L303 [19:14:01] Yeah, this is legacy stuff, very MW inspired. [19:16:40] Krinkle: I'll look around to see if I have notes on digging results out of the grid engine accounting files after lunch [19:16:52] bd808: thanks [19:17:00] We did some poking at this for https://tools.wmflabs.org/grid-jobs/ [19:17:31] This query has no shortage of problems and creative solutions - https://www.google.co.uk/search?q=git+gc+memory+usage [19:17:51] So I'll use that as starting point. Probably gonna disable git-gc and maybe run it from a separate cron so that at least the failure is isolated. [20:13:54] (03CR) 10Legoktm: "Why remove CodeSniffer and commit-message-validator?" [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/394366 (owner: 10Greg Grossmeier) [20:23:03] valhallasw`cloud: fyi wikibugs kept pinging out last night around 10pm UTC-6 (Central Standard Time aka America/Chicago) [22:12:46] no_justification: I guess from bawolff's comments on T118799 someone could just change those errors back to debug instead of warning [22:12:47] T118799: XMPReader::parse exceptions - https://phabricator.wikimedia.org/T118799 [22:13:30] maybe fix things so that the filename is always in the log message too I guess [22:13:54] oops. meant that for -core :) [22:16:03] I would be fine with INFO or DEBUG [22:16:07] Long as I don't see them :p [22:59:28] I can't login to toolforge [22:59:47] mbh: whats it say? [23:01:25] "Can't connect to host more than 15 seconds". I connect through WinSCP. https://i.imgur.com/SD1kzjx.png [23:01:29] tools-login seems to be sick. I'm trying to get into it right now [23:02:07] mbh: as a short term work around, try using tools-dev.wmflabs.org instead of tools-login [23:03:31] yes, it works [23:04:00] I'll see if I can figure out what is wrong with tools-login [23:04:13] load average is 36 :/ [23:06:23] !log tools rebooting login.tools.wmflabs.org due to overload [23:06:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [23:18:16] chasemp: is bastion-03 still trying to shutdown? [23:18:50] bd808: it said it went down and I lost the terminal [23:18:53] but if it's not back yet [23:19:04] I guess we should jumpstart it via hard reboot [23:19:56] * zhuyifei1999_ thinks cloud would benefit from some magic sysrq keys [23:21:29] chasemp: it's not letting my root key in at least with a hard connection close [23:22:01] bd808: can you reboot via labcontrol I have to take a phone call [23:22:03] ? [23:22:08] chasemp: I'll go have horizon kick it in the ass [23:22:13] (03PS1) 10Chad: All-Users and All-Projects to #wikimedia-releng reporting [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/394498 [23:23:34] !log tools Hard reboot of tools-bastion-03 via Horizon [23:23:38] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [23:33:09] legoktm: does wikibugs self-update if we +2 changes like https://gerrit.wikimedia.org/r/#/c/394498/1 [23:33:31] bd808: yes [23:34:15] heh. I don't have +2 there I guess [23:34:31] (03CR) 10Legoktm: [C: 032] All-Users and All-Projects to #wikimedia-releng reporting [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/394498 (owner: 10Chad) [23:34:50] something's wrong and I'm on the phone andrewbogott is DNS having an issue? [23:34:51] (03Merged) 10jenkins-bot: All-Users and All-Projects to #wikimedia-releng reporting [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/394498 (owner: 10Chad) [23:34:59] I keep getting alerts from bastion-03 about resolution issues... [23:35:01] (03CR) 10jenkins-bot: All-Users and All-Projects to #wikimedia-releng reporting [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/394498 (owner: 10Chad) [23:35:10] bd808: oops I added you to the ACL [23:35:47] chasemp: I'll take a look [23:35:58] tools-bastion-03 : Nov 30 23:22:24 : diamond : unable to resolve host tools-bastion-03 [23:41:32] chasemp: I think those were from the period where it was sick due to load/IOPS. It looks fine now. [23:42:02] yeah, I got an alert that bastion-03 was all the way down [23:42:40] yeah, chasemp tried to reboot it but it hung on shutdown so I hard rebooted via horizon [23:43:11] the reboot attempt was at 23:06 and the hard boot at 23:23 [23:43:32] anything before 23:23 is certainly suspect [23:43:49] the last email I see was from 23:19 [23:43:50] I'm about to call an insurance company and, hence, will be tied up more-or-less for ever… things are working properly now, right? [23:44:06] andrewbogott: I think so. have fun being a grown up [23:44:33] Lol