[00:08:11] RECOVERY - Puppet errors on tools-exec-1441 is OK: OK: Less than 1.00% above the threshold [0.0] [02:23:59] PROBLEM - Puppet errors on tools-exec-1433 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [02:24:39] PROBLEM - Puppet errors on tools-webgrid-generic-1402 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:34:36] 10Toolforge: Raise tool memory limit - similarity - https://phabricator.wikimedia.org/T176527#3628965 (10Surlycyborg) Hmm, I did try a couple more low-hanging fruits to reduce the size of the data, which does make a difference locally. Still no luck launching the tool on the grid though. However, I do see somet... [02:35:31] PROBLEM - Puppet errors on tools-webgrid-generic-1403 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [02:59:02] RECOVERY - Puppet errors on tools-exec-1433 is OK: OK: Less than 1.00% above the threshold [0.0] [03:04:39] RECOVERY - Puppet errors on tools-webgrid-generic-1402 is OK: OK: Less than 1.00% above the threshold [0.0] [03:10:32] RECOVERY - Puppet errors on tools-webgrid-generic-1403 is OK: OK: Less than 1.00% above the threshold [0.0] [04:01:59] PROBLEM - Puppet errors on tools-exec-1401 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [04:16:56] RECOVERY - Puppet errors on tools-exec-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [04:48:16] 10Toolforge: Raise tool memory limit - similarity - https://phabricator.wikimedia.org/T176527#3628972 (10zhuyifei1999) Is the source code available somewhere? Maybe someone could do some memory profiling and see what is going on. [06:34:56] PROBLEM - Puppet errors on tools-exec-1414 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [06:40:16] 10VPS-project-Wikistats: Miraheze wikistats new wikis not updating - https://phabricator.wikimedia.org/T176535#3628995 (10Reception123) [07:13:05] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1413 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [07:14:57] RECOVERY - Puppet errors on tools-exec-1414 is OK: OK: Less than 1.00% above the threshold [0.0] [07:48:03] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1413 is OK: OK: Less than 1.00% above the threshold [0.0] [13:30:39] PROBLEM - Puppet errors on tools-exec-1440 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [13:30:43] (03CR) 10Lokal Profil: [C: 04-1] "Much better with Counter (how have I never used that one before!)" (032 comments) [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379685 (https://phabricator.wikimedia.org/T117330) (owner: 10Jean-Frédéric) [13:31:35] (03PS5) 10Lokal Profil: Harvest the source page of unknown fields [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379685 (https://phabricator.wikimedia.org/T117330) (owner: 10Jean-Frédéric) [13:35:05] (03CR) 10Lokal Profil: [C: 032] Harvest the source page of unknown fields [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379685 (https://phabricator.wikimedia.org/T117330) (owner: 10Jean-Frédéric) [13:35:52] (03Abandoned) 10Lokal Profil: Harvest the source page of unknown fields [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379126 (https://phabricator.wikimedia.org/T117330) (owner: 10Lokal Profil) [13:36:24] (03CR) 10Lokal Profil: [C: 031] "recheck" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379685 (https://phabricator.wikimedia.org/T117330) (owner: 10Jean-Frédéric) [13:41:40] (03CR) 10Jean-Frédéric: [C: 032] Harvest the source page of unknown fields [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379685 (https://phabricator.wikimedia.org/T117330) (owner: 10Jean-Frédéric) [13:42:43] (03Merged) 10jenkins-bot: Harvest the source page of unknown fields [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379685 (https://phabricator.wikimedia.org/T117330) (owner: 10Jean-Frédéric) [13:43:36] (03CR) 10jenkins-bot: Harvest the source page of unknown fields [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379685 (https://phabricator.wikimedia.org/T117330) (owner: 10Jean-Frédéric) [13:44:41] !log tools.heritage Deploy latest from Git master: 30af42c (T117330) [13:44:45] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.heritage/SAL [13:44:46] T117330: Solution of "unknown fields" from monument lists - https://phabricator.wikimedia.org/T117330 [14:10:38] RECOVERY - Puppet errors on tools-exec-1440 is OK: OK: Less than 1.00% above the threshold [0.0] [14:30:49] 10PAWS: PAWS - Redirect loop detected - https://phabricator.wikimedia.org/T175454#3593792 (10Vodenbot) I am using PAWS and I have the same problem since yesterday - when I log in - I receive a blank page on URL: https://paws.wmflabs.org/paws/user/vodenbot/ When I get to initial page to stop server and try to s... [14:54:43] (03PS6) 10Jean-Frédéric: Use isort to sort imports [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378741 [14:55:24] (03CR) 10Jean-Frédéric: Use isort to sort imports (031 comment) [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378741 (owner: 10Jean-Frédéric) [14:57:28] 10Toolforge: Raise tool memory limit - similarity - https://phabricator.wikimedia.org/T176527#3628719 (10valhallasw) After some fiddling with `qacct`, this is the end state before the process was killed: ``` valhallasw@tools-bastion-02:~/accountingtools$ qacct -j 9906175 -f similarity.acct ======================... [14:57:45] PROBLEM - Puppet errors on tools-exec-1437 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [15:14:14] (03PS1) 10Jean-Frédéric: Catch `PageSaveRelatedError` in save_to_wiki_or_local [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379970 (https://phabricator.wikimedia.org/T176530) [15:27:45] RECOVERY - Puppet errors on tools-exec-1437 is OK: OK: Less than 1.00% above the threshold [0.0] [15:27:53] (03PS1) 10Jean-Frédéric: Respect `skip` config when making stats in missing_commonscat_links [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379971 (https://phabricator.wikimedia.org/T176528) [15:47:52] (03PS2) 10Jean-Frédéric: Respect `skip` config when making stats [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379971 (https://phabricator.wikimedia.org/T176528) [15:47:54] (03PS1) 10Jean-Frédéric: Guard against empty totals when making statistics [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379974 (https://phabricator.wikimedia.org/T176528) [16:14:47] (03CR) 10Lokal Profil: "good catch. I've rebuilt the way statistics are processed in unused images (loop over returned results only) but this will help until that" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379971 (https://phabricator.wikimedia.org/T176528) (owner: 10Jean-Frédéric) [16:14:52] (03CR) 10Lokal Profil: [C: 032] Respect `skip` config when making stats [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379971 (https://phabricator.wikimedia.org/T176528) (owner: 10Jean-Frédéric) [16:16:39] (03Merged) 10jenkins-bot: Respect `skip` config when making stats [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379971 (https://phabricator.wikimedia.org/T176528) (owner: 10Jean-Frédéric) [16:18:53] (03CR) 10jenkins-bot: Respect `skip` config when making stats [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379971 (https://phabricator.wikimedia.org/T176528) (owner: 10Jean-Frédéric) [16:24:45] (03CR) 10Lokal Profil: [C: 031] "IIRC one of the scripts tries to catch this error later on. that would need to be updated now that it won't bubble up." [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379970 (https://phabricator.wikimedia.org/T176530) (owner: 10Jean-Frédéric) [16:32:30] 10Data-Services, 10XTools: s51187 and p50380g50692 database users are generating excessive lag on replica service - https://phabricator.wikimedia.org/T172882#3629464 (10jcrespo) > Is it possible to give s51290 read-only access to labsdb1001 until I have a chance to update the troublesome scripts? Done. [16:49:42] (03PS1) 10Jean-Frédéric: Add Croatia in Croatian (hr_hr) [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379986 (https://phabricator.wikimedia.org/T174505) [17:14:48] (03PS1) 10Jean-Frédéric: Update Iraq and Egypt in Arabic [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379991 (https://phabricator.wikimedia.org/T174261) [17:17:08] (03CR) 10Zoranzoki21: [C: 031] "Looks good to me, but someone else must approve" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379991 (https://phabricator.wikimedia.org/T174261) (owner: 10Jean-Frédéric) [17:19:47] 10Toolforge, 10DBA: Hight replag on s1, s3, s5 - https://phabricator.wikimedia.org/T176557#3629533 (10Framawiki) [17:23:36] (03PS1) 10Jean-Frédéric: Add Uganda in English (ug_en) [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379992 (https://phabricator.wikimedia.org/T174426) [17:29:50] 10Toolforge, 10DBA: Hight replag on s1, s3, s5 - https://phabricator.wikimedia.org/T176557#3629533 (10jcrespo) I see them 0 on other hosts, have you tried those? ``` wikireplica-analytics.eqiad.wmnet Shard Lag (seconds) Lag (time) s1 0.0000 00:00:00 s2 0.0000 00:00:00 s3 0.0000 00:00:00 s4 0.0000 00:00:00 s5... [17:33:53] is there some magic way to make pushes from my tool account to my tool repo just work? [17:34:01] without manually entering passwords, I mean [17:44:02] 10Toolforge, 10DBA: Hight replag on s1, s3, s5 - https://phabricator.wikimedia.org/T176557#3629566 (10jcrespo) @MahmoudHashemi @Slaporte s52467 / s52467__hashtags seem to be overloading labs hosts with multiple heavy querying connections, generating lang and other user complains, I have throttled the account t... [17:47:26] tgr: tool repo being? [17:47:37] (phabricator/gerrit/github?) [17:49:40] tgr: but I don't think there's a way without having credentials within the tool account (either a password or a private key) [17:50:23] valhallasw`cloud: phabricator (the one that Striker offers to create automatically) [17:50:23] 10Toolforge, 10DBA: Hight replag on s1, s3, s5 - https://phabricator.wikimedia.org/T176557#3629533 (10mahmoud) It's hard to say what can be done about this without more information. There's user-generated load from visits to http://tools.wmflabs.org/hashtags but there's also batch traffic as well. Both are nec... [17:51:53] tgr: what I would do is create a seperate phabricator user for the project, give that user access to the repo, and put the private key for that user in the project directory [17:54:13] on a collaborative project I'd probably want to know who made the patch [17:55:27] but mostly I am interested in how a new user will figure out this ecosystem [17:55:39] have them set the author line? [17:55:59] alias git to something that checks $SUDO_USER ? [17:56:02] tgr: I think normally you would do development locally and only pull from the tool [17:56:46] alias git="HOME=/home/$SUDO_USER git" [17:57:27] ^ the magic line to let it use the user's git settings rather than the project's. Generates a bunch of warnings though due to file rights [17:57:50] would need the tool user to be able to read the user's git config etc. [17:57:56] should be possible [17:58:01] 10Toolforge, 10DBA: Hight replag on s1, s3, s5 - https://phabricator.wikimedia.org/T176557#3629585 (10jcrespo) Please create separate accounts for separate tools- if users overload a tool, and it starts to make bad/lots of queries, the full account will get affected. Folks will not get their data if the replic... [18:03:02] 10Toolforge, 10DBA: Hight replag on s1, s3, s5 - https://phabricator.wikimedia.org/T176557#3629589 (10mahmoud) That's an interesting idea. It would really help to have answers to my questions if we are forced to register an account per language. For convenience: * What is the limit on connections per account... [18:12:14] (03PS2) 10Jean-Frédéric: Catch `PageSaveRelatedError` in save_to_wiki_or_local [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379970 (https://phabricator.wikimedia.org/T176530) [18:18:27] (03PS4) 10Jean-Frédéric: Filter out blacklisted categories during categorisation [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/328348 (https://phabricator.wikimedia.org/T153744) [18:18:46] (03CR) 10Jean-Frédéric: "I rebased this." [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/328348 (https://phabricator.wikimedia.org/T153744) (owner: 10Jean-Frédéric) [18:21:07] (03CR) 10Jean-Frédéric: [C: 032] Catch `PageSaveRelatedError` in save_to_wiki_or_local [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379970 (https://phabricator.wikimedia.org/T176530) (owner: 10Jean-Frédéric) [18:22:40] (03Merged) 10jenkins-bot: Catch `PageSaveRelatedError` in save_to_wiki_or_local [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379970 (https://phabricator.wikimedia.org/T176530) (owner: 10Jean-Frédéric) [18:24:12] (03CR) 10jenkins-bot: Catch `PageSaveRelatedError` in save_to_wiki_or_local [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379970 (https://phabricator.wikimedia.org/T176530) (owner: 10Jean-Frédéric) [18:25:21] !log tools.heritage Deploy latest from Git master: 1a7b88c (T176528), 150f545 (T176530) [18:25:27] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.heritage/SAL [18:25:27] T176528: ErfgoedBot missing_commonscat_links crashes with NoneType - https://phabricator.wikimedia.org/T176528 [18:25:27] T176530: ErfgoedBot write-to-wiki crashes when hitting AbuseFilter (PageSaveRelatedError) - https://phabricator.wikimedia.org/T176530 [18:26:40] (03CR) 10Jean-Frédéric: [C: 032] "I’ve just noticed this is a big problem so will go ahead and merge this :)" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/328348 (https://phabricator.wikimedia.org/T153744) (owner: 10Jean-Frédéric) [18:28:20] (03Merged) 10jenkins-bot: Filter out blacklisted categories during categorisation [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/328348 (https://phabricator.wikimedia.org/T153744) (owner: 10Jean-Frédéric) [18:29:14] (03CR) 10jenkins-bot: Filter out blacklisted categories during categorisation [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/328348 (https://phabricator.wikimedia.org/T153744) (owner: 10Jean-Frédéric) [18:31:37] !log tools.heritage Deploy latest from Git master: f8a1b8a (T153744) [18:31:41] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.heritage/SAL [18:31:43] T153744: Add blacklist to ErfgoedBot categorisation process - https://phabricator.wikimedia.org/T153744 [18:42:42] (03CR) 10Lokal Profil: "the script which needs amending is https://github.com/wikimedia/labs-tools-heritage/blob/30af42cb0085c0350131253ccbb4bc1ae8681a79/erfgoed" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379970 (https://phabricator.wikimedia.org/T176530) (owner: 10Jean-Frédéric) [18:47:55] PROBLEM - Puppet errors on tools-webgrid-generic-1401 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [18:50:05] (03CR) 10Lokal Profil: "does this then mean that there is something creating idx entries without using `makeIdx`" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378975 (https://phabricator.wikimedia.org/T174503) (owner: 10Jean-Frédéric) [18:54:03] (03CR) 10Jean-Frédéric: "> does this then mean that there is something creating idx entries" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378975 (https://phabricator.wikimedia.org/T174503) (owner: 10Jean-Frédéric) [19:19:37] 10Toolforge, 10DBA: High replag on s1, s3, s5 - https://phabricator.wikimedia.org/T176557#3629653 (10Reedy) [19:27:55] RECOVERY - Puppet errors on tools-webgrid-generic-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [20:22:18] (03CR) 10Lokal Profil: [C: 032] Update Iraq and Egypt in Arabic [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379991 (https://phabricator.wikimedia.org/T174261) (owner: 10Jean-Frédéric) [20:23:56] (03Merged) 10jenkins-bot: Update Iraq and Egypt in Arabic [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379991 (https://phabricator.wikimedia.org/T174261) (owner: 10Jean-Frédéric) [20:24:49] (03CR) 10jenkins-bot: Update Iraq and Egypt in Arabic [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379991 (https://phabricator.wikimedia.org/T174261) (owner: 10Jean-Frédéric) [22:02:20] 10PAWS: PAWS - Redirect loop detected - https://phabricator.wikimedia.org/T175454#3593792 (10Aftabuzzaman) Same problem here. i'm getting "HTTP ERROR 503" for https://paws.wmflabs.org/paws/user/aftabuzzaman/ [22:05:27] (03PS5) 10Lokal Profil: Group unused images per source page [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379141 (https://phabricator.wikimedia.org/T117327) [22:05:33] (03CR) 10jerkins-bot: [V: 04-1] Group unused images per source page [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379141 (https://phabricator.wikimedia.org/T117327) (owner: 10Lokal Profil) [22:08:39] (03PS6) 10Lokal Profil: Group unused images per source page [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379141 (https://phabricator.wikimedia.org/T117327) [22:18:15] (03PS1) 10Lokal Profil: Add default instructions to top of unused images reports [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/380058 [22:44:58] (03PS7) 10Lokal Profil: Group unused images per source page [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379141 (https://phabricator.wikimedia.org/T117327) [23:17:31] (03PS1) 10Lokal Profil: [WIP]Restructure missing_commonscat_links Statistics [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/380060 [23:19:38] (03CR) 10Lokal Profil: "https://gerrit.wikimedia.org/r/380060 and https://gerrit.wikimedia.org/r/#/c/379141" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379974 (https://phabricator.wikimedia.org/T176528) (owner: 10Jean-Frédéric) [23:31:08] PROBLEM - Puppet errors on tools-exec-1407 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]