[00:04:08] RECOVERY - Puppet errors on tools-exec-1407 is OK: OK: Less than 1.00% above the threshold [0.0] [00:18:15] 10cloud-services-team (FY2017-18), 10Goal, 10Patch-For-Review, 10User-bd808: Perform initial Cloud Services rebranding - https://phabricator.wikimedia.org/T168480#3620109 (10Quiddity) [00:20:02] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1413 is OK: OK: Less than 1.00% above the threshold [0.0] [00:39:39] PROBLEM - Puppet errors on tools-exec-1417 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [00:51:59] PROBLEM - Puppet errors on tools-exec-1433 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [01:09:41] RECOVERY - Puppet errors on tools-exec-1417 is OK: OK: Less than 1.00% above the threshold [0.0] [02:02:00] RECOVERY - Puppet errors on tools-exec-1433 is OK: OK: Less than 1.00% above the threshold [0.0] [02:23:03] PROBLEM - Puppet errors on tools-exec-1433 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [03:28:59] PROBLEM - Puppet errors on tools-exec-1425 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [03:33:00] RECOVERY - Puppet errors on tools-exec-1433 is OK: OK: Less than 1.00% above the threshold [0.0] [04:28:59] RECOVERY - Puppet errors on tools-exec-1425 is OK: OK: Less than 1.00% above the threshold [0.0] [04:56:00] bd808: I apparently closed irc and forgot :/ I couldn't even ssh into 1439 so I rebooted it from horizon, it came up looking okay [04:56:26] usually I look and top and iotop and see if something is furiously writing to NFS [05:12:45] PROBLEM - Puppet errors on tools-exec-1412 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [05:52:47] RECOVERY - Puppet errors on tools-exec-1412 is OK: OK: Less than 1.00% above the threshold [0.0] [06:30:04] PROBLEM - Puppet errors on tools-exec-1439 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [07:05:04] RECOVERY - Puppet errors on tools-exec-1439 is OK: OK: Less than 1.00% above the threshold [0.0] [07:27:10] (03CR) 10Jean-Frédéric: [C: 032] Ensure skipped image categorizations are mentioned in stats [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378641 (https://phabricator.wikimedia.org/T174871) (owner: 10Lokal Profil) [07:27:43] (03CR) 10Jean-Frédéric: "recheck" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379140 (owner: 10Lokal Profil) [07:28:53] (03CR) 10jerkins-bot: [V: 04-1] Make all erfgoedbot scripts respect the skipping mechanisms. [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379140 (owner: 10Lokal Profil) [07:29:24] PROBLEM - Puppet errors on tools-worker-1017 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [07:33:19] (03CR) 10Jean-Frédéric: "> It should be possible to solve this via a config/environment" (031 comment) [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378800 (https://phabricator.wikimedia.org/T174614) (owner: 10Lokal Profil) [08:08:52] 10Cloud-Services, 10Toolforge, 10DBA: Disabling general.confirmeduser from dbreports for using up too much db resources - https://phabricator.wikimedia.org/T131956#3620427 (10jcrespo) p50380g50440 was running several queries that were never going to stop executing, and causing 1 day of lag on labsdb1001: ``... [08:09:25] RECOVERY - Puppet errors on tools-worker-1017 is OK: OK: Less than 1.00% above the threshold [0.0] [08:18:46] PROBLEM - Puppet errors on tools-exec-1412 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [08:45:27] (03CR) 10Lokal Profil: "I forgot to update the tests when I originally introduced the test mechanism. will add it now" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379140 (owner: 10Lokal Profil) [08:53:45] RECOVERY - Puppet errors on tools-exec-1412 is OK: OK: Less than 1.00% above the threshold [0.0] [09:14:46] PROBLEM - Puppet errors on tools-exec-1412 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [09:29:50] 10Cloud-VPS, 10Operations-Software-Development, 10Patch-For-Review: Install cumin in the WMCS infrastructure - https://phabricator.wikimedia.org/T175712#3620637 (10hashar) >>! In T175712#3607926, @chasemp wrote: >>>! In T175712#3601612, @hashar wrote: >> As a side effect, #beta-cluster-infrastructure and #c... [09:32:44] (03PS2) 10Lokal Profil: Make all erfgoedbot scripts respect the skipping mechanisms. [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379140 [09:38:20] 10Data-Services, 10XTools: s51187 and p50380g50692 database users are generating excessive lag on replica service - https://phabricator.wikimedia.org/T172882#3620646 (10jcrespo) dplbot/s51290 seems to keep creating issues. In this case, the lag was caused by a different issue (another user creating heavy queri... [09:54:48] RECOVERY - Puppet errors on tools-exec-1412 is OK: OK: Less than 1.00% above the threshold [0.0] [10:32:43] PROBLEM - Puppet errors on tools-exec-1434 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [10:37:29] (03PS3) 10Lokal Profil: Add mechanism for storing wikipage locally instead of writing to wiki [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378800 (https://phabricator.wikimedia.org/T174614) [10:45:35] (03CR) 10jerkins-bot: [V: 04-1] Add mechanism for storing wikipage locally instead of writing to wiki [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378800 (https://phabricator.wikimedia.org/T174614) (owner: 10Lokal Profil) [11:02:34] (03PS4) 10Lokal Profil: Add mechanism for storing wikipage locally instead of writing to wiki [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378800 (https://phabricator.wikimedia.org/T174614) [11:02:53] (03CR) 10Lokal Profil: Add mechanism for storing wikipage locally instead of writing to wiki (031 comment) [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378800 (https://phabricator.wikimedia.org/T174614) (owner: 10Lokal Profil) [11:07:43] RECOVERY - Puppet errors on tools-exec-1434 is OK: OK: Less than 1.00% above the threshold [0.0] [11:15:46] PROBLEM - Puppet errors on tools-exec-1412 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [11:45:37] 10Cloud-VPS, 10Operations-Software-Development, 10Patch-For-Review: Install cumin in the WMCS infrastructure - https://phabricator.wikimedia.org/T175712#3620881 (10Volans) >>! In T175712#3601612, @hashar wrote: > As a side effect, #beta-cluster-infrastructure and #continuous-integration-infrastructure would... [11:48:27] 10Cloud-VPS, 10Operations-Software-Development: Install cumin in the WMCS infrastructure - https://phabricator.wikimedia.org/T175712#3620883 (10Volans) 05Open>03Resolved Cumin master is now installed on `labpuppetmaster100[1-2].wikimedia.org` and is able to connect via SSH to cloud instances, including the... [12:20:48] RECOVERY - Puppet errors on tools-exec-1412 is OK: OK: Less than 1.00% above the threshold [0.0] [12:51:48] PROBLEM - Puppet errors on tools-exec-1412 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [12:53:11] (03PS3) 10Lokal Profil: Make all erfgoedbot scripts respect the skipping mechanisms. [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379140 [13:16:47] (03PS4) 10Lokal Profil: Ensure skipped image categorizations are mentioned in stats [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378641 (https://phabricator.wikimedia.org/T174871) [13:17:13] 10Cloud-VPS, 10Operations-Software-Development: Install cumin in the WMCS infrastructure - https://phabricator.wikimedia.org/T175712#3621057 (10hashar) >>! In T175712#3620881, @Volans wrote: >>>! In T175712#3601612, @hashar wrote: >> As a side effect, #beta-cluster-infrastructure and #continuous-integration-i... [13:20:33] (03CR) 10Lokal Profil: [C: 032] "Last patch was just a rebase. Re +2:ing" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378641 (https://phabricator.wikimedia.org/T174871) (owner: 10Lokal Profil) [13:21:47] RECOVERY - Puppet errors on tools-exec-1412 is OK: OK: Less than 1.00% above the threshold [0.0] [14:49:45] Hello! I need some help with database access at Toolforge. I want to access the databases with php (mysqli_connect("localhost","my_user","my_password","my_db");) but after migrating Tool Labs to Toolforge, old connection data doesnt seem to work. Could someone help me? [14:49:59] Sanyi4: so you have a tool that has stopped working? [14:50:06] yes [14:50:45] if you are literally using localhost that may be your problem. the database servers run separate from the Toolforge environment [14:52:11] The username and password you need to use will be in the $HOME/replica.my.cnf file owned my your tool [14:52:22] Technical Advice IRC meeting in 10 minutes in channel #wikimedia-tech, hosts: @addshore & @Tobi_WMDE_SW - all questions welcome, more infos: https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting [14:52:41] no, it's just a template. i need the things, that i have to put there in place of "localhost"... [14:53:46] Sanyi4: the database hostname depend on what database you are searching. The naming conventions are given at https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database [14:55:46] ok, i will have a look. thanks! [15:14:11] I am facing trouble creating repository for my tool at TOOLFORGE. It says no phabricator account found for tool maintainer whereas i have updated my account details. [15:14:18] Please help. [15:15:45] yasha: the first thing I would recommend is logging out of toolsadmin and then logging back in [15:16:05] there is a bug with caching that may be fixed by that [15:16:58] alright! m trying it out [15:18:14] bd808: It still says the same error.. :( [15:20:19] exact error is- "No Phabricator accounts found for tool maintainers." while creating the a new diffusion repository [15:20:57] though my toolforge account is well linked with phabricator account [15:22:55] yasha: hmm... which tool? I can peek into the database [15:22:56] (03CR) 10Lokal Profil: Harvest the source page of unknown fields (031 comment) [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/379126 (https://phabricator.wikimedia.org/T117330) (owner: 10Lokal Profil) [15:25:00] bd808: tool's name is devyasha (tools.devyasha) [15:25:22] maintainer- Yashasingh [15:25:26] 10cloud-services-team (FY2017-18), 10Goal, 10Patch-For-Review: Define a metric to track OpenStack system availability - https://phabricator.wikimedia.org/T167556#3621499 (10Andrew) I've added fullstack success % to the above graph. We still need to add some auto-cleanup functions to the fullstack test to ke... [15:54:07] 10Striker: "No Phabricator accounts found for tool maintainers." while creating the a new diffusion repository - https://phabricator.wikimedia.org/T176325#3621593 (10bd808) [16:00:55] Hi! Is any volunteer using Pycharm IDE here. To understand what scripts/webservice does when a user runs it as webservice --backend kubernetes start. For debugging to be easy I was trying to use Pycharm. [16:01:55] (not me) [16:02:24] But, unable to set environment. [16:02:56] zhuyifei1999: for debugging do you use pdb? [16:03:14] I think that's going to be very tricky Mridu. There is no easily repeatable runtime environment for working on the webservice command locally [16:03:22] * zhuyifei1999_ shamefully use print [16:03:39] debugging via logs is not shameful :) [16:07:23] 10Striker: "No Phabricator accounts found for tool maintainers." while creating the a new diffusion repository - https://phabricator.wikimedia.org/T176325#3621632 (10bd808) One thing I see at https://toolsadmin.wikimedia.org/tools/id/devyasha is that the tool is a maintainer of itself. I'm not sure if this is th... [16:08:03] bd808: logs, the one which were using a day before? less /data/project/thankyou/uwsgi.log. [16:13:48] 10Striker: "No Phabricator accounts found for tool maintainers." while creating the a new diffusion repository - https://phabricator.wikimedia.org/T176325#3621639 (10bd808) @Yashasingh I removed the tool from being a maintainer of itself. That is not needed for any operations. Please try to create a diffusion re... [16:32:20] If I run webservice --backend kubernetes start. I am getting "Could not find a public_html folder or a .lighttpd.conf file in your tool home" error.While webservice --backend=kubernetes python start works fine. But, then I am not abe to see any prints. [16:32:38] able* [16:32:53] 10Cloud-Services, 10cloud-services-team (Kanban), 10Patch-For-Review: Investigate and implement alternative for showmount based check at instance boot time - https://phabricator.wikimedia.org/T171508#3621705 (10madhuvishy) I was able to change the firstboot script, that runs when a new instance is created an... [16:49:51] PROBLEM - Puppet errors on tools-exec-1402 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [16:52:03] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1417 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [16:52:23] PROBLEM - Puppet errors on tools-exec-1430 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [16:52:34] !log tools apt-get install --only-upgrade apache2; service apache2 restart on tools-puppetmaster-01 [16:52:39] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [16:54:02] 10Striker: "No Phabricator accounts found for tool maintainers." while creating the a new diffusion repository - https://phabricator.wikimedia.org/T176325#3621842 (10Yashasingh) @bd808 I tried opening in incognito window too. The problem persists . I'm attaching the screenshot . {F9674142} [16:55:54] Mridu: you are really not going to be able to trace the internals of `webservice start` live on Toolforge. The task is really to read the source code and describe what it does. [17:00:31] bd808: ok. Will do it without debugging then. [17:27:02] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1417 is OK: OK: Less than 1.00% above the threshold [0.0] [17:27:22] RECOVERY - Puppet errors on tools-exec-1430 is OK: OK: Less than 1.00% above the threshold [0.0] [17:29:50] RECOVERY - Puppet errors on tools-exec-1402 is OK: OK: Less than 1.00% above the threshold [0.0] [18:26:43] 10Tools: Tool "ifttt-testing" loads assets from many sites, mixed http/https - https://phabricator.wikimedia.org/T172609#3622255 (10D3r1ck01) p:05Triage>03Normal [18:44:09] (03CR) 10Lokal Profil: "Would you also need to handle the `ALTER TABLE monuments_all ADD INDEX idx_ctry_municp(country,municipality);` line from INSTALL?" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/378974 (owner: 10Jean-Frédéric) [19:14:30] halfak: u2041__ores_p is using 20Gb of disk space on a somewhat overcrowded db server… is that all gold or is there some cleaning up to be done? [19:14:42] 10Striker: "No Phabricator accounts found for tool maintainers." while creating the a new diffusion repository - https://phabricator.wikimedia.org/T176325#3622422 (10bd808) Just to check and see if repo creation was broken universally, I created https://toolsadmin.wikimedia.org/tools/id/bd808-test/repos/id/tool-... [19:14:45] andrewbogott, that is as intended. [19:14:53] ok [19:15:06] there's a proposal to move it to the new DB servers. [19:15:23] https://phabricator.wikimedia.org/T173513 [19:15:25] yep, we'll get there. Just trying to keep the old one from croaking in the meantime :) [19:15:31] Gotcha :) [19:15:53] halfak: not sure if you saw it, but I sent you an email with a couple of questions about that too. [19:16:20] mostly about figuring out how dire it would be for that data to not be available for a while [19:16:40] we are seriously worried that labsdb1001 and 1003 could die any day [19:17:38] bd808, Still catching up on email after a vacation. [19:17:48] halfak: no worries. :) [19:17:59] Generally, I think some important research will slow down, but nothing currently in use. [19:18:04] those mail queues can get deep pretty fast [19:25:25] 10Striker: "No Phabricator accounts found for tool maintainers." while creating the a new diffusion repository - https://phabricator.wikimedia.org/T176325#3622454 (10Yashasingh) Alright sir , I'll recheck to find out , if there's some fault , from my side. [19:47:45] PROBLEM - Puppet errors on tools-exec-1412 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [19:48:25] 10Striker: "No Phabricator accounts found for tool maintainers." while creating the a new diffusion repository - https://phabricator.wikimedia.org/T176325#3622483 (10bd808) >>! In T176325#3622454, @Yashasingh wrote: > Alright sir , I'll recheck to find out , if there's some fault , from my side. I think the err... [19:55:30] there might be something cool here for bot and tool folks -- https://labs.loc.gov/lc-for-robots/ -- APIs and bulk data on Library of Congress collections [20:11:04] andrewbogott: Thanks for the email about the disk usage on labsdb1001! Am looking into it, are “u3290__2013categories” and “u3290__2012categories” that you mention in the email database names? If so, how do I connect to the server to find them? [20:12:17] Nettrom: yes, the database name [20:13:14] 10Striker: "No Phabricator accounts found for tool maintainers." while creating the a new diffusion repository - https://phabricator.wikimedia.org/T176325#3622587 (10Yashasingh) I'll try logging out from all my accounts and then logging in back . I'll try possible ways to link all my accounts with same user hand... [20:13:33] Nettrom: as for how to connect to them… I think you can just use mysql directly from the toolforge bastion. bd808 might be able to feed you ready-made commands... [20:13:54] andrewbogott: n/m, found them [20:14:21] Nettrom: they're yours, right? It's possible i'm misinterpreting the db name... [20:14:33] andrewbogott: yeah, they’re mine [20:14:41] ok, great. Thanks for looking. [20:14:52] andrewbogott: no problem, sorry to have kept these lying around [20:18:01] these are just holding old data from a published study [20:19:08] andrewbogott: I’ve dropped both databases and the tables they were holding [20:19:35] Nettrom: great, I'll watch for my alert to go green :) [20:22:46] RECOVERY - Puppet errors on tools-exec-1412 is OK: OK: Less than 1.00% above the threshold [0.0] [20:57:57] !log git disabling puppet on gerrit-test3 temp, trying to reproduce the slave issue in prod for gerrit [20:58:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Git/SAL [21:13:49] PROBLEM - Puppet errors on tools-exec-1412 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [21:23:21] 10Cloud-Services, 10monitoring, 10Labs-Sprint-109, 10Patch-For-Review, and 3 others: Monitor nova services - https://phabricator.wikimedia.org/T90784#3622920 (10Andrew) This is modestly different, but needs to be retitled. T42022 is about public http APIs, this is about internal services which can break d... [21:23:44] 10Cloud-Services, 10monitoring, 10Labs-Sprint-109, 10Patch-For-Review, and 3 others: Monitor nova-scheduler log for lost contact with compute nodes - https://phabricator.wikimedia.org/T90784#3622921 (10Andrew) [21:25:20] !log git switching slave = true on gerrit-test3 [21:25:24] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Git/SAL [21:25:25] no_justification ^^ [21:25:31] lets see if it works now :) [21:26:30] works for me [21:26:32] [2017-09-20 21:26:07,767] [main] INFO com.google.gerrit.sshd.SshDaemon : Started Gerrit SSHD-CORE-1.2.0 on *:29418 [21:26:32] [2017-09-20 21:26:07,768] [main] INFO com.google.gerrit.pgm.Daemon : Gerrit Code Review 2.13.9-2-g99a8c8bc51-dirty ready [21:26:32] [2017-09-20 21:26:07,768] [main] INFO com.google.gerrit.pgm.Daemon : Gerrit Code Review 2.13.9-2-g99a8c8bc51-dirty ready [21:27:38] PROBLEM - Puppet errors on tools-exec-1411 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:53:48] RECOVERY - Puppet errors on tools-exec-1412 is OK: OK: Less than 1.00% above the threshold [0.0] [22:02:37] RECOVERY - Puppet errors on tools-exec-1411 is OK: OK: Less than 1.00% above the threshold [0.0] [23:47:02] 10Cloud-Services, 10Toolforge: Add api_url to meta_p.wiki database - https://phabricator.wikimedia.org/T93483#1138365 (10bd808) The place to add this would be https://phabricator.wikimedia.org/source/operations-puppet/browse/production/modules/role/files/labs/db/views/maintain-meta_p.py The schema change woul... [23:47:17] 10Data-Services: Add api_url to meta_p.wiki database - https://phabricator.wikimedia.org/T93483#3623254 (10bd808)