[00:18:41] PROBLEM - Puppet run on tools-worker-1029 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [00:21:48] PROBLEM - Puppet run on tools-worker-1028 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [00:47:24] PROBLEM - Puppet run on tools-bastion-03 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [01:27:23] RECOVERY - Puppet run on tools-bastion-03 is OK: OK: Less than 1.00% above the threshold [0.0] [02:52:53] PROBLEM - Puppet run on tools-exec-1220 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [03:27:51] RECOVERY - Puppet run on tools-exec-1220 is OK: OK: Less than 1.00% above the threshold [0.0] [03:48:19] 10PAWS, 10Pywikibot-OAuth: PAWS can not login - https://phabricator.wikimedia.org/T136114#3078718 (10Tgr) Removing the extension tag as there does not seem to be anything actionable on that side. [03:48:53] 10Labs-project-other, 10MediaWiki-extensions-OAuth: Add OAuth 2.0 support to MediaWiki - https://phabricator.wikimedia.org/T125337#3078720 (10Tgr) p:05Triage>03Low [04:08:15] 10PAWS, 10MediaWiki-extensions-OAuth, 10Pywikibot-OAuth: PAWS can not login - https://phabricator.wikimedia.org/T136114#3078755 (10Tgr) On second thought I'll keep the tag and hide it on the workboard. Sorry for the noise. [06:48:23] PROBLEM - Puppet run on tools-bastion-03 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [06:53:52] PROBLEM - Puppet run on tools-exec-1220 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [07:28:22] RECOVERY - Puppet run on tools-bastion-03 is OK: OK: Less than 1.00% above the threshold [0.0] [07:28:52] RECOVERY - Puppet run on tools-exec-1220 is OK: OK: Less than 1.00% above the threshold [0.0] [08:49:52] PROBLEM - Puppet run on tools-exec-1220 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [09:29:51] RECOVERY - Puppet run on tools-exec-1220 is OK: OK: Less than 1.00% above the threshold [0.0] [12:59:24] 10Tool-Labs-tools-Other: Add Read me to repository tool-editathonstat - https://phabricator.wikimedia.org/T159818#3079846 (10MarcoAurelio) [13:08:29] 10Tool-Labs-tools-Other: Add Read me to repository tool-editathonstat - https://phabricator.wikimedia.org/T159818#3079889 (10MarcoAurelio) a:05Ranjithsiji>03None R2064 is empty and has no code. Where is this repo hosted? (Removing assignee as I think it's a mistake, feel free to correct me if I'm wrong). [13:09:32] 10Tool-Labs-tools-Other: Add Read me to repository tool-editathonstat - https://phabricator.wikimedia.org/T159818#3079891 (10MarcoAurelio) a:03Ranjithsiji [13:12:33] 10Tool-Labs-tools-Other: Add Read me to repository tool-editathonstat - https://phabricator.wikimedia.org/T159818#3079897 (10MarcoAurelio) Okay it seems https://github.com/ranjithsiji/tools-editathonstat is the source. I'll arrange to have the Phab repo to mirror the GitHub one. [14:50:51] PROBLEM - Puppet run on tools-exec-1220 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [15:23:55] 06Labs, 10Labs-Infrastructure: Labvirt1001 has insanely slow IO - https://phabricator.wikimedia.org/T159835#3080282 (10Andrew) [15:24:37] 06Labs, 10Labs-Infrastructure, 06Operations, 10ops-eqiad: Labvirt1001 has insanely slow IO - https://phabricator.wikimedia.org/T159835#3080282 (10Andrew) [15:30:53] RECOVERY - Puppet run on tools-exec-1220 is OK: OK: Less than 1.00% above the threshold [0.0] [16:04:40] 06Labs, 10Wikimedia-Labs-General: Rename project bots to wm-bot - https://phabricator.wikimedia.org/T57691#3080462 (10Andrew) 05Open>03Resolved a:03Andrew We have a new project, wm-bot, where wm-but is being moved. [16:13:47] hi andrewbogott I'm hoping to work on migrating maps-warper from precise soon. i don't think an upgrade would do well, the instance had an old template and suffers from very restrictive disk space, and /mnt broke a while back. Is it possible to get a new instance with the more generous defaults to effectively replace the server but still keep the maps-warper name? [16:14:24] chippy: sure. Do you not have permissions to create instances? Or are you over quota? [16:14:46] I'm not sure, I have not created one before. Would it be in Horizon? [16:15:10] yep [16:15:33] I don't mind creating it for you, just want to make sure nothing is broken with self-serve :) [16:16:10] andrewbogott, the button says Launch Instance (quota exceeded) [16:16:26] ok, I'll mess with the quotas. What flavor of instance do you want? [16:17:17] just seeing whats available [16:18:22] 06Labs: Request increased quota for labs project - https://phabricator.wikimedia.org/T159843#3080530 (10Andrew) [16:18:39] 06Labs: Request increased quota for maps labs project - https://phabricator.wikimedia.org/T159843#3080544 (10Andrew) [16:19:01] andrewbogott, m1.large I think should do it [16:19:21] ok [16:19:53] 06Labs: Request increased quota for maps labs project - https://phabricator.wikimedia.org/T159843#3080530 (10chasemp) This is temporary and in service of precise deprecation? +1 [16:20:29] 06Labs: Request increased quota for maps labs project - https://phabricator.wikimedia.org/T159843#3080556 (10Andrew) This is for one Large size instance: 16G and 8 CPUs. And, yes, we'll lower the quota after the corresponding precise instance is gone. [16:22:15] chippy: try now? [16:22:22] okay [16:22:28] lets give it a go [16:23:17] 06Labs: Request increased quota for maps labs project - https://phabricator.wikimedia.org/T159843#3080575 (10Andrew) I granted this. Will rename the bug to reflect future quota reduction after the precise instance is cleaned up. [16:23:31] 06Labs: Revert increased quota for maps labs project - https://phabricator.wikimedia.org/T159843#3080577 (10Andrew) [16:24:31] andrewbogott, do I launch from the image list, choose an image, click the button? [16:24:41] or from the launch instance button above the instance list? [16:25:30] chippy: I think they're equivalent. You should get a wizard with various options for creation. [16:25:36] ahh ok :) [16:29:05] spawning [16:31:47] !log tools Depooled tools-webgrid-lighttpd-1409 for cold migrating to different labvirt [16:31:50] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [16:32:05] andrewbogott, okay appears be running via horizon now [16:32:14] maps-warper2 [16:32:49] chippy: great. It'll take 3 or 4 minutes to come up, then you should be able to ssh in. [16:32:58] thank you andrewbogott :) [16:34:01] would I need to mount /home and /nfs (I imagine not) [16:34:20] the various nfs mount points i mean [16:41:53] PROBLEM - Host tools-webgrid-lighttpd-1409 is DOWN: CRITICAL - Host Unreachable (10.68.18.43) [16:47:53] andrewbogott, hmm, i'm unable to ssh, says no route to host, no ping, and console log shows several Could not request certificate: getaddrinfo: Name or service not known" after the "Creating a new SSL key for..." line [16:49:18] chippy: looking... [16:51:53] PROBLEM - Puppet run on tools-exec-1220 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [16:53:36] chippy: something drastic has happened to that instance that I haven't seen before. Probably best to just delete it and try again. [16:54:12] andrewbogott, okay, I will do that [16:54:43] It doesn't look to me like you made any mistake in creating it. I'm not sure what's going on. [16:54:48] we'll see if it happens a second time :/ [16:55:09] there was an earlier error about network unreachable, but I dunno [17:06:53] chippy: this one looks better but I'm going to reboot it to see what's going on with /home [17:07:04] okay thanks, was about to ask! [17:10:43] chasemp: can you log into maps-warper2.maps.eqiad.wmflabs and see what's happening with nfs mounting? [17:10:54] Seems like you were the most recent to mess with nfs vs. puppet [17:11:08] andrewbogott: yes but I'm in a meeting for the next hour [17:11:10] I'll try to look [17:11:16] ok [17:11:48] chippy: nfs is pretty messed up there, I'd like to wait for chase to look if you can wait a bit. [17:11:52] Again, it's nothing you did [17:12:03] andrewbogott: honestly itlooks fine to me [17:12:03] andrewbogott, sure I'm happy to wait [17:12:15] andrewbogott: it self corrected? [17:12:17] what do you see? [17:12:27] https://www.irccloud.com/pastebin/mtMG8slm/ [17:12:31] chasemp: it's empty? [17:12:35] And not empty, on the other instance that mounts it [17:12:47] ahhh [17:12:48] yes ok [17:13:04] andrewbogott: so madhuvishy moved some maps specific things to statically managed so we could use the labstore1003 storage [17:13:05] contrast with [17:13:12] https://www.irccloud.com/pastebin/KE2zU2NF/ [17:13:21] madhuvishy: I think there is a new instance for a maps project we need to add to teh export file manually^ [17:13:42] andrewbogott: we let them know when we did it new instances were going to require manual effort iirc [17:13:50] yeah [17:13:50] chasemp: ok [17:13:57] I think that 'them' is not a clear set of people these days [17:14:01] yeah [17:14:23] andrewbogott: once we get the old cluster reimaged and such we can move things back but 1003 was never meant for dynamic export and is trusty still [17:14:39] so it was more sane to make this one instance a one-off [17:14:42] anyway, cool [17:14:45] madhuvishy knows what's up [17:15:01] madhuvishy: let me know when I should remount things :) [17:15:02] i can make ethe change [17:15:38] thanks madhuvishy :) [17:16:07] for future role::labs::nfs::misc::maps_project_internal_ips in misc.yaml needs to be appended with the ip [17:16:13] andrewbogott: which instance is this? [17:16:23] maps-warper2 [17:16:29] madhuvishy: 10.68.20.112 [17:16:42] perfect, i'm patching puppet [17:17:31] RECOVERY - Host tools-webgrid-lighttpd-1409 is UP: PING OK - Packet loss = 0%, RTA = 1.55 ms [17:21:17] !log tools tools-webgrid-lighttpd-1409 migrated to labvirt1011 and repooled [17:21:20] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [17:24:15] 10Tool-Labs-tools-Pageviews: Add CirrusSearch to Massviews - https://phabricator.wikimedia.org/T159858#3080864 (10MusikAnimal) [17:26:52] RECOVERY - Puppet run on tools-exec-1220 is OK: OK: Less than 1.00% above the threshold [0.0] [17:27:38] andrewbogott: i enabled exports for maps-warper2 [17:27:47] madhuvishy: thanks! [17:27:58] also the cold migrate finished [17:33:35] chasemp, madhuvishy, I'm still seeing a totally empty labstore1003-maps on maps-warper2. [17:33:46] I've tried rebooting, re-running puppet, and unmounting/remounting [17:36:59] madhuvishy: ^ did you update and run puppet on 1003 and do exportfs -ra [17:37:18] then reboot the VM if the output of showmount -e labstore1003.eqiad.wmnet is sane [17:37:53] chasemp: ah yes i forgot the exports [17:38:26] I can do it [17:38:28] if you aren't already [17:39:09] andrewbogott: i'm doing that now [17:39:14] thanks [17:39:20] ah i see reboot [17:39:33] yep, working now [17:39:45] chippy: maps-warper2 should be ready to go now, at last :) [17:39:56] thanks! [17:42:51] thanks all too. yes it's good to go. [17:45:56] i'm tracking my work for this on https://phabricator.wikimedia.org/T159846 So i shall try to get the server setup this week, andrewbogott. In terms of data migration apart from config files its mainly a pg data directory which I will dump and restore via the nfs home directory [17:46:10] great [17:46:14] thanks chippy [17:46:23] thanks for all your work! [17:46:33] andrewbogott: can you make T159846 a subtask of the precise ticket [17:46:35] T159846: Wikmaps Warper - Migrate / Upgrade maps-warper from Precise to Trusty - https://phabricator.wikimedia.org/T159846 [17:46:47] ok [17:46:52] !log wikispeech Deploy latest from Git master: aca4abacc4a41145d3f30ba5434416aed4018896 (T151884) [17:46:55] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Wikispeech/SAL [17:46:55] T151884: Change text to glyphs - https://phabricator.wikimedia.org/T151884 [17:47:14] ( andrewbogott sorry I would do it but I'm splitting my brain here in like 4 documents ) [17:47:49] 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: Deprecate precise instances in Labs by 03/31/2017 - https://phabricator.wikimedia.org/T143349#3080923 (10Andrew) [17:48:03] 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: Deprecate precise instances in Labs by 03/31/2017 - https://phabricator.wikimedia.org/T143349#2963643 (10Andrew) [17:48:32] 10Tool-Labs-tools-Pageviews: Add CirrusSearch to Massviews - https://phabricator.wikimedia.org/T159858#3080864 (10Mrjohncummings) This will allow me to provide a reporting mechanism to show organisations where their text is being used on Wikipedia, this will hopefully encourage more organisations to make their t... [17:59:07] !log tools depooling, migrating tools-exec-1416 as part of ongoing labvirt1001 issues [17:59:10] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [18:05:24] PROBLEM - Host tools-exec-1416 is DOWN: CRITICAL - Host Unreachable (10.68.23.14) [18:18:07] 06Labs: Labs instance snuggle-en.snuggle.eqiad.wmflabs needs to be upgraded, replaced, or deleted - https://phabricator.wikimedia.org/T158970#3053261 (10chasemp) just a friendly ping @Halfak that we are a few weeks away from shutting this down [18:19:28] 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: Deprecate precise instances in Labs by 03/31/2017 - https://phabricator.wikimedia.org/T143349#3081079 (10chasemp) > >>>! In T143349#2963953, @TParis wrote: >> I'm in the middle of a move from Hawaii to Texas, Deltaquad would be the >> best poc. But if it... [18:26:04] 06Labs, 10Huggle: Labs instance huggle.huggle.wmflabs needs to be replaced or deleted - https://phabricator.wikimedia.org/T157710#3081146 (10chasemp) To clarify a bit: we are also worried about the use cases of huggle but we are not prepared to take it over as a managed service either. We are hopeful someone... [18:52:53] PROBLEM - Puppet run on tools-exec-1220 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [18:53:24] 10Tool-Labs-tools-Pageviews: Expand Massviews options to include mainspace and show other source-specific data - https://phabricator.wikimedia.org/T159868#3081298 (10MusikAnimal) [18:55:40] (03PS4) 10BryanDavis: Add rewritten crontab in Python [labs/toollabs] - 10https://gerrit.wikimedia.org/r/336998 (https://phabricator.wikimedia.org/T156174) (owner: 10Zhuyifei1999) [18:56:26] (03CR) 10BryanDavis: "> Uploaded patch set 4." [labs/toollabs] - 10https://gerrit.wikimedia.org/r/336998 (https://phabricator.wikimedia.org/T156174) (owner: 10Zhuyifei1999) [18:57:07] (03CR) 10jerkins-bot: [V: 04-1] Add rewritten crontab in Python [labs/toollabs] - 10https://gerrit.wikimedia.org/r/336998 (https://phabricator.wikimedia.org/T156174) (owner: 10Zhuyifei1999) [19:05:53] (03PS5) 10BryanDavis: Add rewritten crontab in Python [labs/toollabs] - 10https://gerrit.wikimedia.org/r/336998 (https://phabricator.wikimedia.org/T156174) (owner: 10Zhuyifei1999) [19:07:35] (03CR) 10jerkins-bot: [V: 04-1] Add rewritten crontab in Python [labs/toollabs] - 10https://gerrit.wikimedia.org/r/336998 (https://phabricator.wikimedia.org/T156174) (owner: 10Zhuyifei1999) [19:08:28] (03CR) 10BryanDavis: "> Uploaded patch set 5." [labs/toollabs] - 10https://gerrit.wikimedia.org/r/336998 (https://phabricator.wikimedia.org/T156174) (owner: 10Zhuyifei1999) [19:09:54] 06Labs: Labs instance snuggle-en.snuggle.eqiad.wmflabs needs to be upgraded, replaced, or deleted - https://phabricator.wikimedia.org/T158970#3081369 (10Halfak) Deletion has been scheduled. Thanks for your incredible patience. [19:10:28] 06Labs: Labs instance snuggle-en.snuggle.eqiad.wmflabs needs to be upgraded, replaced, or deleted - https://phabricator.wikimedia.org/T158970#3053261 (10Halfak) 05Open>03Resolved [19:10:33] \o/ [19:27:25] RECOVERY - Host tools-exec-1416 is UP: PING OK - Packet loss = 0%, RTA = 2.67 ms [19:27:51] RECOVERY - Puppet run on tools-exec-1220 is OK: OK: Less than 1.00% above the threshold [0.0] [19:29:01] (03PS6) 10BryanDavis: Add rewritten crontab in Python [labs/toollabs] - 10https://gerrit.wikimedia.org/r/336998 (https://phabricator.wikimedia.org/T156174) (owner: 10Zhuyifei1999) [19:53:37] thanks halfak! [19:53:55] no, thank you. :D [19:54:18] You've got patience. The best patience. Everyone agrees. [19:54:23] ;) [20:18:29] Hi all, I'm running a longer query on tool labs and get the error "Lost connection to MySQL server during query". Is there a setting I can adjust to avoid this timeout? Thanks [20:20:00] Who operates: "cewbot “cewbot”" [20:21:54] TabbyCat: kanashimi is the shell name of the operator [20:22:09] hall1467: it depends on /how long/ as there are timeouts enforced on query lengths [20:22:16] and I cannot recall what they are offhand [20:22:30] chasemp: thanks, it's because it join/parts from -i18n and we don't know what that bot is for :) [20:22:39] not flooding, it simply does nothing [20:22:50] TabbyCat: sure, I don't really know them so you'll have to reach out [20:23:02] pkill 9 is the solution :P [20:23:25] they have stated 'I want to create a wiki library for bots running on Tool Labs.' [20:23:45] https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Cewbot [20:23:58] * chasemp shrugs yeah I dunno what they are doing [20:26:20] chasemp: Mine goes for about an hour and a half. Is it possible to adjust timeouts some? [20:27:44] hall1467: I think not...but I also don't know the threshold, your best bet is to make a ticket outlinging and one of the DBA's could advise. if you are going for some specific wikidb's it could be we have an alternative setup to query too [20:28:42] hall1467: so labsdb-analytics.eqiad.wmnet is there as experimental but does not have all wiki's so not sure if useful for you but should be more performant w/ high threshoelds [20:31:18] chasemp: Thanks for the advice! I'll look into labsdb-analytics [20:53:54] PROBLEM - Puppet run on tools-exec-1220 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [21:20:34] (03CR) 10BryanDavis: [C: 04-1] Add rewritten crontab in Python (035 comments) [labs/toollabs] - 10https://gerrit.wikimedia.org/r/336998 (https://phabricator.wikimedia.org/T156174) (owner: 10Zhuyifei1999) [21:28:52] RECOVERY - Puppet run on tools-exec-1220 is OK: OK: Less than 1.00% above the threshold [0.0] [21:46:32] (03PS1) 10BryanDavis: [WIP] jsub: Remove support for release=precise [labs/toollabs] - 10https://gerrit.wikimedia.org/r/341666 (https://phabricator.wikimedia.org/T94792) [21:47:14] legoktm: you busy? [21:47:27] 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: Deprecate precise instances in Labs by 03/31/2017 - https://phabricator.wikimedia.org/T143349#3082000 (10Andrew) [21:48:40] 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: Deprecate precise instances in Labs by 03/31/2017 - https://phabricator.wikimedia.org/T143349#2963953 (10Andrew) [21:53:56] (03CR) 10jerkins-bot: [V: 04-1] [WIP] jsub: Remove support for release=precise [labs/toollabs] - 10https://gerrit.wikimedia.org/r/341666 (https://phabricator.wikimedia.org/T94792) (owner: 10BryanDavis) [22:02:00] 06Labs: horizon puppet panel vs. hiera parsing - https://phabricator.wikimedia.org/T156655#3082034 (10Andrew) 05Open>03Invalid I'm not sure this is a real bug so much as me misunderstanding yaml. [22:09:35] 10Tool-Labs-tools-Pageviews: Add CirrusSearch to Massviews - https://phabricator.wikimedia.org/T159858#3082096 (10Mrjohncummings) Incredible, thanks so much :) I will include this tool in a section on metrics on https://en.wikipedia.org/wiki/Wikipedia:Adding_open_license_text_to_Wikipedia Thanks again [22:26:48] 06Labs, 10Huggle: Labs instance huggle.huggle.wmflabs needs to be replaced or deleted - https://phabricator.wikimedia.org/T157710#3082205 (10Petrb) Yes I will try to upgrade it when I have time [22:42:19] bd808: i got a quick question is it possible to set 2 projects to one instance of stashbot in 1 channel? [22:49:51] PROBLEM - Puppet run on tools-exec-1220 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [23:04:15] 10Tool-Labs-tools-Zppixbot, 10Wikibugs, 10QuarryBot-enwiki: Wikibugs doesnt send phab updates to ##Zppix-Wikipedia - https://phabricator.wikimedia.org/T159612#3082329 (10Zppix) Bump, awaiting reply [23:12:44] Zppix: those words don't quite make sense to me. you want to !log to two places in parallel? [23:13:13] No, be able to choose between one or the other [23:13:35] this channel does that with !log [23:13:41] so yes it is possible [23:13:55] In that case ill pm you with somethimg [23:16:54] andrewbogott: I am working with Dr. Cohl on the Math project and need to set up a proxy. Can you help elevate me to admin position? [23:19:12] unclear how I'm meant to respond to that [23:19:28] He left you dont i guess lol andrewbogott [23:24:28] 06Labs: check on the nova-api upstart logs - https://phabricator.wikimedia.org/T159141#3082363 (10Andrew) There doesn't seem to be a good way to win this one. I already added 'delaycompress' to the logrotate script to prevent cronspam (https://gerrit.wikimedia.org/r/#/c/313558/) but now upstart just writes to t... [23:29:53] RECOVERY - Puppet run on tools-exec-1220 is OK: OK: Less than 1.00% above the threshold [0.0] [23:59:27] 06Labs, 10Tool-Labs, 10Tools-Kubernetes: Make tools-webservice use the official kubernetes python client rather than pykube - https://phabricator.wikimedia.org/T159892#3082418 (10yuvipanda)