[00:03:27] * bd808 clings tightly to his NT 4.0 3.5" floppies [00:05:53] bd808, c'mon - you're old enough to remember OS/2 [00:06:33] I have Windows 2.0 on 5.25" disk still in shrink wrap ;) [00:07:28] my first paid IT gig was migrating an OS/2 network to NT 3.5 [00:08:25] * bd808 earned his grey hairs before they fell out [01:39:32] bd808, madhuvishy: this may be a naive question, but could https://phabricator.wikimedia.org/T167086 also involve a different cap for long-running sql queries on PAWS? [01:39:48] (keeping in mind that PAWS is perhaps more often used for larger one-time tasks - analysis or otherwise - than labs in general, as opposed to generating ongoing load. i'm just making self-interested guesses here though ;) [03:39:58] HaeB: that sounds like a question to talk over with the DBAs, but generally I'd say that if you can abuse the DB via PAWS that will just make more people use PAWS. ;) [03:40:39] I think what you really want is indices in the analytics replicas [03:40:52] which is also a DBA question [03:42:53] bd808: to be clear, i'm not asking because PAWS/labs is faster or better indexed for my current purposes than the analytics slaves (it doesn't seem to be, but i havent researched this much) ... [03:43:48] ...rather because i would prefer PAWS because it's public, replicable by community members, and of course offers jupyter notebooks [03:44:25] *nod* those are all good reasons to use PAWS [03:45:17] I'm a bit leery of violating the "net neutrality" of the wiki replica dbs and giving one app a fastlane over others [03:46:30] I think there has been some discussion in the past however about having a db host for long running things [03:46:55] I'd have to dig through phab to see if that discussion ever made it there. [03:49:59] the only part you can get in the analytics cluster is jupyter via SWAP [04:27:03] bd808: interesting; let me know in case you find that phabricator link. we had the same discussion with analytics some months ago about hive/hadoop (regarding the possibility of "nicing" long-running webrequest queries), but not about sql tables IIRC [04:38:02] 10Labs, 10Tool-Labs: Memory Exhausted Near / Tool labs error while querying with Python - https://phabricator.wikimedia.org/T93074#3331212 (10bd808) 05Open>03declined closing this very old issue [04:39:16] 10Labs, 10Labs-Infrastructure, 10DBA: Database upgrade MariaDB 10: Metadata access in INFORMATION_SCHEMA causes complete blocks - https://phabricator.wikimedia.org/T71182#3331214 (10bd808) a:05Springle>03None [04:39:24] 10Labs, 10Labs-Infrastructure, 10DBA: Database upgrade MariaDB 10: Discrepancies with logging table on different wikis - https://phabricator.wikimedia.org/T71127#3331215 (10bd808) a:05Springle>03None [04:43:10] 10Labs, 10MediaWiki-extensions-OpenStackManager, 10Wikimedia-Labs-General, 10Patch-For-Review, 10Regression: [Regression] wikitech.wikimedia.org is sending empty Echo notification emails - https://phabricator.wikimedia.org/T55778#3331216 (10bd808) 05Open>03declined This has either been fixed or every... [04:48:21] 10Labs, 10Labs-Infrastructure: Allow service users to be added and removed from sudo policies - https://phabricator.wikimedia.org/T48656#3331220 (10bd808) 05Open>03declined We are phasing out service groups for all projects except #tool-labs [04:50:44] 10Labs, 10Labs-Infrastructure: Synchronize mediawiki groups with LDAP have all projects require shell group - https://phabricator.wikimedia.org/T46167#3331222 (10bd808) 05Open>03declined [04:55:32] how do I give another user access to a tool? [04:55:42] so that they can `become foo` and edit its files etc [04:57:06] 10Labs, 10Operations: virbr0 interface present in some virt hosts - https://phabricator.wikimedia.org/T83732#917870 (10bd808) Is this still an issue that needs to be fixed? [04:57:36] madhuvishy: I recommend against 'cloud', see https://www.gnu.org/philosophy/words-to-avoid.html#Cloud, who do I bring this up with? [04:58:52] gry: Hi, for your first question, https://tools.wmflabs.org/?list, find your tool, click on the manage members link beneath the tool [04:59:49] as for the cloud naming, labs-l has all the announcements on naming, and our recent consultation on renaming - https://wikitech.wikimedia.org/wiki/User:BryanDavis/Rebranding_Cloud_Services_products [04:59:52] 10Labs, 10Operations, 10wikitech.wikimedia.org: Turn on Cirrus replicas for labswiki (wikitech) - https://phabricator.wikimedia.org/T83760#3331225 (10bd808) @EBernhardson is this bug report still valid or just ancient cruft? [05:00:58] madhuvishy, who do I send my commentary to? to labs-l? [05:01:32] gry: the cloud services name was finalized as the foundation team name and announced, but all the product naming was proposed and up for discussion until recently. feel free to comment on the Talk page on the Rebranding consultation, in a new section [05:01:38] gry: the ship has sailed on the use of "cloud" [05:02:23] bd808: it is disengaging, users need to understand they themselves can write the tools, not only use a cloud 'service' provided to them as a 'consumer', i would suggest something else (not labs and not cloud) [05:02:37] bd808: i won't insist, it is not my project, but i'll put a comment on the talk page [05:02:43] The lead paragraph of https://en.wikipedia.org/wiki/Cloud_computing exactly describes our services [05:02:55] Tools is going to continue being called Tools/ToolForge too [05:03:15] bd808: cloud computing is a service, it is not a collaboration platform (which lab is) [05:03:37] 10Labs, 10Operations, 10wikitech.wikimedia.org: Turn on Cirrus replicas for labswiki (wikitech) - https://phabricator.wikimedia.org/T83760#3331228 (10EBernhardson) 05Open>03Resolved a:03EBernhardson It looks like everything in deployment-prep is using either one or two replicas. Should be fine. [05:04:04] The team and the collective environment are named "cloud services". The products are Wikimedia VPS and ToolForge [05:04:18] and other things actually [05:04:31] bd808: yes, product names are ok. the team name is misleading I think [05:04:52] https://wikitech.wikimedia.org/wiki/Help:Cloud_Services_Introduction tries to explain [05:04:57] your objection is noted, but that decision was made in December [05:05:16] bd808: yes, thanks :) [05:06:54] I tried to be on labs-l but it was too much irrelevant traffic that I couldn't assist with, I will try again I guess [05:09:31] gry: we try to only send important announcements there on maintenance, upgrades and any product/platform updates, what kinda things do you think are noise there? [05:13:47] 10Tool-Labs-tools-Xtools, 10Community-Tech-Sprint: Restrict access to users' edit stats unless opted-in - https://phabricator.wikimedia.org/T165401#3331232 (10Samwilson) @MusikAnimal @Matthewrbowker : I know I listed 'top edited pages' above under the things that Edit Counter hides, but I'm right in thinking t... [06:15:16] a user is registered at wikitech, but https://wikitech.wikimedia.org/w/index.php?title=Special:NovaServiceGroup&action=managemembers&projectname=tools&servicegroupname=tools.gpy top dropdown does not list them, is it a wrong place to add new maintainers to a tool? [06:15:18] madhuvishy: one sec [06:16:17] madhuvishy: I'll get back to you about which messages are noisy in a few hours/days, I don't have it online, sorry - will let you know [06:16:34] (it is on some computer at home and I have no idea which or why) [08:13:16] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1426 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [08:48:15] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1426 is OK: OK: Less than 1.00% above the threshold [0.0] [09:47:39] 10Tool-Labs-tools-Erwin's-tools: Add option to count logs as well in xContribs - https://phabricator.wikimedia.org/T89602#3331581 (10Nemo_bis) 05Open>03Resolved Despite what I said, the difference in graphs is not great e.g. for an account like mine: {F8410675} {F8410677} Additionally, we may want to handle... [09:49:18] 10Labs, 10Tool-Labs, 10Goal: Contact tool maintainters using large amounts of disk space - https://phabricator.wikimedia.org/T136212#3331586 (10Nemo_bis) [09:49:21] 10Labs, 10Tool-Labs: toolserver-home-archive is using 52G on Tools - https://phabricator.wikimedia.org/T136202#3331584 (10Nemo_bis) 05Open>03declined > I guess I could uncompress the archive and use something like filelight. Turns out I don't have enough disk space for this right now. Currently there's no... [11:18:35] (03PS1) 10Ayounsi: Add mock rancid ssh key [labs/private] - 10https://gerrit.wikimedia.org/r/357791 [11:20:43] (03CR) 10Volans: [C: 031] "LGTM" [labs/private] - 10https://gerrit.wikimedia.org/r/357791 (owner: 10Ayounsi) [11:23:36] (03CR) 10Ayounsi: [V: 032 C: 032] Add mock rancid ssh key [labs/private] - 10https://gerrit.wikimedia.org/r/357791 (owner: 10Ayounsi) [11:43:49] PROBLEM - High iowait on tools-exec-1435 is CRITICAL: CRITICAL: tools.tools-exec-1435.cpu.total.iowait (>20.00%) [11:53:48] RECOVERY - High iowait on tools-exec-1435 is OK: OK: All targets OK [11:54:54] PROBLEM - Puppet errors on tools-exec-1435 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [12:19:56] RECOVERY - Puppet errors on tools-exec-1435 is OK: OK: Less than 1.00% above the threshold [0.0] [13:47:49] 10Labs, 10Operations: virbr0 interface present in some virt hosts - https://phabricator.wikimedia.org/T83732#3332217 (10chasemp) 05Open>03Resolved a:03chasemp It seems not, I'm going to close this but anyone who knows differently please reopen ```for i in `cat labvirt`; do echo $i; ssh $i.eqiad.wmnet 'i... [13:54:55] PROBLEM - Puppet errors on tools-exec-1420 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [13:54:57] PROBLEM - Puppet errors on tools-exec-1439 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [13:55:03] 10Labs, 10Labs-Infrastructure, 10Operations, 10Patch-For-Review, 10Wikimedia-Incident: Some labs instances IP have multiple PTR entries in DNS - https://phabricator.wikimedia.org/T115194#3332228 (10chasemp) 05Resolved>03Open ```elukey@deployment-aqs03:~$ dig -x 10.68.17.125 +short elukey ci-jessie-wi... [13:55:11] PROBLEM - Puppet errors on tools-exec-1422 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [13:57:05] PROBLEM - Puppet errors on tools-exec-1408 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [13:57:31] PROBLEM - Puppet errors on tools-exec-1403 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [13:57:44] PROBLEM - Puppet errors on tools-exec-1406 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [13:57:59] 10Labs, 10Operations: Tools puppet failing: Detail: undefined method `>>' for "24443.99":String - https://phabricator.wikimedia.org/T167412#3332240 (10chasemp) [13:58:09] PROBLEM - Puppet errors on tools-exec-1409 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [13:58:26] PROBLEM - Puppet errors on tools-exec-1411 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [13:58:32] PROBLEM - Puppet errors on tools-exec-1436 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [13:58:50] PROBLEM - Puppet errors on tools-exec-1431 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [13:59:58] PROBLEM - Puppet errors on tools-exec-1434 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [14:00:26] PROBLEM - Puppet errors on tools-exec-1441 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [14:01:30] 10Labs, 10Operations: Tools puppet failing: Detail: undefined method `>>' for "24443.99":String - https://phabricator.wikimedia.org/T167412#3332329 (10chasemp) Related? ```Commit: d3dc61097073773b308f2cc1bb9352c4aea61be8 Author: Alexandros Kosiaris Date: (5 hours ago) 2017-06-08... [14:02:40] PROBLEM - Puppet errors on tools-exec-1404 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [14:02:41] 10Labs, 10Operations: Tools puppet failing: Detail: undefined method `>>' for "24443.99":String - https://phabricator.wikimedia.org/T167412#3332347 (10chasemp) This is probably from an operation against this fact: > sudo facter -p | grep swapsize_mb swapsize_mb => 24443.99 Where that fact is now a string... [14:03:58] PROBLEM - Puppet errors on tools-exec-1428 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [14:04:06] PROBLEM - Puppet errors on tools-exec-1440 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [14:04:54] 10Labs, 10Operations: Tools puppet failing: Detail: undefined method `>>' for "24443.99":String - https://phabricator.wikimedia.org/T167412#3332386 (10chasemp) p:05Triage>03Normal [14:05:28] PROBLEM - Puppet errors on tools-exec-1417 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [14:05:48] PROBLEM - Puppet errors on tools-exec-1442 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [14:07:13] PROBLEM - Puppet errors on tools-exec-1401 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [14:07:31] PROBLEM - Puppet errors on tools-exec-1405 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [14:07:46] PROBLEM - Puppet errors on tools-exec-1414 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [14:07:59] PROBLEM - Puppet errors on tools-exec-1438 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [14:08:17] PROBLEM - Puppet errors on tools-exec-1427 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [14:08:23] PROBLEM - Puppet errors on tools-exec-1418 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [14:08:30] PROBLEM - Puppet errors on tools-exec-1424 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [14:09:03] ^ that noise should start to resolve in a few minutes [14:09:32] PROBLEM - Puppet errors on tools-exec-1423 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [14:09:36] PROBLEM - Puppet errors on tools-exec-1421 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [14:10:57] PROBLEM - Puppet errors on tools-exec-1435 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [14:11:09] PROBLEM - Puppet errors on tools-exec-1410 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [14:13:08] 10Labs, 10Operations: Tools puppet failing: Detail: undefined method `>>' for "24443.99":String - https://phabricator.wikimedia.org/T167412#3332403 (10chasemp) turns out 37b83e8b2c04a58f555ee5627a415561ab792d26 unintentionally resulted in this ```diff --git a/modules/toollabs/templates/gridengine/host-vmem.er... [14:13:15] PROBLEM - Puppet errors on tools-exec-1432 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [14:14:13] PROBLEM - Puppet errors on tools-exec-1426 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [14:14:14] 10Labs, 10Operations: Tools puppet failing: Detail: undefined method `>>' for "24443.99":String - https://phabricator.wikimedia.org/T167412#3332404 (10chasemp) Quoting @faidon from irc: ```yeah my suggestion wrt this would be a) swap = 3*ram is just silly obsolete advice, half a gig of swap should be plenty/e... [14:14:33] PROBLEM - Puppet errors on tools-exec-1415 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [14:14:37] 10Labs, 10Operations: host-vmem.erb is doing operations that make no sense - https://phabricator.wikimedia.org/T167412#3332409 (10chasemp) [14:14:49] PROBLEM - Puppet errors on tools-exec-1419 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [14:14:51] 10Labs, 10Operations, 10Patch-For-Review, 10cloud-services-team (Kanban): rebuild tools-grid-master as a large instance - https://phabricator.wikimedia.org/T162955#3332411 (10chasemp) [14:14:55] 10Labs, 10Operations: host-vmem.erb is doing operations that make no sense - https://phabricator.wikimedia.org/T167412#3332240 (10chasemp) [14:15:05] PROBLEM - Puppet errors on tools-exec-1416 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [14:15:16] PROBLEM - Puppet errors on tools-exec-1412 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [14:15:24] 10Labs, 10Wikidata, 10Patch-For-Review, 10User-Ladsgroup, 10Wikidata-Sprint: Open view for term_full_entity_id in wb_terms table in labs - https://phabricator.wikimedia.org/T167114#3332415 (10daniel) We need to time this with our deployment schedule, running the migration script, and announcing changes. [14:18:23] PROBLEM - Puppet errors on tools-exec-1430 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [14:19:25] PROBLEM - Puppet errors on tools-exec-1433 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [14:19:43] PROBLEM - Puppet errors on tools-exec-1425 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [14:19:47] PROBLEM - Puppet errors on tools-exec-1429 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [14:20:21] PROBLEM - Puppet errors on tools-exec-1402 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [14:20:41] PROBLEM - Puppet errors on tools-exec-1437 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [14:20:45] 10Labs, 10Labs-Infrastructure, 10Operations, 10Patch-For-Review, 10Wikimedia-Incident: Some labs instances IP have multiple PTR entries in DNS - https://phabricator.wikimedia.org/T115194#3332427 (10hashar) From my digging: | May 9th | `652785` | June 8th | `692016` So I guess 505374 is a few months old. [14:23:38] PROBLEM - Puppet errors on tools-exec-1413 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [14:24:57] 10Labs, 10Wikidata, 10Patch-For-Review, 10User-Ladsgroup, 10Wikidata-Sprint: Open view for term_full_entity_id in wb_terms table in labs - https://phabricator.wikimedia.org/T167114#3332432 (10chasemp) 05Open>03stalled >>! In T167114#3318129, @gerritbot wrote: > Change 357369 had a related patch set u... [14:25:14] 10Labs, 10Wikidata, 10Patch-For-Review, 10User-Ladsgroup, 10Wikidata-Sprint: Open view for term_full_entity_id in wb_terms table in labs - https://phabricator.wikimedia.org/T167114#3332436 (10chasemp) p:05High>03Normal [14:32:33] RECOVERY - Puppet errors on tools-exec-1403 is OK: OK: Less than 1.00% above the threshold [0.0] [14:33:25] RECOVERY - Puppet errors on tools-exec-1411 is OK: OK: Less than 1.00% above the threshold [0.0] [14:34:55] RECOVERY - Puppet errors on tools-exec-1420 is OK: OK: Less than 1.00% above the threshold [0.0] [14:34:57] RECOVERY - Puppet errors on tools-exec-1439 is OK: OK: Less than 1.00% above the threshold [0.0] [14:35:11] RECOVERY - Puppet errors on tools-exec-1422 is OK: OK: Less than 1.00% above the threshold [0.0] [14:37:04] RECOVERY - Puppet errors on tools-exec-1408 is OK: OK: Less than 1.00% above the threshold [0.0] [14:37:43] RECOVERY - Puppet errors on tools-exec-1406 is OK: OK: Less than 1.00% above the threshold [0.0] [14:38:09] RECOVERY - Puppet errors on tools-exec-1409 is OK: OK: Less than 1.00% above the threshold [0.0] [14:38:31] RECOVERY - Puppet errors on tools-exec-1436 is OK: OK: Less than 1.00% above the threshold [0.0] [14:38:49] RECOVERY - Puppet errors on tools-exec-1431 is OK: OK: Less than 1.00% above the threshold [0.0] [14:38:59] RECOVERY - Puppet errors on tools-exec-1428 is OK: OK: Less than 1.00% above the threshold [0.0] [14:39:03] RECOVERY - Puppet errors on tools-exec-1440 is OK: OK: Less than 1.00% above the threshold [0.0] [14:39:57] RECOVERY - Puppet errors on tools-exec-1434 is OK: OK: Less than 1.00% above the threshold [0.0] [14:40:23] RECOVERY - Puppet errors on tools-exec-1441 is OK: OK: Less than 1.00% above the threshold [0.0] [14:40:27] RECOVERY - Puppet errors on tools-exec-1417 is OK: OK: Less than 1.00% above the threshold [0.0] [14:42:40] RECOVERY - Puppet errors on tools-exec-1404 is OK: OK: Less than 1.00% above the threshold [0.0] [14:42:46] RECOVERY - Puppet errors on tools-exec-1414 is OK: OK: Less than 1.00% above the threshold [0.0] [14:44:32] RECOVERY - Puppet errors on tools-exec-1423 is OK: OK: Less than 1.00% above the threshold [0.0] [14:45:52] RECOVERY - Puppet errors on tools-exec-1442 is OK: OK: Less than 1.00% above the threshold [0.0] [14:47:12] RECOVERY - Puppet errors on tools-exec-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [14:47:31] RECOVERY - Puppet errors on tools-exec-1405 is OK: OK: Less than 1.00% above the threshold [0.0] [14:48:01] RECOVERY - Puppet errors on tools-exec-1438 is OK: OK: Less than 1.00% above the threshold [0.0] [14:48:17] RECOVERY - Puppet errors on tools-exec-1427 is OK: OK: Less than 1.00% above the threshold [0.0] [14:48:23] RECOVERY - Puppet errors on tools-exec-1418 is OK: OK: Less than 1.00% above the threshold [0.0] [14:48:31] RECOVERY - Puppet errors on tools-exec-1424 is OK: OK: Less than 1.00% above the threshold [0.0] [14:49:39] RECOVERY - Puppet errors on tools-exec-1421 is OK: OK: Less than 1.00% above the threshold [0.0] [14:50:15] RECOVERY - Puppet errors on tools-exec-1412 is OK: OK: Less than 1.00% above the threshold [0.0] [14:50:57] RECOVERY - Puppet errors on tools-exec-1435 is OK: OK: Less than 1.00% above the threshold [0.0] [14:51:09] RECOVERY - Puppet errors on tools-exec-1410 is OK: OK: Less than 1.00% above the threshold [0.0] [14:53:16] RECOVERY - Puppet errors on tools-exec-1432 is OK: OK: Less than 1.00% above the threshold [0.0] [14:54:12] RECOVERY - Puppet errors on tools-exec-1426 is OK: OK: Less than 1.00% above the threshold [0.0] [14:54:34] RECOVERY - Puppet errors on tools-exec-1415 is OK: OK: Less than 1.00% above the threshold [0.0] [14:54:50] RECOVERY - Puppet errors on tools-exec-1419 is OK: OK: Less than 1.00% above the threshold [0.0] [14:55:08] RECOVERY - Puppet errors on tools-exec-1416 is OK: OK: Less than 1.00% above the threshold [0.0] [14:55:22] RECOVERY - Puppet errors on tools-exec-1402 is OK: OK: Less than 1.00% above the threshold [0.0] [14:58:23] RECOVERY - Puppet errors on tools-exec-1430 is OK: OK: Less than 1.00% above the threshold [0.0] [14:58:37] RECOVERY - Puppet errors on tools-exec-1413 is OK: OK: Less than 1.00% above the threshold [0.0] [14:59:25] RECOVERY - Puppet errors on tools-exec-1433 is OK: OK: Less than 1.00% above the threshold [0.0] [14:59:43] RECOVERY - Puppet errors on tools-exec-1425 is OK: OK: Less than 1.00% above the threshold [0.0] [14:59:47] RECOVERY - Puppet errors on tools-exec-1429 is OK: OK: Less than 1.00% above the threshold [0.0] [15:00:41] RECOVERY - Puppet errors on tools-exec-1437 is OK: OK: Less than 1.00% above the threshold [0.0] [15:01:11] An irc driven commadn to silence alerts for shinken-wm is officially on my wishlist [15:03:38] My wish is for all that stuff to be in another channel entirely. [15:04:13] yes pleaes [15:04:24] I hate this pattern here and in -ops [15:04:36] but it was argued and voted to keep it a few times in the last 3 years [15:15:35] chasemp: you have ops in this channel so you can -v the bot when needed after asking chanserv to +o you [15:16:15] we can move the bots to a bot only channel too, but in my experience that is about the same as just killing the bot [15:18:46] err.. not -v but +q [15:19:34] something like /mode #wikimedia-cloud +q shinken-wm [15:23:56] 10Labs, 10Tool-Labs: Create a fonts CDN for use on Tool Labs - https://phabricator.wikimedia.org/T110027#3332592 (10zhuyifei1999) I was just working on a [[http://logio.org|log.io]] interface and had to remove all the google fonts. It looks so much worse :/ I'm willing to take this task if setting up a cachin... [15:29:02] 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install labvirt101[5-8] - https://phabricator.wikimedia.org/T165531#3332619 (10Cmjohnson) [15:32:00] 10Labs, 10Tool-Labs: Create a fonts CDN for use on Tool Labs - https://phabricator.wikimedia.org/T110027#3332628 (10zhuyifei1999) Regarding licensing: Google said that [[https://developers.google.com/fonts/|All fonts are released under open source licenses.]] [15:32:13] 10Labs, 10Tool-Labs: Create a fonts CDN for use on Tool Labs - https://phabricator.wikimedia.org/T110027#1566983 (10bd808) >>! In T110027#3332592, @zhuyifei1999 wrote: > I was just working on a [[http://logio.org|log.io]] interface and had to remove all the google fonts. It looks so much worse :/ > > I'm will... [15:40:00] bd808: I'm woefully ignorant on irc admin things [15:42:05] bd808: the difference between cdnjs and google fonts is that with cdnjs we have all the files from a git repo, but for googlefonts we don't. The setup gotta be different :/ [15:42:06] bd808: that's the dominant argument previously for separating out bots as well iirc [15:42:23] chasemp: my cheat sheet is https://meta.wikimedia.org/wiki/IRC/Instructions#Channel_operator_commands [15:43:04] I just checked lighttpd's mod_proxy and it does not support ssl in the backend. can I borrow tools static server? [15:43:29] zhuyifei1999_: why can't we use https://github.com/google/fonts and https://github.com/majodev/google-webfonts-helper ? [15:44:17] I think https://github.com/google/fonts/tree/master/apache is all of the fonts [15:44:55] wow I didn't see that [15:44:59] (looking) [15:46:22] bd808: I was thinking more like !shinken-wm slience type thing, !shinken-wm show silence, !shinken-wm unsilence [all|foo] [15:46:33] rather than a necessary global mute [15:46:37] * chasemp mostly dreams [15:46:48] then you have to add authn+authz to the bot [15:46:56] neither of the tow fonts I'm looking for is in there [15:46:59] or live iwth any jerk silencing it [15:47:10] bd808: sort of, username matching for a whitelist seems fine [15:47:20] or maye I'm being too optimistic there [15:47:21] i.e. https://fonts.googleapis.com/css?family=Source+Sans+Pro:400,600 & https://fonts.googleapis.com/css?family=Inconsolata [15:47:28] and it would be all or nothing for authz [15:47:52] anyone opsen should ahve their username protected on irc etc [15:48:01] (unless I missed something) [15:48:09] *shrug* easy to spoof usernames generally. I added a bunch of fancy crap to stashbot because people were abusing !log in -ops [15:48:33] I think w/ my username for example you have a few seconds to auth to it before you get booted [15:48:45] that seems mostly ok if the list is 10 ppl whitelisted [15:48:48] or 4 [15:49:15] if you really really wanted to attempt to impersonate and join this channel to silence shinken-wm [15:49:30] I would be really curious as to the why :) [15:49:52] but if it makes you uncomfortable that's understandable [15:50:41] zhuyifei1999_: if we are going to do a straight proxy then it would be easiest in the nginx code that we manage in puppet for tools-static [15:50:53] yeah [15:50:56] which we can do I thing [15:50:58] *think [15:51:13] its not like the problem is high traffic [15:51:39] (though it would be best if we could find some way to fix the hostnames from the fonts api) [15:51:50] I'm looking at https://github.com/majodev/google-webfonts-helper [15:53:29] I think google-webfonts-helper is just a tarball generator the more I look at it [15:54:11] there is also the mirroring hack -- https://neverpanic.de/blog/2014/03/19/downloading-google-web-fonts-for-local-hosting/ [15:54:48] a reverse proxy is probably the easiest thing to setup [15:56:10] and google-webfonts-helper is nodejs, meaning some ugly hacks is needed to mount it to a subpath :/ (I didn't find a better way when I worked on log.io) [15:57:37] (idk why it's called that when it doesn't own the hostname) [15:58:25] for those who care, the s4 lag is cleared up on the labsdb replicas, but will start again on Monday as the next server is updated [15:58:39] you can follow along at T166206 [15:58:40] T166206: Convert unique keys into primary keys for some wiki tables on s4 - https://phabricator.wikimedia.org/T166206 [16:09:19] 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-codfw, 10Patch-For-Review: rack/setup/install labtestpuppetmaster2001 - https://phabricator.wikimedia.org/T167157#3332797 (10RobH) [16:14:41] 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-codfw: rack/setup/install labtestnet2002 - https://phabricator.wikimedia.org/T167159#3332839 (10Papaul) @chasemp do I have to put labtestnet2002 both eth0 and eth1 under labs-hosts1-b-codfw network or just plug eth1 and not put it under that network? Same... [16:21:39] 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-codfw: rack/setup/install labtestnet2002 - https://phabricator.wikimedia.org/T167159#3332874 (10RobH) IRC Update: We chatted about this, basically he wanted to know if we had to setup dns for both interfaces (eth0 and eth1) prior to installation. Since on... [16:31:25] since when sshing into the infrastructure is prohibited? o.O https://github.com/wikimedia/puppet/blob/production/modules/toollabs/manifests/infrastructure.pp#L11 [16:32:13] I thought I've done this for like a dozen times for debugging purposes [16:34:17] zhuyifei1999_: that class is only applied on a small number of hosts in the the tools project. Mostly things like the Puppet, Docker, and Kubernetes masters [16:34:38] https://github.com/wikimedia/puppet/blob/production/modules/toollabs/manifests/static.pp#L10 [16:34:54] *nod* it is on the proxies too [16:35:37] and the redis and mail servers [16:35:48] I was trying to figure out the nginx settings on the hosts but now I guess I have to do it the hard way [16:36:37] zhuyifei1999_: https://github.com/wikimedia/puppet/blob/production/modules/toollabs/templates/static-server.conf.erb [16:36:51] I mean nginx.conf [16:37:26] server blocks are included by nginx.conf [16:38:13] thanks anyways [16:38:25] zhuyifei1999_: ubuntu stock config. nginx is managed with this puppet code -- https://github.com/wikimedia/operations-puppet-nginx [16:38:35] k [16:38:58] the vhost should not assume any particular global config [16:43:02] iirc, the cache stuffs has to be defined globally [16:46:03] will test [17:04:34] 10Labs, 10DBA: Labs database replica drift - https://phabricator.wikimedia.org/T138967#3333050 (10bd808) Cross-post from {T166091}: ```SELECT pl_namespace, pl_title FROM page JOIN pagelinks ON pl_from = page_id WHERE page_namespace=0 AND page_title="Ajinkya_Rahane" AND pl_title LIKE "Ankit%"; +--------------+... [17:04:46] 10Labs, 10DBA: Labs database replica drift - https://phabricator.wikimedia.org/T138967#3333056 (10bd808) [17:04:50] 10Labs, 10DBA: Labs database corruption - https://phabricator.wikimedia.org/T166091#3285070 (10bd808) [17:09:45] !drift is https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Database/Replica_drift [17:09:45] Key was added [17:10:15] !lag [17:10:26] !replag [17:10:46] !replag is https://tools.wmflabs.org/replag/ [17:10:47] Key was added [17:19:29] bd808: yeah proxy_cache_path has a context of http. setting in the server block will result in something like '2017/06/08 17:14:42 [emerg] 14029#0: "proxy_cache_path" directive is not allowed here in /etc/nginx/sites-enabled/static-server:39' [17:19:44] but setting it above the block should work [17:20:12] hey folks! I'm trying to create a new user database on the same host as enwiki_p; [17:20:26] But it's failing. Am I being dumb or is db creation disabled? [17:20:40] My command: CREATE DATABASE u2014__ores_p; [17:21:03] as far i know user databases should be on the tools.host ... but i might be wrong. [17:21:28] somone (the DBA?) told me a while ago that databases run enwiki_p/etc. hosts might get deleted etc. [17:21:47] https://nginx.org/en/docs/http/ngx_http_proxy_module.html?&_ga=2.234233182.1514498944.1496940209-1867526186.1496940209#proxy_cache_path [17:22:12] Steinsplitter, right. This is purposefully going to be created there. [17:22:20] See https://phabricator.wikimedia.org/T146718 [17:22:47] I've got a few other dbs there. I could just reuse them I guess. [17:27:17] halfak: you should be able to create databases in ToolsDB [17:27:25] Oh. I'm dumb. I transposed some digits in my user ID [17:27:34] ah ha! [17:27:40] That's what I get for looking at history and copy-pasting old commands [17:27:40] PEBCAK [17:27:45] The old command was wrong too! [17:28:19] > CREATE DATABASE u2041__ores_p;\n Query OK, 1 row affected (0.00 sec) [17:28:20] :D [17:28:22] 10Labs, 10Tool-Labs: Create a fonts CDN for use on Tool Labs - https://phabricator.wikimedia.org/T110027#3333308 (10zhuyifei1999) a:03zhuyifei1999 [17:29:02] zhuyifei1999_: do you think we need it to actually cache locally or can we just do a live reverse proxy that makes it look to google like our nginx is requesting everything? [17:29:18] halfak: yay - https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Database#User_databases just in case you need to refer back in the future :) [17:30:08] bd808: I personally wouldn't want our infrastructure to ddos google so they blacklist us [17:30:21] we will not ddos google ;) [17:30:38] ddossing google sounds like fun though :/ xD [17:30:49] our server would melt long before they noticed [17:30:59] hmm [17:31:04] good point [17:31:21] to be fair they probably just move to a different dc :P [17:31:28] https://blog.bradfieldcs.com/you-are-not-google-84912cf44afb [17:33:18] (I must admit that I'm not familiar with stuffs like Hadoop or Spark) [17:33:38] they are "big data" things [17:34:09] we have a hadoop cluster in our analytics network. we also use kafka mentioned in that article [17:35:58] 10Labs, 10Tool-Labs, 10DBA: s51127__dewiki_lists (merlbot) database using 13G on labsdb1001 (enwiki) - https://phabricator.wikimedia.org/T133325#2228208 (10Luke081515) Merl is inactive since about one year. The crontabs have been commented out one year ago. My first suggestion would be: create a dump, so tha... [17:37:29] should we hide the referrer header from google? [17:38:07] it wouldn't hurt. we could just do a fake "tools.wmflabs.org/" one [17:40:01] halfak: I can't figure out how to cram it nicely into the create database command, but you can get the active username for easy cut-n-paste with -- select substring_index(current_user(), '@', 1); [17:40:59] bd808, ooh that's a good addition to the wiki :) [17:41:04] there is probably some function where you can build up the DDL in a string and then exec it that could make it a one-liner, but too much work for a drive-by answer [17:43:49] uhhh you can also get it from the command line on tools by doing something like `grep "user" replica.my.cnf | cut -f 3 -d " "` [17:45:53] https://wikitech.wikimedia.org/w/index.php?title=Help%3ATool_Labs%2FDatabase&type=revision&diff=1761524&oldid=1761521 [17:58:16] PROBLEM - Free space - all mounts on tools-webgrid-lighttpd-1426 is CRITICAL: CRITICAL: tools.tools-webgrid-lighttpd-1426.diskspace._tmp.byte_percentfree (<100.00%) [18:00:00] bd808: there's tons of xvfb-run.* files in /tmp here [18:00:07] for the alert ^ [18:00:15] 6369 files [18:00:56] madhuvishy: yuck. probably leaking from the wsexport jobs [18:01:10] aah [18:01:11] yeah [18:01:17] I think we pinned them to a host [18:01:38] right - think i can safely delete them? [18:01:45] yeah [18:02:05] if one is in active use its file handle will be open and that won't effect the running code [18:02:40] right [18:03:45] I've requested access to Tools, would someone mind approving my request? https://toolsadmin.wikimedia.org/tools/membership/status/25 [18:03:56] bd808: aah yikes that is not even the problem! there's also 328 epub files [18:03:57] :/ [18:04:18] and lots of calibre tmp files [18:05:29] madhuvishy: yuck. I hate the way that tool works :/ [18:05:30] nuke them all [18:05:57] davidwbarratt: {{Done}}! welcome to tool labs / ToolForge [18:06:10] bd808 thanks! :) [18:07:19] !log tools Clean up space on /tmp on tools-webgrid-lighttpd-1426 by deleting temp files xvfb-run.* and calibre_1.25.0_tmp_* created by the wsexport tool [18:07:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [18:08:20] bd808: okay I'm not sure if I should also remove the epubs but deleting the calibre_* files dropped it from 15G to 873M. I'll make a talk for those folks [18:08:51] the epubs can go too. they are only on disk long enough for the original request to download them [18:09:11] ah okay [18:09:15] if they are hanging out it means the original request was aborted or errored out [18:10:10] hmmm [18:10:47] !log tools Delete ws-*.epub from /tmp on tools-webgrid-lighttpd-1426 [18:10:49] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [18:11:17] the way that tool works is ... gross. Its a php script that shells out to jsub to generate a file and then returns the file to the caller [18:11:39] cool, now at 14M [18:11:42] right [18:11:52] the least to do would be do clean up the temp files [18:12:06] I love what it does but not how it does it [18:12:19] there are no safety checks in the system [18:14:36] !log tools Also delete from /tmp on tools-webgrid-lighttpd-1411 xvfb-run.*, calibre_* and ws-*.epub [18:14:40] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [18:18:16] RECOVERY - Free space - all mounts on tools-webgrid-lighttpd-1426 is OK: OK: All targets OK [18:40:41] madhuvishy: bd808 i think a script to cleanup unused /tmp files is a good investment? [18:47:24] bd808: how do you think about: serve a http://fontcdn.org/ in https://tools.wmflabs.org/googlefonts/, proxy https://fonts.googleapis.com/css with https://tools-static.wmflabs.org/googlefonts/css (with url substitution of //fonts.gstatic.com/), and proxy https://fonts.gstatic.com/s/ with https://tools-static.wmflabs.org/googlefonts/s/ ? [18:49:39] uh https://github.com/thomaspark/fontcdn/blob/gh-pages/index.html this is full of idek stuffs [18:52:49] yeah I guess I can just fork it so it only use labs-local stuffs [19:02:27] actually, "fontcdn" is still cooler than "googlefonts" since it's shorter [19:03:32] 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-codfw, 10Patch-For-Review: rack/setup/install labtestnet2002 - https://phabricator.wikimedia.org/T167159#3333690 (10RobH) [19:03:36] 10Labs, 10Labs-Infrastructure, 10Operations, 10netops, 10ops-codfw: codfw:labtestnet2002 switch port configuration - https://phabricator.wikimedia.org/T167322#3333688 (10RobH) 05Resolved>03Open Nevermind, I had a bad config and it didn't commit. I need to investiage and redo the change. [19:03:55] 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install labvirt101[5-8] - https://phabricator.wikimedia.org/T165531#3333691 (10Cmjohnson) [19:04:05] 10Labs, 10Labs-Infrastructure, 10Operations, 10netops, 10ops-codfw: codfw: labtestneutron2002 sswitch port configuration - https://phabricator.wikimedia.org/T167326#3333692 (10RobH) a:05Papaul>03RobH [19:06:30] 10Labs, 10Labs-Infrastructure, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install labvirt101[5-8] - https://phabricator.wikimedia.org/T165531#3333702 (10Cmjohnson) mac addresses were added to dhcpd file, not sure if h/w raid is needed..i believe these came with a controller. Also @mark was... [19:47:34] 10Labs, 10Labs-Infrastructure, 10Operations, 10netops, 10ops-codfw: codfw: labtestneutron2002 switch port configuration - https://phabricator.wikimedia.org/T167326#3333850 (10Papaul) [20:04:49] 10Tool-Labs-tools-Xtools, 10MobileFrontend: Edit count on mobile differs from xtools' - https://phabricator.wikimedia.org/T163893#3333907 (10Framawiki) It's probably more a xtools problem [20:08:32] 10Tool-Labs-tools-Xtools, 10MediaWiki-API: Edit count on mobile differs from xtools' - https://phabricator.wikimedia.org/T163893#3333912 (10Framawiki) The problem does not concern Mobiles tools, but [[ https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&format=json&prop=info&list=users&usprop=editcou... [20:17:16] 10Tool-Labs-tools-Xtools, 10MediaWiki-API: Edit count on mobile differs from xtools' - https://phabricator.wikimedia.org/T163893#3333947 (10Anomie) [20:18:18] 10Tool-Labs-tools-Xtools, 10MediaWiki-API: Edit count on mobile differs from xtools' - https://phabricator.wikimedia.org/T163893#3213645 (10Anomie) While "wrong" is debatable, it comes down to {T21311}. The edit count maintained by MediaWiki doesn't match the way xtools or other things might count edits. [20:22:08] 10Labs: Create nova service account for openstack - https://phabricator.wikimedia.org/T167467#3333976 (10chasemp) [22:33:33] 10Labs, 10Tool-Labs: wsexport tool leaking files in /tmp - https://phabricator.wikimedia.org/T166337#3293050 (10madhuvishy) This caused low disk space alerts on 3 nodes today, and I deleted calibre_*, xvfb-run.* and *.epub from tools-webgrid-lighttpd-1426, 1411 and 1417 today. Repeating Andrew's request to set... [22:37:10] RECOVERY - Free space - all mounts on tools-webgrid-lighttpd-1417 is OK: OK: All targets OK [22:39:55] 10Labs, 10Tool-Labs, 10Tool-Labs-tools-Other: wsexport tool leaking files in /tmp - https://phabricator.wikimedia.org/T166337#3334284 (10bd808) [22:41:41] 10Labs, 10Tool-Labs, 10Tool-Labs-tools-Other: wsexport tool leaking files in /tmp - https://phabricator.wikimedia.org/T166337#3293050 (10bd808) Cross posted upstream at https://github.com/wsexport/tool/issues/136 [22:47:40] thanks bd808 [22:57:57] 10Tool-Labs-tools-Xtools, 10Community-Tech-Sprint: Restrict access to users' edit stats unless opted-in - https://phabricator.wikimedia.org/T165401#3334308 (10Samwilson) a:03Samwilson