[00:38:53] PROBLEM - Puppet errors on tools-exec-1417 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [00:48:42] PROBLEM - Puppet errors on tools-exec-1419 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [01:08:53] RECOVERY - Puppet errors on tools-exec-1417 is OK: OK: Less than 1.00% above the threshold [0.0] [01:14:36] PROBLEM - Puppet errors on tools-webgrid-generic-1401 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [01:28:46] RECOVERY - Puppet errors on tools-exec-1419 is OK: OK: Less than 1.00% above the threshold [0.0] [01:54:37] RECOVERY - Puppet errors on tools-webgrid-generic-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [02:37:12] dschwen: I've tried to fill in some more useful information in https://wikitech.wikimedia.org/wiki/Help:Puppet [02:49:42] PROBLEM - Puppet errors on tools-worker-1022 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [03:07:23] thx, bd808 [03:07:28] reading that page now [03:08:21] I need to understand the modules/profiles/roles pattern [03:08:31] and where each of those are in /var/lib/git/operations/puppet [03:08:52] and where the appropriate place for my stuff in that repo would be [03:09:22] do you know of any labs (pardon me, cloud) projects that have custom puppet roles / modules /profiles? [03:29:38] RECOVERY - Puppet errors on tools-worker-1022 is OK: OK: Less than 1.00% above the threshold [0.0] [03:45:06] PROBLEM - Puppet errors on tools-exec-1412 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [04:20:04] RECOVERY - Puppet errors on tools-exec-1412 is OK: OK: Less than 1.00% above the threshold [0.0] [04:46:06] PROBLEM - Puppet errors on tools-exec-1427 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [05:03:45] dschwen: where to put things is a good question. I don't know off the top of my head if there are any merged puppet modules or roles that are intended solely for use in a project. [05:04:26] This commit is an example of starting a new role for use in a project that is intended to eventually be used in production -- https://github.com/wikimedia/puppet/commit/d21184dc5489492057891358e0bef35960f521af [05:05:45] If I was going to make something just for local use in a project I wold probably just make a class in modules/role/manifests like "role::my_project" and go from there [05:07:06] In that case I would mostly ignore the documentation about "profiles" and their 'interesting' use of hiera settings that is on wikitech for our production puppet code [05:09:38] bd808: I remember when I basically asked the same thing (custom project puppet) for v2c. the configuration is still a pain [05:10:25] Puppet has a pretty steep learning curve in my experience [05:10:36] yeah [05:11:14] the "ah ha" moment for me was realizing that is more like writing unit tests than applications [05:11:41] the resources you declare are basically assertions about how the server should look [05:12:13] ...and then puppet breaks with require loops :P [05:12:27] and they are unordered unless you explicitly create dependencies [05:12:58] using require is usually a bad idea. using include is almost always better [05:14:35] configuration management systems are like bug trackers: they are all horrible, but eventually you pick one and get used to it [05:14:50] see also: code review tools [05:15:43] lol [05:16:06] RECOVERY - Puppet errors on tools-exec-1427 is OK: OK: Less than 1.00% above the threshold [0.0] [05:16:43] in the production branch manifests does not contain a role directory [05:16:47] ?! [05:17:02] it's in modules [05:17:17] puppet/modules/role/manifests [05:17:22] got it [05:17:32] the top level manifests directory isn't really used in the Labs setup [05:22:20] I'd really like to work on an easier to use per-project puppet setup, but I never seem to find the time [05:23:03] Yuvi took a shot at it once with something he called "puppetception" that ran Puppet form Puppet but that turned out to work pretty poorly [05:23:45] Andrew has some ideas about using Puppet environments to make a better solution [05:24:16] maybe I'll try to make it a hackathon project while I'm at wikimania [05:28:24] * bd808 wanders off to find sleep [05:42:41] bd808: if someone creates a ticket on phab about it, please cc me. I would really love such feature [06:33:47] PROBLEM - Puppet errors on tools-exec-1441 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [06:38:28] PROBLEM - Puppet errors on tools-exec-1424 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [06:43:10] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1414 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [07:08:50] RECOVERY - Puppet errors on tools-exec-1441 is OK: OK: Less than 1.00% above the threshold [0.0] [07:15:37] PROBLEM - Puppet errors on tools-webgrid-generic-1401 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [07:18:31] RECOVERY - Puppet errors on tools-exec-1424 is OK: OK: Less than 1.00% above the threshold [0.0] [07:20:39] PROBLEM - Puppet errors on tools-worker-1022 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [07:23:12] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1414 is OK: OK: Less than 1.00% above the threshold [0.0] [07:55:36] RECOVERY - Puppet errors on tools-webgrid-generic-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [08:00:43] RECOVERY - Puppet errors on tools-worker-1022 is OK: OK: Less than 1.00% above the threshold [0.0] [09:21:41] PROBLEM - Puppet errors on tools-worker-1022 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [09:56:41] RECOVERY - Puppet errors on tools-worker-1022 is OK: OK: Less than 1.00% above the threshold [0.0] [10:11:55] 10Labs, 10Tool-Labs, 10Monumental, 10Privacy: Monumental imports css from fonts.googleapis.com - https://phabricator.wikimedia.org/T168786#3376740 (10zhuyifei1999) [10:14:28] 10Labs, 10Tool-Labs, 10Monumental, 10Privacy: Monumental imports css from fonts.googleapis.com - https://phabricator.wikimedia.org/T168786#3376728 (10zhuyifei1999) FWIW, a reverse proxy to fonts.googleapis.com is/was being worked on in {T110027}, but currently stuck in code review. [10:34:32] 10Labs, 10Tool-Labs, 10Monumental, 10Privacy: Monumental imports css from fonts.googleapis.com - https://phabricator.wikimedia.org/T168786#3376778 (10Multichill) Thanks, https://tools.wmflabs.org/cdnjs/ , that was the url I was looking for [10:37:26] 10Labs, 10Tool-Labs, 10Monumental, 10Privacy: Monumental imports css from fonts.googleapis.com - https://phabricator.wikimedia.org/T168786#3376780 (10Ricordisamoa) [10:37:28] 10Tool-Labs-tools-Other, 10Community-Tech-Tool-Labs, 10Epic: Convert all Labs tools to use cdnjs for static libraries and fonts - https://phabricator.wikimedia.org/T103934#3376779 (10Ricordisamoa) [10:52:39] PROBLEM - Puppet errors on tools-worker-1022 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [11:13:18] 10Labs, 10DBA, 10User-Urbanecm: Prepare and check storage layer for maiwikimedia - https://phabricator.wikimedia.org/T168788#3376801 (10Urbanecm) [11:27:39] RECOVERY - Puppet errors on tools-worker-1022 is OK: OK: Less than 1.00% above the threshold [0.0] [12:23:41] PROBLEM - Puppet errors on tools-worker-1022 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [12:34:13] PROBLEM - Puppet errors on tools-exec-1431 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [13:09:16] RECOVERY - Puppet errors on tools-exec-1431 is OK: OK: Less than 1.00% above the threshold [0.0] [13:58:43] RECOVERY - Puppet errors on tools-worker-1022 is OK: OK: Less than 1.00% above the threshold [0.0] [14:46:37] PROBLEM - Puppet errors on tools-webgrid-generic-1401 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [15:13:37] !log tools Restarted webservice on tools.fatameh [15:13:41] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [15:26:37] RECOVERY - Puppet errors on tools-webgrid-generic-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [18:39:24] PROBLEM - Puppet errors on tools-exec-1414 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [19:06:49] PROBLEM - Puppet errors on tools-worker-1020 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [19:07:19] PROBLEM - Puppet errors on tools-worker-1009 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [19:07:49] hmm [19:08:09] andrewbogott: is that normal? [19:14:24] RECOVERY - Puppet errors on tools-exec-1414 is OK: OK: Less than 1.00% above the threshold [0.0] [20:36:41] 10Quarry: Slowdown of Quarry queries processing - https://phabricator.wikimedia.org/T168803#3377191 (10Mess) [20:45:38] (03Draft1) 10Paladox: Enable puppet check for ores* hosts [labs/icinga2] - 10https://gerrit.wikimedia.org/r/361317 [20:45:40] (03PS2) 10Paladox: Fix puppet check for hosts [labs/icinga2] - 10https://gerrit.wikimedia.org/r/361317 [20:45:42] (03CR) 10Paladox: [V: 032 C: 032] Fix puppet check for hosts [labs/icinga2] - 10https://gerrit.wikimedia.org/r/361317 (owner: 10Paladox) [20:57:36] (03Draft1) 10Paladox: Migrate check_mem to public view [labs/icinga2] - 10https://gerrit.wikimedia.org/r/361318 [20:57:38] (03PS2) 10Paladox: Migrate check_mem to public view [labs/icinga2] - 10https://gerrit.wikimedia.org/r/361318 [20:57:42] (03CR) 10Paladox: [V: 032 C: 032] Migrate check_mem to public view [labs/icinga2] - 10https://gerrit.wikimedia.org/r/361318 (owner: 10Paladox) [21:23:01] TabbyCat: if you mean Puppet flapping on the tools nodes then yes, sadly that is normal. We haven't been able to pin down the exact cause but we think it is related to NFS server contention [21:23:39] bd808: are you replying to a comment I made some hours ago about NFS being at 100% for a tool? [21:24:05] yes [21:24:16] okay :) [22:48:57] 10PAWS: I can not write some special characters in PAWS - https://phabricator.wikimedia.org/T136118#3377326 (10Dvorapa) a:05Dvorapa>03None [23:55:40] 10Tool-Labs-tools-Xtools, 10Community-Tech-Sprint: Bogus "not a valid project" errors - https://phabricator.wikimedia.org/T168676#3377377 (10kaldari) 05Open>03Resolved a:03kaldari Seems to be working now!