[00:00:10] ah right, you mentioned that in the commit message as well [00:00:30] (03CR) 10Alex Monk: Enable group1 wikis in RESTBase (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/198433 (https://phabricator.wikimedia.org/T93452) (owner: 10GWicke) [00:01:24] 6operations: unaccepted salt keys - https://phabricator.wikimedia.org/T93455#1137514 (10Dzahn) 3NEW [00:02:54] 6operations, 10RESTBase: restbase1006 not showing up in graphite cassandra metrics - https://phabricator.wikimedia.org/T92989#1137534 (10GWicke) 5Open>3Resolved a:3GWicke Yes, 1006 has joined the flock now. Resolving. [00:04:42] 6operations: unaccepted salt keys - https://phabricator.wikimedia.org/T93455#1137537 (10Dzahn) 5Open>3Resolved a:3Dzahn well.. i did.. and accepted them. please remember doing it though when reinstalling servers [00:05:18] 6operations, 5Patch-For-Review: Setup poolcounter servers for codfw - https://phabricator.wikimedia.org/T93261#1137540 (10Dzahn) update: suhail now works. OS install finished. https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=suhail just subra has the issue detecting disks. [00:05:22] 6operations, 10ops-ulsfo: cp4009 hardware fault - https://phabricator.wikimedia.org/T92476#1137542 (10Gage) This order arrived at 200 Paul today. [00:06:42] (03PS3) 10Dzahn: add subra/suhail to site.pp as codfw poolcounters [puppet] - 10https://gerrit.wikimedia.org/r/198437 (https://phabricator.wikimedia.org/T93261) [00:07:04] gwicke, but I must ask - why? [00:07:44] (03CR) 10Dzahn: [C: 032] add subra/suhail to site.pp as codfw poolcounters [puppet] - 10https://gerrit.wikimedia.org/r/198437 (https://phabricator.wikimedia.org/T93261) (owner: 10Dzahn) [00:10:34] 6operations, 5Patch-For-Review: Setup poolcounter servers for codfw - https://phabricator.wikimedia.org/T93261#1137551 (10Dzahn) applied poolcounter role on suhail: Notice: /Stage[main]/Poolcounter/Package[poolcounter]/ensure: ensure changed 'purged' to 'present' (yay, package is here for trusty) ``` ii p... [00:12:26] RECOVERY - puppet last run on mw2109 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [00:15:49] (03PS1) 10Dzahn: have base::firewall on codfw poolcounters [puppet] - 10https://gerrit.wikimedia.org/r/198440 (https://phabricator.wikimedia.org/T93261) [00:15:58] (03CR) 10jenkins-bot: [V: 04-1] have base::firewall on codfw poolcounters [puppet] - 10https://gerrit.wikimedia.org/r/198440 (https://phabricator.wikimedia.org/T93261) (owner: 10Dzahn) [00:16:19] (03PS2) 10Dzahn: have base::firewall on codfw poolcounters [puppet] - 10https://gerrit.wikimedia.org/r/198440 (https://phabricator.wikimedia.org/T93261) [00:17:01] Krenair: why leave out special wikis? [00:18:45] yes [00:22:18] (03PS1) 10Dzahn: add ferm service for poolcounterd [puppet] - 10https://gerrit.wikimedia.org/r/198442 (https://phabricator.wikimedia.org/T93261) [00:27:26] 6operations, 5Patch-For-Review: Setup poolcounter servers for codfw - https://phabricator.wikimedia.org/T93261#1137600 (10Dzahn) monitoring now also includes check for process running and TCP connect to 7531 working https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=suhail patches pending t... [00:29:07] RECOVERY - puppet last run on mw2106 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [00:30:35] (03PS6) 10Dzahn: WIP cassandra: add ferm rules [puppet] - 10https://gerrit.wikimedia.org/r/197840 (https://phabricator.wikimedia.org/T92680) [00:31:16] (03CR) 10Dzahn: WIP cassandra: add ferm rules (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/197840 (https://phabricator.wikimedia.org/T92680) (owner: 10Dzahn) [00:32:29] (03PS7) 10Dzahn: WIP cassandra: add ferm rules [puppet] - 10https://gerrit.wikimedia.org/r/197840 (https://phabricator.wikimedia.org/T92680) [00:46:45] 6operations, 5Patch-For-Review: contacts.wikimedia.org drupal unpuppetized - https://phabricator.wikimedia.org/T90679#1137623 (10Dzahn) mailed wikitech-l and engineering-l about retiring this service [00:50:49] 6operations: reinstall OCG servers - https://phabricator.wikimedia.org/T84723#1137634 (10Dzahn) [00:54:47] (03PS1) 10BryanDavis: Make status output prettier for log files [tools/scap] - 10https://gerrit.wikimedia.org/r/198449 [00:55:05] Krenair, MaxSem: Since I haven’t been able to successfully debug T93436 yet. I’m just going to remove our use of TemplateParser in MobileFrontend for now. I assume we don’t want to deploy such a change on Friday evening though, correct? The errors seem to be constant, but not a flood. [00:55:35] Can you identify anything user-facing as broken? [00:55:42] Krenair: strangely, no [00:56:07] ok, let's not deploy now then [00:56:22] 6operations: reinstall OCG servers - https://phabricator.wikimedia.org/T84723#1137644 (10Dzahn) { 'host': 'ocg1003.eqiad.wmnet', 'weight': 10, 'enabled': False } http://ganglia.wikimedia.org/latest/graph.php?r=week&z=xlarge&h=ocg1003.eqiad.wmnet&m=cpu_report&s=descending&mc=2&g=network_report&c=PDF+servers+eqiad [00:56:46] (03CR) 10BryanDavis: "Tired of seeing this in the l10nupdate log file:" [tools/scap] - 10https://gerrit.wikimedia.org/r/198449 (owner: 10BryanDavis) [00:57:33] 6operations: pybal issue? - https://phabricator.wikimedia.org/T90839#1137647 (10Dzahn) 5Open>3Resolved a:3Dzahn i'm closing this. one part (mw1062) was fixed by a restart and the other part, ocg, doesn't seem to be a (general) pybal issue. that said, i still don't know why ocg servers get traffic even afte... [00:57:34] hoo, gosh, abuse log likes to hide its variable dumps [00:57:34] 6operations: reinstall OCG servers - https://phabricator.wikimedia.org/T84723#1137650 (10Dzahn) [00:57:35] 6operations, 10ops-eqiad: mw1062 needs a disk replacement - https://phabricator.wikimedia.org/T86542#1137651 (10Dzahn) [00:58:11] Krenair: Hide? [00:58:24] afl_var_dump just contains references to the text table [00:58:35] which contains references to ExternalStorage [00:58:49] Still? [00:59:06] Oh that, yes [00:59:15] 'still'? is this something we're supposed to have stopped? [00:59:27] No, I thought you meant the serialization for a second [00:59:33] no [01:00:24] I can handle a bit of serialized PHP when I'm just looking for key substrings. External storage makes DB queries more difficult [01:02:36] hoo, so why is that done instead of just storing externalstorage references straight in afl_var_dump? [01:02:40] (03CR) 10Krinkle: [C: 031] Make status output prettier for log files [tools/scap] - 10https://gerrit.wikimedia.org/r/198449 (owner: 10BryanDavis) [01:03:04] I think it was supposed to be consistent with how revisions are stored [01:05:33] hoo, basically I'm looking at https://phabricator.wikimedia.org/T39767 and the ticket I linked [01:05:40] I wonder if these are old entries or something [01:08:26] Yeah, that was before I switched the serialization format in early 2013 [01:08:59] I would actually like to wipe these old entries, but don't want to go through the communication mess around that [01:09:08] they don'T really sever a purpose any more [01:10:27] hoo, the code appears to account for $article being null already [01:10:45] I think we can build an isset check in trivially [01:11:43] but legacy schema for the stored afcomputervariable is my only theory for this bug [01:11:54] afcomputedvariable* [01:16:22] hoo, indeed, the 'article' parameter was introduced later [01:16:33] 675e4c67, Nov 9 08:36:26 2011 by ialex@ [01:17:09] https://phabricator.wikimedia.org/rEABF675e4c673ad2242a95e74ede9816b68d98a8aa35 [01:19:18] 6operations: reinstall OCG servers - https://phabricator.wikimedia.org/T84723#1137679 (10Dzahn) while you would think ocg1003 stops getting jobs, it looks actually like this: ocg1002 and ocg1003 are both doing a lot of stuff (tail -f /srv/deployment/ocg/log/ocg.log), while ocg1001 is the one not doing anything.... [01:20:06] https://www.mediawiki.org/wiki/Special:Code/MediaWiki/102497 [01:20:23] Krenair: /nick mutante|away [01:20:27] arr, fail, srry [01:20:32] :) [01:20:37] Krenair: cya:) [01:20:49] * Krenair waves [02:10:06] (03PS1) 10Chmarkine: RT - Enable HSTS max-age=7 days [puppet] - 10https://gerrit.wikimedia.org/r/198455 [02:13:28] (03PS2) 10Chmarkine: RT - Enable HSTS max-age=7 days [puppet] - 10https://gerrit.wikimedia.org/r/198455 (https://phabricator.wikimedia.org/T40516) [02:27:17] (03PS1) 10Chmarkine: ishmael - Enable HSTS max-age=7 days [puppet] - 10https://gerrit.wikimedia.org/r/198457 (https://phabricator.wikimedia.org/T40516) [02:29:03] !log l10nupdate Synchronized php-1.25wmf21/cache/l10n: (no message) (duration: 06m 43s) [02:29:14] Logged the message, Master [02:33:42] !log LocalisationUpdate completed (1.25wmf21) at 2015-03-21 02:32:38+00:00 [02:33:46] Logged the message, Master [02:35:33] !log l10nupdate Synchronized php-1.25wmf22/cache/l10n: (no message) (duration: 00m 03s) [02:35:37] Logged the message, Master [02:36:41] !log LocalisationUpdate completed (1.25wmf22) at 2015-03-21 02:35:37+00:00 [02:36:44] Logged the message, Master [02:45:31] (03PS1) 10Chmarkine: integration - Enable HSTS max-age=7 days [puppet] - 10https://gerrit.wikimedia.org/r/198458 (https://phabricator.wikimedia.org/T40516) [03:06:21] 6operations, 6Phabricator, 7Monitoring: Phabricator reported down on status.wm.o - https://phabricator.wikimedia.org/T93443#1137752 (10mmodell) Checking for a string of text on that page seems like a good test to me, just don't check for something that's defined by the UI or translation files - check for con... [03:46:34] (03PS1) 10Alex Monk: Add Nova_Resource namespace to default labswiki search options [mediawiki-config] - 10https://gerrit.wikimedia.org/r/198460 (https://phabricator.wikimedia.org/T67132) [03:51:08] (03CR) 10Negative24: [C: 031] Add Nova_Resource namespace to default labswiki search options [mediawiki-config] - 10https://gerrit.wikimedia.org/r/198460 (https://phabricator.wikimedia.org/T67132) (owner: 10Alex Monk) [03:52:48] Krenair: Regarding https://gerrit.wikimedia.org/r/198460, earlier today I stupidly demonstrated the need for this :) [03:53:53] :) [03:54:10] (03PS1) 1020after4: fix puppet error due to missing parent directory [puppet] - 10https://gerrit.wikimedia.org/r/198461 [03:56:57] Negative24, do you have another name I might recognise you by? [03:57:12] ? [03:57:36] My username is globally the same [03:58:21] ok [03:58:34] Did you want something else? [04:00:05] nope [04:02:04] 6operations, 10Wikimedia-Labs-wikitech-interface: wikitech instances list is blank - https://phabricator.wikimedia.org/T89808#1137778 (10mmodell) a:5mmodell>3None [04:02:41] 6operations, 10Wikimedia-Labs-wikitech-interface: wikitech instances list is blank - https://phabricator.wikimedia.org/T89808#1045882 (10mmodell) p:5Unbreak!>3Triage Just encountered this bug again myself. [04:15:03] (03CR) 1020after4: [C: 032] Make status output prettier for log files [tools/scap] - 10https://gerrit.wikimedia.org/r/198449 (owner: 10BryanDavis) [04:15:22] (03Merged) 10jenkins-bot: Make status output prettier for log files [tools/scap] - 10https://gerrit.wikimedia.org/r/198449 (owner: 10BryanDavis) [04:31:38] (03PS1) 10Chmarkine: gdash - Enable HSTS max-age=7 days [puppet] - 10https://gerrit.wikimedia.org/r/198469 (https://phabricator.wikimedia.org/T40516) [05:20:17] PROBLEM - Kafka Broker Messages In on analytics1021 is CRITICAL: kafka.server.BrokerTopicMetrics.AllTopicsMessagesInPerSec.FifteenMinuteRate CRITICAL: 787.124932326 [06:14:25] 6operations, 10Wikimedia-Labs-wikitech-interface: wikitech instances list is blank - https://phabricator.wikimedia.org/T89808#1137864 (10yuvipanda) p:5Triage>3Normal [06:20:37] PROBLEM - puppet last run on mw2111 is CRITICAL: CRITICAL: puppet fail [06:28:57] PROBLEM - puppet last run on iron is CRITICAL: CRITICAL: Puppet has 1 failures [06:29:37] PROBLEM - puppet last run on mw1144 is CRITICAL: CRITICAL: Puppet has 1 failures [06:29:58] PROBLEM - puppet last run on db2018 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:08] PROBLEM - puppet last run on db1059 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:16] PROBLEM - puppet last run on cp4008 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:07] PROBLEM - puppet last run on mw2097 is CRITICAL: CRITICAL: Puppet has 1 failures [06:33:29] !log LocalisationUpdate ResourceLoader cache refresh completed at Sat Mar 21 06:32:22 UTC 2015 (duration 32m 21s) [06:33:32] Logged the message, Master [06:33:57] PROBLEM - puppet last run on mw2093 is CRITICAL: CRITICAL: Puppet has 1 failures [06:33:57] PROBLEM - puppet last run on mw2113 is CRITICAL: CRITICAL: Puppet has 1 failures [06:34:16] PROBLEM - puppet last run on mw1042 is CRITICAL: CRITICAL: Puppet has 1 failures [06:35:17] PROBLEM - puppet last run on mw2045 is CRITICAL: CRITICAL: Puppet has 1 failures [06:35:22] (03PS1) 10Yuvipanda: tools: Add CORS header to tools-static [puppet] - 10https://gerrit.wikimedia.org/r/198474 (https://phabricator.wikimedia.org/T93466) [06:39:26] RECOVERY - puppet last run on mw2111 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [06:45:27] RECOVERY - puppet last run on mw1144 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [06:45:57] RECOVERY - puppet last run on db2018 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [06:46:08] RECOVERY - puppet last run on db1059 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [06:46:09] RECOVERY - puppet last run on cp4008 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [06:46:17] RECOVERY - puppet last run on iron is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [06:46:56] RECOVERY - puppet last run on mw2045 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [06:47:06] RECOVERY - puppet last run on mw2093 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [06:47:07] RECOVERY - puppet last run on mw2097 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [06:47:07] RECOVERY - puppet last run on mw2113 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [06:47:17] RECOVERY - puppet last run on mw1042 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:51:17] PROBLEM - puppet last run on mw1206 is CRITICAL: CRITICAL: Puppet has 1 failures [07:08:46] RECOVERY - puppet last run on mw1206 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [09:31:07] 7Puppet, 6Multimedia, 6Release-Engineering, 6Scrum-of-Scrums, and 2 others: Create basic puppet role for Sentry - https://phabricator.wikimedia.org/T84956#1138011 (10Gilles) So far it was very little work and updating the packages I've put together could be fully automated. The rest of the packages might b... [11:40:28] (03CR) 10John F. Lewis: "Heh forgot that. Will ensure they are in future patches." [puppet] - 10https://gerrit.wikimedia.org/r/198268 (owner: 10John F. Lewis) [11:40:47] PROBLEM - Host mw2147 is DOWN: PING CRITICAL - Packet loss = 100% [14:05:26] PROBLEM - HTTP error ratio anomaly detection on graphite2001 is CRITICAL: CRITICAL: Anomaly detected: 10 data above and 0 below the confidence bounds [14:05:26] PROBLEM - HTTP error ratio anomaly detection on graphite1001 is CRITICAL: CRITICAL: Anomaly detected: 10 data above and 0 below the confidence bounds [14:28:27] PROBLEM - HTTP error ratio anomaly detection on graphite1001 is CRITICAL: CRITICAL: Anomaly detected: 10 data above and 0 below the confidence bounds [14:28:27] PROBLEM - HTTP error ratio anomaly detection on graphite2001 is CRITICAL: CRITICAL: Anomaly detected: 10 data above and 0 below the confidence bounds [14:51:54] 6operations, 10Wikimedia-Language-setup, 7Tracking: Wikipedias with zh-* language codes waiting to be renamed (zh-min-nan -> nan, zh-yue -> yue, zh-classical -> lzh) (tracking) - https://phabricator.wikimedia.org/T10217#1138118 (10Mvolz) [15:20:13] 6operations, 10Deployment-Systems, 6Release-Engineering, 6Services: Streamline our service development and deployment process - https://phabricator.wikimedia.org/T93428#1138133 (10mark) [15:20:15] 6operations, 10Deployment-Systems, 6Services: Evaluate Docker as a container deployment tool - https://phabricator.wikimedia.org/T93439#1138135 (10mark) [16:38:48] RECOVERY - Graphite Carbon on graphite2001 is OK: OK: All defined Carbon jobs are runnning. [16:43:17] PROBLEM - Graphite Carbon on graphite2001 is CRITICAL: CRITICAL: Not all configured Carbon instances are running. [16:47:30] 6operations, 10Deployment-Systems, 6Release-Engineering, 6Services: Streamline our service development and deployment process - https://phabricator.wikimedia.org/T93428#1138189 (10GWicke) [17:01:42] 6operations, 10Deployment-Systems, 6Services: Evaluate Docker as a container deployment tool - https://phabricator.wikimedia.org/T93439#1138203 (10GWicke) [17:05:09] 6operations, 10Deployment-Systems, 6Services: Evaluate Docker as a container deployment tool - https://phabricator.wikimedia.org/T93439#1138205 (10GWicke) [17:05:12] 6operations, 10Deployment-Systems, 6Release-Engineering, 6Services: Streamline our service development and deployment process - https://phabricator.wikimedia.org/T93428#1138204 (10GWicke) [17:06:38] 6operations, 10Deployment-Systems, 6Release-Engineering, 6Services: Streamline our service development and deployment process - https://phabricator.wikimedia.org/T93428#1136887 (10GWicke) [17:10:34] 6operations, 10Deployment-Systems, 6Services: Evaluate Docker as a container deployment tool - https://phabricator.wikimedia.org/T93439#1138212 (10ori) > Using init scripts from packages inside the container would require root How come? You mention that systemd can run docker images, which is interesting.... [17:52:17] RECOVERY - HTTP error ratio anomaly detection on graphite1001 is OK: OK: No anomaly detected [17:52:17] RECOVERY - HTTP error ratio anomaly detection on graphite2001 is OK: OK: No anomaly detected [18:00:57] PROBLEM - Slow CirrusSearch query rate on fluorine is CRITICAL: CirrusSearch-slow.log_line_rate CRITICAL: 0.01 [18:13:25] 6operations, 6Phabricator: Phabricator's phd can't sudo to user phd - https://phabricator.wikimedia.org/T93477#1138260 (10Krenair) [18:41:36] 6operations, 10MediaWiki-extensions-TimedMediaHandler, 6Multimedia: Support VP9 in TMH (Unable to decode) - https://phabricator.wikimedia.org/T55863#1138284 (10TheDJ) [18:47:16] PROBLEM - Host mw2027 is DOWN: PING CRITICAL - Packet loss = 100% [18:48:27] RECOVERY - Host mw2027 is UP: PING OK - Packet loss = 0%, RTA = 42.94 ms [19:00:47] PROBLEM - Slow CirrusSearch query rate on fluorine is CRITICAL: CirrusSearch-slow.log_line_rate CRITICAL: 0.0133333333333 [19:15:10] 6operations, 10Deployment-Systems, 6Services: Evaluate Docker as a container deployment tool - https://phabricator.wikimedia.org/T93439#1138337 (10GWicke) > How does that work, exactly? Can systemd be made to invoke the init script somehow? Most docker applications aren't started using an init script execut... [20:00:38] PROBLEM - Slow CirrusSearch query rate on fluorine is CRITICAL: CirrusSearch-slow.log_line_rate CRITICAL: 0.00666666666667 [20:03:16] 6operations, 10Deployment-Systems, 6Services: Evaluate Docker as a container deployment tool - https://phabricator.wikimedia.org/T93439#1138357 (10GWicke) [20:05:25] 6operations, 10Deployment-Systems, 6Services: Evaluate Docker as a container deployment tool - https://phabricator.wikimedia.org/T93439#1138358 (10GWicke) [20:08:01] 6operations, 10Deployment-Systems, 6Services: Evaluate Docker as a container deployment tool - https://phabricator.wikimedia.org/T93439#1138364 (10GWicke) [20:10:25] !log performing slow rolling restart of restbase cassandra cluster to apply config changes from puppet [20:10:30] Logged the message, Master [20:32:01] 6operations, 10Parsoid, 6Services: Move Parsoid config into ops/puppet - https://phabricator.wikimedia.org/T92636#1138382 (10GWicke) @faidon, deploy repos are ultimately hardcoded config files. Those don't scale very well to many nodes and cluster setups. For example, we often break configs in beta labs beca... [20:39:15] 7Puppet, 6Multimedia, 6Release-Engineering, 6Scrum-of-Scrums, and 2 others: Create basic puppet role for Sentry - https://phabricator.wikimedia.org/T84956#1138417 (10hashar) The use of dh-virtualenv is for Zuul T48552 . It is not validated by ops yet though. Note that whenever you update a python module p... [20:57:37] PROBLEM - puppet last run on amssq33 is CRITICAL: CRITICAL: puppet fail [21:00:27] PROBLEM - Slow CirrusSearch query rate on fluorine is CRITICAL: CirrusSearch-slow.log_line_rate CRITICAL: 0.01 [21:16:17] RECOVERY - puppet last run on amssq33 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [21:50:40] git review is having errors on my machine because of a 404 on the commit msg hook [21:53:05] Negative24, what does `git remote -v` say? [21:53:45] Krenair: the https remotes for ops puppet [21:54:08] okay, can you try ssh://negative24@gerrit.wikimedia.org:29418/operations/puppet.git instead? [21:54:24] `git remote set-url origin ...` [21:54:39] Krenair: I remembering switching to https instead of ssh because of an error but I forget what... [21:55:42] Krenair: Ah that worked. I wonder why git clone wasn't working with ssh before... [21:59:13] Negative24, https://phabricator.wikimedia.org/T93489 [21:59:18] this has been happening to me for a while [21:59:46] Subbed. Thanks for the heads up [21:59:57] Whoops [22:00:00] :P [22:00:15] Clicking without looking :) [22:02:12] (03PS1) 10Negative24: Remove dummy redirect for fab-01 [puppet] - 10https://gerrit.wikimedia.org/r/198535 [22:05:37] PROBLEM - Slow CirrusSearch query rate on fluorine is CRITICAL: CirrusSearch-slow.log_line_rate CRITICAL: 0.00666666666667 [22:06:19] (03PS2) 10Alex Monk: Remove dummy redirect for fab-01 [puppet] - 10https://gerrit.wikimedia.org/r/198535 (owner: 10Negative24) [22:08:19] Krenair: The Jenkins warnings about puppet lint puppet urls not starting with modules doesn't mean much, right? [22:10:18] I have no idea, I don't really contribute to puppet.git [22:10:28] I'm a MediaWiki developer :) [22:10:39] I do touch mediawiki-config.git regularly [22:11:51] Ok. I looked it up on the files it referenced and its really just the private puppet repo references. Also, vim needs spellcheck :) [22:17:36] PROBLEM - HTTP 5xx req/min on graphite2001 is CRITICAL: CRITICAL: 7.14% of data above the critical threshold [500.0] [22:17:37] PROBLEM - HTTP 5xx req/min on graphite1001 is CRITICAL: CRITICAL: 7.14% of data above the critical threshold [500.0] [22:29:09] RECOVERY - HTTP 5xx req/min on graphite2001 is OK: OK: Less than 1.00% above the threshold [250.0] [22:29:09] RECOVERY - HTTP 5xx req/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [22:31:39] j/join ##idf [22:58:27] RECOVERY - Graphite Carbon on graphite2001 is OK: OK: All defined Carbon jobs are runnning. [23:02:56] PROBLEM - Graphite Carbon on graphite2001 is CRITICAL: CRITICAL: Not all configured Carbon instances are running. [23:10:37] PROBLEM - Slow CirrusSearch query rate on fluorine is CRITICAL: CirrusSearch-slow.log_line_rate CRITICAL: 0.0133333333333