[00:00:45] <icinga-wm>	 RECOVERY - Unmerged changes on repository puppet on strontium is OK: No changes to merge.
[00:01:38] <grrrit-wm>	 (03PS3) 10Dzahn: contint: disable unattended upgrade [puppet] - 10https://gerrit.wikimedia.org/r/210391 (https://phabricator.wikimedia.org/T98876) (owner: 10Hashar)
[00:03:03] <grrrit-wm>	 (03CR) 10Dzahn: [C: 032] contint: disable unattended upgrade [puppet] - 10https://gerrit.wikimedia.org/r/210391 (https://phabricator.wikimedia.org/T98876) (owner: 10Hashar)
[00:04:13] <logmsgbot>	 !log ori Synchronized php-1.26wmf6/includes: 9bf0236c20, 2d3c9233ed (duration: 00m 17s)
[00:04:21] <morebots>	 Logged the message, Master
[00:13:28] <wikibugs>	 6operations, 10ops-eqiad: ssh connection to some management servers fails, a hard reset may be needed - https://phabricator.wikimedia.org/T99805#1304212 (10Dzahn) p:5Triage>3High
[00:13:51] <wikibugs>	 6operations, 10Deployment-Systems: errors reported by "eventual_consistency_deployment_server_init" on new deploy server - https://phabricator.wikimedia.org/T99928#1304213 (10Dzahn) p:5Triage>3Normal
[00:14:43] <mutante>	 oh, the favicon of phab changed, right
[00:14:58] <mutante>	 i guess it was the restart from the other config change 
[00:17:37] <wikibugs>	 6operations: [7a44ef6d] 2015-05-22 11:26:53: Fatal exception of type MWException - https://phabricator.wikimedia.org/T100012#1304215 (10Dzahn) p:5Triage>3High
[00:22:54] <cwdent>	 trying to get permission to +2 something in gerrit.  wondering if anyone here can point me in the right direction?
[00:24:10] <ori>	 cwdent: for which repository?
[00:24:24] <ori>	 also, see http://www.mediawiki.org/wiki/Gerrit/%2B2
[00:25:38] <cwdent>	 i am in fundraising tech, working with mediawiki-vagrant now
[00:26:43] <cwdent>	 and yes i will be careful, i don't even want to +2, but team said i should :)
[00:30:08] <mutante>	 cwdent: did you just start working for WMF?
[00:30:25] <cwdent>	 yep, 2 weeks ago
[00:30:28] <mutante>	 cwdent: would you know of some kind of onboarding ticket?
[00:30:38] <mutante>	 you are probably just lacking the WMF LDAP group
[00:31:01] <jgage>	 that problem happens with disturbing regularity
[00:31:07] <mutante>	 yes, like every time :)
[00:31:13] <jgage>	 belated welcome, cwdent :)
[00:31:15] <cwdent>	 hrm, i'm not sure.  i have LDAP access other places...
[00:31:17] <mutante>	 yea, welcome
[00:31:26] <mutante>	 and sorry about that but we didnt get a notification
[00:31:27] <cwdent>	 thanks!  i'm honored to be here
[00:32:15] <cwdent>	 at least, i think i do...wikitech is the ldap account?
[00:32:28] <jgage>	 yeah
[00:33:06] <cwdent>	 yeah, those creds work various places.
[00:33:14] * cwdent still figuring out logins
[00:33:37] <mutante>	 !log adding cwdent to WMF LDAP group per https://www.mediawiki.org/wiki/User:CDentinger_%28WMF%29
[00:33:43] <morebots>	 Logged the message, Master
[00:34:48] <mutante>	 cwdent: try on a relevant gerrit link and see if the +2 is not greyed out anymore, without hitting submit
[00:36:59] <cwdent>	 mutante: i actually don't see a +2 at all...i'm looking at the radio buttons after i click review, right?
[00:37:30] <mutante>	 cwdent: yes, try logging out and back in 
[00:39:33] <cwdent>	 mutante: ah ok, i see the other buttons on a ticket for the fundraising-dash repo (which i've pushed to if that's relevant) but not mediawiki-vagrant
[00:40:01] <jgage>	 cwdent: mutante's gotta run, i'll take a look
[00:40:34] <cwdent>	 ok, sorry to be bugging you late, this can totally wait till next week
[00:40:35] <mutante>	 so the WMF group is like the default and gives you a bunch of repos but not everything
[00:40:47] <mutante>	 thanks, handing over to jgage
[00:41:15] <cwdent>	 thanks mutante, have a good weekend
[00:42:30] <jgage>	 hm there's an ldap group called vagrant, maybe that's what we need
[00:42:33] * jgage pokes around
[00:43:06] <jgage>	 cwdent: what is the url to the gerrit change you're trying to +2?
[00:44:59] <jgage>	 we had another coloradoan in the office recently, but i didn't catch his name
[00:45:05] <jgage>	 he was from my hometown, co springs
[00:45:19] <cwdent>	 https://gerrit.wikimedia.org/r/#/c/212820/1,publish
[00:45:27] <jgage>	 k
[00:45:42] <cwdent>	 nice yeah there's a couple of us.  i live 3 blocks from thcipriani|afk
[00:45:50] <jgage>	 cool
[00:46:17] <jgage>	 i flew through denver recently. those views of the rockies always make me want to stay.
[00:47:02] <cwdent>	 yeah i love the front range.  i lived in the springs for about a year
[00:50:53] <jgage>	 hmm i wonder how to determine the mapping between projects and ldap groups
[00:51:35] <jgage>	 oh rad you worked for sparkfun
[00:51:49] <jgage>	 i wish i was a competent hardware hacker
[00:51:58] <jgage>	 i know just enough to enjoy browsing their catalog
[00:52:27] <cwdent>	 ha yeah i was there for 7 years
[00:52:57] <cwdent>	 though i'm only borderline competent as a hardware hacker
[01:17:08] <jgage>	 cwdent: if you're still around, try again?
[01:17:24] <jgage>	 (reloading the gerrit url should be sufficient)
[01:18:02] <cwdent>	 that did it!  thanks a ton jgage
[01:18:30] <jgage>	 yay!
[01:18:35] <jgage>	 that took some sleuthing :)
[01:18:37] * jgage takes notes
[02:20:00] <logmsgbot>	 !log l10nupdate Synchronized php-1.26wmf6/cache/l10n: (no message) (duration: 06m 02s)
[02:20:22] <morebots>	 Logged the message, Master
[02:24:40] <logmsgbot>	 !log LocalisationUpdate completed (1.26wmf6) at 2015-05-23 02:23:36+00:00
[02:24:47] <morebots>	 Logged the message, Master
[02:24:54] <icinga-wm>	 PROBLEM - are wikitech and wt-static in sync on silver is CRITICAL: wikitech-static CRIT - wikitech and wikitech-static out of sync (94787s 90000s)
[02:41:25] <logmsgbot>	 !log l10nupdate Synchronized php-1.26wmf7/cache/l10n: (no message) (duration: 05m 56s)
[02:41:38] <morebots>	 Logged the message, Master
[02:45:51] <logmsgbot>	 !log LocalisationUpdate completed (1.26wmf7) at 2015-05-23 02:44:48+00:00
[02:45:57] <morebots>	 Logged the message, Master
[04:13:14] <icinga-wm>	 PROBLEM - puppet last run on mw2042 is CRITICAL puppet fail
[04:27:15] <icinga-wm>	 PROBLEM - HTTP 5xx req/min on graphite1001 is CRITICAL 21.43% of data above the critical threshold [500.0]
[04:31:45] <icinga-wm>	 RECOVERY - puppet last run on mw2042 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures
[04:33:53] <wikibugs>	 6operations, 7Wikimedia-log-errors: [7a44ef6d] 2015-05-22 11:26:53: Fatal exception of type MWException - https://phabricator.wikimedia.org/T100012#1304248 (10Glaisher)
[04:42:34] <icinga-wm>	 RECOVERY - HTTP 5xx req/min on graphite1001 is OK Less than 1.00% above the threshold [250.0]
[04:54:23] <wikibugs>	 6operations: Google Webmaster Tools - 1000 domain limit - https://phabricator.wikimedia.org/T99132#1304250 (10dr0ptp4kt) Update: I've received some guidance. May be a bit until I can take action on it. In a nutshell, though:  * Not possible to increase the sites limit * We could consolidate the multitude of dist...
[05:06:46] <icinga-wm>	 RECOVERY - are wikitech and wt-static in sync on silver is OK: wikitech-static OK - wikitech and wikitech-static in sync (7934 90000s)
[05:13:47] <logmsgbot>	 !log LocalisationUpdate ResourceLoader cache refresh completed at Sat May 23 05:12:44 UTC 2015 (duration 12m 43s)
[05:13:53] <morebots>	 Logged the message, Master
[05:19:44] <icinga-wm>	 PROBLEM - HTTP 5xx req/min on graphite1001 is CRITICAL 6.67% of data above the critical threshold [500.0]
[05:29:45] <icinga-wm>	 RECOVERY - HTTP 5xx req/min on graphite1001 is OK Less than 1.00% above the threshold [250.0]
[06:32:55] <icinga-wm>	 PROBLEM - puppet last run on db1040 is CRITICAL Puppet has 1 failures
[06:33:04] <icinga-wm>	 PROBLEM - puppet last run on subra is CRITICAL Puppet has 1 failures
[06:33:24] <icinga-wm>	 PROBLEM - puppet last run on cp4003 is CRITICAL Puppet has 1 failures
[06:33:44] <icinga-wm>	 PROBLEM - puppet last run on mw2016 is CRITICAL Puppet has 1 failures
[06:34:16] <icinga-wm>	 PROBLEM - puppet last run on mw2083 is CRITICAL Puppet has 1 failures
[06:34:16] <icinga-wm>	 PROBLEM - puppet last run on mw2123 is CRITICAL Puppet has 1 failures
[06:34:24] <icinga-wm>	 PROBLEM - puppet last run on mw1092 is CRITICAL Puppet has 1 failures
[06:34:44] <icinga-wm>	 PROBLEM - puppet last run on mw2079 is CRITICAL Puppet has 2 failures
[06:34:45] <icinga-wm>	 PROBLEM - puppet last run on mw2022 is CRITICAL Puppet has 1 failures
[06:35:05] <icinga-wm>	 PROBLEM - puppet last run on mw1052 is CRITICAL Puppet has 1 failures
[06:46:14] <icinga-wm>	 RECOVERY - puppet last run on mw2083 is OK Puppet is currently enabled, last run 48 seconds ago with 0 failures
[06:46:15] <icinga-wm>	 RECOVERY - puppet last run on mw1092 is OK Puppet is currently enabled, last run 22 seconds ago with 0 failures
[06:46:25] <icinga-wm>	 RECOVERY - puppet last run on db1040 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures
[06:46:36] <icinga-wm>	 RECOVERY - puppet last run on subra is OK Puppet is currently enabled, last run 1 minute ago with 0 failures
[06:46:55] <icinga-wm>	 RECOVERY - puppet last run on cp4003 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures
[06:47:04] <icinga-wm>	 RECOVERY - puppet last run on mw1052 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures
[06:47:15] <icinga-wm>	 RECOVERY - puppet last run on mw2016 is OK Puppet is currently enabled, last run 49 seconds ago with 0 failures
[06:47:54] <icinga-wm>	 RECOVERY - puppet last run on mw2123 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures
[06:48:15] <icinga-wm>	 RECOVERY - puppet last run on mw2079 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures
[06:48:16] <icinga-wm>	 RECOVERY - puppet last run on mw2022 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures
[07:22:06] <grrrit-wm>	 (03CR) 10Filippo Giunchedi: "LGTM, the diamond changes should go in a separate code review though" [puppet] - 10https://gerrit.wikimedia.org/r/212041 (https://phabricator.wikimedia.org/T83580) (owner: 10Ottomata)
[07:25:32] <grrrit-wm>	 (03CR) 10Filippo Giunchedi: initial debian packaging (031 comment) [debs/python-etcd] - 10https://gerrit.wikimedia.org/r/212528 (https://phabricator.wikimedia.org/T99771) (owner: 10Filippo Giunchedi)
[07:34:25] <icinga-wm>	 PROBLEM - Persistent high iowait on labstore1001 is CRITICAL 87.50% of data above the critical threshold [35.0]
[07:41:25] <icinga-wm>	 RECOVERY - Persistent high iowait on labstore1001 is OK Less than 50.00% above the threshold [25.0]
[07:49:45] <icinga-wm>	 PROBLEM - Persistent high iowait on labstore1001 is CRITICAL 55.56% of data above the critical threshold [35.0]
[08:01:36] <icinga-wm>	 RECOVERY - Persistent high iowait on labstore1001 is OK Less than 50.00% above the threshold [25.0]
[08:06:44] <icinga-wm>	 PROBLEM - Persistent high iowait on labstore1001 is CRITICAL 50.00% of data above the critical threshold [35.0]
[08:11:44] <icinga-wm>	 PROBLEM - High load average on labstore1001 is CRITICAL 50.00% of data above the critical threshold [24.0]
[08:13:24] <icinga-wm>	 RECOVERY - Persistent high iowait on labstore1001 is OK Less than 50.00% above the threshold [25.0]
[08:18:52] <YuviPanda|boo>	 hmm
[08:20:54] <icinga-wm>	 PROBLEM - puppet last run on sca1001 is CRITICAL Puppet has 15 failures
[08:25:15] <icinga-wm>	 RECOVERY - High load average on labstore1001 is OK Less than 50.00% above the threshold [16.0]
[08:29:54] <yuvipanda>	 legoktm: we should get UrlShortener deployed somewhere
[08:31:04] <legoktm>	 yuvipanda: yesssssss
[08:31:54] <icinga-wm>	 PROBLEM - Persistent high iowait on labstore1001 is CRITICAL 55.56% of data above the critical threshold [35.0]
[08:33:35] <icinga-wm>	 PROBLEM - High load average on labstore1001 is CRITICAL 50.00% of data above the critical threshold [24.0]
[08:35:35] <yuvipanda>	 legoktm: should we rewrite it to be a noejs service first, tho?
[08:35:44] * legoktm slaps yuvipanda
[08:35:54] <icinga-wm>	 RECOVERY - puppet last run on sca1001 is OK Puppet is currently enabled, last run 20 seconds ago with 0 failures
[08:36:55] <icinga-wm>	 RECOVERY - High load average on labstore1001 is OK Less than 50.00% above the threshold [16.0]
[08:43:54] <icinga-wm>	 PROBLEM - High load average on labstore1001 is CRITICAL 60.00% of data above the critical threshold [24.0]
[08:45:14] <yuvipanda>	 _joe_: https://get.docker.com/
[08:46:26] <legoktm>	 <yuvipanda> https://get.tools.wmflabs.org/
[08:46:37] * yuvipanda swats legoktm
[08:55:35] <icinga-wm>	 RECOVERY - Persistent high iowait on labstore1001 is OK Less than 50.00% above the threshold [25.0]
[08:55:35] <icinga-wm>	 RECOVERY - High load average on labstore1001 is OK Less than 50.00% above the threshold [16.0]
[09:00:44] <icinga-wm>	 PROBLEM - Persistent high iowait on labstore1001 is CRITICAL 62.50% of data above the critical threshold [35.0]
[09:15:46] <icinga-wm>	 RECOVERY - Persistent high iowait on labstore1001 is OK Less than 50.00% above the threshold [25.0]
[09:16:10] <wikibugs>	 6operations: Upgrade sodium to jessie - https://phabricator.wikimedia.org/T82698#1304472 (10fgiunchedi) looks like ~200G used ATM  ``` /dev/mapper/sodium-mailman                       280G  102G  179G  37% /var/lib/mailman ```  so from the spares list, this "Dell PowerEdge R420, single Intel Xeon E5-2450 v2 2.50...
[09:21:34] <grrrit-wm>	 (03PS1) 10Yuvipanda: mesos: Install docker on all slaves [puppet] - 10https://gerrit.wikimedia.org/r/212861 
[09:22:18] <grrrit-wm>	 (03PS2) 10Yuvipanda: mesos: Install docker on all slaves [puppet] - 10https://gerrit.wikimedia.org/r/212861 (https://phabricator.wikimedia.org/T99923) 
[09:22:39] <grrrit-wm>	 (03CR) 10Yuvipanda: [C: 032 V: 032] mesos: Install docker on all slaves [puppet] - 10https://gerrit.wikimedia.org/r/212861 (https://phabricator.wikimedia.org/T99923) (owner: 10Yuvipanda)
[09:24:03] <wikibugs>	 6operations: Upgrade sodium to jessie - https://phabricator.wikimedia.org/T82698#1304486 (10faidon) a:5faidon>3None Why not a (ganeti) VM? In any case, this ticket lacks an owner/assignee. Finding a machine for that is the easy part :)
[09:28:38] <wikibugs>	 6operations, 10ops-eqiad: analytics1036 can't talk cross row? - https://phabricator.wikimedia.org/T99845#1304490 (10faidon) I've rebooted multiple times, also checked BIOS settings, also rebooted the iDRAC a few times just in case it was something related to NIC sharing. I've upgraded all of the firmware on th...
[09:29:18] <wikibugs>	 6operations: Upgrade sodium to jessie - https://phabricator.wikimedia.org/T82698#1304491 (10fgiunchedi) does it make a difference that it needs a public ip? if it doesn't a VM would be a good fit indeed. very true re: owner, cc @mark
[09:38:48] <grrrit-wm>	 (03PS1) 10Yuvipanda: mesos: Use require_package to get docker [puppet] - 10https://gerrit.wikimedia.org/r/212865 
[09:38:50] <grrrit-wm>	 (03PS1) 10Yuvipanda: mesos: Enable docker as containerization mechanism for mesos [puppet] - 10https://gerrit.wikimedia.org/r/212866 (https://phabricator.wikimedia.org/T99923) 
[09:39:14] <grrrit-wm>	 (03CR) 10Yuvipanda: [C: 032 V: 032] mesos: Use require_package to get docker [puppet] - 10https://gerrit.wikimedia.org/r/212865 (owner: 10Yuvipanda)
[09:39:43] <grrrit-wm>	 (03CR) 10Yuvipanda: [C: 032 V: 032] mesos: Enable docker as containerization mechanism for mesos [puppet] - 10https://gerrit.wikimedia.org/r/212866 (https://phabricator.wikimedia.org/T99923) (owner: 10Yuvipanda)
[09:44:54] <grrrit-wm>	 (03PS2) 10Ori.livneh: Re-enable xhprof profiling [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212808 (https://phabricator.wikimedia.org/T66301) 
[09:44:56] <grrrit-wm>	 (03PS1) 10Ori.livneh: Change StatsD port to another value temporarily [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212868 
[09:44:58] <grrrit-wm>	 (03PS1) 10Ori.livneh: Revert "Change StatsD port to another value temporarily" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212869 
[09:45:04] <grrrit-wm>	 (03PS1) 10Yuvipanda: mesos: Increase registry execution timeout to support docker [puppet] - 10https://gerrit.wikimedia.org/r/212870 
[09:45:27] <ori>	 ^ godog
[09:45:36] <grrrit-wm>	 (03CR) 10Yuvipanda: [C: 032 V: 032] mesos: Increase registry execution timeout to support docker [puppet] - 10https://gerrit.wikimedia.org/r/212870 (owner: 10Yuvipanda)
[09:45:53] <grrrit-wm>	 (03CR) 10Filippo Giunchedi: [C: 031] Change StatsD port to another value temporarily [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212868 (owner: 10Ori.livneh)
[09:49:54] <grrrit-wm>	 (03CR) 10Ori.livneh: [C: 032] Change StatsD port to another value temporarily [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212868 (owner: 10Ori.livneh)
[09:50:00] <grrrit-wm>	 (03Merged) 10jenkins-bot: Change StatsD port to another value temporarily [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212868 (owner: 10Ori.livneh)
[09:51:15] <grrrit-wm>	 (03PS1) 10Yuvipanda: mesos: Puppetize docker config file [puppet] - 10https://gerrit.wikimedia.org/r/212871 
[09:52:07] <logmsgbot>	 !log ori Synchronized wmf-config/CommonSettings.php: I311c989e9: Change StatsD port to another value temporarily (duration: 00m 14s)
[09:52:16] <morebots>	 Logged the message, Master
[09:54:38] <grrrit-wm>	 (03CR) 10Filippo Giunchedi: [C: 031] Re-enable xhprof profiling [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212808 (https://phabricator.wikimedia.org/T66301) (owner: 10Ori.livneh)
[09:56:19] <grrrit-wm>	 (03CR) 10Ori.livneh: [C: 032] Re-enable xhprof profiling [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212808 (https://phabricator.wikimedia.org/T66301) (owner: 10Ori.livneh)
[09:56:25] <grrrit-wm>	 (03Merged) 10jenkins-bot: Re-enable xhprof profiling [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212808 (https://phabricator.wikimedia.org/T66301) (owner: 10Ori.livneh)
[09:56:35] <icinga-wm>	 PROBLEM - Translation cache space on mw1203 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 92%
[09:57:50] <logmsgbot>	 !log ori Synchronized wmf-config/StartProfiler.php: Ia7549d45: Re-enable xhprof profiling (duration: 00m 14s)
[09:57:56] <morebots>	 Logged the message, Master
[09:58:04] <wikibugs>	 6operations: Upgrade sodium to jessie - https://phabricator.wikimedia.org/T82698#1304560 (10JohnLewis) >>! In T82698#1304491, @fgiunchedi wrote: > does it make a difference that it needs a public ip? if it doesn't a VM would be a good fit indeed. very true re: owner, cc @mark  Mailman handles it's own maiil proc...
[09:59:36] <grrrit-wm>	 (03PS2) 10Yuvipanda: mesos: Puppetize docker config file [puppet] - 10https://gerrit.wikimedia.org/r/212871 
[10:01:16] <godog>	 JohnFLewis: are you at the hackathon?
[10:01:24] <icinga-wm>	 PROBLEM - Unmerged changes on repository puppet on strontium is CRITICAL: There is one unmerged change in puppet (dir /var/lib/git/operations/puppet).
[10:01:52] <grrrit-wm>	 (03PS1) 10Ori.livneh: *Correctly* set port of $wgStatsdServer [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212873 
[10:01:56] <yuvipanda>	 godog: he isn't
[10:02:30] <grrrit-wm>	 (03CR) 10Ori.livneh: [C: 032] *Correctly* set port of $wgStatsdServer [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212873 (owner: 10Ori.livneh)
[10:02:36] <grrrit-wm>	 (03Merged) 10jenkins-bot: *Correctly* set port of $wgStatsdServer [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212873 (owner: 10Ori.livneh)
[10:03:11] <logmsgbot>	 !log ori Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 13s)
[10:03:18] <morebots>	 Logged the message, Master
[10:03:34] <icinga-wm>	 PROBLEM - Translation cache space on mw1244 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 92%
[10:03:35] <icinga-wm>	 PROBLEM - Translation cache space on mw1248 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 92%
[10:03:44] <icinga-wm>	 PROBLEM - Translation cache space on mw1211 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 92%
[10:03:45] <icinga-wm>	 PROBLEM - Translation cache space on mw1166 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 92%
[10:03:45] <icinga-wm>	 PROBLEM - Translation cache space on mw1171 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 91%
[10:03:45] <icinga-wm>	 PROBLEM - Translation cache space on mw1243 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 95%
[10:03:46] <icinga-wm>	 PROBLEM - Translation cache space on mw1245 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 96%
[10:03:55] <icinga-wm>	 PROBLEM - Translation cache space on mw1231 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 99%
[10:03:55] <icinga-wm>	 PROBLEM - Translation cache space on mw1163 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 92%
[10:03:55] <icinga-wm>	 PROBLEM - Translation cache space on mw1082 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 91%
[10:03:55] <icinga-wm>	 PROBLEM - Translation cache space on mw1072 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 91%
[10:03:56] <icinga-wm>	 PROBLEM - Translation cache space on mw1179 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 93%
[10:03:56] <icinga-wm>	 PROBLEM - Translation cache space on mw1065 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 91%
[10:03:56] <icinga-wm>	 PROBLEM - Translation cache space on mw1097 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 92%
[10:03:56] <icinga-wm>	 PROBLEM - Translation cache space on mw1089 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 91%
[10:04:03] <greg-g>	 ek
[10:04:04] <icinga-wm>	 PROBLEM - Translation cache space on mw1172 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:04:04] <icinga-wm>	 PROBLEM - Translation cache space on mw1110 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 91%
[10:04:04] <icinga-wm>	 PROBLEM - Translation cache space on mw1230 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 97%
[10:04:04] <icinga-wm>	 PROBLEM - Translation cache space on mw1191 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 96%
[10:04:05] <icinga-wm>	 PROBLEM - Translation cache space on mw1036 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 92%
[10:04:05] <icinga-wm>	 PROBLEM - Translation cache space on mw1045 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 93%
[10:04:05] <icinga-wm>	 PROBLEM - Translation cache space on mw1195 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 95%
[10:04:05] <jgage>	 what
[10:04:05] <icinga-wm>	 PROBLEM - Translation cache space on mw1232 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 98%
[10:04:06] <icinga-wm>	 PROBLEM - Translation cache space on mw1204 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 95%
[10:04:07] <icinga-wm>	 PROBLEM - Translation cache space on mw1147 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 91%
[10:04:07] <icinga-wm>	 PROBLEM - Translation cache space on mw1133 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 91%
[10:04:07] <icinga-wm>	 PROBLEM - Translation cache space on mw1069 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 93%
[10:04:09] <yuvipanda>	 needs a rolling restart
[10:04:10] <yuvipanda>	 i think
[10:04:18] <icinga-wm>	 PROBLEM - Translation cache space on mw1068 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 95%
[10:04:19] <icinga-wm>	 PROBLEM - Translation cache space on mw1126 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:04:19] <icinga-wm>	 PROBLEM - Translation cache space on mw1019 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:04:24] <icinga-wm>	 PROBLEM - Translation cache space on mw1125 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:04:24] <icinga-wm>	 PROBLEM - Translation cache space on mw1049 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 95%
[10:04:24] <icinga-wm>	 PROBLEM - Translation cache space on mw1134 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:04:24] <icinga-wm>	 PROBLEM - Translation cache space on mw1116 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:04:24] <icinga-wm>	 PROBLEM - Translation cache space on mw1122 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 95%
[10:04:24] <icinga-wm>	 PROBLEM - Translation cache space on mw1088 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 96%
[10:04:25] <icinga-wm>	 PROBLEM - Translation cache space on mw1189 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 97%
[10:04:25] <icinga-wm>	 PROBLEM - Translation cache space on mw1073 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:04:26] <icinga-wm>	 PROBLEM - Translation cache space on mw1067 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:04:26] <icinga-wm>	 PROBLEM - Translation cache space on mw1094 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 95%
[10:04:27] <icinga-wm>	 PROBLEM - Translation cache space on mw1135 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:04:27] <icinga-wm>	 PROBLEM - Translation cache space on mw1033 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 95%
[10:04:40] <greg-g>	 _joe_: ^
[10:04:44] <icinga-wm>	 PROBLEM - Translation cache space on mw1058 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 98%
[10:04:45] <icinga-wm>	 PROBLEM - Translation cache space on mw1249 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:04:45] <icinga-wm>	 PROBLEM - Translation cache space on mw1075 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 99%
[10:04:45] <icinga-wm>	 PROBLEM - Translation cache space on mw1041 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 99%
[10:04:45] <icinga-wm>	 PROBLEM - Translation cache space on mw1053 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 98%
[10:04:45] <icinga-wm>	 PROBLEM - Translation cache space on mw1253 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:04:45] <icinga-wm>	 PROBLEM - Translation cache space on mw1106 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 97%
[10:04:46] <icinga-wm>	 PROBLEM - Translation cache space on mw1029 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 99%
[10:04:54] <icinga-wm>	 PROBLEM - Translation cache space on mw1127 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 98%
[10:04:55] <icinga-wm>	 PROBLEM - Translation cache space on mw1084 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 99%
[10:04:55] <icinga-wm>	 PROBLEM - Translation cache space on mw1085 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 99%
[10:04:55] <icinga-wm>	 PROBLEM - Translation cache space on mw1043 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 99%
[10:04:55] <icinga-wm>	 PROBLEM - Translation cache space on mw1061 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:04:55] <icinga-wm>	 PROBLEM - Translation cache space on mw1022 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 99%
[10:04:59] <ori>	 greg-g: it's OK
[10:05:04] <icinga-wm>	 PROBLEM - Translation cache space on mw1144 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 99%
[10:05:04] <icinga-wm>	 PROBLEM - Translation cache space on mw1115 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 99%
[10:05:04] <icinga-wm>	 PROBLEM - Translation cache space on mw1095 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 99%
[10:05:04] <icinga-wm>	 PROBLEM - Translation cache space on mw1164 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:05:04] <icinga-wm>	 PROBLEM - Translation cache space on mw1174 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:05:04] <icinga-wm>	 PROBLEM - Translation cache space on mw1042 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:05:05] <icinga-wm>	 PROBLEM - Translation cache space on mw1039 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:05:05] <icinga-wm>	 PROBLEM - Translation cache space on mw1112 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:05:06] <icinga-wm>	 RECOVERY - Translation cache space on mw1203 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:14] <icinga-wm>	 PROBLEM - Translation cache space on mw1079 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:05:14] <icinga-wm>	 PROBLEM - Translation cache space on mw1109 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:05:15] <icinga-wm>	 PROBLEM - Translation cache space on mw1077 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:05:15] <icinga-wm>	 PROBLEM - Translation cache space on mw1070 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:05:15] <icinga-wm>	 PROBLEM - Translation cache space on mw1052 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:05:15] <icinga-wm>	 PROBLEM - Translation cache space on mw1025 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:05:15] <icinga-wm>	 RECOVERY - Translation cache space on mw1244 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:15] <icinga-wm>	 PROBLEM - Translation cache space on mw1120 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:05:16] <icinga-wm>	 PROBLEM - Translation cache space on mw1020 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:05:16] <icinga-wm>	 PROBLEM - Translation cache space on mw1055 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:05:17] <icinga-wm>	 PROBLEM - Translation cache space on mw1119 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:05:17] <icinga-wm>	 PROBLEM - Translation cache space on mw1128 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:05:18] <greg-g>	 ori: :)
[10:05:34] <icinga-wm>	 RECOVERY - Translation cache space on mw1245 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:34] <icinga-wm>	 PROBLEM - Translation cache space on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:05:34] <icinga-wm>	 RECOVERY - Translation cache space on mw1231 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:35] <icinga-wm>	 RECOVERY - Translation cache space on mw1163 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:35] <icinga-wm>	 PROBLEM - Translation cache space on mw1028 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:05:35] <icinga-wm>	 PROBLEM - Translation cache space on mw1107 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:05:35] <icinga-wm>	 PROBLEM - Translation cache space on mw1104 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[10:05:35] <icinga-wm>	 RECOVERY - Translation cache space on mw1082 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:36] <icinga-wm>	 RECOVERY - Translation cache space on mw1179 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:36] <icinga-wm>	 RECOVERY - Translation cache space on mw1065 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:37] <icinga-wm>	 RECOVERY - Translation cache space on mw1097 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:44] <icinga-wm>	 RECOVERY - Translation cache space on mw1089 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:45] <icinga-wm>	 RECOVERY - Translation cache space on mw1172 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:45] <icinga-wm>	 RECOVERY - Translation cache space on mw1230 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:45] <icinga-wm>	 RECOVERY - Translation cache space on mw1110 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:45] <icinga-wm>	 RECOVERY - Translation cache space on mw1191 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:45] <icinga-wm>	 RECOVERY - Translation cache space on mw1195 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:45] <icinga-wm>	 RECOVERY - Translation cache space on mw1045 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:46] <icinga-wm>	 RECOVERY - Translation cache space on mw1232 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:46] <icinga-wm>	 RECOVERY - Translation cache space on mw1036 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:47] <icinga-wm>	 RECOVERY - Translation cache space on mw1204 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:47] <icinga-wm>	 RECOVERY - Translation cache space on mw1147 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:58] <icinga-wm>	 RECOVERY - Translation cache space on mw1071 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:59] <icinga-wm>	 RECOVERY - Translation cache space on mw1194 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:05:59] <icinga-wm>	 RECOVERY - Translation cache space on mw1068 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:00] <icinga-wm>	 RECOVERY - Translation cache space on mw1126 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:00] <icinga-wm>	 RECOVERY - Translation cache space on mw1019 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:04] <icinga-wm>	 RECOVERY - Translation cache space on mw1233 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:04] <jgage>	 what caused that?
[10:06:04] <icinga-wm>	 RECOVERY - Translation cache space on mw1125 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:04] <icinga-wm>	 RECOVERY - Translation cache space on mw1049 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:04] <icinga-wm>	 RECOVERY - Translation cache space on mw1134 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:04] <icinga-wm>	 RECOVERY - Translation cache space on mw1122 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:04] <icinga-wm>	 RECOVERY - Translation cache space on mw1116 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:04] <icinga-wm>	 RECOVERY - Translation cache space on mw1088 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:05] <icinga-wm>	 RECOVERY - Translation cache space on mw1189 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:06] <icinga-wm>	 RECOVERY - Translation cache space on mw1161 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:06] <icinga-wm>	 RECOVERY - Translation cache space on mw1073 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:07] <icinga-wm>	 RECOVERY - Translation cache space on mw1135 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:08] <icinga-wm>	 RECOVERY - Translation cache space on mw1033 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:25] <icinga-wm>	 RECOVERY - Translation cache space on mw1058 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:25] <icinga-wm>	 RECOVERY - Translation cache space on mw1075 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:25] <icinga-wm>	 RECOVERY - Translation cache space on mw1041 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:25] <icinga-wm>	 RECOVERY - Translation cache space on mw1053 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:34] <icinga-wm>	 RECOVERY - Translation cache space on mw1061 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:34] <icinga-wm>	 RECOVERY - Translation cache space on mw1106 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:34] <ori>	 jgage: any sync of PHP code is liable to cause the TC cache size to be exceeded
[10:06:34] <icinga-wm>	 RECOVERY - Translation cache space on mw1029 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:34] <icinga-wm>	 RECOVERY - Translation cache space on mw1164 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:35] <icinga-wm>	 RECOVERY - Translation cache space on mw1174 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:35] <icinga-wm>	 RECOVERY - Translation cache space on mw1127 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:35] <icinga-wm>	 RECOVERY - Translation cache space on mw1042 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:35] <icinga-wm>	 RECOVERY - Translation cache space on mw1039 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:36] <icinga-wm>	 RECOVERY - Translation cache space on mw1084 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:36] <icinga-wm>	 RECOVERY - Translation cache space on mw1085 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:44] <icinga-wm>	 RECOVERY - Translation cache space on mw1043 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:44] <icinga-wm>	 RECOVERY - Translation cache space on mw1112 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:44] <icinga-wm>	 RECOVERY - Translation cache space on mw1022 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:44] <icinga-wm>	 RECOVERY - Translation cache space on mw1144 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:44] <icinga-wm>	 RECOVERY - Translation cache space on mw1079 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:45] <icinga-wm>	 RECOVERY - Translation cache space on mw1115 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:45] <icinga-wm>	 RECOVERY - Translation cache space on mw1095 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:45] <icinga-wm>	 RECOVERY - Translation cache space on mw1109 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:46] <icinga-wm>	 RECOVERY - Translation cache space on mw1077 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:48] <icinga-wm>	 RECOVERY - Translation cache space on mw1025 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:48] <icinga-wm>	 RECOVERY - Translation cache space on mw1070 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:48] <icinga-wm>	 RECOVERY - Translation cache space on mw1052 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:06:49] <ori>	 _joe_: we should kill the alert, it just frightens people
[10:06:54] <ori>	 and it's not actionable
[10:06:57] <jgage>	 ori: gotcha
[10:07:04] <icinga-wm>	 RECOVERY - Translation cache space on mw1098 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:07:04] <icinga-wm>	 RECOVERY - Translation cache space on mw1024 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:07:05] <icinga-wm>	 RECOVERY - Translation cache space on mw1032 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:07:05] <icinga-wm>	 RECOVERY - Translation cache space on mw1114 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:07:05] <icinga-wm>	 RECOVERY - Translation cache space on mw1028 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:07:05] <icinga-wm>	 RECOVERY - Translation cache space on mw1104 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:07:05] <icinga-wm>	 RECOVERY - Translation cache space on mw1107 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:07:15] <icinga-wm>	 PROBLEM - Translation cache space on mw1012 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:07:15] <icinga-wm>	 PROBLEM - Translation cache space on mw1003 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:07:16] <icinga-wm>	 PROBLEM - Translation cache space on mw1014 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:07:16] <icinga-wm>	 PROBLEM - Translation cache space on mw1008 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:07:24] <icinga-wm>	 RECOVERY - Translation cache space on mw1072 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:07:25] <icinga-wm>	 PROBLEM - Translation cache space on mw1005 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:07:34] <icinga-wm>	 PROBLEM - Translation cache space on mw1016 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:07:34] <icinga-wm>	 PROBLEM - Translation cache space on mw1009 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:07:34] <icinga-wm>	 RECOVERY - Translation cache space on mw1133 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:07:35] <icinga-wm>	 PROBLEM - Translation cache space on mw1010 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 93%
[10:07:35] <icinga-wm>	 RECOVERY - Translation cache space on mw1130 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:07:45] <icinga-wm>	 PROBLEM - Translation cache space on mw1006 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 93%
[10:07:54] <icinga-wm>	 PROBLEM - Translation cache space on mw1001 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:07:55] <jgage>	 how do we recognize real TC space problems vs harmless?
[10:07:55] <icinga-wm>	 PROBLEM - Translation cache space on mw1004 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:07:55] <icinga-wm>	 PROBLEM - Translation cache space on mw1011 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:08:25] <icinga-wm>	 PROBLEM - Translation cache space on mw1002 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:08:25] <icinga-wm>	 PROBLEM - Translation cache space on mw1013 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 93%
[10:08:30] <ori>	 jgage: they're all equally real / equally harmless. real, because they shouldn't happen. harmless (relatively) because the process restarts when they happen and that clears the TC cache
[10:08:34] <icinga-wm>	 PROBLEM - Translation cache space on mw1007 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:08:44] <icinga-wm>	 PROBLEM - Translation cache space on mw1015 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 94%
[10:08:52] <jgage>	 ori: ok, cool
[10:08:55] <greg-g>	 ori: the check was added after an outage, though, right?
[10:09:48] <jgage>	 so perhaps a better alert might be "TC space exhausted, HHVM restarted"
[10:10:06] <greg-g>	 well, if it could force the restart
[10:10:17] <icinga-wm>	 PROBLEM - HTTP 5xx req/min on graphite1001 is CRITICAL 23.08% of data above the critical threshold [500.0]
[10:16:17] <grrrit-wm>	 (03PS1) 10Ori.livneh: Exclude xhprof.run_init from being reported [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212874 
[10:20:14] <grrrit-wm>	 (03PS2) 10Ori.livneh: Revert "Change StatsD port to another value temporarily" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212869 
[10:20:16] <grrrit-wm>	 (03PS2) 10Ori.livneh: Exclude xhprof.run_init from being reported [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212874 
[10:20:37] <grrrit-wm>	 (03CR) 10Ori.livneh: [C: 032] Exclude xhprof.run_init from being reported [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212874 (owner: 10Ori.livneh)
[10:20:43] <grrrit-wm>	 (03Merged) 10jenkins-bot: Exclude xhprof.run_init from being reported [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212874 (owner: 10Ori.livneh)
[10:21:28] <logmsgbot>	 !log ori Synchronized wmf-config/StartProfiler.php: Exclude xhprof.run_init from being reported (duration: 00m 13s)
[10:21:33] <morebots>	 Logged the message, Master
[10:22:15] <icinga-wm>	 RECOVERY - HTTP 5xx req/min on graphite1001 is OK Less than 1.00% above the threshold [250.0]
[10:22:26] <ori>	 !log Metrics from MediaWiki to graphite are temporarily suspended while xhprof profiling work is ongoing.
[10:22:31] <morebots>	 Logged the message, Master
[10:22:45] <icinga-wm>	 PROBLEM - Translation cache space on mw1017 is CRITICAL: HHVM_TC_SPACE CRITICAL code.main: 91%
[10:24:24] <icinga-wm>	 RECOVERY - Translation cache space on mw1017 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:25:45] <icinga-wm>	 RECOVERY - Translation cache space on mw1012 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:25:45] <icinga-wm>	 RECOVERY - Translation cache space on mw1003 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:25:54] <icinga-wm>	 RECOVERY - Translation cache space on mw1014 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:25:54] <icinga-wm>	 RECOVERY - Translation cache space on mw1008 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:25:55] <icinga-wm>	 RECOVERY - Translation cache space on mw1005 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:26:04] <icinga-wm>	 RECOVERY - Translation cache space on mw1016 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:26:04] <icinga-wm>	 RECOVERY - Translation cache space on mw1009 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:26:04] <icinga-wm>	 RECOVERY - Translation cache space on mw1038 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:26:05] <icinga-wm>	 RECOVERY - Translation cache space on mw1010 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:26:15] <icinga-wm>	 RECOVERY - Translation cache space on mw1006 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:26:25] <icinga-wm>	 RECOVERY - Translation cache space on mw1001 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:26:26] <icinga-wm>	 RECOVERY - Translation cache space on mw1004 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:26:26] <icinga-wm>	 RECOVERY - Translation cache space on mw1011 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:26:55] <icinga-wm>	 RECOVERY - Translation cache space on mw1002 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:26:55] <icinga-wm>	 RECOVERY - Translation cache space on mw1013 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:27:05] <icinga-wm>	 RECOVERY - Translation cache space on mw1007 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:27:14] <icinga-wm>	 RECOVERY - Translation cache space on mw1015 is OK: HHVM_TC_SPACE OK TC sizes are OK
[10:32:15] <icinga-wm>	 PROBLEM - HTTP 5xx req/min on graphite1001 is CRITICAL 14.29% of data above the critical threshold [500.0]
[10:33:20] <grrrit-wm>	 (03PS1) 10Ori.livneh: Fix-up for I388671b [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212876 
[10:33:31] <grrrit-wm>	 (03CR) 10Ori.livneh: [C: 032] Fix-up for I388671b [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212876 (owner: 10Ori.livneh)
[10:33:37] <grrrit-wm>	 (03Merged) 10jenkins-bot: Fix-up for I388671b [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212876 (owner: 10Ori.livneh)
[10:49:35] <icinga-wm>	 RECOVERY - Router interfaces on cr1-eqiad is OK host 208.80.154.196, interfaces up: 230, down: 0, dormant: 0, excluded: 0, unused: 0
[10:50:54] <icinga-wm>	 RECOVERY - HTTP 5xx req/min on graphite1001 is OK Less than 1.00% above the threshold [250.0]
[10:59:14] <icinga-wm>	 PROBLEM - puppet last run on sca1001 is CRITICAL puppet fail
[11:05:06] <grrrit-wm>	 (03CR) 10Ori.livneh: [C: 032] Revert "Change StatsD port to another value temporarily" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212869 (owner: 10Ori.livneh)
[11:05:12] <grrrit-wm>	 (03Merged) 10jenkins-bot: Revert "Change StatsD port to another value temporarily" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/212869 (owner: 10Ori.livneh)
[11:16:04] <icinga-wm>	 RECOVERY - puppet last run on sca1001 is OK Puppet is currently enabled, last run 16 seconds ago with 0 failures
[11:16:46] <logmsgbot>	 !log ori Synchronized wmf-config/CommonSettings.php: Ic258d01a7: Revert "Change StatsD port to another value temporarily" (duration: 00m 13s)
[11:16:53] <morebots>	 Logged the message, Master
[11:25:25] <icinga-wm>	 PROBLEM - carbon-cache too many creates on graphite1001 is CRITICAL 1.69% of data above the critical threshold [1000.0]
[11:26:02] <godog>	 expected ^
[11:48:52] <grrrit-wm>	 (03CR) 10Ori.livneh: [C: 032 V: 032] Initial venv [software/sentry] - 10https://gerrit.wikimedia.org/r/201006 (https://phabricator.wikimedia.org/T84956) (owner: 10Gilles)
[11:50:28] <wikibugs>	 6operations, 6Multimedia: Add monitoring of upload rate on commons to icingia alerts - https://phabricator.wikimedia.org/T92322#1304701 (10ori)
[11:53:12] <wikibugs>	 6operations, 7Graphite, 5Patch-For-Review: enable statsd extended counters - https://phabricator.wikimedia.org/T95703#1304706 (10ori) 5Open>3Resolved
[11:53:14] <wikibugs>	 6operations, 7Graphite, 5Patch-For-Review: replace txstatsd - https://phabricator.wikimedia.org/T90111#1304707 (10ori)
[11:54:44] <yuvipanda>	 _joe_: I'm going to help this guy get set up on toollabs for a bit. I'll be back soon
[11:54:52] <yuvipanda>	 _joe_: but yeah, private registry works
[11:54:59] <yuvipanda>	 https://mesosphere.github.io/marathon/docs/native-docker.html I need to do
[12:21:17] <grrrit-wm>	 (03PS1) 10Ori.livneh: (ori) dotfiles update [puppet] - 10https://gerrit.wikimedia.org/r/212892 
[12:21:19] <grrrit-wm>	 (03PS1) 10Ori.livneh: graphite: set a coarser aggregation policy to relieve storage pressure [puppet] - 10https://gerrit.wikimedia.org/r/212893 
[12:21:31] <grrrit-wm>	 (03PS2) 10Ori.livneh: (ori) dotfiles update [puppet] - 10https://gerrit.wikimedia.org/r/212892 
[12:21:38] <grrrit-wm>	 (03CR) 10Ori.livneh: [C: 032 V: 032] (ori) dotfiles update [puppet] - 10https://gerrit.wikimedia.org/r/212892 (owner: 10Ori.livneh)
[12:21:47] <grrrit-wm>	 (03PS2) 10Ori.livneh: graphite: set a coarser aggregation policy to relieve storage pressure [puppet] - 10https://gerrit.wikimedia.org/r/212893 
[12:22:04] <ori>	 godog: all yours
[12:22:55] <icinga-wm>	 RECOVERY - Unmerged changes on repository puppet on strontium is OK: No changes to merge.
[12:34:04] <yuvipanda>	 ori: thanks!
[12:34:19] <ori>	 yuvipanda: np!
[12:34:24] <wikibugs>	 6operations, 7Graphite: audit graphite retention schemas - https://phabricator.wikimedia.org/T96662#1304809 (10fgiunchedi) so, another proposal after talking with @ori, rationale being that we're most interested in recent data for investigation purposes while older data we should retain less. Difference betwee...
[12:40:04] <icinga-wm>	 PROBLEM - puppet last run on sca1001 is CRITICAL Puppet has 10 failures
[12:47:55] <grrrit-wm>	 (03PS3) 10Filippo Giunchedi: graphite: set a coarser aggregation policy to relieve storage pressure [puppet] - 10https://gerrit.wikimedia.org/r/212893 (https://phabricator.wikimedia.org/T96662) (owner: 10Ori.livneh)
[12:49:36] <grrrit-wm>	 (03CR) 10Ori.livneh: [C: 031] graphite: set a coarser aggregation policy to relieve storage pressure [puppet] - 10https://gerrit.wikimedia.org/r/212893 (https://phabricator.wikimedia.org/T96662) (owner: 10Ori.livneh)
[12:50:57] <grrrit-wm>	 (03PS4) 10Filippo Giunchedi: graphite: set a coarser aggregation policy to relieve storage pressure [puppet] - 10https://gerrit.wikimedia.org/r/212893 (https://phabricator.wikimedia.org/T96662) (owner: 10Ori.livneh)
[12:51:21] <grrrit-wm>	 (03CR) 10Filippo Giunchedi: [C: 032 V: 032] graphite: set a coarser aggregation policy to relieve storage pressure [puppet] - 10https://gerrit.wikimedia.org/r/212893 (https://phabricator.wikimedia.org/T96662) (owner: 10Ori.livneh)
[12:52:58] <godog>	 !log bounce carbon on graphite1001 to pick up new retention schema
[12:53:04] <morebots>	 Logged the message, Master
[12:53:57] <godog>	 !log remove MediaWiki.xhprof to pick up new retention schema
[12:54:03] <morebots>	 Logged the message, Master
[13:12:25] <grrrit-wm>	 (03PS1) 10Yuvipanda: ssh: Allow temporary opt out from more secure ssh [puppet] - 10https://gerrit.wikimedia.org/r/212909 
[13:13:29] <grrrit-wm>	 (03CR) 10Merlijn van Deen: "Shouldn't the default be 'true'?" [puppet] - 10https://gerrit.wikimedia.org/r/212909 (owner: 10Yuvipanda)
[13:14:08] <grrrit-wm>	 (03PS2) 10Yuvipanda: ssh: Allow temporary opt out from more secure ssh [puppet] - 10https://gerrit.wikimedia.org/r/212909 
[13:14:09] <yuvipanda>	 valhallasw: yup
[13:14:32] <grrrit-wm>	 (03PS3) 10Yuvipanda: mesos: Puppetize docker config file [puppet] - 10https://gerrit.wikimedia.org/r/212871 
[13:15:15] <grrrit-wm>	 (03CR) 10Yuvipanda: [C: 032] mesos: Puppetize docker config file [puppet] - 10https://gerrit.wikimedia.org/r/212871 (owner: 10Yuvipanda)
[13:16:33] <valhallasw>	 yuvipanda: also <%- vs <%?
[13:16:46] <yuvipanda>	 valhallasw: <%- trims newlines
[13:16:54] <valhallasw>	 ah.
[13:16:54] <icinga-wm>	 RECOVERY - puppet last run on sca1001 is OK Puppet is currently enabled, last run 49 seconds ago with 0 failures
[13:16:55] <yuvipanda>	 hmm, maybe I shouldn't have <%- but only -%>
[13:17:07] <yuvipanda>	 valhallasw: I don't want puppet to diff all the hosts :)
[13:17:14] <grrrit-wm>	 (03CR) 10Merlijn van Deen: [C: 031] ssh: Allow temporary opt out from more secure ssh [puppet] - 10https://gerrit.wikimedia.org/r/212909 (owner: 10Yuvipanda)
[13:17:18] <valhallasw>	 ?
[13:17:32] <yuvipanda>	 valhallasw: as in, I don't want it to insert an empty line on all prod hosts
[13:17:38] <valhallasw>	 ahhh
[13:28:37] <grrrit-wm>	 (03PS1) 10Andrew Bogott: Resurect the old ceph module [puppet] - 10https://gerrit.wikimedia.org/r/212914 
[13:31:48] <grrrit-wm>	 (03CR) 10Giuseppe Lavagetto: [C: 031] "I don't like this at all, but as long as it's temporary, it's ok-ish." [puppet] - 10https://gerrit.wikimedia.org/r/212909 (owner: 10Yuvipanda)
[13:32:06] <grrrit-wm>	 (03PS3) 10Yuvipanda: ssh: Allow temporary opt out from more secure ssh [puppet] - 10https://gerrit.wikimedia.org/r/212909 
[13:33:19] <wikibugs>	 6operations, 10Wikimedia-DNS, 10Wikimedia-Video: Please set up a CNAME for videoserver.wikimedia.org to Video Editing Server - https://phabricator.wikimedia.org/T99216#1304942 (10MarkTraceur) This sounds like a good plan to me - it's the least we can do to support our video creators on Commons...  Are there...
[13:33:28] <grrrit-wm>	 (03CR) 10Yuvipanda: [C: 032] ssh: Allow temporary opt out from more secure ssh [puppet] - 10https://gerrit.wikimedia.org/r/212909 (owner: 10Yuvipanda)
[14:15:52] <grrrit-wm>	 (03PS2) 10Andrew Bogott: Resurect the old ceph module [puppet] - 10https://gerrit.wikimedia.org/r/212914 
[14:20:43] <wikibugs>	 10Ops-Access-Requests, 6operations: Shell and research access for Moushira Elamrawy - https://phabricator.wikimedia.org/T100091#1305120 (10ori) 3NEW
[14:21:34] <wikibugs>	 10Ops-Access-Requests, 6operations: Shell and research access for Moushira Elamrawy - https://phabricator.wikimedia.org/T100091#1305137 (10ori)
[14:23:45] <wikibugs>	 10Ops-Access-Requests, 6operations: Shell and research access for Moushira Elamrawy - https://phabricator.wikimedia.org/T100091#1305144 (10ori)
[14:38:51] <grrrit-wm>	 (03PS3) 10Ori.livneh: Resurrect the old ceph module [puppet] - 10https://gerrit.wikimedia.org/r/212914 (owner: 10Andrew Bogott)
[14:45:36] <grrrit-wm>	 (03PS4) 10Andrew Bogott: Resurect the old ceph module [puppet] - 10https://gerrit.wikimedia.org/r/212914 
[14:45:38] <grrrit-wm>	 (03PS1) 10Andrew Bogott: Revert "Remove role::ceph::*, unused now" [puppet] - 10https://gerrit.wikimedia.org/r/212938 
[14:50:06] <grrrit-wm>	 (03PS5) 10Ori.livneh: Resurrect the old ceph module [puppet] - 10https://gerrit.wikimedia.org/r/212914 (owner: 10Andrew Bogott)
[14:55:35] <icinga-wm>	 PROBLEM - puppet last run on ms-fe3001 is CRITICAL puppet fail
[14:57:35] <icinga-wm>	 PROBLEM - puppet last run on labvirt1008 is CRITICAL Puppet has 1 failures
[14:59:09] <yuvipanda>	 _joe_: https://github.com/gogits/gogs  over the long term maybe :)
[15:12:44] <icinga-wm>	 RECOVERY - puppet last run on labvirt1008 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures
[15:14:05] <icinga-wm>	 RECOVERY - puppet last run on ms-fe3001 is OK Puppet is currently enabled, last run 54 seconds ago with 0 failures
[15:14:15] <yuvipanda>	 valhallasw: where are you?
[15:17:42] <grrrit-wm>	 (03CR) 10Alex Monk: "It won't merge because we've updated the file being removed here since the parent commit was merged into master. I don't know if we even n" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/188388 (https://phabricator.wikimedia.org/T75905) (owner: 10Reedy)
[15:20:06] <grrrit-wm>	 (03PS2) 10Alex Monk: Don't commit interwiki.cdb anymore [mediawiki-config] - 10https://gerrit.wikimedia.org/r/188388 (https://phabricator.wikimedia.org/T75905) (owner: 10Reedy)
[15:25:20] <grrrit-wm>	 (03PS1) 10Ori.livneh: Add moushira to bastion-only and researchers. [puppet] - 10https://gerrit.wikimedia.org/r/212946 (https://phabricator.wikimedia.org/T100091) 
[15:38:00] <grrrit-wm>	 (03CR) 10Moushira: [C: 031] "Yes, thats my key!" [puppet] - 10https://gerrit.wikimedia.org/r/212946 (https://phabricator.wikimedia.org/T100091) (owner: 10Ori.livneh)
[15:55:34] <icinga-wm>	 PROBLEM - puppet last run on cp3049 is CRITICAL puppet fail
[15:59:55] <icinga-wm>	 PROBLEM - puppet last run on sca1001 is CRITICAL Puppet has 81 failures
[16:00:07] <yuvipanda>	 _joe_: boom
[16:00:21] <yuvipanda>	 _joe_: do dig -t srv @marathon-master-01.eqiad.wmflabs 1 _tool-hello._tcp.marathon.mesos 
[16:00:24] <yuvipanda>	 :D
[16:07:19] <yuvipanda>	 _joe_: and it does multiples well as well - two instances running returns two SRV records
[16:07:23] * yuvipanda likes this
[16:11:25] <grrrit-wm>	 (03PS1) 10Gergő Tisza: Remove .pyc files [software/sentry] - 10https://gerrit.wikimedia.org/r/212958 
[16:11:46] <grrrit-wm>	 (03CR) 10Ori.livneh: [C: 032 V: 032] Remove .pyc files [software/sentry] - 10https://gerrit.wikimedia.org/r/212958 (owner: 10Gergő Tisza)
[16:14:05] <icinga-wm>	 RECOVERY - puppet last run on cp3049 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures
[16:16:44] <icinga-wm>	 RECOVERY - puppet last run on sca1001 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures
[16:26:15] <icinga-wm>	 PROBLEM - puppet last run on db2005 is CRITICAL puppet fail
[16:45:04] <icinga-wm>	 RECOVERY - puppet last run on db2005 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures
[16:57:12] <yuvipanda>	 _joe_: marathon has built in support for rolling deploys, and you can tweak it to see if you want it to be 2 stage or fully 'rolling'
[16:57:59] <yuvipanda>	 _joe_: so we can restart by basically doing a PUT with a new docker image, and it takes care of everything else by itself :)
[18:00:15] <icinga-wm>	 PROBLEM - puppet last run on sca1001 is CRITICAL puppet fail
[18:37:04] <icinga-wm>	 RECOVERY - puppet last run on sca1001 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures
[19:41:42] <Krenair>	 ori, did you see that post on wikitech-l?
[19:41:48] <Krenair>	 about page weight
[19:49:48] <MatmaRex>	 i was actually going to respond that ori has been on it already, but couldn't find any pretty graphs or anything
[19:51:19] <Krenair>	 I noticed the comment about lack of gzip compression over TLS
[19:53:46] <Krenair>	 but it definitely works for me
[19:55:39] <grrrit-wm>	 (03PS1) 10Yuvipanda: dynamicproxy: Hygiene fixes [puppet] - 10https://gerrit.wikimedia.org/r/212991 
[19:56:21] <Krenair>	 https://phabricator.wikimedia.org/P677
[19:56:22] <grrrit-wm>	 (03CR) 10jenkins-bot: [V: 04-1] dynamicproxy: Hygiene fixes [puppet] - 10https://gerrit.wikimedia.org/r/212991 (owner: 10Yuvipanda)
[19:57:31] <grrrit-wm>	 (03PS2) 10Yuvipanda: dynamicproxy: Hygiene fixes [puppet] - 10https://gerrit.wikimedia.org/r/212991 
[19:59:00] <grrrit-wm>	 (03PS3) 10Yuvipanda: dynamicproxy: Hygiene fixes [puppet] - 10https://gerrit.wikimedia.org/r/212991 
[19:59:45] <icinga-wm>	 PROBLEM - puppet last run on sca1001 is CRITICAL Puppet has 25 failures
[20:00:17] <Krenair>	 MatmaRex, also you'd expect someone like Rich Farmbrough to know which domain he's talking about by now
[20:01:13] <MatmaRex>	 hmm>
[20:01:14] <MatmaRex>	 ?
[20:02:00] <grrrit-wm>	 (03PS4) 10Yuvipanda: dynamicproxy: Hygiene fixes [puppet] - 10https://gerrit.wikimedia.org/r/212991 
[20:02:09] <Krenair>	 he thought that bits was on wikipedia's domain
[20:02:24] <grrrit-wm>	 (03CR) 10Yuvipanda: [C: 032 V: 032] dynamicproxy: Hygiene fixes [puppet] - 10https://gerrit.wikimedia.org/r/212991 (owner: 10Yuvipanda)
[20:02:33] <Krenair>	 see the thread that was linked to from the email
[20:02:46] <legoktm>	 MatmaRex: performance.wikimedia.org
[20:02:54] <legoktm>	 MatmaRex: fancy graphs!
[20:05:31] <Krenair>	 and ugh, "the Wikipedia developers"
[20:06:14] <Krenair>	 and then of course there is the obligatory no-js person
[20:06:24] <icinga-wm>	 PROBLEM - HTTP 5xx req/min on graphite1001 is CRITICAL 7.69% of data above the critical threshold [500.0]
[20:07:55] <MatmaRex>	 meh
[20:08:09] <MatmaRex>	 legoktm: thanks
[20:11:52] <grrrit-wm>	 (03PS1) 10Yuvipanda: dynamicproxy: Add redundanturl dynamicproxy [puppet] - 10https://gerrit.wikimedia.org/r/212997 
[20:13:24] <grrrit-wm>	 (03PS2) 10Yuvipanda: dynamicproxy: Add redundanturl dynamicproxy [puppet] - 10https://gerrit.wikimedia.org/r/212997 
[20:16:25] <icinga-wm>	 RECOVERY - HTTP 5xx req/min on graphite1001 is OK Less than 1.00% above the threshold [250.0]
[20:36:45] <icinga-wm>	 RECOVERY - puppet last run on sca1001 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures
[20:40:55] <grrrit-wm>	 (03PS3) 10Yuvipanda: dynamicproxy: Add redundanturl dynamicproxy [puppet] - 10https://gerrit.wikimedia.org/r/212997 
[21:08:05] <icinga-wm>	 PROBLEM - carbon-cache write error on graphite1001 is CRITICAL 22.22% of data above the critical threshold [8.0]
[21:10:34] <wikibugs>	 6operations, 10Wikimedia-Git-or-Gerrit: git.wikimedia.org replication from gerrit stopped or lags - https://phabricator.wikimedia.org/T99990#1305760 (10Paladox)
[21:10:53] <wikibugs>	 6operations, 10Wikimedia-Git-or-Gerrit: git.wikimedia.org replication from gerrit stopped or lags - https://phabricator.wikimedia.org/T99990#1303220 (10Paladox)
[21:11:11] <wikibugs>	 6operations, 10Wikimedia-Git-or-Gerrit: git.wikimedia.org replication from gerrit stopped or lags - https://phabricator.wikimedia.org/T99990#1305766 (10Paladox) 5duplicate>3Open
[21:11:57] <wikibugs>	 6operations, 10Wikimedia-Git-or-Gerrit: git.wikimedia.org replication from gerrit stopped or lags - https://phabricator.wikimedia.org/T99990#1303220 (10Paladox) Hi I had this patch https://gerrit.wikimedia.org/r/#/c/212813/ review and +2 for code reviewed and it said it was successfully merged but looking on g...
[21:13:55] <wikibugs>	 6operations, 10Wikimedia-Git-or-Gerrit: git.wikimedia.org replication from gerrit stopped or lags - https://phabricator.wikimedia.org/T99990#1305769 (10Paladox) p:5Triage>3Unbreak!
[21:14:19] <wikibugs>	 6operations, 10Wikimedia-Git-or-Gerrit: git.wikimedia.org replication from gerrit stopped or lags - https://phabricator.wikimedia.org/T99990#1303220 (10Paladox) Since gerrit has stoped replicating into gitblit status should be unbreak now.
[21:29:57] <grrrit-wm>	 (03PS1) 10Aaron Schulz: Fixed totally broken runner JSON response code [mediawiki-config] - 10https://gerrit.wikimedia.org/r/213010 
[21:30:05] <AaronSchulz>	 ori ^
[21:30:20] <AaronSchulz>	 next swat would be nice
[21:41:37] <icinga-wm>	 PROBLEM - carbon-cache too many creates on graphite1001 is CRITICAL 1.69% of data above the critical threshold [1000.0]
[21:43:21] <godog>	 expected ^ xhprof
[21:44:20] <NotASpy>	 any idea why I can't properly create/upload protect https://commons.wikimedia.org/w/index.php?title=File:Ssss.jpg ?
[21:44:56] <NotASpy>	 it protects as normal, but when reloading the page after leaving the page protection dialog, the protection vanishes 
[21:47:47] <AaronSchulz>	 https://commons.wikimedia.org/wiki/File:Ssss.jpg?action=edit gives 'This title has been protected from creation by Nick. The reason given is "Protection against re-creation (non-descriptive file name)".' to me
[21:53:01] <NotASpy>	 OK, will try and upload with a non admin account. 
[23:30:30] <logmsgbot>	 !log ori Synchronized php-1.26wmf7/extensions/Gadgets: b592efa5fe: Update Gadgets for I6da3eede0: Conversion to using WAN cache (duration: 00m 13s)
[23:30:37] <morebots>	 Logged the message, Master
[23:37:14] <icinga-wm>	 PROBLEM - puppet last run on mw2007 is CRITICAL puppet fail
[23:52:12] <grrrit-wm>	 (03PS1) 10Yuvipanda: mesos: Setup marathon properly [puppet] - 10https://gerrit.wikimedia.org/r/213189 
[23:55:44] <icinga-wm>	 RECOVERY - puppet last run on mw2007 is OK Puppet is currently enabled, last run 16 seconds ago with 0 failures
[23:56:45] <icinga-wm>	 PROBLEM - Kafka Broker Messages In Per Second on graphite1001 is CRITICAL Anomaly detected: 0 data above and 45 below the confidence bounds