[00:23:15] <icinga-wm>	 PROBLEM - Apache HTTP on mw1197 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 0.073 second response time
[00:23:45] <icinga-wm>	 PROBLEM - HHVM rendering on mw1197 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 0.073 second response time
[00:24:15] <icinga-wm>	 RECOVERY - Apache HTTP on mw1197 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 613 bytes in 0.199 second response time
[00:24:45] <icinga-wm>	 RECOVERY - HHVM rendering on mw1197 is OK: HTTP OK: HTTP/1.1 200 OK - 76425 bytes in 1.367 second response time
[00:58:25] <icinga-wm>	 PROBLEM - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[00:58:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb1001 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[00:58:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb1003 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[00:58:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb1002 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[00:59:25] <icinga-wm>	 PROBLEM - citoid endpoints health on scb1004 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[01:00:40] <Shangkuanlc>	 Hi, this ia from Taiwanese Wikimedians. We News beuraucrat's help to provide User:Koala0090 
[01:01:05] <Shangkuanlc>	 One day right of creating account more then 6
[01:01:25] <icinga-wm>	 RECOVERY - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is OK: All endpoints are healthy
[01:01:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb1003 is OK: All endpoints are healthy
[01:01:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb1001 is OK: All endpoints are healthy
[01:01:26] <icinga-wm>	 RECOVERY - citoid endpoints health on scb1002 is OK: All endpoints are healthy
[01:01:40] <Shangkuanlc>	 Because he is hosting a workshop for high schools students in Hualien, Taiwan
[01:02:11] <Shangkuanlc>	 See here for the workshop detail (in Chinese) 可否請管理員協助提供維基用戶 User:Koala0090 一天開通多帳號權限？ 詳見 https://zh.m.wikipedia.org/wiki/Wikipedia:%E8%87%BA%E7%81%A3%E6%95%99%E8%82%B2%E5%B0%88%E6%A1%88/%E6%85%88%E4%B8%AD%E7%B6%AD%E5%9F%BA%E7%B7%A8%E8%AD%AF%E5%AF%AB%E4%BD%9C%E5%9D%8A
[01:02:15] <icinga-wm>	 RECOVERY - citoid endpoints health on scb1004 is OK: All endpoints are healthy
[01:04:50] <Shangkuanlc>	 Please, anyone?
[01:06:50] <Shangkuanlc>	 It seems like nobody is here. Thanks anyway!
[02:21:43] <logmsgbot>	 !log l10nupdate@tin scap sync-l10n completed (1.30.0-wmf.1) (duration: 07m 30s)
[02:21:59] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[02:27:46] <logmsgbot>	 !log l10nupdate@tin ResourceLoader cache refresh completed at Sun May 21 02:27:46 UTC 2017 (duration 6m 3s)
[02:27:53] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[02:46:25] <icinga-wm>	 PROBLEM - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[02:46:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb1003 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[02:46:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb1002 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[02:46:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb1001 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[02:47:25] <icinga-wm>	 RECOVERY - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is OK: All endpoints are healthy
[02:47:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb1003 is OK: All endpoints are healthy
[02:47:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb1002 is OK: All endpoints are healthy
[02:47:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb1001 is OK: All endpoints are healthy
[02:56:25] <icinga-wm>	 PROBLEM - citoid endpoints health on scb1004 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[02:56:35] <icinga-wm>	 PROBLEM - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[02:56:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb1003 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[02:56:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb1002 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[02:58:15] <icinga-wm>	 RECOVERY - citoid endpoints health on scb1004 is OK: All endpoints are healthy
[02:58:25] <icinga-wm>	 RECOVERY - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is OK: All endpoints are healthy
[02:58:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb1002 is OK: All endpoints are healthy
[02:58:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb1003 is OK: All endpoints are healthy
[03:33:05] <icinga-wm>	 PROBLEM - puppet last run on mw2240 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/share/GeoIP/GeoIP2-City.mmdb.gz]
[04:01:05] <icinga-wm>	 RECOVERY - puppet last run on mw2240 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures
[04:08:45] <icinga-wm>	 PROBLEM - mailman I/O stats on fermium is CRITICAL: CRITICAL - I/O stats: Transfers/Sec=3514.10 Read Requests/Sec=4102.90 Write Requests/Sec=11.80 KBytes Read/Sec=30697.60 KBytes_Written/Sec=71.20
[04:16:45] <icinga-wm>	 RECOVERY - mailman I/O stats on fermium is OK: OK - I/O stats: Transfers/Sec=7.00 Read Requests/Sec=0.60 Write Requests/Sec=5.30 KBytes Read/Sec=2.80 KBytes_Written/Sec=179.20
[04:25:51] <wikibugs>	 06Operations, 10ops-eqiad: Degraded RAID on db1024 - https://phabricator.wikimedia.org/T165934#3280907 (10ops-monitoring-bot)
[04:51:35] <icinga-wm>	 PROBLEM - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[04:51:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb1002 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[04:51:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb1004 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[04:51:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb1003 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[04:51:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb1001 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[04:54:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb1003 is OK: All endpoints are healthy
[04:54:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb1004 is OK: All endpoints are healthy
[04:54:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb1001 is OK: All endpoints are healthy
[04:54:25] <icinga-wm>	 RECOVERY - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is OK: All endpoints are healthy
[04:54:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb1002 is OK: All endpoints are healthy
[05:57:16] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2002 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[05:57:25] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2005 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[05:57:25] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2001 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[05:57:25] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2003 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[05:57:25] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2006 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[05:58:25] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2004 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[05:58:25] <icinga-wm>	 PROBLEM - Citoid LVS codfw on citoid.svc.codfw.wmnet is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[06:00:15] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2002 is OK: All endpoints are healthy
[06:00:15] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2005 is OK: All endpoints are healthy
[06:00:15] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2003 is OK: All endpoints are healthy
[06:00:15] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2006 is OK: All endpoints are healthy
[06:00:15] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2001 is OK: All endpoints are healthy
[06:00:15] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2004 is OK: All endpoints are healthy
[06:00:16] <icinga-wm>	 RECOVERY - Citoid LVS codfw on citoid.svc.codfw.wmnet is OK: All endpoints are healthy
[06:39:58] <wikibugs>	 06Operations, 10Wikimedia-SVG-rendering, 07Upstream: librsvg misinterpret quoted font family names that contain whitespaces - https://phabricator.wikimedia.org/T64987#3280940 (10Perhelion) >>! In T64987#3279102, @Aklapper wrote: > @Perhelion: Does that mean https://bugzilla.gnome.org/show_bug.cgi?id=739329 s...
[06:47:05] <icinga-wm>	 PROBLEM - puppet last run on labtestcontrol2001 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Package[tzdata]
[06:55:11] <wikibugs>	 06Operations, 10Wikimedia-SVG-rendering, 07Upstream: librsvg misinterpret quoted font family names that contain whitespaces - https://phabricator.wikimedia.org/T64987#3280941 (10Perhelion)
[06:58:09] <wikibugs>	 06Operations, 10Wikimedia-SVG-rendering, 07Upstream: librsvg misinterpret quoted font family names that contain whitespaces - https://phabricator.wikimedia.org/T64987#3280944 (10Perhelion)
[07:16:05] <icinga-wm>	 RECOVERY - puppet last run on labtestcontrol2001 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures
[08:21:29] <wikibugs>	 (03CR) 10Nemo bis: Enable ValidationStatistics log for FlaggedRevs (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/354615 (https://phabricator.wikimedia.org/T163107) (owner: 10Nemo bis)
[08:31:35] <wikibugs>	 06Operations, 10Ops-Access-Requests: Access to search logs for Jan Dittrich - https://phabricator.wikimedia.org/T165943#3281090 (10Jan_Dittrich)
[08:33:11] <Danny_B>	 i'm still getting that 500 on enhanced watchlist. can that be rolled back until it's fixed?
[08:33:35] <Danny_B>	 (or the fix deployed)
[08:35:45] <Reedy>	 Where's the patch again?
[08:37:28] <Reedy>	 https://gerrit.wikimedia.org/r/#/c/354602/ looks related
[08:37:31] <Reedy>	 But wasn't there another?
[08:38:37] <Danny_B>	 https://gerrit.wikimedia.org/r/#/c/350914/ ?
[08:38:59] <wikibugs>	 (03PS2) 10Nemo bis: Remove $wgEnableValidationStatisticsUpdates from FlaggedRevs config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/354600
[08:39:23] <Danny_B>	 that's what gwicke linked yesterday iianm
[08:41:04] <Reedy>	 Which is CR-1'd
[08:42:38] <Danny_B>	 idk... i just would like if it worked as it worked a week ago normally
[08:44:54] <MatmaRex>	 Danny_B: i don't think there were any changes. it's an ongoing issue for users with large watchlists, i think. perhaps you're hitting a slower database server or something.
[08:45:27] <MatmaRex>	 Danny_B: there was also something about this in tech news a week ago or two, a similar problem for users who had "watch categorization changes" enabled
[08:46:18] <bawolff>	 Reedy: Do you think it'd be sane to just return the empty string there?
[08:46:52] <Reedy>	 The callers use...
[08:46:55] <Reedy>	 $data['historyLink'] = $this->getDiffHistLinks( $rcObj, $query );
[08:47:23] <Reedy>	 The callers use...
[08:47:24] <Reedy>	 $data['historyLink'] = $this->getDiffHistLinks( $rcObj, $query );
[08:47:26] <Reedy>	 and then later
[08:47:27] <Reedy>	 		$line .= implode( '', $data );
[08:49:42] <bawolff>	 There's an if statement for type of rc entry, so in many places it just doesn't have a historyLink entry in the array
[08:51:41] * bawolff made a new version returning an empty string
[08:53:36] <Reedy>	 What does implode do with array keys?
[08:55:07] <bawolff>	 it just ignores them
[08:55:24] <bawolff>	 I think equivalent to implode( '', array_values( $data ) );
[09:06:45] <logmsgbot>	 !log smalyshev@tin Started deploy [wdqs/wdqs@227ab25]: Redeploy GUI due to breakage in T165228
[09:06:54] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:06:54] <stashbot>	 T165228: Query results are downloaded in wrong encoding - https://phabricator.wikimedia.org/T165228
[09:07:04] <logmsgbot>	 !log smalyshev@tin Finished deploy [wdqs/wdqs@227ab25]: Redeploy GUI due to breakage in T165228 (duration: 00m 19s)
[09:07:11] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:13:25] <icinga-wm>	 PROBLEM - HHVM rendering on mw2125 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[09:14:15] <icinga-wm>	 RECOVERY - HHVM rendering on mw2125 is OK: HTTP OK: HTTP/1.1 200 OK - 76275 bytes in 0.176 second response time
[09:19:08] <wikibugs>	 (03PS1) 10Reedy: Print dbname before running update.php [puppet] - 10https://gerrit.wikimedia.org/r/354919
[09:20:01] <wikibugs>	 (03CR) 10Greg Grossmeier: [C: 031] Print dbname before running update.php [puppet] - 10https://gerrit.wikimedia.org/r/354919 (owner: 10Reedy)
[09:21:55] <wikibugs>	 (03CR) 10Rush: [C: 032] Print dbname before running update.php [puppet] - 10https://gerrit.wikimedia.org/r/354919 (owner: 10Reedy)
[09:22:00] <wikibugs>	 (03CR) 10Rush: [V: 032 C: 032] "seems to only effect beta and greg gave a +1 seems fine to me" [puppet] - 10https://gerrit.wikimedia.org/r/354919 (owner: 10Reedy)
[09:22:16] <greg-g>	 :)
[09:22:27] <greg-g>	 "blame greg if it breaks"
[09:22:52] <Reedy>	 See if we can see which db is brokened
[09:22:57] <Niharika>	 Hey joal. Someone looking for you in the Atrium. 
[09:23:04] <chasemp>	 greg-g: nahhhhhhhhhh but you're mr. beta <straightens tie>
[09:24:09] <greg-g>	 :)
[09:42:06] <Reedy>	 !log force ran puppet on deployment-tin to pickup dbname in wmf-beta-update-database.py
[09:42:14] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:58:22] <wikibugs>	 (03PS1) 10Reedy: Do the echo when running update.php [puppet] - 10https://gerrit.wikimedia.org/r/354932
[10:12:28] <wikibugs>	 (03PS1) 10Filippo Giunchedi: Test for unreferenced files introduced by changes [software/puppet-compiler] - 10https://gerrit.wikimedia.org/r/354939
[10:14:13] <icinga-wm>	 ACKNOWLEDGEMENT - Check systemd state on labstore2001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. andrew bogott This box is a WIP
[10:15:01] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Test for unreferenced files introduced by changes [software/puppet-compiler] - 10https://gerrit.wikimedia.org/r/354939 (owner: 10Filippo Giunchedi)
[10:56:25] <wikibugs>	 (03PS3) 10Mark Bergsma: Use a bytearray to encode prefixes in BGP.encodePrefixes [debs/pybal] - 10https://gerrit.wikimedia.org/r/354685
[10:56:27] <wikibugs>	 (03PS5) 10Mark Bergsma: Use a bytearray to build IPPrefix [debs/pybal] - 10https://gerrit.wikimedia.org/r/354711
[10:56:29] <wikibugs>	 (03PS2) 10Mark Bergsma: Create new BGP message classes for incremental construction [debs/pybal] - 10https://gerrit.wikimedia.org/r/354684
[10:56:31] <wikibugs>	 (03PS5) 10Mark Bergsma: Adapt NaiveBGPPeering to support UPDATE message overflow [debs/pybal] - 10https://gerrit.wikimedia.org/r/354686
[10:56:33] <wikibugs>	 (03PS3) 10Mark Bergsma: Allow for withdrawals and NLRI to be sent in the same UPDATE [debs/pybal] - 10https://gerrit.wikimedia.org/r/354723
[11:00:53] <wikibugs>	 (03CR) 10Mark Bergsma: [C: 032] Use a bytearray to encode prefixes in BGP.encodePrefixes [debs/pybal] - 10https://gerrit.wikimedia.org/r/354685 (owner: 10Mark Bergsma)
[11:01:39] <wikibugs>	 (03Merged) 10jenkins-bot: Use a bytearray to encode prefixes in BGP.encodePrefixes [debs/pybal] - 10https://gerrit.wikimedia.org/r/354685 (owner: 10Mark Bergsma)
[11:02:25] <wikibugs>	 (03PS2) 10Volans: Puppet compiler: automatically sync from all masters [puppet] - 10https://gerrit.wikimedia.org/r/354105 (https://phabricator.wikimedia.org/T165583)
[11:02:43] <wikibugs>	 (03CR) 10Mark Bergsma: [C: 032] Use a bytearray to build IPPrefix [debs/pybal] - 10https://gerrit.wikimedia.org/r/354711 (owner: 10Mark Bergsma)
[11:04:18] <wikibugs>	 (03Merged) 10jenkins-bot: Use a bytearray to build IPPrefix [debs/pybal] - 10https://gerrit.wikimedia.org/r/354711 (owner: 10Mark Bergsma)
[11:05:36] <wikibugs>	 (03PS3) 10Giuseppe Lavagetto: Add netlink-based Ipvsmanager implementation [debs/pybal] - 10https://gerrit.wikimedia.org/r/302882
[11:05:42] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Add netlink-based Ipvsmanager implementation [debs/pybal] - 10https://gerrit.wikimedia.org/r/302882 (owner: 10Giuseppe Lavagetto)
[11:13:46] <wikibugs>	 06Operations, 05MW-1.30-release-notes, 06Performance-Team, 10Thumbor, 13Patch-For-Review: Thumbor should reject thumbnail requests that are the same size as the original or bigger - https://phabricator.wikimedia.org/T150741#3281679 (10Gilles) Assuming the above change works and we only need to run refres...
[11:18:26] <wikibugs>	 (03PS5) 10Volans: Puppet: run-puppet-agent, add --failed-only option [puppet] - 10https://gerrit.wikimedia.org/r/349416
[11:47:54] <wikibugs>	 (03PS1) 10Dereckson: Add techconduct.wikimedia.orgfor new private wiki [dns] - 10https://gerrit.wikimedia.org/r/354954 (https://phabricator.wikimedia.org/T165977)
[11:48:23] <wikibugs>	 (03PS2) 10Dereckson: Add techconduct.wikimedia.org for new private wiki [dns] - 10https://gerrit.wikimedia.org/r/354954 (https://phabricator.wikimedia.org/T165977)
[11:50:19] <wikibugs>	 (03PS4) 10Mark Bergsma: Allow for withdrawals and NLRI to be sent in the same UPDATE [debs/pybal] - 10https://gerrit.wikimedia.org/r/354723
[11:50:21] <wikibugs>	 (03PS1) 10Mark Bergsma: Add GPLv2 license header to bgp.py [debs/pybal] - 10https://gerrit.wikimedia.org/r/354955
[12:11:28] <wikibugs>	 (03PS1) 10Dereckson: Apache: add techconduct.wm.o to remnant sites [puppet] - 10https://gerrit.wikimedia.org/r/354959 (https://phabricator.wikimedia.org/T165977)
[12:11:56] <wikibugs>	 (03CR) 10Mark Bergsma: [C: 04-2] "bgp.py should not in any way depend on pybal classes" [debs/pybal] (1.13) - 10https://gerrit.wikimedia.org/r/344659 (owner: 10Ema)
[12:12:49] <wikibugs>	 (03CR) 10Mark Bergsma: [C: 04-2] "bgp.py should not in any way depend on pybal classes" [debs/pybal] - 10https://gerrit.wikimedia.org/r/354677 (owner: 10Ema)
[12:13:37] <wikibugs>	 (03PS1) 10Dereckson: Don't replicate techconductwiki to labs [puppet] - 10https://gerrit.wikimedia.org/r/354960 (https://phabricator.wikimedia.org/T165977)
[12:14:19] <wikibugs>	 (03CR) 10Mark Bergsma: [C: 04-1] "I like moving the IPPrefix classes to a separate module, as long as we keep it separate and independent of pybal." [debs/pybal] - 10https://gerrit.wikimedia.org/r/354746 (owner: 10Ema)
[12:14:32] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Don't replicate techconductwiki to labs [puppet] - 10https://gerrit.wikimedia.org/r/354960 (https://phabricator.wikimedia.org/T165977) (owner: 10Dereckson)
[12:26:35] <icinga-wm>	 PROBLEM - Citoid LVS codfw on citoid.svc.codfw.wmnet is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[12:26:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2005 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[12:26:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2004 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[12:26:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2006 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[12:27:25] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2001 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[12:27:25] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2003 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[12:27:25] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2002 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[12:28:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2003 is OK: All endpoints are healthy
[12:28:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2002 is OK: All endpoints are healthy
[12:28:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb1004 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[12:28:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb1003 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[12:28:45] <icinga-wm>	 PROBLEM - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[12:28:45] <icinga-wm>	 PROBLEM - citoid endpoints health on scb1002 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[12:29:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2004 is OK: All endpoints are healthy
[12:29:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2006 is OK: All endpoints are healthy
[12:29:25] <icinga-wm>	 RECOVERY - Citoid LVS codfw on citoid.svc.codfw.wmnet is OK: All endpoints are healthy
[12:29:35] <icinga-wm>	 RECOVERY - citoid endpoints health on scb1003 is OK: All endpoints are healthy
[12:29:35] <icinga-wm>	 RECOVERY - citoid endpoints health on scb1002 is OK: All endpoints are healthy
[12:30:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2005 is OK: All endpoints are healthy
[12:30:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2001 is OK: All endpoints are healthy
[12:30:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb1004 is OK: All endpoints are healthy
[12:30:35] <icinga-wm>	 RECOVERY - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is OK: All endpoints are healthy
[12:36:58] <wikibugs>	 (03PS2) 10Dereckson: Don't replicate techconductwiki to labs [puppet] - 10https://gerrit.wikimedia.org/r/354960 (https://phabricator.wikimedia.org/T165977)
[12:48:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2001 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[12:48:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2004 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[12:48:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2003 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[12:48:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2002 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[12:48:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2005 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[12:48:35] <icinga-wm>	 PROBLEM - Citoid LVS codfw on citoid.svc.codfw.wmnet is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[12:48:36] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2006 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[12:50:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2001 is OK: All endpoints are healthy
[12:50:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2002 is OK: All endpoints are healthy
[12:50:26] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2004 is OK: All endpoints are healthy
[12:50:26] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2006 is OK: All endpoints are healthy
[12:50:26] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2003 is OK: All endpoints are healthy
[12:50:26] <icinga-wm>	 RECOVERY - Citoid LVS codfw on citoid.svc.codfw.wmnet is OK: All endpoints are healthy
[12:50:26] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2005 is OK: All endpoints are healthy
[12:58:23] <wikibugs>	 (03PS1) 10Ema: Add unit tests for the Origin and BaseASPath BGP attributes [debs/pybal] - 10https://gerrit.wikimedia.org/r/354972
[12:58:45] <wikibugs>	 (03PS3) 10Volans: Puppet compiler: automatically sync from all masters [puppet] - 10https://gerrit.wikimedia.org/r/354105 (https://phabricator.wikimedia.org/T165583)
[13:01:30] <wikibugs>	 (03Abandoned) 10Ema: bgp: log with util.log instead of printing to stdout [debs/pybal] - 10https://gerrit.wikimedia.org/r/354677 (owner: 10Ema)
[13:13:02] <wikibugs>	 (03PS3) 10Filippo Giunchedi: prometheus: report puppet agent stats [puppet] - 10https://gerrit.wikimedia.org/r/354007
[13:13:04] <wikibugs>	 (03PS2) 10Filippo Giunchedi: base: report prometheus agent stats [puppet] - 10https://gerrit.wikimedia.org/r/354457
[13:13:06] <wikibugs>	 (03PS2) 10Filippo Giunchedi: prometheus: add alertmanager_url to prometheus server [puppet] - 10https://gerrit.wikimedia.org/r/354459
[13:13:08] <wikibugs>	 (03PS2) 10Filippo Giunchedi: role: use alertmanager in beta prometheus [puppet] - 10https://gerrit.wikimedia.org/r/354460
[13:13:10] <wikibugs>	 (03PS1) 10Filippo Giunchedi: role: set external url for prometheus beta [puppet] - 10https://gerrit.wikimedia.org/r/354975
[13:13:12] <wikibugs>	 (03PS1) 10Filippo Giunchedi: WIP prometheus::alertmanager [puppet] - 10https://gerrit.wikimedia.org/r/354976
[13:15:10] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] WIP prometheus::alertmanager [puppet] - 10https://gerrit.wikimedia.org/r/354976 (owner: 10Filippo Giunchedi)
[13:19:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2002 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[13:19:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2001 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[13:19:35] <icinga-wm>	 PROBLEM - Citoid LVS codfw on citoid.svc.codfw.wmnet is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[13:19:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2005 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[13:19:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2004 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[13:19:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2003 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[13:19:35] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2006 is CRITICAL: /api (open graph via native scraper) timed out before a response was received
[13:21:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2002 is OK: All endpoints are healthy
[13:21:35] <icinga-wm>	 RECOVERY - Citoid LVS codfw on citoid.svc.codfw.wmnet is OK: All endpoints are healthy
[13:22:25] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2004 is OK: All endpoints are healthy
[13:22:27] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2003 is OK: All endpoints are healthy
[13:22:27] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2005 is OK: All endpoints are healthy
[13:22:27] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2001 is OK: All endpoints are healthy
[13:22:27] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2006 is OK: All endpoints are healthy
[13:27:39] <wikibugs>	 06Operations, 10DBA, 10Wikimedia-Site-requests, 13Patch-For-Review: Create CoC committee private wiki - https://phabricator.wikimedia.org/T165977#3282604 (10Dereckson) Adding #DBA for the **PRIVATE** database part and to notify them we're creating a new private wiki. Adding #operations for Apache :80 redir...
[13:30:34] <wikibugs>	 06Operations, 10DBA, 10Wikimedia-Site-requests, 13Patch-For-Review: Create CoC committee private wiki - https://phabricator.wikimedia.org/T165977#3282613 (10Dereckson) a:05Dereckson>03jcrespo Jaime, I assign this task yo you to block it until you give us a green light replication to labs is disabled. I...
[13:42:53] <wikibugs>	 (03PS1) 10Dereckson: Set initial configuration for techconduct.wikimedia.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/354985 (https://phabricator.wikimedia.org/T165977)
[13:44:38] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Set initial configuration for techconduct.wikimedia.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/354985 (https://phabricator.wikimedia.org/T165977) (owner: 10Dereckson)
[13:55:35] <wikibugs>	 (03PS2) 10Dereckson: Set initial configuration for techconduct.wikimedia.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/354985 (https://phabricator.wikimedia.org/T165977)
[14:07:37] <wikibugs>	 06Operations, 10DBA, 10Wikimedia-Site-requests, 13Patch-For-Review: Create CoC committee private wiki - https://phabricator.wikimedia.org/T165977#3282698 (10Dereckson) p:05Triage>03Normal
[14:41:53] <wikibugs>	 (03PS2) 10Ema: Move BGP classes to bgp.bgp, IP classes to bgp.ip [debs/pybal] - 10https://gerrit.wikimedia.org/r/354746
[14:43:19] <wikibugs>	 (03PS3) 10Ema: Move BGP classes to bgp.bgp, IP classes to bgp.ip [debs/pybal] - 10https://gerrit.wikimedia.org/r/354746
[14:45:30] <wikibugs>	 (03PS4) 10Ema: Move BGP classes to bgp.bgp, IP classes to bgp.ip [debs/pybal] - 10https://gerrit.wikimedia.org/r/354746
[14:48:50] <wikibugs>	 (03CR) 10Mark Bergsma: [C: 032] Move BGP classes to bgp.bgp, IP classes to bgp.ip [debs/pybal] - 10https://gerrit.wikimedia.org/r/354746 (owner: 10Ema)
[14:51:53] <wikibugs>	 (03PS2) 10Amire80: [DON'T MERGE] Remove special Math extension settings for hewiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/353970
[14:57:47] <wikibugs>	 (03PS3) 10Amire80: Remove special Math extension settings for hewiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/353970
[15:00:32] <wikibugs>	 (03CR) 10Mark Bergsma: Add netlink-based Ipvsmanager implementation (031 comment) [debs/pybal] (2.0-dev) - 10https://gerrit.wikimedia.org/r/354509 (owner: 10Giuseppe Lavagetto)
[15:04:26] <wikibugs>	 (03Abandoned) 10Ema: Add unit tests for the Origin and BaseASPath BGP attributes [debs/pybal] - 10https://gerrit.wikimedia.org/r/354972 (owner: 10Ema)
[15:05:37] <wikibugs>	 (03PS4) 10Filippo Giunchedi: prometheus: report puppet agent stats [puppet] - 10https://gerrit.wikimedia.org/r/354007
[15:05:39] <wikibugs>	 (03PS3) 10Filippo Giunchedi: base: report prometheus agent stats [puppet] - 10https://gerrit.wikimedia.org/r/354457
[15:05:41] <wikibugs>	 (03PS3) 10Filippo Giunchedi: prometheus: add alertmanager_url to prometheus server [puppet] - 10https://gerrit.wikimedia.org/r/354459
[15:05:43] <wikibugs>	 (03PS3) 10Filippo Giunchedi: role: use alertmanager in beta prometheus [puppet] - 10https://gerrit.wikimedia.org/r/354460
[15:05:45] <wikibugs>	 (03PS2) 10Filippo Giunchedi: role: set external url for prometheus beta [puppet] - 10https://gerrit.wikimedia.org/r/354975
[15:05:47] <wikibugs>	 (03PS2) 10Filippo Giunchedi: WIP prometheus::alertmanager [puppet] - 10https://gerrit.wikimedia.org/r/354976
[15:09:11] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] WIP prometheus::alertmanager [puppet] - 10https://gerrit.wikimedia.org/r/354976 (owner: 10Filippo Giunchedi)
[15:25:31] <wikibugs>	 (03CR) 10Ema: [C: 031] Create new BGP message classes for incremental construction [debs/pybal] - 10https://gerrit.wikimedia.org/r/354684 (owner: 10Mark Bergsma)
[15:32:34] <wikibugs>	 (03CR) 10Ema: [C: 04-1] Adapt NaiveBGPPeering to support UPDATE message overflow (031 comment) [debs/pybal] - 10https://gerrit.wikimedia.org/r/354686 (owner: 10Mark Bergsma)
[15:34:54] <wikibugs>	 (03CR) 10Ema: Allow for withdrawals and NLRI to be sent in the same UPDATE (031 comment) [debs/pybal] - 10https://gerrit.wikimedia.org/r/354723 (owner: 10Mark Bergsma)
[15:35:03] <wikibugs>	 (03CR) 10Ema: [C: 031] Add GPLv2 license header to bgp.py [debs/pybal] - 10https://gerrit.wikimedia.org/r/354955 (owner: 10Mark Bergsma)
[15:35:33] <wikibugs>	 (03PS1) 10Ema: bgp: add a few unit tests [debs/pybal] - 10https://gerrit.wikimedia.org/r/355000
[15:37:06] <madhuvishy>	 moritzm: I am at the hackathon, but jfyi I'll follow up on the NFS ferm rules stuff tomorrow or Tuesday :)
[15:39:26] <wikibugs>	 (03CR) 10Mark Bergsma: Allow for withdrawals and NLRI to be sent in the same UPDATE (031 comment) [debs/pybal] - 10https://gerrit.wikimedia.org/r/354723 (owner: 10Mark Bergsma)
[15:52:35] <wikibugs>	 (03CR) 10Mark Bergsma: Adapt NaiveBGPPeering to support UPDATE message overflow (031 comment) [debs/pybal] - 10https://gerrit.wikimedia.org/r/354686 (owner: 10Mark Bergsma)
[15:53:52] <wikibugs>	 (03PS6) 10Mark Bergsma: Adapt NaiveBGPPeering to support UPDATE message overflow [debs/pybal] - 10https://gerrit.wikimedia.org/r/354686
[15:53:54] <wikibugs>	 (03PS5) 10Mark Bergsma: Allow for withdrawals and NLRI to be sent in the same UPDATE [debs/pybal] - 10https://gerrit.wikimedia.org/r/354723
[15:53:56] <wikibugs>	 (03PS2) 10Mark Bergsma: Add GPLv2 license header to bgp.py [debs/pybal] - 10https://gerrit.wikimedia.org/r/354955
[15:54:51] <wikibugs>	 (03CR) 10Mark Bergsma: Adapt NaiveBGPPeering to support UPDATE message overflow (031 comment) [debs/pybal] - 10https://gerrit.wikimedia.org/r/354686 (owner: 10Mark Bergsma)
[15:56:16] <wikibugs>	 (03CR) 10Mark Bergsma: [C: 032] bgp: add a few unit tests [debs/pybal] - 10https://gerrit.wikimedia.org/r/355000 (owner: 10Ema)
[16:00:15] <wikibugs>	 (03CR) 10Mark Bergsma: [C: 032] Create new BGP message classes for incremental construction [debs/pybal] - 10https://gerrit.wikimedia.org/r/354684 (owner: 10Mark Bergsma)
[16:01:09] <wikibugs>	 (03Merged) 10jenkins-bot: Create new BGP message classes for incremental construction [debs/pybal] - 10https://gerrit.wikimedia.org/r/354684 (owner: 10Mark Bergsma)
[16:01:44] <wikibugs>	 (03CR) 10Mark Bergsma: [C: 032] Add GPLv2 license header to bgp.py [debs/pybal] - 10https://gerrit.wikimedia.org/r/354955 (owner: 10Mark Bergsma)
[16:09:38] <wikibugs>	 (03CR) 10Mark Bergsma: [C: 04-1] Instrumentation fixes (031 comment) [debs/pybal] - 10https://gerrit.wikimedia.org/r/354680 (https://phabricator.wikimedia.org/T103882) (owner: 10Ema)
[16:33:15] <wikibugs>	 (03PS2) 10Filippo Giunchedi: Test for unreferenced files introduced by changes [software/puppet-compiler] - 10https://gerrit.wikimedia.org/r/354939
[16:38:04] <wikibugs>	 (03PS3) 10Filippo Giunchedi: WIP prometheus::alertmanager [puppet] - 10https://gerrit.wikimedia.org/r/354976
[16:39:36] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] WIP prometheus::alertmanager [puppet] - 10https://gerrit.wikimedia.org/r/354976 (owner: 10Filippo Giunchedi)
[16:49:56] <icinga-wm>	 ACKNOWLEDGEMENT - HP RAID on ms-be2029 is CRITICAL: CHECK_NRPE: Socket timeout after 50 seconds. nagiosadmin RAID handler auto-ack: https://phabricator.wikimedia.org/T166021
[16:50:04] <wikibugs>	 06Operations, 10ops-codfw: Degraded RAID on ms-be2029 - https://phabricator.wikimedia.org/T166021#3283038 (10ops-monitoring-bot)
[16:52:13] <wikibugs>	 (03CR) 10Filippo Giunchedi: [C: 031] Puppet compiler: automatically sync from all masters [puppet] - 10https://gerrit.wikimedia.org/r/354105 (https://phabricator.wikimedia.org/T165583) (owner: 10Volans)
[16:55:10] <wikibugs>	 (03CR) 10Filippo Giunchedi: Puppet: run-puppet-agent, add --failed-only option (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/349416 (owner: 10Volans)
[17:24:45] <icinga-wm>	 PROBLEM - puppet last run on sca1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:52:45] <icinga-wm>	 RECOVERY - puppet last run on sca1003 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures
[18:05:45] <icinga-wm>	 PROBLEM - puppet last run on db1082 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[18:34:45] <icinga-wm>	 RECOVERY - puppet last run on db1082 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures
[18:49:13] <wikibugs>	 (03PS6) 10Mark Bergsma: Allow for withdrawals and NLRI to be sent in the same UPDATE [debs/pybal] - 10https://gerrit.wikimedia.org/r/354723
[18:49:15] <wikibugs>	 (03PS3) 10Mark Bergsma: Add GPLv2 license header to bgp.py [debs/pybal] - 10https://gerrit.wikimedia.org/r/354955
[20:48:55] <icinga-wm>	 PROBLEM - puppet last run on cp3007 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[21:17:55] <icinga-wm>	 RECOVERY - puppet last run on cp3007 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures
[22:24:21] <wikibugs>	 06Operations, 10RESTBase, 06Services, 10Wikimedia-Site-requests: Index page https://wikimedia.org/api/ is broken / RESTBase not discoverable - https://phabricator.wikimedia.org/T138848#3283225 (10Krinkle)