[00:26:31] PROBLEM - Redis status tcp_6479 on rdb2006 is CRITICAL: CRITICAL ERROR - Redis Library - can not ping 10.192.48.44 on port 6479 [00:28:21] RECOVERY - Redis status tcp_6479 on rdb2006 is OK: OK: REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5606672 keys - replication_delay is 0 [01:26:09] PROBLEM - Ulsfo HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [01:26:10] PROBLEM - Text HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [01:27:29] PROBLEM - puppet last run on ms-be2013 is CRITICAL: CRITICAL: Puppet has 1 failures [01:28:09] PROBLEM - Codfw HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [01:33:07] RECOVERY - Text HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [01:33:28] RECOVERY - Ulsfo HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [01:35:08] RECOVERY - Codfw HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [01:41:57] PROBLEM - puppet last run on db2042 is CRITICAL: CRITICAL: puppet fail [01:53:38] RECOVERY - puppet last run on ms-be2013 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [02:08:15] RECOVERY - puppet last run on db2042 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [02:19:29] I thought redirects.dat was an artifact. [02:22:03] !log mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.4) (duration: 08m 55s) [02:22:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [02:22:50] Hmm, guess not. [02:27:38] !log l10nupdate@tin ResourceLoader cache refresh completed at Sun Jun 5 02:27:38 UTC 2016 (duration 5m 35s) [02:27:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [03:32:05] PROBLEM - BGP status on cr2-esams is CRITICAL: BGP CRITICAL - No response from remote host 91.198.174.244 [03:36:22] PROBLEM - puppet last run on mw1097 is CRITICAL: CRITICAL: Puppet has 1 failures [03:36:43] PROBLEM - puppet last run on mw1248 is CRITICAL: CRITICAL: Puppet has 1 failures [03:42:23] PROBLEM - puppet last run on nescio is CRITICAL: CRITICAL: puppet fail [03:51:44] PROBLEM - puppet last run on mw1179 is CRITICAL: CRITICAL: puppet fail [03:52:23] PROBLEM - puppet last run on mw2176 is CRITICAL: CRITICAL: Puppet has 1 failures [03:52:33] PROBLEM - puppet last run on db2029 is CRITICAL: CRITICAL: Puppet has 1 failures [04:01:33] RECOVERY - puppet last run on mw1097 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [04:02:02] RECOVERY - puppet last run on nescio is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [04:02:03] RECOVERY - puppet last run on mw1248 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [04:05:34] PROBLEM - Start and verify pages via webservices on checker.tools.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 SERVICE UNAVAILABLE - 187 bytes in 10.825 second response time [04:09:23] RECOVERY - Start and verify pages via webservices on checker.tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 166 bytes in 7.547 second response time [04:17:43] RECOVERY - puppet last run on mw2176 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [04:19:03] RECOVERY - puppet last run on mw1179 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [04:19:53] RECOVERY - puppet last run on db2029 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [04:27:22] PROBLEM - puppet last run on cp3039 is CRITICAL: CRITICAL: puppet fail [04:51:23] PROBLEM - puppet last run on labvirt1010 is CRITICAL: CRITICAL: Puppet has 2 failures [04:54:34] RECOVERY - puppet last run on cp3039 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [05:16:43] RECOVERY - puppet last run on labvirt1010 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [05:46:12] PROBLEM - puppet last run on mw1147 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:14] PROBLEM - puppet last run on mw1259 is CRITICAL: CRITICAL: Puppet has 1 failures [06:11:43] RECOVERY - puppet last run on mw1147 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [06:21:23] PROBLEM - puppet last run on labvirt1010 is CRITICAL: CRITICAL: Puppet has 2 failures [06:30:03] PROBLEM - puppet last run on cp1068 is CRITICAL: CRITICAL: Puppet has 2 failures [06:30:22] PROBLEM - puppet last run on druid1002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:33] PROBLEM - puppet last run on analytics1047 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:33] PROBLEM - puppet last run on mw1008 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:53] PROBLEM - puppet last run on cp3008 is CRITICAL: CRITICAL: Puppet has 2 failures [06:31:43] RECOVERY - puppet last run on mw1259 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:31:53] PROBLEM - puppet last run on mw1158 is CRITICAL: CRITICAL: Puppet has 1 failures [06:33:22] PROBLEM - puppet last run on mw2073 is CRITICAL: CRITICAL: Puppet has 1 failures [06:34:34] PROBLEM - puppet last run on mw2126 is CRITICAL: CRITICAL: Puppet has 1 failures [06:46:53] RECOVERY - puppet last run on labvirt1010 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:47:22] PROBLEM - tools homepage -admin tool- on tools.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Not Available - 530 bytes in 0.074 second response time [06:55:53] RECOVERY - puppet last run on cp1068 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [06:56:14] RECOVERY - puppet last run on analytics1047 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [06:56:22] RECOVERY - puppet last run on mw1008 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:56:42] RECOVERY - puppet last run on cp3008 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [06:57:13] RECOVERY - tools homepage -admin tool- on tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 3669 bytes in 0.057 second response time [06:57:33] RECOVERY - puppet last run on mw1158 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [06:58:03] RECOVERY - puppet last run on druid1002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:58:22] RECOVERY - puppet last run on mw2126 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:59:02] RECOVERY - puppet last run on mw2073 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:59:23] PROBLEM - puppet last run on ms-be2020 is CRITICAL: CRITICAL: Puppet has 1 failures [07:25:22] RECOVERY - puppet last run on ms-be2020 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [08:15:09] PROBLEM - Outgoing network saturation on labstore1001 is CRITICAL: CRITICAL: 17.24% of data above the critical threshold [106250000.0] [08:49:35] RECOVERY - Outgoing network saturation on labstore1001 is OK: OK: Less than 10.00% above the threshold [93750000.0] [09:28:25] PROBLEM - Outgoing network saturation on labstore1001 is CRITICAL: CRITICAL: 10.34% of data above the critical threshold [106250000.0] [09:51:36] PROBLEM - puppet last run on labvirt1010 is CRITICAL: CRITICAL: Puppet has 5 failures [10:05:45] RECOVERY - Outgoing network saturation on labstore1001 is OK: OK: Less than 10.00% above the threshold [93750000.0] [10:17:05] RECOVERY - puppet last run on labvirt1010 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:45:13] RECOVERY - Start and verify pages via webservices on checker.tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 166 bytes in 15.036 second response time [13:30:24] PROBLEM - Start a job and verify on Precise on checker.tools.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 SERVICE UNAVAILABLE - 187 bytes in 0.213 second response time [13:36:25] RECOVERY - Start a job and verify on Precise on checker.tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 166 bytes in 0.484 second response time [13:41:15] PROBLEM - Outgoing network saturation on labstore1001 is CRITICAL: CRITICAL: 17.24% of data above the critical threshold [106250000.0] [14:21:44] PROBLEM - puppet last run on labvirt1010 is CRITICAL: CRITICAL: Puppet has 1 failures [14:47:25] RECOVERY - puppet last run on labvirt1010 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [14:55:04] !log `mwscript initSiteStats.php --wiki csbwiki --update` (T137060) [14:55:06] T137060: Update statistics count on csbwiki - https://phabricator.wikimedia.org/T137060 [14:55:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [14:58:15] RECOVERY - Outgoing network saturation on labstore1001 is OK: OK: Less than 10.00% above the threshold [93750000.0] [15:59:33] PROBLEM - puppet last run on db1084 is CRITICAL: CRITICAL: Puppet has 1 failures [16:20:26] PROBLEM - puppet last run on labvirt1010 is CRITICAL: CRITICAL: Puppet has 2 failures [16:24:46] RECOVERY - puppet last run on db1084 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [16:40:36] PROBLEM - Outgoing network saturation on labstore1001 is CRITICAL: CRITICAL: 13.33% of data above the critical threshold [106250000.0] [16:47:36] RECOVERY - puppet last run on labvirt1010 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [17:13:45] RECOVERY - Outgoing network saturation on labstore1001 is OK: OK: Less than 10.00% above the threshold [93750000.0] [17:38:43] PROBLEM - Text HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [1000.0] [17:46:12] PROBLEM - Codfw HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [17:50:03] RECOVERY - Codfw HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [17:50:13] RECOVERY - Text HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [18:39:39] PROBLEM - Outgoing network saturation on labstore1001 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [106250000.0] [20:21:54] RECOVERY - Outgoing network saturation on labstore1001 is OK: OK: Less than 10.00% above the threshold [93750000.0] [20:34:34] (03PS1) 10Eranroz: Adding support for some common imports [mediawiki-config] - 10https://gerrit.wikimedia.org/r/292883 (https://phabricator.wikimedia.org/T137074) [20:39:46] (03CR) 10Dereckson: [C: 04-1] Adding support for some common imports (032 comments) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/292883 (https://phabricator.wikimedia.org/T137074) (owner: 10Eranroz) [21:13:44] (03PS2) 10Eranroz: Adding support for some common imports [mediawiki-config] - 10https://gerrit.wikimedia.org/r/292883 (https://phabricator.wikimedia.org/T137074) [21:15:37] (03CR) 10Eranroz: [C: 04-1] "WIP - still need to meet community consensus and validate with the community there are no other important sources for import." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/292883 (https://phabricator.wikimedia.org/T137074) (owner: 10Eranroz) [21:20:45] PROBLEM - puppet last run on labvirt1010 is CRITICAL: CRITICAL: Puppet has 2 failures [21:22:15] PROBLEM - Outgoing network saturation on labstore1001 is CRITICAL: CRITICAL: 27.59% of data above the critical threshold [106250000.0] [21:46:05] RECOVERY - puppet last run on labvirt1010 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [22:02:31] RECOVERY - Outgoing network saturation on labstore1001 is OK: OK: Less than 10.00% above the threshold [93750000.0] [22:52:28] PROBLEM - Outgoing network saturation on labstore1001 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [106250000.0]