[00:29:59] PROBLEM - puppet last run on mw2157 is CRITICAL: CRITICAL: Puppet has 1 failures [00:55:08] RECOVERY - puppet last run on mw2157 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [01:09:48] PROBLEM - puppet last run on cp1047 is CRITICAL: CRITICAL: Puppet has 1 failures [01:34:58] RECOVERY - puppet last run on cp1047 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [01:41:18] PROBLEM - puppet last run on elastic2016 is CRITICAL: CRITICAL: puppet fail [01:54:29] PROBLEM - puppet last run on cp3013 is CRITICAL: CRITICAL: puppet fail [02:03:08] PROBLEM - puppet last run on cp3003 is CRITICAL: CRITICAL: Puppet has 1 failures [02:08:28] RECOVERY - puppet last run on elastic2016 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [02:20:39] !log mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.10) (duration: 08m 24s) [02:20:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [02:21:47] RECOVERY - puppet last run on cp3013 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [02:26:18] PROBLEM - Ensure legal html en.m.wp on en.m.wikipedia.org is CRITICAL: a\shref=\/\/wikimediafoundation\.org\/wiki\/Privacy_policy\sclass=\S+\stitle=wmf:Privacy\spolicyPrivacy/a html not found [02:26:21] !log l10nupdate@tin ResourceLoader cache refresh completed at Sun Jul 17 02:26:21 UTC 2016 (duration 5m 42s) [02:26:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [02:30:08] RECOVERY - puppet last run on cp3003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [03:02:38] PROBLEM - mobileapps endpoints health on scb1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [03:04:38] RECOVERY - mobileapps endpoints health on scb1001 is OK: All endpoints are healthy [03:11:38] PROBLEM - puppet last run on cp3004 is CRITICAL: CRITICAL: puppet fail [03:38:49] RECOVERY - puppet last run on cp3004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [03:47:58] PROBLEM - puppet last run on db2059 is CRITICAL: CRITICAL: puppet fail [03:52:19] PROBLEM - puppet last run on cp4005 is CRITICAL: CRITICAL: puppet fail [04:07:28] (03PS1) 10Kharkiv07: Remove unneed permissions from enwiki bureaucrats [mediawiki-config] - 10https://gerrit.wikimedia.org/r/299354 (https://phabricator.wikimedia.org/T140550) [04:15:08] RECOVERY - puppet last run on db2059 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [04:19:28] RECOVERY - puppet last run on cp4005 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [05:05:48] (03CR) 10Music1201: [C: 031] Remove unneed permissions from enwiki bureaucrats [mediawiki-config] - 10https://gerrit.wikimedia.org/r/299354 (https://phabricator.wikimedia.org/T140550) (owner: 10Kharkiv07) [05:06:41] PROBLEM - mobileapps endpoints health on scb2001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [05:08:31] RECOVERY - mobileapps endpoints health on scb2001 is OK: All endpoints are healthy [05:17:40] (03PS2) 10Kharkiv07: Remove unneed permissions from enwiki bureaucrats [mediawiki-config] - 10https://gerrit.wikimedia.org/r/299354 (https://phabricator.wikimedia.org/T140550) [06:06:59] PROBLEM - mobileapps endpoints health on scb1002 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [06:08:58] RECOVERY - mobileapps endpoints health on scb1002 is OK: All endpoints are healthy [06:25:28] PROBLEM - mobileapps endpoints health on scb1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [06:29:18] RECOVERY - mobileapps endpoints health on scb1001 is OK: All endpoints are healthy [06:31:10] PROBLEM - puppet last run on mw2208 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:19] PROBLEM - puppet last run on cp3017 is CRITICAL: CRITICAL: Puppet has 3 failures [06:31:28] PROBLEM - puppet last run on cp2013 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:39] PROBLEM - puppet last run on mw2129 is CRITICAL: CRITICAL: Puppet has 2 failures [06:31:49] PROBLEM - puppet last run on mw1158 is CRITICAL: CRITICAL: Puppet has 2 failures [06:31:58] PROBLEM - puppet last run on mw1135 is CRITICAL: CRITICAL: Puppet has 1 failures [06:34:20] PROBLEM - puppet last run on mw2207 is CRITICAL: CRITICAL: Puppet has 2 failures [06:52:31] (03PS2) 10ArielGlenn: move dependent recombine jobs into same invocation of dump script [puppet] - 10https://gerrit.wikimedia.org/r/297578 (https://phabricator.wikimedia.org/T139449) [06:56:00] RECOVERY - puppet last run on mw2208 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [06:56:39] RECOVERY - puppet last run on cp2013 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [06:56:59] RECOVERY - puppet last run on cp3017 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [06:57:00] RECOVERY - puppet last run on mw1135 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [06:57:10] RECOVERY - puppet last run on mw2207 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [06:58:00] RECOVERY - puppet last run on mw2129 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:58:49] RECOVERY - puppet last run on mw1158 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:00:06] (03CR) 10ArielGlenn: [C: 032] move dependent recombine jobs into same invocation of dump script [puppet] - 10https://gerrit.wikimedia.org/r/297578 (https://phabricator.wikimedia.org/T139449) (owner: 10ArielGlenn) [07:03:53] (03PS2) 10ArielGlenn: add batching config setting for xml stubs dumps [puppet] - 10https://gerrit.wikimedia.org/r/297427 [07:09:41] (03CR) 10ArielGlenn: [C: 032] add batching config setting for xml stubs dumps [puppet] - 10https://gerrit.wikimedia.org/r/297427 (owner: 10ArielGlenn) [07:14:21] (03PS2) 10ArielGlenn: direct output from cron job for full dumps to log files [puppet] - 10https://gerrit.wikimedia.org/r/289406 [07:28:28] (03PS3) 10ArielGlenn: direct output from cron job for full dumps to log files [puppet] - 10https://gerrit.wikimedia.org/r/289406 [07:52:23] PROBLEM - puppet last run on serpens is CRITICAL: CRITICAL: puppet fail [07:53:42] PROBLEM - BGP status on cr2-eqiad is CRITICAL: BGP CRITICAL - AS1299/IPv6: Active [08:01:32] PROBLEM - BGP status on cr2-eqiad is CRITICAL: BGP CRITICAL - AS1299/IPv6: Active, AS1299/IPv4: Connect [08:02:13] RECOVERY - Router interfaces on cr2-eqiad is OK: OK: host 208.80.154.197, interfaces up: 202, down: 0, dormant: 0, excluded: 0, unused: 0 [08:04:02] PROBLEM - IPv4 ping to eqiad on ripe-atlas-eqiad is CRITICAL: CRITICAL - failed 122 probes of 403 (alerts on 19) - https://atlas.ripe.net/measurements/1790945/#!map [08:08:22] PROBLEM - mobileapps endpoints health on scb2002 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [08:10:01] RECOVERY - IPv4 ping to eqiad on ripe-atlas-eqiad is OK: OK - failed 18 probes of 403 (alerts on 19) - https://atlas.ripe.net/measurements/1790945/#!map [08:10:21] RECOVERY - mobileapps endpoints health on scb2002 is OK: All endpoints are healthy [08:20:11] PROBLEM - puppet last run on mw1253 is CRITICAL: CRITICAL: Puppet has 1 failures [08:43:23] PROBLEM - IPv4 ping to eqiad on ripe-atlas-eqiad is CRITICAL: CRITICAL - failed 32 probes of 403 (alerts on 19) - https://atlas.ripe.net/measurements/1790945/#!map [08:49:03] RECOVERY - puppet last run on mw1253 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [08:55:12] RECOVERY - IPv4 ping to eqiad on ripe-atlas-eqiad is OK: OK - failed 9 probes of 403 (alerts on 19) - https://atlas.ripe.net/measurements/1790945/#!map [09:17:13] PROBLEM - mobileapps endpoints health on scb1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:19:11] RECOVERY - mobileapps endpoints health on scb1001 is OK: All endpoints are healthy [10:03:35] PROBLEM - mobileapps endpoints health on scb2002 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [10:05:16] RECOVERY - mobileapps endpoints health on scb2002 is OK: All endpoints are healthy [10:18:55] RECOVERY - puppet last run on serpens is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [10:23:35] PROBLEM - Labs LDAP on serpens is CRITICAL: Could not bind to the LDAP server [10:31:13] !log restart slapd on serpens - T130593 [10:31:14] T130593: investigate slapd memory leak - https://phabricator.wikimedia.org/T130593 [10:31:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [10:31:35] RECOVERY - Labs LDAP on serpens is OK: LDAP OK - 0.112 seconds response time [11:21:54] PROBLEM - puppet last run on mw1211 is CRITICAL: CRITICAL: Puppet has 1 failures [11:47:34] RECOVERY - puppet last run on mw1211 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [13:17:51] (03PS4) 10ArielGlenn: direct output from cron job for full dumps to log files [puppet] - 10https://gerrit.wikimedia.org/r/289406 [13:21:16] (03PS5) 10ArielGlenn: direct output from cron job for full dumps to log files [puppet] - 10https://gerrit.wikimedia.org/r/289406 [13:23:25] (03PS6) 10ArielGlenn: direct output from cron job for full dumps to log files [puppet] - 10https://gerrit.wikimedia.org/r/289406 [13:24:56] (03CR) 10ArielGlenn: [C: 032] direct output from cron job for full dumps to log files [puppet] - 10https://gerrit.wikimedia.org/r/289406 (owner: 10ArielGlenn) [13:35:18] (03PS3) 10Florianschmidtwelzow: Remove unneed permissions from enwiki bureaucrats [mediawiki-config] - 10https://gerrit.wikimedia.org/r/299354 (https://phabricator.wikimedia.org/T140550) (owner: 10Kharkiv07) [13:35:30] (03CR) 10Florianschmidtwelzow: [C: 031] "looks ok, technically." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/299354 (https://phabricator.wikimedia.org/T140550) (owner: 10Kharkiv07) [13:46:36] (03PS3) 10ArielGlenn: in configs, allow comma separated list of files of wikis to be skipped [dumps] - 10https://gerrit.wikimedia.org/r/289567 [13:47:23] (03CR) 10ArielGlenn: [C: 032] in configs, allow comma separated list of files of wikis to be skipped [dumps] - 10https://gerrit.wikimedia.org/r/289567 (owner: 10ArielGlenn) [13:53:34] PROBLEM - puppet last run on bast3001 is CRITICAL: CRITICAL: Puppet has 1 failures [13:58:11] (03Abandoned) 10ArielGlenn: add option to onallwikis to run from a base wiki, with wikis as args [dumps] (ariel) - 10https://gerrit.wikimedia.org/r/276956 (owner: 10ArielGlenn) [14:03:17] (03PS2) 10ArielGlenn: misc tiny thumb scripts: pylint, pep8 [dumps] (ariel) - 10https://gerrit.wikimedia.org/r/280107 [14:04:13] (03CR) 10ArielGlenn: [C: 032] misc tiny thumb scripts: pylint, pep8 [dumps] (ariel) - 10https://gerrit.wikimedia.org/r/280107 (owner: 10ArielGlenn) [14:11:02] (03PS2) 10ArielGlenn: thumbFilesSizesCounts: full pylint and pep8 [dumps] (ariel) - 10https://gerrit.wikimedia.org/r/280105 [14:12:22] (03CR) 10ArielGlenn: [C: 032] thumbFilesSizesCounts: full pylint and pep8 [dumps] (ariel) - 10https://gerrit.wikimedia.org/r/280105 (owner: 10ArielGlenn) [14:17:24] RECOVERY - puppet last run on bast3001 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [14:19:21] (03PS2) 10ArielGlenn: thumbPxSize: full pylint and pep8 [dumps] (ariel) - 10https://gerrit.wikimedia.org/r/280106 [14:19:35] (03PS1) 10MarcoAurelio: Configuration changes for he.wikinews.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/299446 (https://phabricator.wikimedia.org/T140544) [14:20:25] PROBLEM - mobileapps endpoints health on scb1002 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:21:23] (03CR) 10Luke081515: [C: 031] Configuration changes for he.wikinews.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/299446 (https://phabricator.wikimedia.org/T140544) (owner: 10MarcoAurelio) [14:24:24] RECOVERY - mobileapps endpoints health on scb1002 is OK: All endpoints are healthy [14:33:31] (03CR) 10Luke081515: [C: 031] Remove unneed permissions from enwiki bureaucrats [mediawiki-config] - 10https://gerrit.wikimedia.org/r/299354 (https://phabricator.wikimedia.org/T140550) (owner: 10Kharkiv07) [14:42:27] (03CR) 10ArielGlenn: [C: 032] thumbPxSize: full pylint and pep8 [dumps] (ariel) - 10https://gerrit.wikimedia.org/r/280106 (owner: 10ArielGlenn) [15:00:26] 06Operations, 06Discovery, 06Discovery-Search-Backlog, 10Dumps-Generation, and 2 others: Link "current" to last dump set on cirrussearch get a 404 - https://phabricator.wikimedia.org/T138176#2469135 (10ArielGlenn) We'll know if this is fixed on Tues or Wed when the next run completes. [15:03:46] hi please join #wikimedia-offtopic for off topic messages. This is the offical channel now for off topic messages. [16:19:38] PROBLEM - puppet last run on cp3010 is CRITICAL: CRITICAL: Puppet has 1 failures [16:41:49] (03PS1) 10Paladox: Add some colors to the site table on changes [puppet] - 10https://gerrit.wikimedia.org/r/299447 [16:42:23] (03PS2) 10Paladox: Add some colors to the site table on changes [puppet] - 10https://gerrit.wikimedia.org/r/299447 [16:42:47] (03PS3) 10Paladox: Add some colors to the site table on changes [puppet] - 10https://gerrit.wikimedia.org/r/299447 [16:44:19] (03CR) 10Paladox: "Please see" [puppet] - 10https://gerrit.wikimedia.org/r/299447 (owner: 10Paladox) [16:45:48] RECOVERY - puppet last run on cp3010 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [17:01:21] (03PS1) 10ArielGlenn: lock wikis for dump runs by date, permitting runs across multiple dates [dumps] - 10https://gerrit.wikimedia.org/r/299448 (https://phabricator.wikimedia.org/T126341) [17:09:18] PROBLEM - mobileapps endpoints health on scb1002 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:11:17] RECOVERY - mobileapps endpoints health on scb1002 is OK: All endpoints are healthy [17:47:23] Hmm [17:47:35] Database error on blocking [17:47:51] "Function: IndexPager::buildQueryInfo (LogPager)" [17:50:41] Bsadowski1: at which page? [17:53:53] https://en.wikipedia.org/wiki/Special:Block/The_Unblockable_Wonder [18:58:59] PROBLEM - mobileapps endpoints health on scb1002 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [19:04:59] RECOVERY - mobileapps endpoints health on scb1002 is OK: All endpoints are healthy [19:16:30] PROBLEM - mobileapps endpoints health on scb1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [19:18:29] RECOVERY - mobileapps endpoints health on scb1001 is OK: All endpoints are healthy [19:54:26] (03CR) 10Kharkiv07: [C: 031] Enable global abuse filters on ptwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/299111 (https://phabricator.wikimedia.org/T140395) (owner: 10Urbanecm) [20:14:50] PROBLEM - mobileapps endpoints health on scb1002 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [20:16:40] RECOVERY - mobileapps endpoints health on scb1002 is OK: All endpoints are healthy [20:59:09] PROBLEM - Host db2056 is DOWN: PING CRITICAL - Packet loss = 100% [21:31:05] PROBLEM - mobileapps endpoints health on scb1002 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [21:33:05] RECOVERY - mobileapps endpoints health on scb1002 is OK: All endpoints are healthy [21:35:42] 06Operations, 06Commons, 06Multimedia: Deploy a PHP and HHVM patch (Exif values retrieved incorrectly if they appear before IFD) - https://phabricator.wikimedia.org/T140419#2469609 (10matmarex) [21:58:03] (03PS2) 10ArielGlenn: lock wikis for dump runs by date, permitting runs across multiple dates [dumps] - 10https://gerrit.wikimedia.org/r/299448 (https://phabricator.wikimedia.org/T126341) [22:01:41] 06Operations, 06Commons, 06Multimedia: Deploy a PHP and HHVM patch (Exif values retrieved incorrectly if they appear before IFD) - https://phabricator.wikimedia.org/T140419#2464014 (10matmarex) [22:26:27] db2056 seems down [22:29:26] [14132683.464355] INFO: rcu_sched detected stalls on CPUs/tasks: { 0} (detected by 1, t=1391774 jiffies, g=137737817, c=137737816, q=10779) [22:30:26] as it is not vital, I will investigate tomorrow morning [22:35:25] PROBLEM - puppet last run on ms-be2024 is CRITICAL: CRITICAL: puppet fail [23:03:19] RECOVERY - puppet last run on ms-be2024 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [23:49:40] PROBLEM - puppet last run on cp3010 is CRITICAL: CRITICAL: Puppet has 1 failures