[00:35:42] (03PS1) 10Ladsgroup: dumps: Modernize design of the index page [puppet] - 10https://gerrit.wikimedia.org/r/334856 (https://phabricator.wikimedia.org/T155697) [00:42:47] (03CR) 10Ladsgroup: "If you like the design I will make a patch for all other pages" [puppet] - 10https://gerrit.wikimedia.org/r/334856 (https://phabricator.wikimedia.org/T155697) (owner: 10Ladsgroup) [01:55:38] PROBLEM - puppet last run on labtestcontrol2001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [02:17:14] !log l10nupdate@tin scap sync-l10n completed (1.29.0-wmf.9) (duration: 06m 06s) [02:17:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:21:37] !log l10nupdate@tin ResourceLoader cache refresh completed at Sun Jan 29 02:21:37 UTC 2017 (duration 4m 23s) [02:21:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:23:38] RECOVERY - puppet last run on labtestcontrol2001 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [02:37:18] PROBLEM - Ensure mysql credential creation for tools users is running on labstore1005 is CRITICAL: CRITICAL - Expecting active but unit maintain-dbusers is failed [02:37:58] PROBLEM - Check systemd state on labstore1005 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [02:39:18] RECOVERY - Ensure mysql credential creation for tools users is running on labstore1005 is OK: OK - maintain-dbusers is active [02:39:58] RECOVERY - Check systemd state on labstore1005 is OK: OK - running: The system is fully operational [02:47:28] PROBLEM - Redis replication status tcp_6479 on rdb2006 is CRITICAL: CRITICAL: replication_delay is 641 600 - REDIS 2.8.17 on 10.192.48.44:6479 has 1 databases (db0) with 3140636 keys, up 89 days 18 hours - replication_delay is 641 [02:47:38] PROBLEM - Redis replication status tcp_6479 on rdb2005 is CRITICAL: CRITICAL: replication_delay is 647 600 - REDIS 2.8.17 on 10.192.32.133:6479 has 1 databases (db0) with 3140678 keys, up 89 days 18 hours - replication_delay is 647 [02:51:38] RECOVERY - Redis replication status tcp_6479 on rdb2005 is OK: OK: REDIS 2.8.17 on 10.192.32.133:6479 has 1 databases (db0) with 3133521 keys, up 89 days 18 hours - replication_delay is 0 [02:53:28] RECOVERY - Redis replication status tcp_6479 on rdb2006 is OK: OK: REDIS 2.8.17 on 10.192.48.44:6479 has 1 databases (db0) with 3133286 keys, up 89 days 18 hours - replication_delay is 0 [03:02:17] 06Operations, 10MediaWiki-Database, 10MediaWiki-General-or-Unknown, 10Wikimedia-General-or-Unknown: 504 Gateway Time-out on https://de.wikipedia.org/w/index.php?title=Wikipedia:L%C3%B6schkandidaten&action=info - https://phabricator.wikimedia.org/T156537#2979943 (10zhuyifei1999) [03:13:18] PROBLEM - puppet last run on druid1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [03:20:28] PROBLEM - Redis replication status tcp_6479 on rdb2006 is CRITICAL: CRITICAL: replication_delay is 660 600 - REDIS 2.8.17 on 10.192.48.44:6479 has 1 databases (db0) with 3134152 keys, up 89 days 18 hours - replication_delay is 660 [03:25:38] PROBLEM - Redis replication status tcp_6479 on rdb2005 is CRITICAL: CRITICAL ERROR - Redis Library - can not ping 10.192.32.133 on port 6479 [03:26:38] RECOVERY - Redis replication status tcp_6479 on rdb2005 is OK: OK: REDIS 2.8.17 on 10.192.32.133:6479 has 1 databases (db0) with 3133676 keys, up 89 days 18 hours - replication_delay is 51 [03:30:28] RECOVERY - Redis replication status tcp_6479 on rdb2006 is OK: OK: REDIS 2.8.17 on 10.192.48.44:6479 has 1 databases (db0) with 3133825 keys, up 89 days 19 hours - replication_delay is 0 [03:32:48] PROBLEM - puppet last run on eeden is CRITICAL: CRITICAL: Puppet has 2 failures. Last run 2 minutes ago with 2 failures. Failed resources (up to 3 shown): File[/usr/share/GeoIP/GeoIPCity.dat.gz],File[/usr/share/GeoIP/GeoIPCity.dat.test] [03:33:18] PROBLEM - puppet last run on mw1286 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/share/GeoIP/GeoIPCity.dat.gz] [03:33:18] PROBLEM - puppet last run on mw1252 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/share/GeoIP/GeoIPCity.dat.gz] [03:34:26] (03PS5) 10Juniorsys: Linting fixes (Multiple modules) [puppet] - 10https://gerrit.wikimedia.org/r/334276 (https://phabricator.wikimedia.org/T93645) [03:41:18] RECOVERY - puppet last run on druid1003 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [03:44:26] (03PS5) 10Juniorsys: dnsrecursor: Linting fixes [puppet] - 10https://gerrit.wikimedia.org/r/334279 (https://phabricator.wikimedia.org/T93645) [03:47:12] (03CR) 10Juniorsys: "@Hashar Should be fixed" [puppet] - 10https://gerrit.wikimedia.org/r/334279 (https://phabricator.wikimedia.org/T93645) (owner: 10Juniorsys) [04:00:18] RECOVERY - puppet last run on mw1252 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [04:00:48] RECOVERY - puppet last run on eeden is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [04:01:18] RECOVERY - puppet last run on mw1286 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [04:20:00] 06Operations, 06Labs, 10wikitech.wikimedia.org: Investigate issues with wikitech-static.wikimedia.org - https://phabricator.wikimedia.org/T156570#2979953 (10MZMcBride) [04:40:25] (03PS5) 10Juniorsys: extdist: Linting fixes [puppet] - 10https://gerrit.wikimedia.org/r/334284 [04:44:41] (03CR) 10Juniorsys: "@Legoktm I have done this, although it's not recommended by any style guide (comma after last resource is, comma after last array element " [puppet] - 10https://gerrit.wikimedia.org/r/334284 (owner: 10Juniorsys) [05:29:13] (03Abandoned) 10Juniorsys: Linting changes (multiple) [puppet] - 10https://gerrit.wikimedia.org/r/334295 (https://phabricator.wikimedia.org/T93645) (owner: 10Juniorsys) [05:49:18] PROBLEM - puppet last run on db1016 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [05:56:38] (03PS6) 10Juniorsys: extdist: Linting fixes [puppet] - 10https://gerrit.wikimedia.org/r/334284 [06:18:18] RECOVERY - puppet last run on db1016 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [06:25:08] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1312 [06:30:08] RECOVERY - check_mysql on frdb2001 is OK: Uptime: 223077 Threads: 1 Questions: 2427637 Slow queries: 950 Opens: 1724 Flush tables: 1 Open tables: 556 Queries per second avg: 10.882 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [06:41:18] PROBLEM - Check HHVM threads for leakage on mw1168 is CRITICAL: CRITICAL: HHVM has more than double threads running or queued than apache has busy workers [07:20:08] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2296 [07:25:08] RECOVERY - check_mysql on frdb2001 is OK: Uptime: 226377 Threads: 1 Questions: 2431538 Slow queries: 950 Opens: 1725 Flush tables: 1 Open tables: 557 Queries per second avg: 10.741 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [07:35:18] 06Operations, 06Labs, 10wikitech.wikimedia.org: Investigate issues with wikitech-static.wikimedia.org - https://phabricator.wikimedia.org/T156570#2979953 (10Legoktm) Given I don't think you would have been able to login anyways. Maybe a sitenotic... [07:48:48] PROBLEM - puppet last run on cp3030 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [07:55:18] RECOVERY - Check HHVM threads for leakage on mw1168 is OK: OK [08:17:48] RECOVERY - puppet last run on cp3030 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [08:31:58] PROBLEM - puppet last run on analytics1039 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [09:00:58] RECOVERY - puppet last run on analytics1039 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [09:10:34] (03PS1) 10Urbanecm: Enable RSS extension at metawiki, enable one feed [mediawiki-config] - 10https://gerrit.wikimedia.org/r/334864 (https://phabricator.wikimedia.org/T155830) [09:41:24] (03PS2) 10Ladsgroup: dumps: Modernize design of the index page [puppet] - 10https://gerrit.wikimedia.org/r/334856 (https://phabricator.wikimedia.org/T155697) [09:43:21] (03PS3) 10Ladsgroup: dumps: Modernize design of the index page [puppet] - 10https://gerrit.wikimedia.org/r/334856 (https://phabricator.wikimedia.org/T155697) [09:58:38] PROBLEM - puppet last run on mc1022 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:27:38] RECOVERY - puppet last run on mc1022 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [12:28:14] (03PS10) 10Paladox: Phabricator: Fix phd init script, also use systemd script if the os is cable of it [puppet] - 10https://gerrit.wikimedia.org/r/333358 [12:28:26] (03PS11) 10Paladox: Phabricator: Fix phd init script, also use systemd script if the os is cable of it [puppet] - 10https://gerrit.wikimedia.org/r/333358 [12:29:00] (03PS12) 10Paladox: Phabricator: Fix phd init script, also use systemd script if the os is cable of it [puppet] - 10https://gerrit.wikimedia.org/r/333358 [12:30:05] (03CR) 10Paladox: [C: 031] "@Giuseppe Lavagetto / @_joe_ hi, could you remove your -2 please?" [puppet] - 10https://gerrit.wikimedia.org/r/333358 (owner: 10Paladox) [12:33:52] (03PS13) 10Paladox: Phabricator: Fix phd init script, also use systemd script if the os is cable of it [puppet] - 10https://gerrit.wikimedia.org/r/333358 [12:38:20] (03PS10) 10Paladox: Gerrit: Add a systemd init script fro gerrit [debs/gerrit] - 10https://gerrit.wikimedia.org/r/333475 [12:38:42] (03CR) 10Paladox: Gerrit: Add a systemd init script fro gerrit (034 comments) [debs/gerrit] - 10https://gerrit.wikimedia.org/r/333475 (owner: 10Paladox) [12:39:09] (03CR) 10Paladox: "@Muehlenhoff i wonder how do we add support for restarting gerrit in the systemd file please?" [debs/gerrit] - 10https://gerrit.wikimedia.org/r/333475 (owner: 10Paladox) [12:58:48] PROBLEM - puppet last run on cp3048 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [13:26:48] RECOVERY - puppet last run on cp3048 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [14:03:22] (03PS11) 10Paladox: Gerrit: Add a systemd init script fro gerrit [debs/gerrit] - 10https://gerrit.wikimedia.org/r/333475 [14:03:50] (03CR) 10Paladox: "I found someone else did a service file for gerrit here https://aur.archlinux.org/cgit/aur.git/tree/gerrit.systemd?h=gerrit :)" [debs/gerrit] - 10https://gerrit.wikimedia.org/r/333475 (owner: 10Paladox) [14:13:14] (03CR) 10Paladox: [C: 031] "Tested it on gerrit-test3, it started and stopped fine." [debs/gerrit] - 10https://gerrit.wikimedia.org/r/333475 (owner: 10Paladox) [14:55:28] PROBLEM - puppet last run on elastic1021 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [14:58:39] 06Operations, 06Labs, 10Tool-Labs, 10Traffic, 07HTTPS: Detect tools.wmflabs.org tools which are HTTP-only - https://phabricator.wikimedia.org/T128409#2980398 (10zhuyifei1999) >>! In T128409#2233237, @tom29739 wrote: > In Tools the tool never 'sees' http because everything goes through the proxy: They d... [15:24:28] RECOVERY - puppet last run on elastic1021 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [16:23:32] 06Operations, 10OTRS: clean up non-working otrs email addresses - https://phabricator.wikimedia.org/T84044#2980476 (10Rjd0060) 05Open>03Resolved a:03Rjd0060 These "non-working" email addresses are actually used for internal mail sorting within the system. While you may not be able to directly email thes... [18:00:47] 06Operations, 10Gerrit, 06Release-Engineering-Team: Enable the git:// protocole on gerrit - https://phabricator.wikimedia.org/T156597#2980539 (10Paladox) [20:07:18] PROBLEM - puppet last run on mw1277 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [20:20:48] PROBLEM - puppet last run on dbmonitor2001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [20:25:48] PROBLEM - puppet last run on lvs3001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [20:35:18] RECOVERY - puppet last run on mw1277 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [20:47:48] RECOVERY - puppet last run on dbmonitor2001 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [20:54:48] RECOVERY - puppet last run on lvs3001 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [21:37:38] PROBLEM - puppet last run on mw1255 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [22:05:38] RECOVERY - puppet last run on mw1255 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [22:12:15] 06Operations, 10Gerrit, 06Release-Engineering-Team: Enable the git:// protocole on gerrit - https://phabricator.wikimedia.org/T156597#2980539 (10lfaraone) Is this actually something worth implementing? HTTPS should be preferred as a transport in almost every case -- the performance downsides are low, and the... [22:37:18] PROBLEM - puppet last run on maps1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [23:04:28] PROBLEM - puppet last run on ms-be1026 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [23:06:18] RECOVERY - puppet last run on maps1001 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [23:33:28] RECOVERY - puppet last run on ms-be1026 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures