[00:38:18] 06Operations, 10MediaWiki-Cache, 10MediaWiki-JobQueue, 10MediaWiki-JobRunner, and 3 others: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan - https://phabricator.wikimedia.org/T124418#2390997 (10BBlack) @aaron and @gwicke - both patches sound promising, thanks for digging into this topic! [01:17:32] (03PS1) 10Tim Landscheidt: labstore: Fix home directory calculation for user accounts [puppet] - 10https://gerrit.wikimedia.org/r/295029 (https://phabricator.wikimedia.org/T138103) [01:20:57] (03CR) 10Tim Landscheidt: "I tested the concept with a trimmed-down copy of create-dbusers and pretty confident that it works, but did not test it as is." [puppet] - 10https://gerrit.wikimedia.org/r/295029 (https://phabricator.wikimedia.org/T138103) (owner: 10Tim Landscheidt) [02:00:19] (03CR) 10Yuvipanda: [C: 032] "Thanks for the patch! I'll merge and test this on first run (and revert if it causes problems)" [puppet] - 10https://gerrit.wikimedia.org/r/295029 (https://phabricator.wikimedia.org/T138103) (owner: 10Tim Landscheidt) [02:26:08] !log mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 10m 04s) [02:26:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [02:32:26] !log l10nupdate@tin ResourceLoader cache refresh completed at Sat Jun 18 02:32:26 UTC 2016 (duration 6m 18s) [02:32:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [03:37:36] (03CR) 10Tim Landscheidt: Use paged searches in ldaplist (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/262745 (owner: 10Muehlenhoff) [05:15:29] PROBLEM - Router interfaces on cr1-eqord is CRITICAL: CRITICAL: host 208.80.154.198, interfaces up: 37, down: 1, dormant: 0, excluded: 0, unused: 0BRxe-0/0/0: down - Core: cr2-codfw:xe-5/2/1 (Telia, IC-314534, 29ms) {#11375} [10Gbps wave]BR [05:15:30] PROBLEM - Router interfaces on cr2-codfw is CRITICAL: CRITICAL: host 208.80.153.193, interfaces up: 120, down: 1, dormant: 0, excluded: 0, unused: 0BRxe-5/2/1: down - Core: cr1-eqord:xe-0/0/0 (Telia, IC-314534, 24ms) {#10694} [10Gbps wave]BR [05:47:07] PROBLEM - Disk space on elastic1024 is CRITICAL: DISK CRITICAL - free space: /var/lib/elasticsearch 80757 MB (15% inode=99%) [05:51:16] RECOVERY - Disk space on elastic1024 is OK: DISK OK [06:13:33] PROBLEM - Disk space on elastic1024 is CRITICAL: DISK CRITICAL - free space: /var/lib/elasticsearch 77943 MB (15% inode=99%) [06:30:13] PROBLEM - puppet last run on cp1054 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:32] PROBLEM - puppet last run on mw1170 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:13] PROBLEM - puppet last run on mw1119 is CRITICAL: CRITICAL: Puppet has 1 failures [06:32:13] RECOVERY - Disk space on elastic1024 is OK: DISK OK [06:33:02] RECOVERY - Router interfaces on cr2-codfw is OK: OK: host 208.80.153.193, interfaces up: 122, down: 0, dormant: 0, excluded: 0, unused: 0 [06:34:12] RECOVERY - Router interfaces on cr1-eqord is OK: OK: host 208.80.154.198, interfaces up: 39, down: 0, dormant: 0, excluded: 0, unused: 0 [06:34:33] PROBLEM - puppet last run on mw1158 is CRITICAL: CRITICAL: Puppet has 1 failures [06:35:43] PROBLEM - puppet last run on mw2073 is CRITICAL: CRITICAL: Puppet has 4 failures [06:36:12] PROBLEM - puppet last run on mw1110 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:52] PROBLEM - puppet last run on relforge1002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:42] RECOVERY - puppet last run on mw1110 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [06:57:02] RECOVERY - puppet last run on mw1158 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [06:57:33] RECOVERY - puppet last run on cp1054 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:58:12] RECOVERY - puppet last run on mw1119 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:58:13] RECOVERY - puppet last run on mw1170 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:58:13] RECOVERY - puppet last run on mw2073 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [07:13:41] RECOVERY - puppet last run on relforge1002 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [08:38:48] 06Operations, 06Commons, 10media-storage: Update rsvg on the image scalers - https://phabricator.wikimedia.org/T112421#1634588 (10Menner) Regression from 2.40.11 has been fixed in recent 2.40.16. I've compiled it on Ubuntu 14.04 and I've seen no issues on a short test. I've used http://ftp.gnome.org/pub/GNO... [09:20:22] PROBLEM - puppet last run on labvirt1010 is CRITICAL: CRITICAL: Puppet has 2 failures [09:47:42] RECOVERY - puppet last run on labvirt1010 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [10:55:20] 06Operations, 10MediaWiki-extensions-UniversalLanguageSelector, 10Wikimedia-SVG-rendering, 07I18n: MB Lateefi Fonts for Sindhi Wikipedia. - https://phabricator.wikimedia.org/T138136#2391310 (10Dereckson) [10:59:25] 06Operations, 10MediaWiki-extensions-UniversalLanguageSelector, 10Wikimedia-SVG-rendering, 07I18n: MB Lateefi Fonts for Sindhi Wikipedia. - https://phabricator.wikimedia.org/T138136#2391329 (10Dereckson) @mehtab.ahmed Font author could be contacted by mail: majidbhurgri@gmail.com [11:18:22] PROBLEM - Disk space on elastic1012 is CRITICAL: DISK CRITICAL - free space: /var/lib/elasticsearch 80659 MB (15% inode=99%) [11:22:32] RECOVERY - Disk space on elastic1012 is OK: DISK OK [11:52:49] 06Operations, 06Discovery, 10Kartographer, 06Maps, and 3 others: Use https instead of http when linking to openstreetmap.org - https://phabricator.wikimedia.org/T138126#2391352 (10Aklapper) [12:08:13] PROBLEM - Redis status tcp_6479 on rdb2006 is CRITICAL: CRITICAL: replication_delay is 716 600 - REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5296830 keys - replication_delay is 716 [12:20:34] RECOVERY - Redis status tcp_6479 on rdb2006 is OK: OK: REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5252560 keys - replication_delay is 0 [12:35:41] PROBLEM - Redis status tcp_6479 on rdb2006 is CRITICAL: CRITICAL: replication_delay is 630 600 - REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5252942 keys - replication_delay is 630 [12:50:02] RECOVERY - Redis status tcp_6479 on rdb2006 is OK: OK: REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5253843 keys - replication_delay is 55 [14:41:18] PROBLEM - HHVM rendering on mw1114 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:41:29] PROBLEM - Apache HTTP on mw1114 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:42:19] PROBLEM - puppet last run on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:43:10] PROBLEM - SSH on mw1114 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:43:29] PROBLEM - configured eth on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:43:50] PROBLEM - Check size of conntrack table on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:43:59] PROBLEM - nutcracker port on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:46:08] RECOVERY - nutcracker port on mw1114 is OK: TCP OK - 0.000 second response time on port 11212 [14:52:19] PROBLEM - nutcracker port on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:52:48] PROBLEM - DPKG on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:52:59] PROBLEM - Disk space on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:53:00] PROBLEM - HHVM processes on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:53:00] PROBLEM - salt-minion processes on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:54:39] PROBLEM - nutcracker process on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:54:39] PROBLEM - dhclient process on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:06:19] RECOVERY - DPKG on mw1114 is OK: All packages OK [15:06:48] RECOVERY - HHVM rendering on mw1114 is OK: HTTP OK: HTTP/1.1 200 OK - 66703 bytes in 9.398 second response time [15:06:59] RECOVERY - SSH on mw1114 is OK: SSH OK - OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.7 (protocol 2.0) [15:07:09] RECOVERY - configured eth on mw1114 is OK: OK - interfaces up [15:07:28] RECOVERY - dhclient process on mw1114 is OK: PROCS OK: 0 processes with command name dhclient [15:07:30] RECOVERY - Disk space on mw1114 is OK: DISK OK [15:07:30] RECOVERY - HHVM processes on mw1114 is OK: PROCS OK: 25 processes with command name hhvm [15:07:39] RECOVERY - nutcracker port on mw1114 is OK: TCP OK - 0.000 second response time on port 11212 [15:07:48] RECOVERY - salt-minion processes on mw1114 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion [15:07:49] RECOVERY - Apache HTTP on mw1114 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 627 bytes in 0.129 second response time [15:07:50] RECOVERY - nutcracker process on mw1114 is OK: PROCS OK: 1 process with UID = 110 (nutcracker), command name nutcracker [15:08:08] RECOVERY - Check size of conntrack table on mw1114 is OK: OK: nf_conntrack is 10 % full [15:10:59] PROBLEM - HHVM rendering on mw1114 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:12:10] PROBLEM - Apache HTTP on mw1114 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:12:49] PROBLEM - DPKG on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:13:28] PROBLEM - SSH on mw1114 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:13:39] PROBLEM - configured eth on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:13:50] PROBLEM - dhclient process on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:13:59] PROBLEM - Disk space on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:13:59] PROBLEM - HHVM processes on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:14:09] PROBLEM - nutcracker port on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:14:10] PROBLEM - salt-minion processes on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:14:19] PROBLEM - nutcracker process on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:14:29] PROBLEM - Check size of conntrack table on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:24:59] RECOVERY - DPKG on mw1114 is OK: All packages OK [15:25:38] RECOVERY - SSH on mw1114 is OK: SSH OK - OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.7 (protocol 2.0) [15:25:49] RECOVERY - configured eth on mw1114 is OK: OK - interfaces up [15:26:00] RECOVERY - dhclient process on mw1114 is OK: PROCS OK: 0 processes with command name dhclient [15:26:09] RECOVERY - HHVM processes on mw1114 is OK: PROCS OK: 25 processes with command name hhvm [15:26:09] RECOVERY - Disk space on mw1114 is OK: DISK OK [15:26:19] RECOVERY - nutcracker port on mw1114 is OK: TCP OK - 0.000 second response time on port 11212 [15:26:28] RECOVERY - salt-minion processes on mw1114 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion [15:26:29] RECOVERY - nutcracker process on mw1114 is OK: PROCS OK: 1 process with UID = 110 (nutcracker), command name nutcracker [15:26:39] RECOVERY - Check size of conntrack table on mw1114 is OK: OK: nf_conntrack is 0 % full [15:26:58] RECOVERY - puppet last run on mw1114 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [16:31:22] 06Operations, 10Mail, 10OTRS: E-mail incorrectly forwarded to wm-cz OTRS e-mail - https://phabricator.wikimedia.org/T129743#2391484 (10Urbanecm) p:05Normal>03High At least high priority because this breaks mail which come to info@wikimedia.cz. Please fix it soon. [17:15:49] PROBLEM - Redis status tcp_6479 on rdb2006 is CRITICAL: CRITICAL: replication_delay is 673 600 - REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5283539 keys - replication_delay is 673 [17:28:18] RECOVERY - Redis status tcp_6479 on rdb2006 is OK: OK: REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5264469 keys - replication_delay is 0 [18:22:12] 06Operations, 10MediaWiki-extensions-UniversalLanguageSelector, 10Wikimedia-SVG-rendering, 07I18n: MB Lateefi Fonts for Sindhi Wikipedia. - https://phabricator.wikimedia.org/T138136#2391546 (10mehtab.ahmed) I have sent email to author for permission. [18:57:44] 06Operations, 10MediaWiki-extensions-UniversalLanguageSelector, 10Wikimedia-SVG-rendering, 07I18n: MB Lateefi Fonts for Sindhi Wikipedia. - https://phabricator.wikimedia.org/T138136#2391567 (10Aklapper) 05Open>03stalled p:05Triage>03Low Setting task status to STALLED until font is published under a... [21:03:40] 06Operations, 10MediaWiki-extensions-UniversalLanguageSelector, 10Wikimedia-SVG-rendering, 07I18n: MB Lateefi Fonts for Sindhi Wikipedia. - https://phabricator.wikimedia.org/T138136#2391617 (10Dereckson) [21:09:19] 06Operations, 10MediaWiki-extensions-UniversalLanguageSelector, 10Wikimedia-SVG-rendering, 07I18n: MB Lateefi Fonts for Sindhi Wikipedia. - https://phabricator.wikimedia.org/T138136#2391622 (10Dereckson) @mehtab.ahmed Could you also check http://software.sil.org/lateef/? There is a code sample at http://s... [22:10:39] 06Operations, 10MediaWiki-extensions-UniversalLanguageSelector, 10Wikimedia-SVG-rendering, 07I18n: MB Lateefi Fonts for Sindhi Wikipedia. - https://phabricator.wikimedia.org/T138136#2391629 (10mehtab.ahmed) Author is ready to give us permission he has asked for details. [22:23:39] 06Operations, 10MediaWiki-extensions-UniversalLanguageSelector, 10Wikimedia-SVG-rendering, 07I18n: MB Lateefi Fonts for Sindhi Wikipedia. - https://phabricator.wikimedia.org/T138136#2391632 (10Dereckson) I suggest this as an explanation link: http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=OFL#... [22:48:18] (03PS1) 10Tim Landscheidt: labstore: Fix LDAP query for project members [puppet] - 10https://gerrit.wikimedia.org/r/295099 (https://phabricator.wikimedia.org/T138102) [22:49:47] (03CR) 10Tim Landscheidt: "Tested the query separately successfully, but not the whole script as is." [puppet] - 10https://gerrit.wikimedia.org/r/295099 (https://phabricator.wikimedia.org/T138102) (owner: 10Tim Landscheidt)