[00:00:04] twentyafterfour: Dear anthropoid, the time has come. Please deploy Phabricator update (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170727T0000). [00:01:02] 10Operations, 10monitoring, 10Patch-For-Review: Check for an oversized exim4 queue indicating mail delivery failures - https://phabricator.wikimedia.org/T133110#3476770 (10Dzahn) 05Open>03Resolved [00:20:51] RECOVERY - MariaDB Slave Lag: s2 on dbstore1001 is OK: OK slave_sql_lag Replication lag: 89992.14 seconds [00:35:27] 10Operations, 10monitoring: Monitor hardware thermal issues - https://phabricator.wikimedia.org/T125205#3476831 (10faidon) 05Open>03Resolved a:03faidon So I thought about it a little bit and think we can resolve this after all. I don't know of any cases where temperatures are an issue but one that the cu... [00:38:51] 10Operations, 10monitoring: Fix Icinga checks for test/decom servers - https://phabricator.wikimedia.org/T151632#3476834 (10Dzahn) I have tried this twice using the simplistic approach with "profile::base::monitoring: false" but that does not actually work. by role::spare::system https://gerrit.wikimedia.org/... [00:43:53] (03PS1) 10Smalyshev: Add setup for https://www.mediawiki.org/ontology [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368120 (https://phabricator.wikimedia.org/T171807) [02:02:28] (03PS1) 10Dzahn: base::monitoring: make it possible to disable monitoring [puppet] - 10https://gerrit.wikimedia.org/r/368124 [02:03:32] (03CR) 10jerkins-bot: [V: 04-1] base::monitoring: make it possible to disable monitoring [puppet] - 10https://gerrit.wikimedia.org/r/368124 (owner: 10Dzahn) [02:08:15] (03PS2) 10Dzahn: base::monitoring: make it possible to disable monitoring [puppet] - 10https://gerrit.wikimedia.org/r/368124 [02:09:22] (03CR) 10jerkins-bot: [V: 04-1] base::monitoring: make it possible to disable monitoring [puppet] - 10https://gerrit.wikimedia.org/r/368124 (owner: 10Dzahn) [02:10:14] 3 [02:13:09] (03PS3) 10Dzahn: base::monitoring: make it possible to disable monitoring [puppet] - 10https://gerrit.wikimedia.org/r/368124 (https://phabricator.wikimedia.org/T151632) [02:18:04] (03PS4) 10Dzahn: base::monitoring: make it possible to disable monitoring [puppet] - 10https://gerrit.wikimedia.org/r/368124 (https://phabricator.wikimedia.org/T151632) [02:20:10] (03PS10) 10Krinkle: varnish: Switch browsersec to use errorpage template [puppet] - 10https://gerrit.wikimedia.org/r/355338 (https://phabricator.wikimedia.org/T113114) [02:22:12] (03PS5) 10Dzahn: base::monitoring: make it possible to disable monitoring [puppet] - 10https://gerrit.wikimedia.org/r/368124 (https://phabricator.wikimedia.org/T151632) [02:23:08] (03CR) 10jerkins-bot: [V: 04-1] base::monitoring: make it possible to disable monitoring [puppet] - 10https://gerrit.wikimedia.org/r/368124 (https://phabricator.wikimedia.org/T151632) (owner: 10Dzahn) [02:28:11] (03PS6) 10Dzahn: base::monitoring: make it possible to disable monitoring [puppet] - 10https://gerrit.wikimedia.org/r/368124 (https://phabricator.wikimedia.org/T151632) [02:39:11] 10Operations, 10Android-app-feature-Compilations, 10Traffic, 10Wikipedia-Android-App-Backlog, 10Reading-Infrastructure-Team-Backlog (Kanban): Determine where to host zim files for the Android app - https://phabricator.wikimedia.org/T170843#3476960 (10Krinkle) [02:39:42] !log l10nupdate@tin scap sync-l10n completed (1.30.0-wmf.10) (duration: 14m 21s) [02:39:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:45:41] (03PS7) 10Dzahn: base::monitoring: make it possible to disable monitoring [puppet] - 10https://gerrit.wikimedia.org/r/368124 (https://phabricator.wikimedia.org/T151632) [02:53:31] (03PS8) 10Dzahn: base::monitoring: make it possible to disable monitoring [puppet] - 10https://gerrit.wikimedia.org/r/368124 (https://phabricator.wikimedia.org/T151632) [02:56:48] (03CR) 10Dzahn: "http://puppet-compiler.wmflabs.org/7174/einsteinium.wikimedia.org/" [puppet] - 10https://gerrit.wikimedia.org/r/368124 (https://phabricator.wikimedia.org/T151632) (owner: 10Dzahn) [03:05:14] !log l10nupdate@tin scap sync-l10n completed (1.30.0-wmf.11) (duration: 06m 00s) [03:05:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:12:30] !log l10nupdate@tin ResourceLoader cache refresh completed at Thu Jul 27 03:12:30 UTC 2017 (duration 7m 16s) [03:12:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:25:41] PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 768.00 seconds [03:51:51] RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 191.11 seconds [04:28:09] (03CR) 10Krinkle: [C: 04-1] "Move long HTML line to a separate file, like Id77d23a442." [puppet] - 10https://gerrit.wikimedia.org/r/355338 (https://phabricator.wikimedia.org/T113114) (owner: 10Krinkle) [04:36:13] 10Operations, 10MediaWiki-Platform-Team, 10Performance-Team, 10Availability (Multiple-active-datacenters), and 4 others: Allow integration of data from etcd into the MediaWiki configuration - https://phabricator.wikimedia.org/T156924#3477012 (10Krinkle) [04:36:21] 10Operations, 10MediaWiki-Platform-Team, 10Performance-Team, 10Availability (Multiple-active-datacenters), and 4 others: Allow integration of data from etcd into the MediaWiki configuration - https://phabricator.wikimedia.org/T156924#3072673 (10Krinkle) [04:40:34] (03Abandoned) 10Krinkle: Revert "Bump cache epoch for Wikidata" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367853 (https://phabricator.wikimedia.org/T167784) (owner: 10Krinkle) [04:53:33] (03CR) 10Krinkle: "* https://github.com/wikimedia/operations-mediawiki-config/commits/HEAD/tests/noc-conf/NOCDblistTest.php" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556) (owner: 10Jdlrobson) [05:59:02] PROBLEM - Check Varnish expiry mailbox lag on cp4015 is CRITICAL: CRITICAL: expiry mailbox lag is 2056358 [07:02:10] !log installing spice secuerity updates on trusty hosts (jessie already fixed) [07:02:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:29:02] RECOVERY - Check Varnish expiry mailbox lag on cp4015 is OK: OK: expiry mailbox lag is 1 [07:37:08] !log upgrading apache on planet1001 [07:37:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:52:24] 10Operations, 10ORES, 10Scap, 10Scoring-platform-team-Backlog, 10Release-Engineering-Team (Watching / External): ORES should use git-fat for wheel deployments - https://phabricator.wikimedia.org/T171619#3477154 (10fgiunchedi) cc'ing #operations too [07:53:27] (03CR) 10Elukey: "I am seeing a lot of DEBUG messages like the following in the syslog of some hosts:" [puppet] - 10https://gerrit.wikimedia.org/r/367710 (https://phabricator.wikimedia.org/T171583) (owner: 10Rush) [07:55:58] (03PS2) 10Ema: VCL: mobile_redirect: unconditional https [puppet] - 10https://gerrit.wikimedia.org/r/367926 (owner: 10BBlack) [07:56:13] (03CR) 10Ema: [V: 032 C: 032] VCL: mobile_redirect: unconditional https [puppet] - 10https://gerrit.wikimedia.org/r/367926 (owner: 10BBlack) [08:10:53] 10Operations, 10Cloud-VPS, 10monitoring, 10Patch-For-Review, 10User-fgiunchedi: Diamond collectors collects NFS statistics on Cloud-VPS - https://phabricator.wikimedia.org/T171583#3477178 (10fgiunchedi) @zhuyifei1999 looks like this is fixed, confirmed? [08:12:57] !log installing apache security updates on graphite* [08:13:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:21:34] !log upload scap 3.6.0-1 - T127762 [08:21:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:21:45] T127762: Update Debian Package for Scap3 - https://phabricator.wikimedia.org/T127762 [08:23:51] PROBLEM - puppet last run on sca1003 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Package[scap] [08:25:21] PROBLEM - puppet last run on snapshot1005 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Package[scap] [08:25:32] (03PS2) 10Filippo Giunchedi: Scap: bump version to 3.6.0-1 [puppet] - 10https://gerrit.wikimedia.org/r/367749 (https://phabricator.wikimedia.org/T127762) (owner: 10Thcipriani) [08:25:40] that's me, fixing as we speak [08:26:05] \O/ [08:26:54] (03PS1) 10Jcrespo: Promote db2048 as the new codfw s1 master, replacing db2016 [puppet] - 10https://gerrit.wikimedia.org/r/368140 (https://phabricator.wikimedia.org/T170662) [08:27:15] (03CR) 10Filippo Giunchedi: [C: 032] Scap: bump version to 3.6.0-1 [puppet] - 10https://gerrit.wikimedia.org/r/367749 (https://phabricator.wikimedia.org/T127762) (owner: 10Thcipriani) [08:28:04] !log disable puppet on db2016 and db2048 to prepare for switchover [08:28:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:29:11] (03PS1) 10Elukey: role::statistics::web: fix geowiki rsync [puppet] - 10https://gerrit.wikimedia.org/r/368143 [08:29:51] PROBLEM - puppet last run on snapshot1001 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Package[scap] [08:30:21] PROBLEM - puppet last run on mw2246 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Package[scap] [08:31:01] RECOVERY - puppet last run on sca1003 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [08:31:22] PROBLEM - WDQS SPARQL on wdqs1001 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Temporarily Unavailable - 387 bytes in 0.001 second response time [08:31:31] PROBLEM - WDQS HTTP on wdqs1001 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Temporarily Unavailable - 387 bytes in 0.001 second response time [08:31:52] RECOVERY - puppet last run on snapshot1001 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [08:32:21] RECOVERY - puppet last run on mw2246 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [08:32:31] RECOVERY - puppet last run on snapshot1005 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [08:36:11] (03CR) 10Elukey: [C: 032] "pcc looks good https://puppet-compiler.wmflabs.org/compiler02/7176/" [puppet] - 10https://gerrit.wikimedia.org/r/368143 (owner: 10Elukey) [08:36:32] PROBLEM - puppet last run on stat1005 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Package[scap] [08:36:55] paravoid: stat1006 cronspam fixed --^ [08:37:11] didn't see stat1003's one in the past two days, maybe it was temporary? [08:39:26] (03PS1) 10Jcrespo: Depool db2048 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368146 (https://phabricator.wikimedia.org/T170662) [08:42:03] (03PS1) 10Jcrespo: mariadb: Promote db2048 as the new codfw master [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368147 (https://phabricator.wikimedia.org/T170662) [08:42:50] (03CR) 10Jcrespo: [C: 032] Depool db2048 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368146 (https://phabricator.wikimedia.org/T170662) (owner: 10Jcrespo) [08:43:01] PROBLEM - IPv4 ping to codfw on ripe-atlas-codfw is CRITICAL: CRITICAL - failed 45 probes of 273 (alerts on 19) - https://atlas.ripe.net/measurements/1791210/#!map [08:43:05] 10Operations, 10monitoring, 10netops: Grafana dashboards for librenms graphite data - https://phabricator.wikimedia.org/T171823#3477254 (10fgiunchedi) [08:43:18] (03CR) 10Gehel: "This change request does have 2 phab references in the commit message, which does not follow the guidelines (https://www.mediawiki.org/wik" [puppet] - 10https://gerrit.wikimedia.org/r/367930 (https://phabricator.wikimedia.org/T170494) (owner: 10Bearloga) [08:43:50] 10Operations, 10monitoring, 10Patch-For-Review, 10Prometheus-metrics-monitoring, 10User-fgiunchedi: Replace Torrus with Prometheus snmp_exporter for PDUs monitoring - https://phabricator.wikimedia.org/T148541#3477269 (10fgiunchedi) 05Open>03declined We've ultimately gone with pushing librenms data in... [08:44:05] 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Hindi-Sites: Create Wikiversity Hindi - https://phabricator.wikimedia.org/T168765#3477273 (10Jayprakash12345) Is there any progress in this ticket [08:45:49] (03Merged) 10jenkins-bot: Depool db2048 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368146 (https://phabricator.wikimedia.org/T170662) (owner: 10Jcrespo) [08:47:53] (03CR) 10jenkins-bot: Depool db2048 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368146 (https://phabricator.wikimedia.org/T170662) (owner: 10Jcrespo) [08:47:55] 10Operations, 10monitoring, 10Graphite, 10User-fgiunchedi: Audit groups of metrics in Graphite that allocate a lot of disk space - https://phabricator.wikimedia.org/T1075#3477283 (10fgiunchedi) 05Open>03Resolved Resolving this old task, individual big users are tracked separately [08:48:01] RECOVERY - IPv4 ping to codfw on ripe-atlas-codfw is OK: OK - failed 1 probes of 273 (alerts on 19) - https://atlas.ripe.net/measurements/1791210/#!map [08:48:03] !log jynus@tin Synchronized wmf-config/db-codfw.php: Depool db2048 (duration: 00m 46s) [08:48:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:49:38] (03CR) 10Gehel: [C: 04-1] statistics::discovery: Reconfigure for Golden data retrieval (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/367930 (https://phabricator.wikimedia.org/T170494) (owner: 10Bearloga) [08:53:06] (03PS4) 10Ema: pybal::monitoring: add check_pybal_ipvs_diff [puppet] - 10https://gerrit.wikimedia.org/r/367662 (https://phabricator.wikimedia.org/T134893) [08:54:49] 10Operations, 10Datasets-General-or-Unknown, 10Wikidata, 10HHVM, 10Patch-For-Review: Enable GC for HHVM CLI (at least for dump runners) - https://phabricator.wikimedia.org/T162245#3477289 (10elukey) [08:54:52] !log stopping mysql and restarting db2048 [08:55:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:55:39] (03CR) 10Ema: pybal::monitoring: add check_pybal_ipvs_diff (033 comments) [puppet] - 10https://gerrit.wikimedia.org/r/367662 (https://phabricator.wikimedia.org/T134893) (owner: 10Ema) [08:56:20] (03CR) 10Gehel: "@EBernhardson: I modified your CR a bit, could you review to make sure it still make sense?" [puppet] - 10https://gerrit.wikimedia.org/r/367709 (https://phabricator.wikimedia.org/T169498) (owner: 10EBernhardson) [08:56:51] PROBLEM - puppet last run on netmon1002 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Package[scap] [08:57:36] (03CR) 10Elukey: [C: 031] Update check_disk_options to check/alert on free inodes [puppet] - 10https://gerrit.wikimedia.org/r/367407 (https://phabricator.wikimedia.org/T129222) (owner: 10Filippo Giunchedi) [09:02:47] !log copy python-conftool to stretch-wikimedia for scap dep [09:02:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:04:54] (03PS2) 10Gehel: Update PYTHONPATH for mjolnir-kafka-daemon [puppet] - 10https://gerrit.wikimedia.org/r/366017 (owner: 10EBernhardson) [09:07:18] (03CR) 10Giuseppe Lavagetto: [C: 031] "Very nice! All comments are minor and are not blockers" (0313 comments) [software/cumin] - 10https://gerrit.wikimedia.org/r/366737 (https://phabricator.wikimedia.org/T170394) (owner: 10Volans) [09:11:06] 10Operations, 10ops-eqiad, 10Analytics, 10User-Elukey: SATA errors for stat1004 in the dmesg - https://phabricator.wikimedia.org/T162770#3477300 (10elukey) 05Open>03Resolved The issue seemed transient, didn't see any trace of it anymore in dmesg. Will re-open if necessary :) [09:11:46] (03CR) 10Filippo Giunchedi: [C: 031] pybal::monitoring: add check_pybal_ipvs_diff [puppet] - 10https://gerrit.wikimedia.org/r/367662 (https://phabricator.wikimedia.org/T134893) (owner: 10Ema) [09:18:52] (03CR) 10Giuseppe Lavagetto: [C: 031] Transports: improve Command class [software/cumin] - 10https://gerrit.wikimedia.org/r/367823 (https://phabricator.wikimedia.org/T171679) (owner: 10Volans) [09:22:17] (03CR) 10Giuseppe Lavagetto: [C: 031] CLI: add an option to ignore exit codes of commands [software/cumin] - 10https://gerrit.wikimedia.org/r/367824 (https://phabricator.wikimedia.org/T171679) (owner: 10Volans) [09:23:27] !log starting s1-codfw database topology changes [09:23:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:25:03] (03PS1) 10Filippo Giunchedi: scap: upgrade to 3.6.0-2 [puppet] - 10https://gerrit.wikimedia.org/r/368150 [09:26:43] (03CR) 10Filippo Giunchedi: [C: 032] scap: upgrade to 3.6.0-2 [puppet] - 10https://gerrit.wikimedia.org/r/368150 (owner: 10Filippo Giunchedi) [09:32:51] (03CR) 10Ema: [C: 032] 1.13.10: Add support for One-packet scheduling (OPS) [debs/pybal] - 10https://gerrit.wikimedia.org/r/367924 (https://phabricator.wikimedia.org/T104442) (owner: 10Ema) [09:33:02] (03CR) 10Ema: [C: 032] 1.13.10: Add support for One-packet scheduling (OPS) [debs/pybal] (1.13) - 10https://gerrit.wikimedia.org/r/367925 (https://phabricator.wikimedia.org/T104442) (owner: 10Ema) [09:34:21] RECOVERY - puppet last run on stat1005 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [09:48:49] !log pybal 1.13.10 (one-packet-scheduling) built and uploaded to apt.w.o T104442 [09:48:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:48:59] T104442: Investigate better DNS cache/lookup solutions - https://phabricator.wikimedia.org/T104442 [09:52:21] RECOVERY - Check systemd state on relforge1001 is OK: OK - running: The system is fully operational [09:54:41] (03CR) 10Gehel: [C: 032] Update PYTHONPATH for mjolnir-kafka-daemon [puppet] - 10https://gerrit.wikimedia.org/r/366017 (owner: 10EBernhardson) [09:54:49] (03PS3) 10Gehel: Update PYTHONPATH for mjolnir-kafka-daemon [puppet] - 10https://gerrit.wikimedia.org/r/366017 (owner: 10EBernhardson) [09:55:31] RECOVERY - puppet last run on netmon1002 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [10:01:12] RECOVERY - Check systemd state on relforge1002 is OK: OK - running: The system is fully operational [10:01:59] (03CR) 10Jcrespo: [C: 032] mariadb: Promote db2048 as the new codfw master [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368147 (https://phabricator.wikimedia.org/T170662) (owner: 10Jcrespo) [10:02:12] (03PS2) 10Jcrespo: Promote db2048 as the new codfw s1 master, replacing db2016 [puppet] - 10https://gerrit.wikimedia.org/r/368140 (https://phabricator.wikimedia.org/T170662) [10:02:17] (03CR) 10Jcrespo: [C: 032] Promote db2048 as the new codfw s1 master, replacing db2016 [puppet] - 10https://gerrit.wikimedia.org/r/368140 (https://phabricator.wikimedia.org/T170662) (owner: 10Jcrespo) [10:03:19] (03Merged) 10jenkins-bot: mariadb: Promote db2048 as the new codfw master [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368147 (https://phabricator.wikimedia.org/T170662) (owner: 10Jcrespo) [10:03:31] (03CR) 10jenkins-bot: mariadb: Promote db2048 as the new codfw master [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368147 (https://phabricator.wikimedia.org/T170662) (owner: 10Jcrespo) [10:05:30] !log starting actual master failover s1-codfw db2016->db2048 [10:05:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:06:38] !log jynus@tin Synchronized wmf-config/db-codfw.php: Promote db2048 as the new codfw-s1 master (duration: 00m 46s) [10:06:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:07:05] (03CR) 10Volans: [C: 031] "LGTM, a couple of optional comments inline." (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/367662 (https://phabricator.wikimedia.org/T134893) (owner: 10Ema) [10:13:34] (03PS4) 10Gehel: Move R-related code from shiny_server to separate module [puppet] - 10https://gerrit.wikimedia.org/r/366170 (https://phabricator.wikimedia.org/T153856) (owner: 10Bearloga) [10:13:44] (03PS1) 10Jcrespo: mysql-serporter: Change role on prometheus, which is still done manually [puppet] - 10https://gerrit.wikimedia.org/r/368158 (https://phabricator.wikimedia.org/T170662) [10:14:13] <_joe_> !log restarting puppetdb on nihal, T170740 [10:14:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:14:27] T170740: PuppetDB misbehaving on 2017-07-15 - https://phabricator.wikimedia.org/T170740 [10:15:11] (03CR) 10Gehel: [C: 032] Move R-related code from shiny_server to separate module [puppet] - 10https://gerrit.wikimedia.org/r/366170 (https://phabricator.wikimedia.org/T153856) (owner: 10Bearloga) [10:15:40] (03PS1) 10Jcrespo: Alter db2048 and db2016 sequence in s1.dblist [software] - 10https://gerrit.wikimedia.org/r/368159 (https://phabricator.wikimedia.org/T170662) [10:16:41] PROBLEM - puppet last run on nihal is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:16:47] 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Hindi-Sites: Create Wikiversity Hindi - https://phabricator.wikimedia.org/T168765#3477434 (10MarcoAurelio) @Jayprakash12345 They're working on this. Wikis can't be created whenever we want. We have to find a suitable time and @Dereckson... [10:16:51] PROBLEM - puppet last run on thumbor2003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:16:51] PROBLEM - puppet last run on mw2215 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:17:01] PROBLEM - puppet last run on mw2223 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:17:01] PROBLEM - puppet last run on cp2007 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:17:11] PROBLEM - puppet last run on maps2002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:17:11] PROBLEM - puppet last run on mw2214 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:17:12] PROBLEM - puppet last run on scb2005 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:17:12] PROBLEM - puppet last run on elastic2006 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:17:41] PROBLEM - puppet last run on ores2003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:17:41] <_joe_> this is all expected ^^ [10:17:41] PROBLEM - puppet last run on db2081 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:17:41] PROBLEM - puppet last run on db2078 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:17:41] PROBLEM - puppet last run on db2034 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:17:42] PROBLEM - puppet last run on tureis is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:17:43] mmmh bad gateway again [10:17:51] PROBLEM - puppet last run on mw2146 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:17:51] PROBLEM - puppet last run on cp2018 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:17:51] PROBLEM - puppet last run on ores2001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:17:51] PROBLEM - puppet last run on mw2110 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:17:52] _joe_: is that you? [10:17:52] PROBLEM - puppet last run on db2047 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:17:57] <_joe_> yes, see my log line [10:18:01] PROBLEM - puppet last run on scb2006 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:18:01] PROBLEM - puppet last run on mc2023 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:18:01] PROBLEM - puppet last run on restbase2010 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:18:06] oops missed it sorry [10:18:11] PROBLEM - puppet last run on wtp2001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:18:11] PROBLEM - puppet last run on wdqs2003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:18:12] PROBLEM - puppet last run on mw2168 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:18:12] PROBLEM - puppet last run on mw2174 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:18:12] PROBLEM - puppet last run on db2038 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:18:22] PROBLEM - puppet last run on mw2236 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:18:41] PROBLEM - puppet last run on db2084 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:18:41] PROBLEM - puppet last run on lvs4001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:18:51] PROBLEM - puppet last run on mw2186 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:18:51] PROBLEM - puppet last run on db2037 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:18:51] PROBLEM - puppet last run on acrab is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:18:51] PROBLEM - puppet last run on mw2105 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:18:52] PROBLEM - puppet last run on cp4007 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:18:52] PROBLEM - puppet last run on wtp2018 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:18:52] PROBLEM - puppet last run on mw2114 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:18:52] PROBLEM - puppet last run on mc2024 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:18:53] PROBLEM - puppet last run on mw2113 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:19:01] PROBLEM - puppet last run on mw2242 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:19:01] PROBLEM - puppet last run on mw2124 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:19:01] PROBLEM - puppet last run on mw2258 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:19:11] PROBLEM - puppet last run on mw2136 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:19:11] PROBLEM - puppet last run on mw2213 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:19:11] PROBLEM - puppet last run on lvs2001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:19:12] PROBLEM - puppet last run on conf2002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:19:18] (03PS1) 10Filippo Giunchedi: prometheus: handle non-existant or empty puppet last run summary [puppet] - 10https://gerrit.wikimedia.org/r/368160 (https://phabricator.wikimedia.org/T170932) [10:19:21] PROBLEM - puppet last run on sca2004 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:19:21] PROBLEM - puppet last run on db2057 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:19:22] PROBLEM - puppet last run on lvs2003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:19:31] PROBLEM - puppet last run on rdb2004 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:19:31] PROBLEM - puppet last run on mw2173 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:19:41] PROBLEM - puppet last run on ganeti2005 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:19:41] PROBLEM - puppet last run on db2088 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:19:51] PROBLEM - puppet last run on mw2118 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:20:01] PROBLEM - puppet last run on mw2229 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:20:21] PROBLEM - puppet last run on mwlog2001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:20:31] PROBLEM - puppet last run on restbase2002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:20:31] PROBLEM - puppet last run on labtestvirt2002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:22:31] PROBLEM - puppet last run on notebook1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:23:31] PROBLEM - Check systemd state on bast3002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [10:23:37] 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Hindi-Sites: Create Wikiversity Hindi - https://phabricator.wikimedia.org/T168765#3477472 (10MarcoAurelio) p:05Normal>03Low Priority to reflect the current status of the task, not the wishes of the requestor. Actually there will be... [10:28:06] 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Hindi-Sites: Create Wikiversity Hindi - https://phabricator.wikimedia.org/T168765#3477478 (10MarcoAurelio) [10:29:11] RECOVERY - puppet last run on maps2002 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [10:30:09] 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Hindi-Sites: Create Wikiversity Hindi - https://phabricator.wikimedia.org/T168765#3477485 (10MarcoAurelio) [10:31:03] !log shutting down and rebooting db2016 [10:31:05] 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Hindi-Sites: Create Wikiversity Hindi - https://phabricator.wikimedia.org/T168765#3477486 (10Urbanecm) @MarcoAurelio Are you going to upload relevant conf patches or should I? [10:31:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:31:54] RECOVERY - puppet last run on nihal is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [10:32:07] 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Hindi-Sites: Create Wikiversity Hindi - https://phabricator.wikimedia.org/T168765#3376126 (10MarcoAurelio) [10:32:56] !log reimaging mw2246 to jessie (T145742) [10:33:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:33:05] T145742: Migrate video scalers to jessie - https://phabricator.wikimedia.org/T145742 [10:33:21] PROBLEM - puppet last run on restbase-test2002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:33:22] 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Hindi-Sites: Create Wikiversity Hindi - https://phabricator.wikimedia.org/T168765#3376126 (10MarcoAurelio) @Urbanecm Feel free to. I've added a to-do list above to keep track of what's still pending or is done. Regards. [10:33:30] (03PS1) 10Ema: pybal: one-packet-scheduling for dns_rec_udp [puppet] - 10https://gerrit.wikimedia.org/r/368162 (https://phabricator.wikimedia.org/T104442) [10:33:41] PROBLEM - puppet last run on db2045 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:33:44] (03PS6) 10Giuseppe Lavagetto: puppetdb: Bump Java Heap max size to 6GB [puppet] - 10https://gerrit.wikimedia.org/r/366229 (https://phabricator.wikimedia.org/T170740) (owner: 10Alexandros Kosiaris) [10:33:52] <_joe_> volans, elukey ^^ [10:34:01] PROBLEM - puppet last run on ms-be2022 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:34:01] PROBLEM - puppet last run on db2043 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:34:02] PROBLEM - puppet last run on db2059 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:34:02] PROBLEM - puppet last run on ores2002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:34:41] _joe_: nice :) [10:35:15] (03PS2) 10Jcrespo: prometheus-mysql-exporter: Change role on prometheus [puppet] - 10https://gerrit.wikimedia.org/r/368158 (https://phabricator.wikimedia.org/T170662) [10:35:21] (03CR) 10Jcrespo: [C: 032] Alter db2048 and db2016 sequence in s1.dblist [software] - 10https://gerrit.wikimedia.org/r/368159 (https://phabricator.wikimedia.org/T170662) (owner: 10Jcrespo) [10:35:22] _joe_: looks ok, do you have a compiler result? [10:35:26] <_joe_> https://www.freedesktop.org/software/systemd/man/systemd.service.html#Command%20lines is a reference for the issues [10:35:29] I am doing it [10:35:29] <_joe_> nope [10:35:36] (03PS3) 10Jcrespo: prometheus-mysql-exporter: Change role on prometheus [puppet] - 10https://gerrit.wikimedia.org/r/368158 (https://phabricator.wikimedia.org/T170662) [10:35:39] thanks elukey [10:36:02] <_joe_> basically systemd decided to have a parameters expansion that is subtly different from all shells [10:37:02] https://puppet-compiler.wmflabs.org/compiler02/7177/nitrogen.eqiad.wmnet/ [10:37:07] looks good afaics [10:37:42] enwiki-codfw may be a bit unstable for a while in terms of lag [10:37:59] no user impact, just extra logging, etc. [10:38:26] (03CR) 10Volans: [C: 031] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/366229 (https://phabricator.wikimedia.org/T170740) (owner: 10Alexandros Kosiaris) [10:38:31] PROBLEM - puppet last run on notebook1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:38:35] <_joe_> let's merge it? [10:38:40] (03CR) 10Elukey: [C: 031] "pcc result: https://puppet-compiler.wmflabs.org/compiler02/7177/nitrogen.eqiad.wmnet/" [puppet] - 10https://gerrit.wikimedia.org/r/366229 (https://phabricator.wikimedia.org/T170740) (owner: 10Alexandros Kosiaris) [10:38:41] _joe_: go for it [10:38:46] <_joe_> then let me run it on nihal [10:38:48] * volans going to fix it's own puppetdb [10:38:59] (03CR) 10Giuseppe Lavagetto: [C: 032] puppetdb: Bump Java Heap max size to 6GB [puppet] - 10https://gerrit.wikimedia.org/r/366229 (https://phabricator.wikimedia.org/T170740) (owner: 10Alexandros Kosiaris) [10:39:05] (03PS7) 10Giuseppe Lavagetto: puppetdb: Bump Java Heap max size to 6GB [puppet] - 10https://gerrit.wikimedia.org/r/366229 (https://phabricator.wikimedia.org/T170740) (owner: 10Alexandros Kosiaris) [10:40:37] (03Draft1) 10MarcoAurelio: Incubator sysops to add/remove 'accountcreator' [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368164 (https://phabricator.wikimedia.org/T171751) [10:40:58] 10Operations, 10monitoring, 10User-fgiunchedi: Update diamond to latest upstream version - https://phabricator.wikimedia.org/T97635#3477528 (10zhuyifei1999) [10:41:01] 10Operations, 10Cloud-VPS, 10monitoring, 10Patch-For-Review, 10User-fgiunchedi: Diamond collectors collects NFS statistics on Cloud-VPS - https://phabricator.wikimedia.org/T171583#3477525 (10zhuyifei1999) 05Open>03Resolved a:03chasemp Yep, thanks :) [10:43:50] <_joe_> it's working on nihal [10:43:56] \o/ [10:45:00] RECOVERY - puppet last run on tureis is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [10:45:10] RECOVERY - puppet last run on mw2146 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [10:45:11] RECOVERY - puppet last run on mc2023 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [10:45:11] PROBLEM - puppet last run on mc2035 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:45:13] one good follow up would be to push JVM metrics to prometheus to monitor what puppetdb's jvm does (GC, heap, etc..) [10:45:16] 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Hindi-Sites: Create Wikiversity Hindi - https://phabricator.wikimedia.org/T168765#3477532 (10Urbanecm) [10:45:20] RECOVERY - puppet last run on cp2007 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [10:45:24] elukey: and GC log [10:45:30] RECOVERY - puppet last run on mw2168 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [10:45:30] RECOVERY - puppet last run on mw2214 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [10:45:30] PROBLEM - puppet last run on db2075 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:45:30] RECOVERY - puppet last run on elastic2006 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [10:45:40] PROBLEM - puppet last run on ganeti2007 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:45:41] PROBLEM - puppet last run on elastic2016 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:45:42] 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Hindi-Sites: Create Wikiversity Hindi - https://phabricator.wikimedia.org/T168765#3376126 (10Urbanecm) I've removed DNS and Apache as those are not required for language projects (they are covered by shared configuration already). [10:45:50] RECOVERY - puppet last run on db2078 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [10:45:51] RECOVERY - puppet last run on db2084 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [10:45:59] volans: +1 [10:46:00] RECOVERY - puppet last run on ores2003 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [10:46:10] RECOVERY - puppet last run on cp2018 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [10:46:10] RECOVERY - puppet last run on mw2186 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [10:46:10] RECOVERY - puppet last run on thumbor2003 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [10:46:10] RECOVERY - puppet last run on ores2001 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [10:46:10] RECOVERY - puppet last run on mw2110 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [10:46:10] RECOVERY - puppet last run on mw2113 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [10:46:10] RECOVERY - puppet last run on mw2215 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [10:46:11] RECOVERY - puppet last run on db2047 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [10:46:20] RECOVERY - puppet last run on mw2258 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [10:46:20] RECOVERY - puppet last run on mw2223 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [10:46:23] RECOVERY - puppet last run on mw2213 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [10:46:30] RECOVERY - puppet last run on wtp2001 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [10:46:30] RECOVERY - puppet last run on wdqs2003 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [10:46:31] RECOVERY - puppet last run on db2038 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [10:46:31] RECOVERY - puppet last run on scb2005 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [10:46:40] RECOVERY - puppet last run on rdb2004 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [10:47:00] RECOVERY - puppet last run on db2088 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [10:47:00] RECOVERY - puppet last run on db2037 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [10:47:01] RECOVERY - puppet last run on acrab is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [10:47:03] (03PS2) 10MarcoAurelio: Incubator bureaucrats to add/remove 'accountcreator' [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368164 (https://phabricator.wikimedia.org/T171751) [10:47:10] RECOVERY - puppet last run on mw2105 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [10:47:10] RECOVERY - puppet last run on mc2024 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [10:47:10] RECOVERY - puppet last run on wtp2018 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [10:47:10] RECOVERY - puppet last run on mw2114 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [10:47:10] <_joe_> we might have some failures in a few [10:47:10] RECOVERY - puppet last run on mw2242 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [10:47:11] RECOVERY - puppet last run on cp4007 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [10:47:20] RECOVERY - puppet last run on mw2124 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [10:47:20] RECOVERY - puppet last run on scb2006 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [10:47:20] RECOVERY - puppet last run on restbase2010 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [10:47:30] RECOVERY - puppet last run on mw2174 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [10:47:30] RECOVERY - puppet last run on conf2002 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [10:47:40] RECOVERY - puppet last run on sca2004 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [10:47:40] RECOVERY - puppet last run on db2057 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [10:47:40] RECOVERY - puppet last run on lvs2003 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [10:47:40] RECOVERY - puppet last run on mw2236 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [10:47:40] RECOVERY - puppet last run on mw2173 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [10:48:00] RECOVERY - puppet last run on lvs4001 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [10:48:20] RECOVERY - puppet last run on mw2229 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [10:48:24] RECOVERY - puppet last run on mw2136 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [10:48:24] RECOVERY - puppet last run on lvs2001 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [10:48:31] RECOVERY - puppet last run on mwlog2001 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [10:48:41] RECOVERY - puppet last run on labtestvirt2002 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [10:48:50] RECOVERY - puppet last run on db2081 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [10:48:50] RECOVERY - puppet last run on restbase2002 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [10:48:51] RECOVERY - puppet last run on ganeti2005 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [10:49:11] RECOVERY - puppet last run on mw2118 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [10:50:00] PROBLEM - puppet last run on prometheus2004 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:50:00] PROBLEM - puppet last run on rdb1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:50:10] PROBLEM - puppet last run on analytics1037 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:50:20] PROBLEM - puppet last run on db1050 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:50:30] PROBLEM - puppet last run on mw1221 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:50:30] PROBLEM - puppet last run on mw1257 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:50:30] PROBLEM - puppet last run on db1063 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:50:40] PROBLEM - puppet last run on analytics1029 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:50:40] RECOVERY - puppet last run on db2034 is OK: OK: Puppet is currently enabled, last run 5 minutes ago with 0 failures [10:50:41] PROBLEM - puppet last run on elastic1037 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:51:50] PROBLEM - puppet last run on es2003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:59:27] <_joe_> uhm [10:59:50] RECOVERY - puppet last run on elastic1037 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [11:01:00] RECOVERY - puppet last run on db2045 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [11:01:11] RECOVERY - puppet last run on ms-be2022 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [11:01:20] RECOVERY - puppet last run on ores2002 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [11:01:40] RECOVERY - puppet last run on restbase-test2002 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [11:02:01] (03PS1) 10Urbanecm: Initial configuration for hiwikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368165 (https://phabricator.wikimedia.org/T168765) [11:02:20] RECOVERY - puppet last run on db2043 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [11:02:20] RECOVERY - puppet last run on db2059 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [11:03:28] (03CR) 10jerkins-bot: [V: 04-1] Initial configuration for hiwikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368165 (https://phabricator.wikimedia.org/T168765) (owner: 10Urbanecm) [11:05:13] (03PS1) 10Urbanecm: Enable mapframe for cswiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368167 (https://phabricator.wikimedia.org/T171588) [11:06:09] (03PS2) 10Urbanecm: Enable mapframe for cswiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368167 (https://phabricator.wikimedia.org/T171588) [11:06:25] (03PS2) 10Urbanecm: Initial configuration for hiwikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368165 (https://phabricator.wikimedia.org/T168765) [11:08:08] (03CR) 10jerkins-bot: [V: 04-1] Initial configuration for hiwikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368165 (https://phabricator.wikimedia.org/T168765) (owner: 10Urbanecm) [11:09:05] (03CR) 10Jcrespo: [C: 032] prometheus-mysql-exporter: Change role on prometheus [puppet] - 10https://gerrit.wikimedia.org/r/368158 (https://phabricator.wikimedia.org/T170662) (owner: 10Jcrespo) [11:09:11] (03PS4) 10Jcrespo: prometheus-mysql-exporter: Change role on prometheus [puppet] - 10https://gerrit.wikimedia.org/r/368158 (https://phabricator.wikimedia.org/T170662) [11:09:41] (03PS3) 10Urbanecm: Initial configuration for hiwikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368165 (https://phabricator.wikimedia.org/T168765) [11:11:30] (03CR) 10jerkins-bot: [V: 04-1] Initial configuration for hiwikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368165 (https://phabricator.wikimedia.org/T168765) (owner: 10Urbanecm) [11:13:23] RECOVERY - puppet last run on mc2035 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [11:13:42] RECOVERY - puppet last run on db2075 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:14:02] RECOVERY - puppet last run on ganeti2007 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [11:14:02] RECOVERY - puppet last run on es2003 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [11:14:03] RECOVERY - puppet last run on elastic2016 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [11:14:43] PROBLEM - Check whether ferm is active by checking the default input chain on mw2246 is CRITICAL: Return code of 255 is out of bounds [11:14:46] PROBLEM - puppet last run on mw2246 is CRITICAL: Return code of 255 is out of bounds [11:15:42] PROBLEM - DPKG on mw2246 is CRITICAL: Return code of 255 is out of bounds [11:15:42] PROBLEM - salt-minion processes on mw2246 is CRITICAL: Return code of 255 is out of bounds [11:16:32] PROBLEM - HHVM jobrunner on mw2246 is CRITICAL: connect to address 10.192.0.72 and port 9005: Connection refused [11:16:42] PROBLEM - Disk space on mw2246 is CRITICAL: Return code of 255 is out of bounds [11:17:11] checking mw2246 [11:17:45] it has been a while since it froze like this, there is a task for it :( [11:18:12] RECOVERY - puppet last run on rdb1001 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [11:18:42] RECOVERY - puppet last run on mw1221 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [11:18:42] RECOVERY - puppet last run on mw1257 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [11:18:44] ah no the broken one was mw2256 [11:18:46] my bad [11:19:07] this is probably a reimage [11:19:22] RECOVERY - puppet last run on prometheus2004 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [11:19:22] RECOVERY - puppet last run on analytics1037 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [11:19:32] RECOVERY - puppet last run on db1050 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:19:40] videoscaler going to jessie [11:19:52] RECOVERY - puppet last run on analytics1029 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [11:19:52] RECOVERY - puppet last run on db1063 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [11:20:22] PROBLEM - configured eth on mw2246 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [11:21:09] (03CR) 10Gehel: [C: 031] "LGTM" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368167 (https://phabricator.wikimedia.org/T171588) (owner: 10Urbanecm) [11:21:12] PROBLEM - dhclient process on mw2246 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [11:21:52] PROBLEM - mediawiki-installation DSH group on mw2246 is CRITICAL: Host mw2246 is not in mediawiki-installation dsh group [11:22:02] PROBLEM - Check size of conntrack table on mw2246 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [11:23:02] PROBLEM - nutcracker port on mw2246 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [11:23:02] PROBLEM - Check systemd state on mw2246 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [11:23:21] (03PS1) 10Urbanecm: Initial configuration for wikimania2018wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368168 (https://phabricator.wikimedia.org/T155038) [11:23:52] PROBLEM - Check the NTP synchronisation status of timesyncd on mw2246 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [11:23:52] PROBLEM - nutcracker process on mw2246 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [11:24:42] (03CR) 10jerkins-bot: [V: 04-1] Initial configuration for wikimania2018wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368168 (https://phabricator.wikimedia.org/T155038) (owner: 10Urbanecm) [11:24:49] 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Hindi-Sites, and 2 others: Create Wikiversity Hindi - https://phabricator.wikimedia.org/T168765#3477694 (10Urbanecm) [11:25:06] 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10MW-1.30-release-notes (WMF-deploy-2017-07-25_(1.30.0-wmf.11)), and 2 others: Create Dinka Wikipedia - https://phabricator.wikimedia.org/T168518#3477695 (10Urbanecm) Anything left to do here? [11:25:27] 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Hindi-Sites, and 2 others: Create Wikiversity Hindi - https://phabricator.wikimedia.org/T168765#3376126 (10Urbanecm) Anyone link to SVG version of the logo above please? [11:27:38] (03PS4) 10Urbanecm: Initial configuration for hiwikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368165 (https://phabricator.wikimedia.org/T168765) [11:27:40] 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Hindi-Sites, and 2 others: Create Wikiversity Hindi - https://phabricator.wikimedia.org/T168765#3477698 (10Jayprakash12345) >>! In T168765#3477694, @Urbanecm wrote: > Anyone link to SVG version of the logo above please? https://commons... [11:28:16] 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Hindi-Sites, and 2 others: Create Wikiversity Hindi - https://phabricator.wikimedia.org/T168765#3477701 (10Urbanecm) Thank you! [11:28:48] elukey: yeah,that's all fine, it's the usual race in reimaging, silenceing it again [11:29:00] (03CR) 10jerkins-bot: [V: 04-1] Initial configuration for hiwikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368165 (https://phabricator.wikimedia.org/T168765) (owner: 10Urbanecm) [11:33:53] (03PS5) 10Urbanecm: Initial configuration for hiwikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368165 (https://phabricator.wikimedia.org/T168765) [11:35:20] (03PS2) 10Urbanecm: Initial configuration for wikimania2018wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368168 (https://phabricator.wikimedia.org/T155038) [11:35:51] (03CR) 10jerkins-bot: [V: 04-1] Initial configuration for hiwikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368165 (https://phabricator.wikimedia.org/T168765) (owner: 10Urbanecm) [11:40:08] (03PS6) 10Urbanecm: Initial configuration for hiwikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368165 (https://phabricator.wikimedia.org/T168765) [11:40:12] RECOVERY - Check whether ferm is active by checking the default input chain on mw2246 is OK: OK ferm input default policy is set [11:40:12] RECOVERY - Check size of conntrack table on mw2246 is OK: OK: nf_conntrack is 0 % full [11:40:12] RECOVERY - dhclient process on mw2246 is OK: PROCS OK: 0 processes with command name dhclient [11:40:23] RECOVERY - configured eth on mw2246 is OK: OK - interfaces up [11:41:02] RECOVERY - Disk space on mw2246 is OK: DISK OK [11:41:02] RECOVERY - salt-minion processes on mw2246 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion [11:43:02] RECOVERY - DPKG on mw2246 is OK: All packages OK [11:47:52] RECOVERY - HHVM jobrunner on mw2246 is OK: HTTP OK: HTTP/1.1 200 OK - 202 bytes in 0.080 second response time [11:53:42] RECOVERY - Check the NTP synchronisation status of timesyncd on mw2246 is OK: OK: synced at Thu 2017-07-27 11:53:34 UTC. [11:57:02] RECOVERY - nutcracker process on mw2246 is OK: PROCS OK: 1 process with UID = 111 (nutcracker), command name nutcracker [11:57:12] RECOVERY - nutcracker port on mw2246 is OK: TCP OK - 0.000 second response time on 127.0.0.1 port 11212 [12:01:22] RECOVERY - puppet last run on mw2246 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [12:07:17] (03CR) 10Addshore: [C: 031] Log 'WikibaseQualityConstraints' [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367914 (https://phabricator.wikimedia.org/T171281) (owner: 10Lucas Werkmeister (WMDE)) [12:21:51] RECOVERY - mediawiki-installation DSH group on mw2246 is OK: OK [12:40:12] (03PS2) 10Filippo Giunchedi: prometheus: handle non-existant or empty puppet last run summary [puppet] - 10https://gerrit.wikimedia.org/r/368160 (https://phabricator.wikimedia.org/T170932) [12:42:00] (03CR) 10Filippo Giunchedi: [C: 032] prometheus: handle non-existant or empty puppet last run summary [puppet] - 10https://gerrit.wikimedia.org/r/368160 (https://phabricator.wikimedia.org/T170932) (owner: 10Filippo Giunchedi) [12:44:28] (03CR) 10Muehlenhoff: [C: 032] Migrate former Salt minion to standalone tools executed via Cumin (WIP) [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/365263 (owner: 10Muehlenhoff) [12:44:51] (03CR) 10Muehlenhoff: [C: 032] Add debdeploy client to detect library restarts (WIP) [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/366573 (owner: 10Muehlenhoff) [12:45:37] jouncebot: next [12:45:37] In 0 hour(s) and 14 minute(s): European Mid-day SWAT(Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170727T1300) [12:46:31] (03CR) 10Ottomata: "Thanks luca! :)" [puppet] - 10https://gerrit.wikimedia.org/r/368143 (owner: 10Elukey) [12:53:10] jouncebot: refresh [12:53:12] I refreshed my knowledge about deployments. [12:53:14] jouncebot: next [12:53:14] In 0 hour(s) and 6 minute(s): European Mid-day SWAT(Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170727T1300) [12:54:33] (03PS1) 10Gehel: enable mapframe for euwiki, ptwiki and uawikimedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368172 (https://phabricator.wikimedia.org/T167619) [12:55:22] 10Operations, 10Analytics, 10Analytics-Cluster, 10Research-management: GPU upgrade for stats machine - https://phabricator.wikimedia.org/T148843#3477870 (10Ottomata) Oook, I'm going to have to find some time to do a driver dance before we can test this. Nithum is totally willing to run some tensorflow stu... [12:55:32] 10Operations, 10Analytics, 10Analytics-Cluster, 10Research-management: GPU upgrade for stats machine - https://phabricator.wikimedia.org/T148843#3477871 (10Ottomata) p:05Normal>03Low [12:55:48] (03PS1) 10Muehlenhoff: Create a new package debdeploy-client for the Cumin-based client [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/368173 [12:56:11] 10Operations, 10Analytics, 10Analytics-Cluster, 10Research-management: GPU upgrade for stats machine - https://phabricator.wikimedia.org/T148843#2734568 (10Ottomata) FYI, some links from Nithum: - https://www.codeplay.com/portal/03-30-17-setting-up-tensorflow-with-opencl-using-sycl - http://www.amd.com/en... [13:00:04] addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: Dear anthropoid, the time has come. Please deploy European Mid-day SWAT(Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170727T1300). [13:00:07] Urbanecm: A patch you scheduled for European Mid-day SWAT(Max 8 patches) is about to be deployed. Please be available during the process. [13:00:12] o/ [13:00:33] hashar: I can take care of swat, unless you insist? :) [13:00:59] zeljkof: arent you having lunch ? :) [13:01:11] hashar: quick lunch, already over [13:01:21] I am looking at it. The thing is that I have no clue what map frame is [13:01:29] :D [13:01:32] let me take a look [13:01:46] looks like gehel +1 ed the idea on the task [13:02:30] hashar: yep, enabling mapframe should be good [13:02:51] hashar: are you deploying it, or should I? [13:02:59] (03CR) 10Hashar: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368167 (https://phabricator.wikimedia.org/T171588) (owner: 10Urbanecm) [13:03:11] I am not sure why it is not enabled everywhere though [13:03:11] ok, looks like he is doing it :) [13:03:14] zeljkof: yeah I will do it :) [13:03:15] I'm here :) [13:05:13] (03Merged) 10jenkins-bot: Enable mapframe for cswiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368167 (https://phabricator.wikimedia.org/T171588) (owner: 10Urbanecm) [13:05:41] Urbanecm: deployed on mwdebug1001 [13:05:48] hashar, testin [13:06:20] (03CR) 10jenkins-bot: Enable mapframe for cswiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368167 (https://phabricator.wikimedia.org/T171588) (owner: 10Urbanecm) [13:06:54] hashar, works [13:07:17] (03CR) 10Ema: [C: 031] recdns: do not use self in local resolv.conf [puppet] - 10https://gerrit.wikimedia.org/r/367927 (https://phabricator.wikimedia.org/T104442) (owner: 10BBlack) [13:07:21] syncing [13:07:36] hashar, ack [13:07:42] (03PS2) 10Gehel: enable mapframe for euwiki, ptwiki and uawikimedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368172 (https://phabricator.wikimedia.org/T167619) [13:08:02] (03CR) 10BBlack: [C: 031] pybal: one-packet-scheduling for dns_rec_udp [puppet] - 10https://gerrit.wikimedia.org/r/368162 (https://phabricator.wikimedia.org/T104442) (owner: 10Ema) [13:08:05] !log hashar@tin Synchronized wmf-config/InitialiseSettings.php: Enable mapframe for cswiki - T171588 (duration: 00m 47s) [13:08:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:08:16] T171588: Please turn on mapframe for Czech Wikipedia - https://phabricator.wikimedia.org/T171588 [13:08:19] debt: ^ mapframe deployed on cs [13:08:40] \O/ [13:10:23] (03PS3) 10Gehel: enable mapframe for euwiki, ptwiki and uawikimedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368172 (https://phabricator.wikimedia.org/T167619) [13:11:18] (03CR) 10BBlack: [C: 031] Serve a synth error page when error body is empty in Varnish [puppet] - 10https://gerrit.wikimedia.org/r/365589 (https://phabricator.wikimedia.org/T169683) (owner: 10Gilles) [13:12:22] (03CR) 10Muehlenhoff: [C: 032] Create a new package debdeploy-client for the Cumin-based client [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/368173 (owner: 10Muehlenhoff) [13:16:35] (03CR) 10BBlack: [C: 031] "@Hashar - a similar but simpler option would be to create a define file "secret_file" which wraps "file" and handles show_diff=>false and " [puppet] - 10https://gerrit.wikimedia.org/r/366806 (https://phabricator.wikimedia.org/T79881) (owner: 10Filippo Giunchedi) [13:18:38] (03CR) 10Rush: [C: 031] Update check_disk_options to check/alert on free inodes [puppet] - 10https://gerrit.wikimedia.org/r/367407 (https://phabricator.wikimedia.org/T129222) (owner: 10Filippo Giunchedi) [13:19:08] !log installing apache updates on mendelevium and terbium [13:19:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:19:17] hashar, can you have a look at logstash for T171379? [13:19:17] T171379: Enable the user with flood flag to remove themselves from the group on zh.wiki - https://phabricator.wikimedia.org/T171379 [13:20:01] (03CR) 10Rush: "Elukey, pretty sure that was already the case before this change. This should actually reduce the log lines by 1 with 1 less ignore. I t" [puppet] - 10https://gerrit.wikimedia.org/r/367710 (https://phabricator.wikimedia.org/T171583) (owner: 10Rush) [13:22:12] 10Operations, 10monitoring, 10User-fgiunchedi: Diamond log level set to DEBUG spams syslog - https://phabricator.wikimedia.org/T171580#3477910 (10faidon) p:05Normal>03High [13:23:06] (03PS1) 10Filippo Giunchedi: diamond: ship systemd override file [puppet] - 10https://gerrit.wikimedia.org/r/368177 (https://phabricator.wikimedia.org/T171580) [13:23:55] paravoid: ^ [13:24:21] Urbanecm: not sure what to look for [13:24:39] (03CR) 10Giuseppe Lavagetto: [C: 04-1] diamond: ship systemd override file (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/368177 (https://phabricator.wikimedia.org/T171580) (owner: 10Filippo Giunchedi) [13:24:48] hashar, what does https://phabricator.wikimedia.org/T171379#3469296 mean [13:25:25] godog: not a bad idea to fix it in our puppet, but better file a bug against Debian as well -- this needs to be fixed with a stable update really [13:27:25] Urbanecm: that looks like a bug in the expiry time somehow [13:27:30] paravoid: yeah I'll reopen #854842 and file another one for --log-stdout, I went with puppet first on our side as it is easier to bandaid this way [13:27:36] Urbanecm: looks unrelated to the task [13:27:52] (03CR) 10Muehlenhoff: diamond: ship systemd override file (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/368177 (https://phabricator.wikimedia.org/T171580) (owner: 10Filippo Giunchedi) [13:28:12] Urbanecm: maybe it is https://phabricator.wikimedia.org/T171345 [13:28:54] godog: nod [13:31:56] !log lvs1009, lvs1010: upgrade to pybal 1.13.10 (one-packet-scheduling) T104442 [13:32:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:32:06] T104442: Investigate better DNS cache/lookup solutions - https://phabricator.wikimedia.org/T104442 [13:32:14] Thank you hashar! [13:35:43] (03PS2) 10BBlack: recdns: do not use self in local resolv.conf [puppet] - 10https://gerrit.wikimedia.org/r/367927 (https://phabricator.wikimedia.org/T104442) [13:37:19] (03CR) 10BBlack: [C: 032] recdns: do not use self in local resolv.conf [puppet] - 10https://gerrit.wikimedia.org/r/367927 (https://phabricator.wikimedia.org/T104442) (owner: 10BBlack) [13:38:25] (03PS1) 10Urbanecm: Enable WikidataPageBanner for euwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368178 (https://phabricator.wikimedia.org/T171763) [13:40:45] (03PS2) 10Filippo Giunchedi: diamond: ship systemd override file [puppet] - 10https://gerrit.wikimedia.org/r/368177 (https://phabricator.wikimedia.org/T171580) [13:41:30] _joe_: ^ using systemd::service [13:42:04] I'm not sure if there's an advantage heh [13:42:11] <_joe_> godog: thanks, context is for example https://wikitech.wikimedia.org/wiki/User:Giuseppe_Lavagetto/PuppetFutureParser#Variable_scope_is_now_respected_in_templates [13:43:22] (03CR) 10Giuseppe Lavagetto: diamond: ship systemd override file (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/368177 (https://phabricator.wikimedia.org/T171580) (owner: 10Filippo Giunchedi) [13:43:38] <_joe_> godog: you don't need to declare the service if using systemd::service [13:44:22] ah that's right, the ensure_resource [13:44:51] <_joe_> yes, sorry if the doc was misleading, I can partially blame paravoid for making me rethink the whole thing while writing it :P [13:46:38] <_joe_> (I think the abstraction now is decent and general enough to cover basically any type of systemd unit, which is good) [13:47:47] is there a plan to migrate off base::service_unit over time as well? maybe after trusty is gone would make it more straightforward [13:48:42] (03PS3) 10Filippo Giunchedi: diamond: ship systemd override file [puppet] - 10https://gerrit.wikimedia.org/r/368177 (https://phabricator.wikimedia.org/T171580) [13:50:02] (03CR) 10Debt: "Please add T171805 to this as well, thanks!" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368172 (https://phabricator.wikimedia.org/T167619) (owner: 10Gehel) [13:52:19] <_joe_> godog: yeah my hope is we'd migrate most things as soon as possible given how annoying is to maintain upstaart scripts and the templates scoping issues with the future puppet parser [13:53:37] (03CR) 10Giuseppe Lavagetto: [C: 031] diamond: ship systemd override file [puppet] - 10https://gerrit.wikimedia.org/r/368177 (https://phabricator.wikimedia.org/T171580) (owner: 10Filippo Giunchedi) [13:53:54] 10Operations, 10Traffic, 10Community-Liaisons (Jul-Sep 2017), 10User-Johan: Communicate dropping IE8-on-XP support (a security change) to affected editors and other community members - https://phabricator.wikimedia.org/T163251#3478043 (10BBlack) @Johan Yeah I've been OoO and catching up slowly too. We als... [13:54:49] _joe_: sigh, I just noticed the service is already declared anyways a little before in init.pp, I'll go with service::unit directly [13:55:41] * godog looks at the yak [13:57:46] (03PS4) 10Filippo Giunchedi: diamond: ship systemd override file [puppet] - 10https://gerrit.wikimedia.org/r/368177 (https://phabricator.wikimedia.org/T171580) [13:58:44] (03CR) 10jerkins-bot: [V: 04-1] diamond: ship systemd override file [puppet] - 10https://gerrit.wikimedia.org/r/368177 (https://phabricator.wikimedia.org/T171580) (owner: 10Filippo Giunchedi) [13:59:31] (03CR) 10Filippo Giunchedi: "PCC https://puppet-compiler.wmflabs.org/compiler02/7183/" [puppet] - 10https://gerrit.wikimedia.org/r/368177 (https://phabricator.wikimedia.org/T171580) (owner: 10Filippo Giunchedi) [13:59:56] (03PS4) 10Debt: enable mapframe for euwiki, ptwiki and uawikimedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368172 (https://phabricator.wikimedia.org/T167619) (owner: 10Gehel) [14:01:15] (03PS5) 10Filippo Giunchedi: diamond: ship systemd override file [puppet] - 10https://gerrit.wikimedia.org/r/368177 (https://phabricator.wikimedia.org/T171580) [14:03:35] (03PS1) 10Volans: NOOP: Formatting and minor refactoring [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/368179 [14:05:55] (03CR) 10Debt: [C: 031] "LGTM, thanks!" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368172 (https://phabricator.wikimedia.org/T167619) (owner: 10Gehel) [14:10:10] 10Operations, 10Pybal, 10Traffic: Backport ipvsadm - https://phabricator.wikimedia.org/T171850#3478121 (10ema) [14:10:26] 10Operations, 10Pybal, 10Traffic: Backport ipvsadm - https://phabricator.wikimedia.org/T171850#3478137 (10ema) p:05Triage>03Normal [14:10:59] 10Operations, 10OCG-General, 10Reading-Community-Engagement, 10Epic, and 3 others: [EPIC] (Proposal) Replicate core OCG features and sunset OCG service - https://phabricator.wikimedia.org/T150871#3478142 (10ovasileva) [14:17:45] (03PS2) 10Filippo Giunchedi: Update check_disk_options to check/alert on free inodes [puppet] - 10https://gerrit.wikimedia.org/r/367407 (https://phabricator.wikimedia.org/T129222) [14:19:49] (03CR) 10Filippo Giunchedi: [C: 032] Update check_disk_options to check/alert on free inodes [puppet] - 10https://gerrit.wikimedia.org/r/367407 (https://phabricator.wikimedia.org/T129222) (owner: 10Filippo Giunchedi) [14:21:35] 10Operations, 10Patch-For-Review: Tracking and Reducing cron-spam from root@ - https://phabricator.wikimedia.org/T132324#3478207 (10fgiunchedi) [14:21:37] 10Operations, 10monitoring, 10Patch-For-Review, 10User-fgiunchedi: prometheus-puppet-agent-stats cronspam on missing puppet stats - https://phabricator.wikimedia.org/T170932#3478205 (10fgiunchedi) 05Open>03Resolved Resolving, will reopen if cronspam shows up again [14:23:00] 10Operations, 10Pybal, 10Traffic: Backport ipvsadm - https://phabricator.wikimedia.org/T171850#3478121 (10faidon) stretch has 1.28, so perhaps it's just simpler to upgrade the LVS systems to stretch, which we'll need to do anyway at some point? We're already running the stretch kernel, and they don't have mu... [14:24:23] 10Operations, 10Traffic: Backport iproute2 4.x from debian testing -> our jessie - https://phabricator.wikimedia.org/T138591#2404751 (10faidon) This has been open for a while :) What new things that our kernels can do do we need and on which systems? Are these a priority now or can they wait until we upgrade... [14:25:23] (03PS2) 10Giuseppe Lavagetto: Add filters to the future parser [software/puppet-compiler] - 10https://gerrit.wikimedia.org/r/367678 [14:25:37] 10Operations, 10Pybal, 10Traffic: Backport ipvsadm - https://phabricator.wikimedia.org/T171850#3478121 (10BBlack) Yeah, that's not a bad idea. Perhaps we should morph this into a stretch-for-LVS ticket, and start with the always-almost-ready-to-use lvs1007-12? :) [14:26:01] RECOVERY - Check systemd state on mw2246 is OK: OK - running: The system is fully operational [14:26:34] 10Operations, 10Traffic, 10Wikidata, 10wikiba.se, 10Wikidata-Sprint-2016-11-08: [Task] move wikiba.se webhosting to wikimedia misc-cluster - https://phabricator.wikimedia.org/T99531#3478229 (10BBlack) [14:28:05] 10Operations, 10Pybal, 10Traffic: Backport ipvsadm - https://phabricator.wikimedia.org/T171850#3478231 (10ema) +1 on upgrading to stretch. However, we are probably gonna end up in a similar situation on stretch whenever upgrading to newer kernels, so perhaps it might still make sense to keep this ticket open... [14:29:57] (03PS2) 10Ema: pybal: one-packet-scheduling for dns_rec_udp [puppet] - 10https://gerrit.wikimedia.org/r/368162 (https://phabricator.wikimedia.org/T104442) [14:30:29] (03CR) 10Ema: [V: 032 C: 032] pybal: one-packet-scheduling for dns_rec_udp [puppet] - 10https://gerrit.wikimedia.org/r/368162 (https://phabricator.wikimedia.org/T104442) (owner: 10Ema) [14:33:59] (03PS1) 10Dzahn: releases: delete old microsites::releases class [puppet] - 10https://gerrit.wikimedia.org/r/368181 (https://phabricator.wikimedia.org/T164030) [14:37:08] 10Operations, 10Pybal, 10Traffic: Backport ipvsadm - https://phabricator.wikimedia.org/T171850#3478121 (10MoritzMuehlenhoff) +1 on the stretch upgrade, but I don't think it's very useful to keep the ticket open for future kernel updates, it'll only bitrot and who know's if there's even a new ipsvadm release... [14:37:22] (03PS1) 10Dzahn: cache::misc: add director for releases(1001) [puppet] - 10https://gerrit.wikimedia.org/r/368183 (https://phabricator.wikimedia.org/T164030) [14:38:37] (03PS1) 10Dzahn: cache::misc: switch director for releases to releases1001 [puppet] - 10https://gerrit.wikimedia.org/r/368184 (https://phabricator.wikimedia.org/T164030) [14:40:15] (03PS3) 10Andrew Bogott: labpuppetmaster: more hiera config [puppet] - 10https://gerrit.wikimedia.org/r/368099 [14:40:38] 10Operations, 10Patch-For-Review, 10Release-Engineering-Team (Kanban), 10Security-General: setup releases1001.eqiad.wmnet (was: setup mwreleases1001) - https://phabricator.wikimedia.org/T164030#3478267 (10Dzahn) a:05demon>03Dzahn [14:44:54] (03CR) 10Volans: [C: 031] "LGTM, few minor optional comments." (033 comments) [software/puppet-compiler] - 10https://gerrit.wikimedia.org/r/367678 (owner: 10Giuseppe Lavagetto) [14:46:15] (03CR) 10Andrew Bogott: [C: 032] labpuppetmaster: more hiera config [puppet] - 10https://gerrit.wikimedia.org/r/368099 (owner: 10Andrew Bogott) [14:52:34] 10Operations, 10ORES, 10Scap, 10Scoring-platform-team-Backlog, 10Release-Engineering-Team (Watching / External): ORES should use git-fat for wheel deployments - https://phabricator.wikimedia.org/T171619#3478291 (10Halfak) [14:52:59] 10Operations, 10ORES, 10Scap, 10Scoring-platform-team-Backlog, 10Release-Engineering-Team (Watching / External): ORES should use git-fat for wheel deployments - https://phabricator.wikimedia.org/T171619#3470967 (10Halfak) [14:53:19] 10Operations, 10ORES, 10Scap, 10Scoring-platform-team-Backlog, 10Release-Engineering-Team (Watching / External): ORES should use git-fat for wheel deployments - https://phabricator.wikimedia.org/T171619#3470967 (10Halfak) [14:54:07] 10Operations, 10ORES, 10Scap, 10Scoring-platform-team-Backlog, 10Release-Engineering-Team (Watching / External): ORES should use git-fat for binaries - https://phabricator.wikimedia.org/T171619#3470967 (10Halfak) [14:54:29] 10Operations, 10ORES, 10Scap, 10Scoring-platform-team-Backlog, 10Release-Engineering-Team (Watching / External): ORES should use git-fat for binaries - https://phabricator.wikimedia.org/T171619#3470967 (10Halfak) 05Open>03stalled p:05Normal>03High [14:56:48] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe: Kill manifests/realm.pp - https://phabricator.wikimedia.org/T85459#3478307 (10Joe) This is actually not going to happen: the future parser and puppet 4 have the concept of "directory manifests" that allows us to have a global piece of code to prepend... [14:56:59] 10Operations, 10Cloud-Services: wikitech api list=novainstances not returning list of instances - https://phabricator.wikimedia.org/T171280#3478310 (10Andrew) [14:57:01] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe: Kill manifests/realm.pp - https://phabricator.wikimedia.org/T85459#3478311 (10Joe) 05Open>03declined [14:58:16] (03PS1) 10Cmjohnson: Adding dns entries for kafka-jumbo100[1-6] T167992 [dns] - 10https://gerrit.wikimedia.org/r/368186 [15:01:36] elukey: ah^ will you comment on ^ [15:01:44] was waiting to comment for you to send that email you wanted [15:01:45] sure! [15:01:53] we may want that kafka-jumbo1012 :( [15:03:05] !log stopping jobrunner/jobchron on mw1260 to investigate a few failing ffmpeg2theora invocations (T145742) [15:03:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:03:17] T145742: Migrate video scalers to jessie - https://phabricator.wikimedia.org/T145742 [15:03:28] (03PS5) 10Jcrespo: prometheus-mysql-exporter: Change role on prometheus [puppet] - 10https://gerrit.wikimedia.org/r/368158 (https://phabricator.wikimedia.org/T170662) [15:05:00] PROBLEM - Check systemd state on mw1260 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [15:05:39] 10Operations, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Firewalls appear to be preventing spark executors from talking to spark driver on stat1005 - https://phabricator.wikimedia.org/T170496#3478332 (10Nuria) 05Open>03Resolved [15:05:42] 10Operations, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Firewalls appear to be preventing spark executors from talking to spark driver on stat1005 - https://phabricator.wikimedia.org/T170496#3478334 (10Ottomata) a:03Ottomata [15:05:48] 10Operations, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Firewalls appear to be preventing spark executors from talking to spark driver on stat1005 - https://phabricator.wikimedia.org/T170496#3433298 (10Ottomata) [15:16:42] PROBLEM - puppet last run on mw1260 is CRITICAL: CRITICAL: Puppet has 2 failures. Last run 2 minutes ago with 2 failures. Failed resources (up to 3 shown): Service[jobchron],Service[jobrunner] [15:18:59] (03CR) 10Elukey: [C: 04-1] "Precautionary -1 since we might need different names.. Sorry Chris do you mind to wait a couple of days max before proceeding?" [dns] - 10https://gerrit.wikimedia.org/r/368186 (owner: 10Cmjohnson) [15:20:28] ^silencing the alert for mw1260 [15:21:03] 10Operations, 10ops-codfw, 10hardware-requests: reclaim/decom tmh200[12] - https://phabricator.wikimedia.org/T168472#3478415 (10Papaul) [15:22:46] (03PS1) 10Jcrespo: mariadb-package: Fix comment saying 10 minutes [software] - 10https://gerrit.wikimedia.org/r/368189 [15:23:49] (03PS1) 10Muehlenhoff: Adapt debdeploy server components to Cumin (WIP) [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/368190 [15:29:01] (03CR) 10Jcrespo: [C: 032] mariadb-package: Fix comment saying 10 minutes [software] - 10https://gerrit.wikimedia.org/r/368189 (owner: 10Jcrespo) [15:36:12] RECOVERY - puppet last run on labpuppetmaster1001 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [15:38:53] 10Operations, 10ops-eqiad, 10Analytics-Cluster, 10Analytics-Kanban: rack/setup/install druid100[456].eqiad.wmnet - https://phabricator.wikimedia.org/T171626#3478489 (10fdans) [15:39:26] (03PS6) 10Filippo Giunchedi: diamond: ship systemd override file [puppet] - 10https://gerrit.wikimedia.org/r/368177 (https://phabricator.wikimedia.org/T171580) [15:41:21] (03CR) 10Filippo Giunchedi: [C: 032] diamond: ship systemd override file [puppet] - 10https://gerrit.wikimedia.org/r/368177 (https://phabricator.wikimedia.org/T171580) (owner: 10Filippo Giunchedi) [15:42:58] (03CR) 10Giuseppe Lavagetto: Add filters to the future parser (033 comments) [software/puppet-compiler] - 10https://gerrit.wikimedia.org/r/367678 (owner: 10Giuseppe Lavagetto) [15:43:17] (03PS1) 10Reception123: Added wordmark for Wikipedia Atikamekw [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368194 [15:43:47] Hey folks. We're setting up the new ORES cluster to get ORES off of the SCB nodes. [15:43:55] I've noticed that the machines have Jessie installed. [15:44:03] !log stopping mysql, upgrading and restarting labsdb1010 [15:44:09] But I'm thinking that it might be better to switch them to stretch. [15:44:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:44:13] (03PS3) 10Giuseppe Lavagetto: Add filters to the future parser [software/puppet-compiler] - 10https://gerrit.wikimedia.org/r/367678 [15:44:32] How much difficulty is involved moving a host from one debian OS to another? [15:44:40] Note, these machines are not yet receiving traffic. [15:45:08] halfak: if they are not in service the easiest is to reimage them [15:45:18] also +1 to starting with stretch while we're at it [15:45:24] (03PS2) 10Reception123: Added wordmark for Wikipedia Atikamekw [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368194 (https://phabricator.wikimedia.org/T168203) [15:45:38] Cool. I'll make a task for that. Mind copying your +1 there in a bit? :) [15:46:33] halfak: for sure! [15:47:47] (03Abandoned) 10Reception123: Added wordmark for Wikipedia Atikamekw [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368194 (https://phabricator.wikimedia.org/T168203) (owner: 10Reception123) [15:48:12] 10Operations, 10ORES, 10Scoring-platform-team: Reimage ores* hosts with Debian Stretch - https://phabricator.wikimedia.org/T171851#3478519 (10Halfak) [15:48:14] godog, https://phabricator.wikimedia.org/T171851 [15:48:16] Thanks :) [15:48:38] 10Operations, 10ORES, 10Scoring-platform-team: Reimage ores* hosts with Debian Stretch - https://phabricator.wikimedia.org/T171851#3478146 (10Halfak) a:05Halfak>03None [15:48:53] 10Operations, 10ORES, 10Scoring-platform-team-Backlog: Reimage ores* hosts with Debian Stretch - https://phabricator.wikimedia.org/T171851#3478146 (10Halfak) [15:49:09] halfak: does ores already run on stretch btw? [15:49:22] PROBLEM - Check Varnish expiry mailbox lag on cp1049 is CRITICAL: CRITICAL: expiry mailbox lag is 2052836 [15:49:30] 10Operations, 10ORES, 10Scoring-platform-team-Backlog: Reimage ores* hosts with Debian Stretch - https://phabricator.wikimedia.org/T171851#3478532 (10fgiunchedi) +1 to starting brand new clusters with stretch! [15:49:30] godog, currently on Jessie, but I'm happy to do the work to rebuild our models on stretch [15:51:07] (03Draft1) 10Paladox: Gerrit: Set auth.userNameToLowerCase [puppet] - 10https://gerrit.wikimedia.org/r/368196 [15:51:09] (03PS2) 10Paladox: Gerrit: Set auth.userNameToLowerCase [puppet] - 10https://gerrit.wikimedia.org/r/368196 [15:51:24] halfak: heh I'd imagine the most time consuming part would be to have a machine running stretch and apply the ores puppet and see what breaks, rinse/repeat [15:51:46] though stretch is available in labs too so it can be tested there as well [15:51:59] (03CR) 10Reception123: [C: 031] Gerrit: Set auth.userNameToLowerCase [puppet] - 10https://gerrit.wikimedia.org/r/368196 (owner: 10Paladox) [15:52:27] godog, luckily, we've already got ottomata working on getting ORES stuff running on stretch because we need it for analytics. [15:52:43] He had to port some apt packages, but it seems to be working now :) [15:53:02] Still, a final test of running ORES will be nice. [15:53:10] nice, then yeah not a lot of work (last famous words) [15:53:15] lol [15:53:16] :) [15:57:00] (03PS1) 10Papaul: DNS/Decom: Remove mgmt DNS for tmh200[1-2] [dns] - 10https://gerrit.wikimedia.org/r/368197 [15:57:35] (03PS1) 10Reception123: Added wordmark for Wikipedia Atikamekw [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368198 [15:59:28] (03PS2) 10Reception123: Added wordmark for Wikipedia Atikamekw [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368198 [15:59:50] (03PS1) 10Elukey: confluent::kafka.sh: fix kafka-acls command autocompletion [puppet] - 10https://gerrit.wikimedia.org/r/368199 (https://phabricator.wikimedia.org/T167304) [16:00:04] godog, moritzm, and _joe_: Dear anthropoid, the time has come. Please deploy Puppet SWAT(Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170727T1600). [16:01:34] (03CR) 10EBernhardson: [C: 031] Decrease elasticsearch search thread pool to 32 for cirrus servers [puppet] - 10https://gerrit.wikimedia.org/r/367709 (https://phabricator.wikimedia.org/T169498) (owner: 10EBernhardson) [16:02:57] I see no puppet swat patches https://media3.giphy.com/media/67XNScEjAgtBS/giphy.mp4 [16:03:13] elukey: ^ [16:07:08] (03CR) 10Ottomata: confluent::kafka.sh: fix kafka-acls command autocompletion (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/368199 (https://phabricator.wikimedia.org/T167304) (owner: 10Elukey) [16:09:14] hahahaha [16:09:36] (03PS2) 10RobH: DNS/Decom: Remove mgmt DNS for tmh200[1-2] [dns] - 10https://gerrit.wikimedia.org/r/368197 (owner: 10Papaul) [16:09:42] 10Operations, 10monitoring, 10User-fgiunchedi: Update diamond to latest upstream version - https://phabricator.wikimedia.org/T97635#3478648 (10fgiunchedi) [16:09:44] 10Operations, 10monitoring, 10Patch-For-Review, 10User-fgiunchedi: Diamond log level set to DEBUG spams syslog - https://phabricator.wikimedia.org/T171580#3478646 (10fgiunchedi) 05Open>03Resolved Logging level has been set to INFO again for diamond, following up on parent task [16:10:04] 10Operations, 10monitoring, 10Patch-For-Review: Icinga disk space check should also check inode usage - https://phabricator.wikimedia.org/T129222#3478662 (10fgiunchedi) 05Open>03Resolved All done! [16:11:08] (03CR) 10RobH: [C: 032] DNS/Decom: Remove mgmt DNS for tmh200[1-2] [dns] - 10https://gerrit.wikimedia.org/r/368197 (owner: 10Papaul) [16:11:40] 10Operations, 10Cloud-VPS, 10cloud-services-team (Kanban): Switch to new labs puppetmasters - https://phabricator.wikimedia.org/T171786#3478669 (10Andrew) [16:12:18] 10Operations, 10ops-codfw, 10hardware-requests, 10Patch-For-Review: reclaim/decom tmh200[12] - https://phabricator.wikimedia.org/T168472#3478670 (10RobH) [16:12:38] 10Operations, 10ops-codfw, 10hardware-requests, 10Patch-For-Review: reclaim/decom tmh200[12] - https://phabricator.wikimedia.org/T168472#3365481 (10RobH) 05Open>03Resolved I've merged @papaul's dns change, so this is now resolved (since all steps are now checked off.) [16:13:08] (03PS1) 10Bearloga: r & shiny_server: Fix duplicate xml2 declaration [puppet] - 10https://gerrit.wikimedia.org/r/368200 [16:15:56] (03CR) 10Volans: [C: 031] "LGTM" [software/puppet-compiler] - 10https://gerrit.wikimedia.org/r/367678 (owner: 10Giuseppe Lavagetto) [16:16:03] PROBLEM - MariaDB Slave Lag: s2 on db1047 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 630.94 seconds [16:17:02] poor db1047 [16:17:08] maybe downtime finished? [16:17:20] it is still executing the eventlogging_cleaner [16:20:09] (03CR) 10Gehel: [C: 032] r & shiny_server: Fix duplicate xml2 declaration [puppet] - 10https://gerrit.wikimedia.org/r/368200 (owner: 10Bearloga) [16:21:03] (03PS2) 10Elukey: confluent::kafka.sh: fix kafka-acls command autocompletion [puppet] - 10https://gerrit.wikimedia.org/r/368199 (https://phabricator.wikimedia.org/T167304) [16:22:44] (03CR) 10Ottomata: [C: 031] confluent::kafka.sh: fix kafka-acls command autocompletion [puppet] - 10https://gerrit.wikimedia.org/r/368199 (https://phabricator.wikimedia.org/T167304) (owner: 10Elukey) [16:23:05] (03CR) 10BryanDavis: [C: 031] "Tested via manual edits on tools-static-11. Redirects and rewrites work as expected. No nginx syntax problems. Ready to merge." [puppet] - 10https://gerrit.wikimedia.org/r/357878 (https://phabricator.wikimedia.org/T110027) (owner: 10Zhuyifei1999) [16:23:29] (03PS3) 10Elukey: confluent::kafka.sh: fix kafka-acls command autocompletion [puppet] - 10https://gerrit.wikimedia.org/r/368199 (https://phabricator.wikimedia.org/T167304) [16:24:12] RECOVERY - MariaDB Slave Lag: s2 on db1047 is OK: OK slave_sql_lag Replication lag: 0.00 seconds [16:24:40] zhuyifei1999_: your gfont proxy is running on https://tools-static-beta.wmflabs.org/fontcdn/... and looks good. -- https://tools-static-beta.wmflabs.org/fontcdn/css?family=Raleway:300italic,600italic&subset=latin,latin-ext [16:24:49] cool [16:25:24] I'll see if I can get it merged and live today [16:25:28] k [16:26:18] (03PS4) 10Rush: tools-static: add /fontcdn/ to reverse-proxy to Google Fonts [puppet] - 10https://gerrit.wikimedia.org/r/357878 (https://phabricator.wikimedia.org/T110027) (owner: 10Zhuyifei1999) [16:26:22] (03PS1) 10Bearloga: r: Fix libxml2 dependency [puppet] - 10https://gerrit.wikimedia.org/r/368204 [16:29:15] (03CR) 10Elukey: [C: 032] confluent::kafka.sh: fix kafka-acls command autocompletion [puppet] - 10https://gerrit.wikimedia.org/r/368199 (https://phabricator.wikimedia.org/T167304) (owner: 10Elukey) [16:30:43] (03PS5) 10Rush: tools-static: add /fontcdn/ to reverse-proxy to Google Fonts [puppet] - 10https://gerrit.wikimedia.org/r/357878 (https://phabricator.wikimedia.org/T110027) (owner: 10Zhuyifei1999) [16:32:12] (03CR) 10Rush: [C: 032] tools-static: add /fontcdn/ to reverse-proxy to Google Fonts [puppet] - 10https://gerrit.wikimedia.org/r/357878 (https://phabricator.wikimedia.org/T110027) (owner: 10Zhuyifei1999) [16:32:15] (03PS1) 10Ppchelko: Labs: Enable the debug queue to test JobQueueEventBus. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368206 (https://phabricator.wikimedia.org/T163380) [16:32:26] (03PS2) 10Gehel: r: Fix libxml2 dependency [puppet] - 10https://gerrit.wikimedia.org/r/368204 (owner: 10Bearloga) [16:34:23] (03CR) 10jerkins-bot: [V: 04-1] Labs: Enable the debug queue to test JobQueueEventBus. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368206 (https://phabricator.wikimedia.org/T163380) (owner: 10Ppchelko) [16:35:33] (03PS1) 10Elukey: confluent::kafka.sh: fix typo in kafka acls handling [puppet] - 10https://gerrit.wikimedia.org/r/368207 [16:35:35] ottomata: --^ sigh [16:36:05] ah also CR1 is wrong [16:36:09] * elukey cries in a corner [16:37:23] PS2 coming [16:37:28] (03PS2) 10Elukey: confluent::kafka.sh: fix typo in kafka acls handling [puppet] - 10https://gerrit.wikimedia.org/r/368207 [16:38:04] (03CR) 10Elukey: [V: 032 C: 032] confluent::kafka.sh: fix typo in kafka acls handling [puppet] - 10https://gerrit.wikimedia.org/r/368207 (owner: 10Elukey) [16:39:22] RECOVERY - Check Varnish expiry mailbox lag on cp1049 is OK: OK: expiry mailbox lag is 32430 [16:39:28] (03PS2) 10Ppchelko: Labs: Enable the debug queue to test JobQueueEventBus. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368206 (https://phabricator.wikimedia.org/T163380) [16:44:14] 10Operations, 10Pybal, 10Traffic: Backport ipvsadm - https://phabricator.wikimedia.org/T171850#3478782 (10ema) >>! In T171850#3478261, @MoritzMuehlenhoff wrote: > +1 on the stretch upgrade, but I don't think it's very useful to keep the ticket open for future kernel updates, it'll only bitrot and who know's... [16:46:02] elukey: let me know when you are done with puppet, I have a small patch to merge... [16:46:20] gehel: already merged :) [16:46:30] my turn! [16:46:43] (03PS3) 10Gehel: r: Fix libxml2 dependency [puppet] - 10https://gerrit.wikimedia.org/r/368204 (owner: 10Bearloga) [16:47:06] (03PS5) 10Jdlrobson: Update several Wikipedia projects to existing wordmarks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556) [16:50:15] (03CR) 10Gehel: [C: 032] r: Fix libxml2 dependency [puppet] - 10https://gerrit.wikimedia.org/r/368204 (owner: 10Bearloga) [16:51:25] bearloga: ^ [16:52:26] gehel: running agent now and OH JEEZ YOU NEED TO THIS OUTPUT [16:53:57] (03PS2) 10Andrew Bogott: puppet-merge: fix a syntax error when there's only one worker [puppet] - 10https://gerrit.wikimedia.org/r/368088 [17:00:05] gwicke, cscott, arlolra, subbu, halfak, and Amir1: Dear anthropoid, the time has come. Please deploy Services – Graphoid / Parsoid / OCG / Citoid / ORES (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170727T1700). [17:00:28] Nothing for ORES today [17:04:00] (03CR) 10Andrew Bogott: [C: 032] puppet-merge: fix a syntax error when there's only one worker [puppet] - 10https://gerrit.wikimedia.org/r/368088 (owner: 10Andrew Bogott) [17:11:43] PROBLEM - puppet last run on thorium is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[git_pull_geowiki-data-private] [17:16:59] (03PS1) 10Bearloga: r: Remove libicu dependency [puppet] - 10https://gerrit.wikimedia.org/r/368212 [17:17:47] (03PS2) 10Gehel: r: Remove libicu dependency [puppet] - 10https://gerrit.wikimedia.org/r/368212 (owner: 10Bearloga) [17:18:28] (03PS1) 10BryanDavis: wmcs: Update tools-static reverse proxy rules for Google fonts [puppet] - 10https://gerrit.wikimedia.org/r/368213 (https://phabricator.wikimedia.org/T110027) [17:19:46] (03CR) 10Gehel: [C: 032] r: Remove libicu dependency [puppet] - 10https://gerrit.wikimedia.org/r/368212 (owner: 10Bearloga) [17:25:39] (03CR) 10BryanDavis: [C: 031] "Tested on tools-static-11 by manually patching the config file. No nginx syntax errors and /fontcdn/l is being handled properly now." [puppet] - 10https://gerrit.wikimedia.org/r/368213 (https://phabricator.wikimedia.org/T110027) (owner: 10BryanDavis) [17:27:17] (03PS5) 10Bearloga: contint: role and packages for R language [puppet] - 10https://gerrit.wikimedia.org/r/363337 (https://phabricator.wikimedia.org/T153856) (owner: 10Hashar) [17:27:32] (03PS6) 10Bearloga: contint: role and packages for R language [puppet] - 10https://gerrit.wikimedia.org/r/363337 (https://phabricator.wikimedia.org/T153856) (owner: 10Hashar) [17:28:25] (03PS1) 10Andrew Bogott: labspuppetmaster: Fix some c/p errors [puppet] - 10https://gerrit.wikimedia.org/r/368218 [17:28:27] (03PS1) 10Andrew Bogott: puppetmaster profiles: make allow_from configurable [puppet] - 10https://gerrit.wikimedia.org/r/368219 [17:28:29] (03PS1) 10Andrew Bogott: labs puppetmasters: Add firewall and allow_from rules [puppet] - 10https://gerrit.wikimedia.org/r/368220 [17:28:47] andrewbogott: s/: /: / [17:28:58] * andrewbogott will never learn :( [17:29:00] thanks [17:29:03] (03PS2) 10Rush: wmcs: Update tools-static reverse proxy rules for Google fonts [puppet] - 10https://gerrit.wikimedia.org/r/368213 (https://phabricator.wikimedia.org/T110027) (owner: 10BryanDavis) [17:29:05] :) :) [17:29:34] (03CR) 10jerkins-bot: [V: 04-1] puppetmaster profiles: make allow_from configurable [puppet] - 10https://gerrit.wikimedia.org/r/368219 (owner: 10Andrew Bogott) [17:30:06] (03CR) 10jerkins-bot: [V: 04-1] labs puppetmasters: Add firewall and allow_from rules [puppet] - 10https://gerrit.wikimedia.org/r/368220 (owner: 10Andrew Bogott) [17:30:50] (03CR) 10Rush: [C: 032] wmcs: Update tools-static reverse proxy rules for Google fonts [puppet] - 10https://gerrit.wikimedia.org/r/368213 (https://phabricator.wikimedia.org/T110027) (owner: 10BryanDavis) [17:32:45] (03PS1) 10Catrope: Temporarily enable experimental views in RCFilters [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368221 [17:35:35] (03PS2) 10Andrew Bogott: labspuppetmaster: Fix some c/p errors [puppet] - 10https://gerrit.wikimedia.org/r/368218 [17:35:37] (03PS2) 10Andrew Bogott: puppetmaster profiles: make allow_from configurable [puppet] - 10https://gerrit.wikimedia.org/r/368219 [17:35:39] (03PS2) 10Andrew Bogott: labs puppetmasters: Add firewall and allow_from rules [puppet] - 10https://gerrit.wikimedia.org/r/368220 [17:36:26] (03PS4) 10Gehel: Decrease elasticsearch search thread pool to 32 for cirrus servers [puppet] - 10https://gerrit.wikimedia.org/r/367709 (https://phabricator.wikimedia.org/T169498) (owner: 10EBernhardson) [17:36:53] (03CR) 10jerkins-bot: [V: 04-1] labs puppetmasters: Add firewall and allow_from rules [puppet] - 10https://gerrit.wikimedia.org/r/368220 (owner: 10Andrew Bogott) [17:38:55] (03PS3) 10Andrew Bogott: labs puppetmasters: Add firewall and allow_from rules [puppet] - 10https://gerrit.wikimedia.org/r/368220 [17:39:05] (03PS3) 10Andrew Bogott: labspuppetmaster: Fix some c/p errors [puppet] - 10https://gerrit.wikimedia.org/r/368218 [17:40:53] (03CR) 10Andrew Bogott: [C: 032] labspuppetmaster: Fix some c/p errors [puppet] - 10https://gerrit.wikimedia.org/r/368218 (owner: 10Andrew Bogott) [17:41:28] (03PS3) 10Andrew Bogott: puppetmaster profiles: make allow_from configurable [puppet] - 10https://gerrit.wikimedia.org/r/368219 [17:43:49] 10Operations, 10DBA, 10MediaWiki-extensions-WikibaseClient, 10Performance-Team, and 6 others: Cache invalidations coming from the JobQueue are causing lag on several wikis - https://phabricator.wikimedia.org/T164173#3479046 (10daniel) [17:44:11] (03PS7) 10Bearloga: contint: profile, role, and packages for R language [puppet] - 10https://gerrit.wikimedia.org/r/363337 (https://phabricator.wikimedia.org/T153856) (owner: 10Hashar) [17:44:47] 10Operations, 10ops-eqiad, 10Analytics, 10Analytics-Cluster, 10Patch-For-Review: rack/setup/install new kafka nodes kafka-jumbo100[1-6] - https://phabricator.wikimedia.org/T167992#3479047 (10elukey) I'd ask if possible to pause naming and configurations for these hosts since me and @Ottomata are thinking... [17:44:53] (03CR) 10Paladox: contint: profile, role, and packages for R language (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/363337 (https://phabricator.wikimedia.org/T153856) (owner: 10Hashar) [17:48:15] (03PS1) 10Andrew Bogott: add some stub certs to fix puppet compiler [labs/private] - 10https://gerrit.wikimedia.org/r/368222 [17:48:39] (03CR) 10Andrew Bogott: [C: 032] puppetmaster profiles: make allow_from configurable [puppet] - 10https://gerrit.wikimedia.org/r/368219 (owner: 10Andrew Bogott) [17:49:22] (03CR) 10Rush: [C: 031] labstore block_sync: Use the logging library instead of print [puppet] - 10https://gerrit.wikimedia.org/r/367742 (owner: 10Madhuvishy) [17:49:54] (03PS2) 10Madhuvishy: labstore block_sync: Use the logging library instead of print [puppet] - 10https://gerrit.wikimedia.org/r/367742 [17:50:05] (03CR) 10Madhuvishy: [V: 032 C: 032] labstore block_sync: Use the logging library instead of print [puppet] - 10https://gerrit.wikimedia.org/r/367742 (owner: 10Madhuvishy) [17:50:31] (03CR) 10Andrew Bogott: [V: 032 C: 032] add some stub certs to fix puppet compiler [labs/private] - 10https://gerrit.wikimedia.org/r/368222 (owner: 10Andrew Bogott) [17:50:51] (03PS4) 10Andrew Bogott: labs puppetmasters: Add firewall and allow_from rules [puppet] - 10https://gerrit.wikimedia.org/r/368220 [17:51:11] (03CR) 10Chad: [C: 031] "We need to get these system users *out* of LDAP. It's only going to continue to cause problems and workarounds otherwise." [puppet] - 10https://gerrit.wikimedia.org/r/365891 (https://phabricator.wikimedia.org/T166013) (owner: 1020after4) [17:51:42] (03PS1) 10Madhuvishy: firstboot: Prevent non-root logins while NFS mounts aren't available [puppet] - 10https://gerrit.wikimedia.org/r/368223 (https://phabricator.wikimedia.org/T171508) [17:51:47] (03CR) 10Gehel: [C: 032] Decrease elasticsearch search thread pool to 32 for cirrus servers [puppet] - 10https://gerrit.wikimedia.org/r/367709 (https://phabricator.wikimedia.org/T169498) (owner: 10EBernhardson) [17:51:57] (03PS5) 10Gehel: Decrease elasticsearch search thread pool to 32 for cirrus servers [puppet] - 10https://gerrit.wikimedia.org/r/367709 (https://phabricator.wikimedia.org/T169498) (owner: 10EBernhardson) [17:53:21] (03CR) 10Andrew Bogott: [C: 032] labs puppetmasters: Add firewall and allow_from rules [puppet] - 10https://gerrit.wikimedia.org/r/368220 (owner: 10Andrew Bogott) [17:54:04] (03PS4) 10Giuseppe Lavagetto: Add filters to the future parser [software/puppet-compiler] - 10https://gerrit.wikimedia.org/r/367678 [17:55:02] (03CR) 10Bearloga: statistics::discovery: Reconfigure for Golden data retrieval (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/367930 (https://phabricator.wikimedia.org/T170494) (owner: 10Bearloga) [17:55:05] (03CR) 10jerkins-bot: [V: 04-1] Add filters to the future parser [software/puppet-compiler] - 10https://gerrit.wikimedia.org/r/367678 (owner: 10Giuseppe Lavagetto) [17:55:40] (03PS4) 10Bearloga: statistics::discovery: Reconfigure for Golden data retrieval [puppet] - 10https://gerrit.wikimedia.org/r/367930 (https://phabricator.wikimedia.org/T170494) [17:56:40] (03CR) 10jerkins-bot: [V: 04-1] statistics::discovery: Reconfigure for Golden data retrieval [puppet] - 10https://gerrit.wikimedia.org/r/367930 (https://phabricator.wikimedia.org/T170494) (owner: 10Bearloga) [17:56:54] (03PS5) 10Giuseppe Lavagetto: Add filters to the future parser [software/puppet-compiler] - 10https://gerrit.wikimedia.org/r/367678 [17:57:27] (03PS5) 10Bearloga: statistics::discovery: Reconfigure for Golden data retrieval [puppet] - 10https://gerrit.wikimedia.org/r/367930 (https://phabricator.wikimedia.org/T170494) [17:58:00] (03CR) 10jerkins-bot: [V: 04-1] Add filters to the future parser [software/puppet-compiler] - 10https://gerrit.wikimedia.org/r/367678 (owner: 10Giuseppe Lavagetto) [17:59:17] (03CR) 10jerkins-bot: [V: 04-1] statistics::discovery: Reconfigure for Golden data retrieval [puppet] - 10https://gerrit.wikimedia.org/r/367930 (https://phabricator.wikimedia.org/T170494) (owner: 10Bearloga) [18:00:04] addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: Respected human, time to deploy Morning SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170727T1800). Please do the needful. [18:00:04] Pchelolo, Jdlrobson, and RoanKattouw: A patch you scheduled for Morning SWAT (Max 8 patches) is about to be deployed. Please be available during the process. [18:00:18] 10Operations, 10Cloud-VPS, 10cloud-services-team (Kanban): Add AAAA records for labpuppetmaster1001 and 1002 - https://phabricator.wikimedia.org/T171880#3479078 (10Andrew) [18:01:03] o/ [18:01:03] 10Operations, 10ops-codfw, 10User-fgiunchedi: ms-be2024 not powering on - https://phabricator.wikimedia.org/T171275#3479097 (10Papaul) @fgiunchedi As usual it is 1pm and no HP tech has show up or called. [18:03:16] I can SWAT [18:04:06] (03PS1) 10BryanDavis: wmcs: Fix double Access-Control-Allow-Origin headers for tools-static [puppet] - 10https://gerrit.wikimedia.org/r/368226 (https://phabricator.wikimedia.org/T110027) [18:05:25] chasemp: ^ 3rd time is a charm :) [18:05:44] !log installed libmail-spf-perl on fermium to address spamassassin "module not installed: Mail::SPF ('require' failed)" error [18:05:50] (03PS9) 10Catrope: Limit FeaturedFeed on dewiki to last seven days [mediawiki-config] - 10https://gerrit.wikimedia.org/r/341267 (https://phabricator.wikimedia.org/T159664) (owner: 10BearND) [18:05:53] (03CR) 10Catrope: [C: 032] Limit FeaturedFeed on dewiki to last seven days [mediawiki-config] - 10https://gerrit.wikimedia.org/r/341267 (https://phabricator.wikimedia.org/T159664) (owner: 10BearND) [18:05:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:06:09] (03CR) 10Ottomata: "Stuff from stat1002 was backed up to /srv/stat1002-a on stat1005, so: /srv/stat1002-a/discovery-stats" [puppet] - 10https://gerrit.wikimedia.org/r/367930 (https://phabricator.wikimedia.org/T170494) (owner: 10Bearloga) [18:07:41] (03CR) 10Rush: [C: 032] wmcs: Fix double Access-Control-Allow-Origin headers for tools-static [puppet] - 10https://gerrit.wikimedia.org/r/368226 (https://phabricator.wikimedia.org/T110027) (owner: 10BryanDavis) [18:08:02] bd808: off you go! [18:08:22] thanks chasemp [18:09:08] \o [18:09:13] sorry for delay in my hand up [18:09:46] Hmm why does it suddenly take ~5 mins to merge mw-config patches [18:10:11] (03Merged) 10jenkins-bot: Limit FeaturedFeed on dewiki to last seven days [mediawiki-config] - 10https://gerrit.wikimedia.org/r/341267 (https://phabricator.wikimedia.org/T159664) (owner: 10BearND) [18:10:23] (03CR) 10jenkins-bot: Limit FeaturedFeed on dewiki to last seven days [mediawiki-config] - 10https://gerrit.wikimedia.org/r/341267 (https://phabricator.wikimedia.org/T159664) (owner: 10BearND) [18:11:41] framawiki: Your FeaturedFeed patch is on mwdebug1002, please test [18:12:12] RoanKattouw: i'm on it [18:12:50] (03CR) 10MaxSem: [C: 031] enable mapframe for euwiki, ptwiki and uawikimedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368172 (https://phabricator.wikimedia.org/T167619) (owner: 10Gehel) [18:13:02] (03PS2) 10Catrope: Enable WikidataPageBanner for euwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368178 (https://phabricator.wikimedia.org/T171763) (owner: 10Urbanecm) [18:13:09] (03CR) 10Catrope: [C: 032] Enable WikidataPageBanner for euwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368178 (https://phabricator.wikimedia.org/T171763) (owner: 10Urbanecm) [18:13:51] RoanKattouw: it's good for me [18:14:07] jdlrobson: Yours are coming next, Jenkins volente [18:14:31] framawiki: OK, syncing yours now [18:15:00] (03PS1) 10Chad: Group2 to wmf.11 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368228 [18:15:04] (03Merged) 10jenkins-bot: Enable WikidataPageBanner for euwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368178 (https://phabricator.wikimedia.org/T171763) (owner: 10Urbanecm) [18:15:11] !log catrope@tin Synchronized wmf-config/InitialiseSettings.php: Limit FeaturedFeed on dewiki to 7 days (T159664) (duration: 00m 47s) [18:15:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:15:21] T159664: German TFA FeaturedFeed should be limited to last 7 days - https://phabricator.wikimedia.org/T159664 [18:15:44] jdlrobson: WikidataPageBanner on euwiki is now on mwdebug1002 [18:16:16] (03CR) 10jenkins-bot: Enable WikidataPageBanner for euwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368178 (https://phabricator.wikimedia.org/T171763) (owner: 10Urbanecm) [18:16:58] thanks RoanKattouw ! [18:19:19] (03CR) 10Catrope: [C: 04-1] Update several Wikipedia projects to existing wordmarks (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556) (owner: 10Jdlrobson) [18:19:33] RoanKattouw: testing [18:19:51] jdlrobson: I have a question about your wordmarks patch, see ---^^ [18:19:56] RoanKattouw: sure [18:21:09] Pchelolo: You around for your SWAT patch? [18:21:16] Here RoanKattouw [18:21:17] RoanKattouw: first one is good to go! (eu.wikipedia) [18:21:35] RoanKattouw: my comment is bad [18:21:37] should say "run" [18:21:50] (03PS1) 10Dzahn: add IPv6 record for labpuppetmaster1001/1002 [dns] - 10https://gerrit.wikimedia.org/r/368229 (https://phabricator.wikimedia.org/T171880) [18:22:04] RoanKattouw: fixed [18:22:10] OK, putting Pchelolo's patch and mine on mwdebug1002 in a minute [18:22:11] (03PS6) 10Jdlrobson: Update several Wikipedia projects to existing wordmarks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556) [18:22:14] (03PS2) 10Dzahn: add IPv6 records for labpuppetmaster1001/1002 [dns] - 10https://gerrit.wikimedia.org/r/368229 (https://phabricator.wikimedia.org/T171880) [18:22:28] !log catrope@tin Synchronized wmf-config/InitialiseSettings.php: Enable WikidataPageBanner on euwiki (T171763) (duration: 00m 46s) [18:22:32] (03PS7) 10Catrope: Update several Wikipedia projects to existing wordmarks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556) (owner: 10Jdlrobson) [18:22:33] RoanKattouw: As it's code-only and it's not enabled anywhere - I can't really test anything.. [18:22:34] 10Operations, 10Cloud-VPS, 10Patch-For-Review, 10cloud-services-team (Kanban): Add AAAA records for labpuppetmaster1001 and 1002 - https://phabricator.wikimedia.org/T171880#3479152 (10Dzahn) a:05Andrew>03Dzahn [18:22:35] (03CR) 10Catrope: [C: 032] Update several Wikipedia projects to existing wordmarks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556) (owner: 10Jdlrobson) [18:22:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:22:39] T171763: Add PAGEBANNER to euwiki - https://phabricator.wikimedia.org/T171763 [18:22:43] OK, then I'll just deploy it [18:23:57] (03PS2) 10Catrope: Temporarily enable experimental views in RCFilters [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368221 [18:23:59] (03Merged) 10jenkins-bot: Update several Wikipedia projects to existing wordmarks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556) (owner: 10Jdlrobson) [18:24:01] (03CR) 10Catrope: [C: 032] Temporarily enable experimental views in RCFilters [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368221 (owner: 10Catrope) [18:26:13] 10Operations, 10ops-eqiad, 10DBA: Degraded RAID on db1068 - https://phabricator.wikimedia.org/T171723#3474264 (10Cmjohnson) Disk swapped...rebuilding Adapter #0 Enclosure Device ID: 32 Slot Number: 0 Drive's position: DiskGroup: 0, Span: 0, Arm: 0 Enclosure position: 1 Device Id: 0 WWN: 5000C50023CDC118 Se... [18:26:14] (03CR) 10jenkins-bot: Update several Wikipedia projects to existing wordmarks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556) (owner: 10Jdlrobson) [18:26:32] (03CR) 10Dzahn: "the existing hosts i compared to were iron.wikimedia.org (for labpuppetmaster1001. .151 vs .158 in the same network, 2620:0:861:2) and ei" [dns] - 10https://gerrit.wikimedia.org/r/368229 (https://phabricator.wikimedia.org/T171880) (owner: 10Dzahn) [18:26:58] (03CR) 10Dzahn: [C: 032] add IPv6 records for labpuppetmaster1001/1002 [dns] - 10https://gerrit.wikimedia.org/r/368229 (https://phabricator.wikimedia.org/T171880) (owner: 10Dzahn) [18:29:21] 10Operations, 10Cloud-VPS, 10Patch-For-Review, 10cloud-services-team (Kanban): Add AAAA records for labpuppetmaster1001 and 1002 - https://phabricator.wikimedia.org/T171880#3479190 (10Dzahn) ``` @labpuppetmaster1001:~# host labpuppetmaster1001 labpuppetmaster1001.wikimedia.org has address 208.80.154.158 la... [18:29:28] !log catrope@tin Synchronized php-1.30.0-wmf.11/includes/: T168501 and T163380 (duration: 01m 31s) [18:29:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:29:40] T168501: When querying for multiple tags, and more than one is in the edit, duplicate results - https://phabricator.wikimedia.org/T168501 [18:29:40] T163380: Support posting Jobs to EventBus simultaneously with normal job processing - https://phabricator.wikimedia.org/T163380 [18:30:51] (03PS1) 10Catrope: Enable $wgStructuredChangeFiltersEnableExperimentalViews on group0 and group1 but not group2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368230 [18:30:55] 10Operations, 10Cloud-VPS, 10Patch-For-Review, 10cloud-services-team (Kanban): Add AAAA records for labpuppetmaster1001 and 1002 - https://phabricator.wikimedia.org/T171880#3479200 (10Dzahn) ``` @labpuppetmaster1001:~# ping6 labpuppetmaster1001.wikimedia.org PING labpuppetmaster1001.wikimedia.org(labpuppet... [18:31:44] jdlrobson: The wordmakrs patch is on mwdebug1002 now, sorry for the delay [18:32:31] (03PS2) 10Catrope: Enable $wgStructuredChangeFiltersEnableExperimentalViews on group0 and group1 but not group2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368230 [18:32:38] (03CR) 10Catrope: [C: 032] Enable $wgStructuredChangeFiltersEnableExperimentalViews on group0 and group1 but not group2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368230 (owner: 10Catrope) [18:32:42] !log labpuppetmaster1001 - restarted ferm twice, DNS lookup for AAAA worked, error gone on second time. then did same on labpuppetmaster1002 (T171880) [18:32:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:32:53] T171880: Add AAAA records for labpuppetmaster1001 and 1002 - https://phabricator.wikimedia.org/T171880 [18:33:12] 10Operations, 10Cloud-VPS, 10cloud-services-team (Kanban): Switch to new labs puppetmasters - https://phabricator.wikimedia.org/T171786#3479205 (10Dzahn) [18:33:16] 10Operations, 10Cloud-VPS, 10Patch-For-Review, 10cloud-services-team (Kanban): Add AAAA records for labpuppetmaster1001 and 1002 - https://phabricator.wikimedia.org/T171880#3479204 (10Dzahn) 05Open>03Resolved [18:33:23] (03PS1) 10Catrope: Enable experimental RCFilters everywhere [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368231 [18:33:38] (03CR) 10Catrope: [C: 04-2] "Not until after today's train" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368231 (owner: 10Catrope) [18:33:49] RoanKattouw: :-) [18:35:09] (03CR) 10jerkins-bot: [V: 04-1] Enable $wgStructuredChangeFiltersEnableExperimentalViews on group0 and group1 but not group2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368230 (owner: 10Catrope) [18:35:41] (03CR) 10Catrope: [C: 032] Enable $wgStructuredChangeFiltersEnableExperimentalViews on group0 and group1 but not group2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368230 (owner: 10Catrope) [18:36:42] RoanKattouw: testing [18:37:45] RoanKattouw: im not seeing the changes yet but might be caching problem.. [18:38:00] (03CR) 10jerkins-bot: [V: 04-1] Enable $wgStructuredChangeFiltersEnableExperimentalViews on group0 and group1 but not group2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368230 (owner: 10Catrope) [18:38:10] I resynced just to be sure [18:38:53] RoanKattouw: dblists dont seem to be working correctly... [18:39:35] RoanKattouw: https://ru.m.wikipedia.org/wiki/Kezd%C5%91lap is showing the english label again... which is weird [18:39:52] RoanKattouw: Did you break RC? [18:40:07] RoanKattouw: https://en.wiktionary.org/wiki/Special:RecentChanges claims there are no changes at all in RC. [18:40:12] English label of what? [18:40:18] (From -tech). [18:40:19] That ru.m. link displays no English for me [18:40:28] RoanKattouw: in mwdebug1002? [18:40:41] Just turned off XWMD but still no ENglish [18:41:13] (03Draft2) 10MarcoAurelio: Path for enwikiquote logo [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368233 (https://phabricator.wikimedia.org/T171810) [18:41:14] I did break it, fixing it [18:41:58] RoanKattouw: to be clear i'm talking about the wikipedia wordmark in top left [18:43:00] jdlrobson: Sorry I was responding to James [18:43:19] so there's a problem with my patch.. i just dont understand why. THe code looks absolutely fine :/ [18:43:23] RoanKattouw: OK. [18:43:36] probably needs to use wmg im guessing [18:43:53] I'll look in a minute, unbreaking RC right now [18:44:02] RoanKattouw: sure [18:44:15] !log catrope@tin Synchronized php-1.30.0-wmf.11/includes/: Temporarily revert patch for T168501 while I fix it (duration: 01m 27s) [18:44:20] https://meta.wikimedia.org/w/index.php?title=Special:RecentChanges <-- empty? [18:44:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:44:27] T168501: When querying for multiple tags, and more than one is in the edit, duplicate results - https://phabricator.wikimedia.org/T168501 [18:44:35] working now again [18:46:05] (03Abandoned) 10Catrope: Temporarily enable experimental views in RCFilters [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368221 (owner: 10Catrope) [18:46:13] (03PS3) 10Catrope: Enable $wgStructuredChangeFiltersEnableExperimentalViews on group0 and group1 but not group2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368230 [18:46:15] (03PS2) 10Catrope: Enable experimental RCFilters everywhere [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368231 [18:46:49] jdlrobson: What about the non-dblist-dependent changes like arzwiki, are those working? [18:46:58] RoanKattouw: yup [18:47:05] OK so just the dblists then [18:47:06] ive run into something similar to this before [18:47:13] Maybe scap pull doesn't pull them properly? [18:47:22] I'll just run sync-dblists, can't hurt [18:47:49] (03PS1) 10Jdlrobson: Define custom logos using wmg [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368236 [18:47:54] Huh wtf that command doesn't exist any more [18:48:03] RoanKattouw: yup and if sync doesn't work i suggest we try that ^ [18:48:12] 'wikipedia' definition is an array [18:48:22] Ooh [18:48:27] That may be why? [18:48:29] so it's possible the default merge strategy doesn't handle it [18:48:41] 'wikipedia' and 'wikipedia-e-acute' are both dblists, so they have the same level of specificity [18:48:57] https://www.irccloud.com/pastebin/nBB9AVXQ/ [18:48:59] It knows that individual wiki names should win over dblists, but I don't think it knows which dblists should win over which others [18:49:03] ^ RoanKattouw but note that cawiki works fine [18:49:08] Not sure if your wmg thing would necessarily fix that thogh [18:49:21] it has for similar array type configs.. shouldnt hurt to try [18:49:27] Yes, because cawiki is a wiki name, which is more specific than dblists [18:49:33] I think wg vs wmg is orthogonal to this sissue [18:49:47] Let me check in eval.php what the config vars are coming out as [18:49:59] RoanKattouw: oh i see.. mm.. maybe should have a separate dblist that minuses the rest? [18:50:10] FWIW scap pull should pull down dblists [18:50:34] Yeah that could work but would be ugly [18:50:41] I guess it could be a computed dblist though [18:53:03] RoanKattouw: on it [18:53:04] hi has swat finished already? 'cause there's a patch I listed in wikitech pending deployment (smallish one) [18:53:41] jdlrobson: OTOH if you put the new dblists before the 'wikipedia' et al dblists you could probably reverse the specificity order that way, but that would be dark magic [18:53:54] I'm pretty sure that isn't behavior that's documented or promised anywhere [18:54:00] TabbyCat: Yeah I can still fit you in [18:54:08] (03PS1) 10Jdlrobson: Compute list of English Wikipedia logos [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368237 [18:54:14] ^ RoanKattouw something like that? [18:54:17] Merci [18:54:21] 10Operations, 10ops-eqiad, 10Analytics, 10Analytics-Cluster, 10Patch-For-Review: rack/setup/install new kafka nodes kafka-jumbo100[1-6] - https://phabricator.wikimedia.org/T167992#3479258 (10RobH) So we had a discussion about this earlier in IRC, and after that agreed to document some of it here. @eluke... [18:54:34] (03CR) 10Catrope: [C: 032] Incubator bureaucrats to add/remove 'accountcreator' [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368164 (https://phabricator.wikimedia.org/T171751) (owner: 10MarcoAurelio) [18:55:21] jdlrobson: Ahm, are you sure about the -computed bit in the name there? [18:55:35] Looks like the names mismatch [18:55:37] (03CR) 10jerkins-bot: [V: 04-1] Compute list of English Wikipedia logos [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368237 (owner: 10Jdlrobson) [18:55:48] RoanKattouw: no :) [18:55:53] RoanKattouw: not sure how to compute it.. [18:55:53] 18:55:36 Exception: MWWikiversions::readDbListFile(): unable to read wikipedia-english. [18:55:56] That answers my question [18:56:05] jdlrobson: I think there's no magic to it [18:56:05] notice how computed onces have a computed one.. [18:56:22] so i should just drop '-computed' ? [18:56:47] Dropping -computed would work [18:56:57] But it looks like we have an optimization where all the computed ones are computed explicitly [18:57:30] (03PS2) 10Jdlrobson: Compute list of English Wikipedia logos [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368237 [18:58:05] (03CR) 10Dzahn: [C: 032] cache::misc: add director for releases(1001) [puppet] - 10https://gerrit.wikimedia.org/r/368183 (https://phabricator.wikimedia.org/T164030) (owner: 10Dzahn) [18:58:11] https://gerrit.wikimedia.org/r/#/c/334462/ < has some background... [18:58:14] Which there's a script for ... somewhere [18:58:33] "For performance reasons, computed dblists should be expanded before deployment and to avoid bypassing detection, the files should be named as such." [18:58:39] (03CR) 10GWicke: [C: 032] Labs: Enable the debug queue to test JobQueueEventBus. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368206 (https://phabricator.wikimedia.org/T163380) (owner: 10Ppchelko) [18:59:11] dont know where the script is though.. :/ [18:59:29] Aha [18:59:37] multiversion/bin/expanddblist [18:59:41] (03Abandoned) 10Jdlrobson: Define custom logos using wmg [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368236 (owner: 10Jdlrobson) [18:59:43] (03PS2) 10Dzahn: cache::misc: add director for releases(1001) [puppet] - 10https://gerrit.wikimedia.org/r/368183 (https://phabricator.wikimedia.org/T164030) [19:00:00] (03PS3) 10Catrope: Incubator bureaucrats to add/remove 'accountcreator' [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368164 (https://phabricator.wikimedia.org/T171751) (owner: 10MarcoAurelio) [19:00:04] RainbowSprinkles: Respected human, time to deploy MediaWiki train (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170727T1900). Please do the needful. [19:00:08] (03CR) 10Catrope: Incubator bureaucrats to add/remove 'accountcreator' [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368164 (https://phabricator.wikimedia.org/T171751) (owner: 10MarcoAurelio) [19:00:11] (03CR) 10Catrope: [C: 032] Incubator bureaucrats to add/remove 'accountcreator' [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368164 (https://phabricator.wikimedia.org/T171751) (owner: 10MarcoAurelio) [19:00:31] RainbowSprinkles: SWAT is going over slightly, sorry about that [19:00:36] (03PS3) 10Dzahn: cache::misc: add director for releases(1001) [puppet] - 10https://gerrit.wikimedia.org/r/368183 (https://phabricator.wikimedia.org/T164030) [19:00:36] (03Merged) 10jenkins-bot: Labs: Enable the debug queue to test JobQueueEventBus. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368206 (https://phabricator.wikimedia.org/T163380) (owner: 10Ppchelko) [19:00:56] RoanKattouw: Tyler's taking it for me, something came up :) [19:01:08] (03CR) 10jenkins-bot: Labs: Enable the debug queue to test JobQueueEventBus. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368206 (https://phabricator.wikimedia.org/T163380) (owner: 10Ppchelko) [19:01:37] OK, apologies to thcipriani then [19:01:53] no worries [19:03:03] Combination of Jenkins for mw-config being slower than it's historically been, dblists being confusing, and me accidentally breaking RC on all group0+1 wikis [19:04:55] (03Merged) 10jenkins-bot: Incubator bureaucrats to add/remove 'accountcreator' [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368164 (https://phabricator.wikimedia.org/T171751) (owner: 10MarcoAurelio) [19:05:54] !log catrope@tin Synchronized php-1.30.0-wmf.11/includes/: T168501, take two (duration: 01m 27s) [19:06:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:06:05] T168501: When querying for multiple tags, and more than one is in the edit, duplicate results - https://phabricator.wikimedia.org/T168501 [19:06:25] (03PS2) 10Dzahn: cache::misc: switch director for releases to releases1001 [puppet] - 10https://gerrit.wikimedia.org/r/368184 (https://phabricator.wikimedia.org/T164030) [19:06:28] (03CR) 10jenkins-bot: Incubator bureaucrats to add/remove 'accountcreator' [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368164 (https://phabricator.wikimedia.org/T171751) (owner: 10MarcoAurelio) [19:06:40] !log catrope@tin Synchronized php-1.30.0-wmf.11/autoload.php: (no justification provided) (duration: 00m 45s) [19:06:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:07:17] (03PS4) 10Catrope: Enable $wgStructuredChangeFiltersEnableExperimentalViews on group0 and group1 but not group2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368230 [19:07:23] (03CR) 10Catrope: Enable $wgStructuredChangeFiltersEnableExperimentalViews on group0 and group1 but not group2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368230 (owner: 10Catrope) [19:07:26] (03CR) 10Catrope: [C: 032] Enable $wgStructuredChangeFiltersEnableExperimentalViews on group0 and group1 but not group2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368230 (owner: 10Catrope) [19:07:53] !log catrope@tin Synchronized wmf-config/InitialiseSettings.php: T171751 (duration: 00m 46s) [19:08:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:08:03] T171751: accountcreator user group to be granted by sysops on Wikimedia Incubator - https://phabricator.wikimedia.org/T171751 [19:08:05] TabbyCat: ---^^ [19:08:21] RoanKattouw: looks good on mwdebug1002 [19:08:25] you can deploy [19:08:34] Sorry I just went straight ahead and deployed it [19:08:43] I'm over time and group/right additions are pretty low risk [19:08:50] (03PS3) 10Jdlrobson: Compute list of English Wikipedia logos [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368237 [19:08:50] RoanKattouw: not true, I cannot see it on the live version? [19:08:52] (03Merged) 10jenkins-bot: Enable $wgStructuredChangeFiltersEnableExperimentalViews on group0 and group1 but not group2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368230 (owner: 10Catrope) [19:08:59] ^ RoanKattouw guessing this needs to wait till 4pm now? [19:09:01] (03CR) 10jenkins-bot: Enable $wgStructuredChangeFiltersEnableExperimentalViews on group0 and group1 but not group2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368230 (owner: 10Catrope) [19:09:22] RoanKattouw: forget about it, it's live now yep [19:09:31] thanks [19:09:36] jdlrobson: Did you update your patch? [19:09:40] RoanKattouw: yup [19:09:44] https://gerrit.wikimedia.org/r/368237 [19:09:46] Then I'm willing to give it a shot [19:09:51] One minute while I test my own patch [19:11:17] (03PS4) 10Catrope: Compute list of English Wikipedia logos [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368237 (owner: 10Jdlrobson) [19:11:20] (03CR) 10Catrope: [C: 032] Compute list of English Wikipedia logos [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368237 (owner: 10Jdlrobson) [19:11:28] That looks like it should work to me [19:11:34] !log catrope@tin Synchronized wmf-config/InitialiseSettings.php: Enable namespace and tag filters in RCFilters on group0 and group1 (duration: 00m 46s) [19:11:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:12:22] (03PS1) 10Dzahn: admins: add Bash alias to run puppet on cache::misc for myself [puppet] - 10https://gerrit.wikimedia.org/r/368241 [19:13:01] (03CR) 10jerkins-bot: [V: 04-1] Compute list of English Wikipedia logos [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368237 (owner: 10Jdlrobson) [19:13:04] (03CR) 10jerkins-bot: [V: 04-1] Compute list of English Wikipedia logos [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368237 (owner: 10Jdlrobson) [19:13:14] (03PS2) 10Dzahn: admins: add Bash alias to run puppet on cache::misc for myself [puppet] - 10https://gerrit.wikimedia.org/r/368241 [19:14:03] jdlrobson: Hmph you need to update the tests, they failed [19:14:05] 10Operations, 10ops-codfw, 10User-fgiunchedi: ms-be2024 not powering on - https://phabricator.wikimedia.org/T171275#3479327 (10Papaul) Hello I have your case for your DL380 and wanted to let you know I am currently tied up with another system and will likely have to reschedule your case to another day and l... [19:14:08] RoanKattouw: done [19:14:09] (03PS5) 10Jdlrobson: Compute list of English Wikipedia logos [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368237 [19:14:10] forgot to run ./docroot/noc/createTxtFileSymlinks.sh [19:14:21] (03PS3) 10Dzahn: admins: add Bash alias to run puppet on cache::misc for myself [puppet] - 10https://gerrit.wikimedia.org/r/368241 [19:14:34] (03CR) 10Catrope: [C: 032] Compute list of English Wikipedia logos [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368237 (owner: 10Jdlrobson) [19:15:47] Yes! Filters integration! [19:16:26] wargo: Yeah the RC beta feature now has namespace and tag filter integration [19:16:44] (Not on Wikipedias yet, it'll arrive there on ~4 hours) [19:16:46] *in [19:17:22] (03PS4) 10Dzahn: admins: add Bash alias to run puppet on cache::misc for myself [puppet] - 10https://gerrit.wikimedia.org/r/368241 [19:17:28] Unfoooortunately I did break all non-tagfilter use of RC while trying to roll that out earlier [19:17:33] (03CR) 10Dzahn: [V: 032 C: 032] admins: add Bash alias to run puppet on cache::misc for myself [puppet] - 10https://gerrit.wikimedia.org/r/368241 (owner: 10Dzahn) [19:17:41] So RC *only* worked if you were filtering by tags [19:18:08] (03Merged) 10jenkins-bot: Compute list of English Wikipedia logos [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368237 (owner: 10Jdlrobson) [19:18:20] (03CR) 10jenkins-bot: Compute list of English Wikipedia logos [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368237 (owner: 10Jdlrobson) [19:18:47] jdlrobson: OK your new patch is on mwdebug1002 now [19:18:57] RoanKattouw: now it works :) [19:19:06] good to know for future! [19:19:18] thank you! please sync away! [19:19:30] OK syncing (in two parts) [19:20:14] !log catrope@tin Synchronized dblists: T171556 (duration: 00m 46s) [19:20:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:20:23] T171556: Deploy common Wordmarks - https://phabricator.wikimedia.org/T171556 [19:21:00] !log catrope@tin Synchronized wmf-config: T171556 (duration: 00m 47s) [19:21:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:22:25] thanks RoanKattouw :) [19:27:30] (03PS1) 10Andrew Bogott: labpuppetmaster: set temporary labs_puppet_master overrides for new puppetmasters [puppet] - 10https://gerrit.wikimedia.org/r/368243 [19:28:23] (03PS1) 10Urbanecm: Make wikiquote.png equivalent to enwikiquote.png [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368244 (https://phabricator.wikimedia.org/T171887) [19:28:39] (03CR) 10jerkins-bot: [V: 04-1] labpuppetmaster: set temporary labs_puppet_master overrides for new puppetmasters [puppet] - 10https://gerrit.wikimedia.org/r/368243 (owner: 10Andrew Bogott) [19:30:33] (03PS2) 10Andrew Bogott: labpuppetmaster: set temporary labs_puppet_master overrides [puppet] - 10https://gerrit.wikimedia.org/r/368243 [19:31:32] (03PS6) 10Bearloga: statistics::discovery: Reconfigure for Golden data retrieval [puppet] - 10https://gerrit.wikimedia.org/r/367930 (https://phabricator.wikimedia.org/T170494) [19:32:43] thcipriani: I'm done, it's all yours now [19:33:03] RoanKattouw: thank you [19:33:12] (03CR) 10Andrew Bogott: [C: 032] labpuppetmaster: set temporary labs_puppet_master overrides [puppet] - 10https://gerrit.wikimedia.org/r/368243 (owner: 10Andrew Bogott) [19:34:59] (03PS3) 10Dzahn: cache::misc: switch director for releases to releases1001 [puppet] - 10https://gerrit.wikimedia.org/r/368184 (https://phabricator.wikimedia.org/T164030) [19:37:01] !log cp4021 shutting down for relocation in rack, will put in maint mode for next 2 hours [19:37:07] (03PS1) 10Eevans: Test additional data_file_directories [puppet] - 10https://gerrit.wikimedia.org/r/368247 (https://phabricator.wikimedia.org/T170276) [19:37:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:39:00] (03PS1) 10Fomafix: Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn [puppet] - 10https://gerrit.wikimedia.org/r/368248 (https://phabricator.wikimedia.org/T117845) [19:39:53] (03CR) 10Bearloga: statistics::discovery: Reconfigure for Golden data retrieval (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/367930 (https://phabricator.wikimedia.org/T170494) (owner: 10Bearloga) [19:41:49] (03CR) 10Dzahn: [C: 032] cache::misc: switch director for releases to releases1001 [puppet] - 10https://gerrit.wikimedia.org/r/368184 (https://phabricator.wikimedia.org/T164030) (owner: 10Dzahn) [19:42:47] (03CR) 10Eevans: [C: 031] "This is a dev-only change, ([puppet compiler output](http://puppet-compiler.wmflabs.org/7188))." [puppet] - 10https://gerrit.wikimedia.org/r/368247 (https://phabricator.wikimedia.org/T170276) (owner: 10Eevans) [19:43:21] !log switching https://releases.wikimedia.org backend from bromine to releases1001 - all files have been rsynced (T164030) [19:43:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:43:32] T164030: setup releases1001.eqiad.wmnet (was: setup mwreleases1001) - https://phabricator.wikimedia.org/T164030 [19:45:17] (03Abandoned) 10Thcipriani: Group2 to wmf.11 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368228 (owner: 10Chad) [19:46:12] PROBLEM - IPsec on cp2005 is CRITICAL: Strongswan CRITICAL - ok: 70 not-conn: cp4021_v4, cp4021_v6 [19:46:12] PROBLEM - IPsec on cp1074 is CRITICAL: Strongswan CRITICAL - ok: 56 not-conn: cp4021_v4, cp4021_v6 [19:46:12] PROBLEM - IPsec on cp1049 is CRITICAL: Strongswan CRITICAL - ok: 56 not-conn: cp4021_v4, cp4021_v6 [19:46:13] PROBLEM - IPsec on cp2002 is CRITICAL: Strongswan CRITICAL - ok: 70 not-conn: cp4021_v4, cp4021_v6 [19:46:17] (03PS1) 10Thcipriani: all wikis to 1.30.0-wmf.11 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368250 [19:46:19] (03CR) 10Thcipriani: [C: 032] all wikis to 1.30.0-wmf.11 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368250 (owner: 10Thcipriani) [19:46:22] PROBLEM - IPsec on cp2008 is CRITICAL: Strongswan CRITICAL - ok: 70 not-conn: cp4021_v4, cp4021_v6 [19:46:22] PROBLEM - IPsec on cp2026 is CRITICAL: Strongswan CRITICAL - ok: 70 not-conn: cp4021_v4, cp4021_v6 [19:46:22] PROBLEM - IPsec on cp1072 is CRITICAL: Strongswan CRITICAL - ok: 56 not-conn: cp4021_v4, cp4021_v6 [19:46:32] PROBLEM - IPsec on cp2014 is CRITICAL: Strongswan CRITICAL - ok: 70 not-conn: cp4021_v4, cp4021_v6 [19:46:32] PROBLEM - IPsec on cp2017 is CRITICAL: Strongswan CRITICAL - ok: 70 not-conn: cp4021_v4, cp4021_v6 [19:46:42] PROBLEM - IPsec on cp1064 is CRITICAL: Strongswan CRITICAL - ok: 56 not-conn: cp4021_v4, cp4021_v6 [19:46:42] PROBLEM - IPsec on cp1048 is CRITICAL: Strongswan CRITICAL - ok: 56 not-conn: cp4021_v4, cp4021_v6 [19:46:42] PROBLEM - IPsec on cp1071 is CRITICAL: Strongswan CRITICAL - ok: 56 not-conn: cp4021_v4, cp4021_v6 [19:46:42] PROBLEM - IPsec on cp2020 is CRITICAL: Strongswan CRITICAL - ok: 70 not-conn: cp4021_v4, cp4021_v6 [19:46:42] PROBLEM - IPsec on cp2024 is CRITICAL: Strongswan CRITICAL - ok: 70 not-conn: cp4021_v4, cp4021_v6 [19:46:53] PROBLEM - IPsec on cp1073 is CRITICAL: Strongswan CRITICAL - ok: 56 not-conn: cp4021_v4, cp4021_v6 [19:47:02] PROBLEM - IPsec on cp2022 is CRITICAL: Strongswan CRITICAL - ok: 70 connecting: (unnamed) not-conn: cp4021_v4, cp4021_v6 [19:47:02] PROBLEM - IPsec on cp1099 is CRITICAL: Strongswan CRITICAL - ok: 56 not-conn: cp4021_v4, cp4021_v6 [19:47:02] PROBLEM - IPsec on cp2011 is CRITICAL: Strongswan CRITICAL - ok: 70 not-conn: cp4021_v4, cp4021_v6 [19:47:02] PROBLEM - IPsec on cp1062 is CRITICAL: Strongswan CRITICAL - ok: 56 not-conn: cp4021_v4, cp4021_v6 [19:47:12] PROBLEM - IPsec on cp1050 is CRITICAL: Strongswan CRITICAL - ok: 56 not-conn: cp4021_v4, cp4021_v6 [19:47:12] PROBLEM - IPsec on cp1063 is CRITICAL: Strongswan CRITICAL - ok: 56 not-conn: cp4021_v4, cp4021_v6 [19:47:29] (03PS1) 10Andrew Bogott: m5-master: allow labspuppet@labpuppetmaster1001 and 1002 to labspuppet [puppet] - 10https://gerrit.wikimedia.org/r/368251 [19:47:55] cp4021 is the only one that is broken there. but it's also not reachable via mgmt [19:48:04] cant do much [19:48:05] (03Merged) 10jenkins-bot: all wikis to 1.30.0-wmf.11 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368250 (owner: 10Thcipriani) [19:48:14] (03CR) 10jenkins-bot: all wikis to 1.30.0-wmf.11 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368250 (owner: 10Thcipriani) [19:48:37] mutante see robh comment above [19:48:55] paladox: thank you [19:49:01] your welcome :) [19:50:55] !log thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.30.0-wmf.11 [19:51:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:51:44] if bikeshe can be used as a verb, what is the past tense? [19:51:48] bikeshad? [19:52:22] bikeshedded? [19:52:49] I kind of like bikeshad better [19:52:51] and more importantly, does asking this question invite bikeshedding? [19:52:59] damn. [19:53:01] and if so, is that bad? [19:53:24] thcipriani: bikeshed (third-person singular simple present bikesheds, present participle bikeshedding, simple past and past participle bikeshedded) [19:53:28] https://en.wiktionary.org/wiki/bikeshed [19:53:35] no. way. [19:54:07] that is ... what is it? awesome? awful? [19:54:17] * urandom is so confused [19:54:26] * urandom is so conflicted [19:55:49] english is a hot mess. [19:56:47] or you can bikeshed on the talk page of that wiktionary page of course!!! :) [19:56:56] mutante: don't tempt me! [19:56:59] lol [19:58:06] (03PS2) 10Dzahn: releases: delete old microsites::releases class [puppet] - 10https://gerrit.wikimedia.org/r/368181 (https://phabricator.wikimedia.org/T164030) [19:59:13] 10Operations, 10Epic, 10Goal, 10Services (later): End of August milestone: Cassandra 3 cluster in production - https://phabricator.wikimedia.org/T169939#3479541 (10Eevans) [20:02:09] ok, cp4021 moved it should start coming bakc up [20:04:22] RECOVERY - IPsec on cp2002 is OK: Strongswan OK - 72 ESP OK [20:04:23] RECOVERY - IPsec on cp1072 is OK: Strongswan OK - 58 ESP OK [20:04:32] RECOVERY - IPsec on cp2008 is OK: Strongswan OK - 72 ESP OK [20:04:32] RECOVERY - IPsec on cp2026 is OK: Strongswan OK - 72 ESP OK [20:04:34] RECOVERY - IPsec on cp2014 is OK: Strongswan OK - 72 ESP OK [20:04:34] RECOVERY - IPsec on cp2017 is OK: Strongswan OK - 72 ESP OK [20:04:36] (03CR) 10Dzahn: [C: 032] releases: delete old microsites::releases class [puppet] - 10https://gerrit.wikimedia.org/r/368181 (https://phabricator.wikimedia.org/T164030) (owner: 10Dzahn) [20:04:42] RECOVERY - IPsec on cp1064 is OK: Strongswan OK - 58 ESP OK [20:04:42] RECOVERY - IPsec on cp1048 is OK: Strongswan OK - 58 ESP OK [20:04:43] RECOVERY - IPsec on cp1071 is OK: Strongswan OK - 58 ESP OK [20:04:52] RECOVERY - IPsec on cp2020 is OK: Strongswan OK - 72 ESP OK [20:04:52] RECOVERY - IPsec on cp2024 is OK: Strongswan OK - 72 ESP OK [20:05:02] RECOVERY - IPsec on cp1073 is OK: Strongswan OK - 58 ESP OK [20:05:03] RECOVERY - IPsec on cp1099 is OK: Strongswan OK - 58 ESP OK [20:05:03] RECOVERY - IPsec on cp2022 is OK: Strongswan OK - 72 ESP OK [20:05:12] RECOVERY - IPsec on cp1062 is OK: Strongswan OK - 58 ESP OK [20:05:12] RECOVERY - IPsec on cp2011 is OK: Strongswan OK - 72 ESP OK [20:05:12] RECOVERY - IPsec on cp1050 is OK: Strongswan OK - 58 ESP OK [20:05:13] RECOVERY - IPsec on cp1063 is OK: Strongswan OK - 58 ESP OK [20:05:13] RECOVERY - IPsec on cp1074 is OK: Strongswan OK - 58 ESP OK [20:05:22] RECOVERY - IPsec on cp1049 is OK: Strongswan OK - 58 ESP OK [20:05:22] RECOVERY - IPsec on cp2005 is OK: Strongswan OK - 72 ESP OK [20:05:55] !log ppchelko@tin Started deploy [restbase/deploy@cfb9c46]: Correctly escape the quotes in PDF names [20:06:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:07:12] (03PS3) 10Dzahn: releases: delete old microsites::releases class [puppet] - 10https://gerrit.wikimedia.org/r/368181 (https://phabricator.wikimedia.org/T164030) [20:08:01] (03CR) 10Dzahn: [C: 032] "backend has switched away from bromine now" [puppet] - 10https://gerrit.wikimedia.org/r/368181 (https://phabricator.wikimedia.org/T164030) (owner: 10Dzahn) [20:09:19] (03PS1) 10Andrew Bogott: labspuppetmaster: fully qualify m5-master [puppet] - 10https://gerrit.wikimedia.org/r/368255 [20:09:43] (03PS2) 10Andrew Bogott: labspuppetmaster: fully qualify m5-master [puppet] - 10https://gerrit.wikimedia.org/r/368255 [20:09:58] (03PS3) 10Andrew Bogott: labspuppetmaster: fully qualify m5-master [puppet] - 10https://gerrit.wikimedia.org/r/368255 [20:12:33] (03CR) 10Andrew Bogott: [C: 032] labspuppetmaster: fully qualify m5-master [puppet] - 10https://gerrit.wikimedia.org/r/368255 (owner: 10Andrew Bogott) [20:13:43] !log ppchelko@tin Finished deploy [restbase/deploy@cfb9c46]: Correctly escape the quotes in PDF names (duration: 07m 48s) [20:13:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:15:14] PROBLEM - Apache HTTP on mw1225 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 0.002 second response time [20:16:14] RECOVERY - Apache HTTP on mw1225 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 612 bytes in 0.037 second response time [20:17:41] 10Operations, 10DBA, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, and 2 others: Reopen Wikinews Dutch - https://phabricator.wikimedia.org/T168764#3479585 (10Jdforrester-WMF) 05Resolved>03Open Eurgh. [20:19:57] (03CR) 10Dzahn: [C: 032] Test additional data_file_directories [puppet] - 10https://gerrit.wikimedia.org/r/368247 (https://phabricator.wikimedia.org/T170276) (owner: 10Eevans) [20:20:13] (03PS2) 10Dzahn: Test additional data_file_directories [puppet] - 10https://gerrit.wikimedia.org/r/368247 (https://phabricator.wikimedia.org/T170276) (owner: 10Eevans) [20:22:05] !log installing apache security updates on cobalt [20:22:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:24:35] !log Un-dirtying state of /srv/deployment/jobrunner/jobrunner on tin (from T129148). Checking-out https://gerrit.wikimedia.org/r/367743 instead. [20:24:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:24:45] T129148: Deploy jobrunner with scap3 (Trebuchet jobrunner/jobrunner) - https://phabricator.wikimedia.org/T129148 [20:25:36] (03Abandoned) 10Paladox: Gerrit: Add labs specific gerrit.config [puppet] - 10https://gerrit.wikimedia.org/r/367030 (owner: 10Paladox) [20:31:38] !log restarting all pdfrender instances on scb in eqiad; one of them was hanging & causing user requests to fail [20:31:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:32:44] (03PS1) 10Ppchelko: JobQueueEventBus: Enable job events in group0 wikis. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368258 (https://phabricator.wikimedia.org/T163380) [20:33:44] PROBLEM - pdfrender on scb1002 is CRITICAL: connect to address 10.64.16.21 and port 5252: Connection refused [20:58:44] RECOVERY - MegaRAID on db1068 is OK: OK: optimal, 1 logical, 2 physical, WriteBack policy [21:07:44] RECOVERY - Check systemd state on bast3002 is OK: OK - running: The system is fully operational [21:08:45] !log bast3002 repointed mdadm at null alias to clear systemd degraded state alert [21:08:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:25:01] RainbowSprinkles! Daniel_K told me to ping you regarding a backport? [21:28:15] thcipriani: ^ [21:28:37] addshore: re https://phabricator.wikimedia.org/T171370 [21:30:39] Oooh, okay, I should be able to do that in a bit [21:32:04] addshore: yeah, if you can make a backport for that change, ping me and I can get it out. [21:32:51] Acknowledgments! [21:33:07] Haha, wow, that was a great phone auto correct... [21:33:22] no longer blocking train. so less urgent than it was earlier. [21:33:59] thcipriani: so we want that patch to land on .10 of wikidata? [21:34:18] ahh, okay, cool, https://gerrit.wikimedia.org/r/#/c/368224 on .10 of wikibase is already merged :) [21:40:13] !log Restarting Cassandra, restbase-dev1004-{a,b} to apply updated data directories list [21:40:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:41:32] thcipriani: greg-g https://gerrit.wikimedia.org/r/#/c/368312/ [21:43:02] If you ever need to do a backport again updating the wikidata build is pretty simple :) https://github.com/wikimedia/mediawiki-extensions-Wikidata/blob/master/README.md under the "Manually update a build" section should cover everything [21:43:32] So making the above patch was, checkout the repo & correct branch, run "composer update --prefer-dist -o --no-dev wikibase/wikibase", push the changes to gerrit [21:47:46] addshore: that was fast :) [21:48:15] Yeh, it not that hard really, but I appreciate it is different compare with every other wikimedia thing ever :) [21:48:18] *its [21:48:48] indeed, alright, once jenkins checks off I'll merge and get it deployed [21:49:10] awesome! I'll be around / dozing off for the next hour I expect so ping me if you need anything else ! [21:49:20] :) [21:49:29] cool, thanks for the build [21:52:14] PROBLEM - Router interfaces on mr1-eqiad is CRITICAL: CRITICAL: host 208.80.154.199, interfaces up: 36, down: 1, dormant: 0, excluded: 0, unused: 0 [21:54:14] RECOVERY - Router interfaces on mr1-eqiad is OK: OK: host 208.80.154.199, interfaces up: 38, down: 0, dormant: 0, excluded: 0, unused: 0 [21:54:26] PROBLEM - Host mr1-eqiad.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [21:55:44] addshore: hrm, seems like jenkins is mad :( [21:57:23] hmmmm [21:59:34] RECOVERY - Host mr1-eqiad.oob IPv6 is UP: PING OK - Packet loss = 0%, RTA = 102.38 ms [22:01:42] 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10MW-1.30-release-notes (WMF-deploy-2017-07-25_(1.30.0-wmf.11)), and 2 others: Create Dinka Wikipedia - https://phabricator.wikimedia.org/T168518#3479902 (10Dcljr) I don't see anything myself, but then I'm a "din-0" user. Not yet a very a... [22:04:40] hmmm, not sure about that one [22:04:56] my gut tells me it is just a CI issue [22:07:45] hmm, it looks like the same failure is happening on the master build too [22:07:52] hrm, both those tests seem to be freaking out about a missing .git directory in propertysuggestor [22:21:58] thcipriani: yup, although the chage of course has absolutly nothing to do with propertysuggestor or any .git dirs ;) [22:22:55] I expect https://gerrit.wikimedia.org/r/#/c/368315/ will fail too [22:23:40] oh boy :) [22:26:07] yup, that fails with the same issue [22:29:39] (03CR) 10Hoo man: [C: 04-1] "Looking at this further: The number of dispatchers running is by far enough currently, but the batch-size/ dispatch-interval is to low/ to" [puppet] - 10https://gerrit.wikimedia.org/r/366887 (https://phabricator.wikimedia.org/T171263) (owner: 10Ladsgroup) [22:32:06] thcipriani: if it is no longer blocking the train I would say let's leave it to be looked at again in friday EU morning time :) [22:34:06] addshore: sure, that's fine. [22:36:14] (03PS1) 10Andrew Bogott: labs puppetmaster: allow new puppetmasters to talk to the enc [puppet] - 10https://gerrit.wikimedia.org/r/368317 [22:38:01] (03CR) 10Andrew Bogott: [C: 032] labs puppetmaster: allow new puppetmasters to talk to the enc [puppet] - 10https://gerrit.wikimedia.org/r/368317 (owner: 10Andrew Bogott) [22:38:14] 10Puppet, 10Cloud-VPS: role::puppet::self: Unable to locate package geoipupdate - https://phabricator.wikimedia.org/T171916#3479983 (10MaxSem) [22:38:59] 10Puppet, 10Cloud-VPS: role::puppet::self: Unable to locate package geoipupdate - https://phabricator.wikimedia.org/T171916#3479996 (10MaxSem) [22:51:16] (03PS1) 10BryanDavis: wmcs: remove toollabs::admin_web_updater [puppet] - 10https://gerrit.wikimedia.org/r/368318 (https://phabricator.wikimedia.org/T140254) [22:53:00] (03CR) 10Andrew Bogott: [C: 032] wmcs: remove toollabs::admin_web_updater [puppet] - 10https://gerrit.wikimedia.org/r/368318 (https://phabricator.wikimedia.org/T140254) (owner: 10BryanDavis) [22:53:14] 10Operations, 10Release-Engineering-Team (Kanban): setup releases2001.codfw.wmnet - https://phabricator.wikimedia.org/T171917#3480023 (10Dzahn) [22:53:50] 10Operations, 10Release-Engineering-Team (Watching / External): Provide cross-dc redundancy (active-active or active-passive) to all important misc services - https://phabricator.wikimedia.org/T156937#2990470 (10Dzahn) [22:53:53] 10Operations, 10Release-Engineering-Team (Kanban): setup releases2001.codfw.wmnet - https://phabricator.wikimedia.org/T171917#3480023 (10Dzahn) [22:54:58] 10Operations, 10Release-Engineering-Team (Watching / External): Provide cross-dc redundancy (active-active or active-passive) to all important misc services - https://phabricator.wikimedia.org/T156937#2990470 (10Dzahn) [22:57:46] 10Operations, 10Sentry, 10vm-requests, 10Multimedia-Sprint-2015-03-25: Procure hardware for Sentry - https://phabricator.wikimedia.org/T93138#1130243 (10Dzahn) This has been sitting since Jan 2016. Any updates on what the real blocker is here (nowadays)? [22:58:17] 10Operations, 10Sentry, 10vm-requests, 10Multimedia-Sprint-2015-03-25: Procure hardware for Sentry - https://phabricator.wikimedia.org/T93138#3480065 (10Dzahn) Do you still request a Ganeti VM? [22:59:43] 10Operations, 10Cloud-VPS, 10cloud-services-team (Kanban): Switch to new labs puppetmasters - https://phabricator.wikimedia.org/T171786#3480067 (10Andrew) [23:00:04] addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: Respected human, time to deploy Evening SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170727T2300). Please do the needful. [23:00:04] matt_flaschen and RoanKattouw: A patch you scheduled for Evening SWAT (Max 8 patches) is about to be deployed. Please be available during the process. [23:03:07] I'll SWAT [23:04:56] (03PS3) 10Catrope: Enable experimental RCFilters everywhere [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368231 [23:05:00] (03CR) 10Catrope: [C: 032] Enable experimental RCFilters everywhere [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368231 (owner: 10Catrope) [23:06:24] (03Merged) 10jenkins-bot: Enable experimental RCFilters everywhere [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368231 (owner: 10Catrope) [23:06:34] (03CR) 10jenkins-bot: Enable experimental RCFilters everywhere [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368231 (owner: 10Catrope) [23:11:47] (03PS1) 10Dzahn: add releases2001.codfw.wmnet [dns] - 10https://gerrit.wikimedia.org/r/368319 (https://phabricator.wikimedia.org/T171917) [23:11:57] !log catrope@tin Started scap: wmf-config/InitialiseSettings.php Enable experimental RCFilters on group2 too [23:12:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:12:08] (03CR) 10jerkins-bot: [V: 04-1] add releases2001.codfw.wmnet [dns] - 10https://gerrit.wikimedia.org/r/368319 (https://phabricator.wikimedia.org/T171917) (owner: 10Dzahn) [23:12:49] (03PS2) 10Catrope: Make emails for minor edits always available; keep defaults [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368113 (https://phabricator.wikimedia.org/T29884) (owner: 10Mattflaschen) [23:12:53] (03CR) 10Catrope: [C: 032] Make emails for minor edits always available; keep defaults [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368113 (https://phabricator.wikimedia.org/T29884) (owner: 10Mattflaschen) [23:14:53] !log catrope@tin Finished scap: wmf-config/InitialiseSettings.php Enable experimental RCFilters on group2 too (duration: 02m 55s) [23:15:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:15:16] 10Operations, 10Patch-For-Review, 10Release-Engineering-Team (Kanban): setup releases2001.codfw.wmnet - https://phabricator.wikimedia.org/T171917#3480100 (10Dzahn) [23:15:26] (03PS1) 10Rush: openstack: move openstack::repo to new model [puppet] - 10https://gerrit.wikimedia.org/r/368321 (https://phabricator.wikimedia.org/T171494) [23:15:38] 10Operations, 10vm-requests, 10Patch-For-Review, 10Release-Engineering-Team (Kanban): setup releases2001.codfw.wmnet - https://phabricator.wikimedia.org/T171917#3480023 (10Dzahn) [23:15:47] (03Merged) 10jenkins-bot: Make emails for minor edits always available; keep defaults [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368113 (https://phabricator.wikimedia.org/T29884) (owner: 10Mattflaschen) [23:16:16] (03CR) 10jenkins-bot: Make emails for minor edits always available; keep defaults [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368113 (https://phabricator.wikimedia.org/T29884) (owner: 10Mattflaschen) [23:16:32] (03PS2) 10Rush: openstack: move openstack::repo to new model [puppet] - 10https://gerrit.wikimedia.org/r/368321 (https://phabricator.wikimedia.org/T171494) [23:17:36] (03PS2) 10Dzahn: add releases2001.codfw.wmnet [dns] - 10https://gerrit.wikimedia.org/r/368319 (https://phabricator.wikimedia.org/T171917) [23:17:58] (03CR) 10jerkins-bot: [V: 04-1] openstack: move openstack::repo to new model [puppet] - 10https://gerrit.wikimedia.org/r/368321 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [23:19:50] matt_flaschen: Are you around to test your enotif change? It's live on mwdebug1002 now [23:20:57] (03PS3) 10Dzahn: add releases2001.codfw.wmnet [dns] - 10https://gerrit.wikimedia.org/r/368319 (https://phabricator.wikimedia.org/T171917) [23:21:52] Hello I'm Krishna. I"m interested in contributing to wikimedia operations. My background is Computer science Major and handson experience in jenkins,python,linux and knowledge on puppet. Can anybody please guide me where to get started . Thanks. [23:22:34] mkrish: hi Krishna, i can help you. this is the right channel for starters [23:23:51] Yikes, I forgot the window. [23:23:56] RoanKattouw, sorry, testing now. [23:26:18] No worries [23:26:24] I'm having some fun with my patch anyway [23:26:29] * mutante talks to Krishna in PM [23:26:40] https://usercontent.irccloud-cdn.com/file/XlREvBny/rcfilters-50-50.png [23:26:53] RoanKattouw, sorry again. It shows in preferences, I need to check the code to see if it's testable, or will trigger the job queue (which would then fail since it won't run on mwdebug1002) [23:28:02] (03PS4) 10Dzahn: add releases2001.codfw.wmnet [dns] - 10https://gerrit.wikimedia.org/r/368319 (https://phabricator.wikimedia.org/T171917) [23:29:10] (03PS3) 10Rush: openstack: move openstack::repo to new model [puppet] - 10https://gerrit.wikimedia.org/r/368321 (https://phabricator.wikimedia.org/T171494) [23:29:44] PROBLEM - Router interfaces on pfw-eqiad is CRITICAL: CRITICAL: host 208.80.154.218, interfaces up: 104, down: 1, dormant: 0, excluded: 3, unused: 0 [23:29:58] (03PS4) 10Rush: openstack: move openstack::repo to new model [puppet] - 10https://gerrit.wikimedia.org/r/368321 (https://phabricator.wikimedia.org/T171494) [23:31:16] RoanKattouw, yeah, I don't think the actual email will work on mwdebug1002 per above (it goes through a job, then within the job it rechecks wgEnotifMinorEdits). [23:31:29] RoanKattouw, so good to release, then I'll test after that. [23:33:44] RECOVERY - Router interfaces on pfw-eqiad is OK: OK: host 208.80.154.218, interfaces up: 105, down: 0, dormant: 0, excluded: 3, unused: 0 [23:33:55] (03CR) 10Dzahn: [C: 032] add releases2001.codfw.wmnet [dns] - 10https://gerrit.wikimedia.org/r/368319 (https://phabricator.wikimedia.org/T171917) (owner: 10Dzahn) [23:34:05] matt_flaschen: OK, syncing [23:35:23] !log catrope@tin Synchronized wmf-config/: Enable emails for minor edits everywhere but keep default prefs (T29884, T142727) (duration: 00m 45s) [23:35:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:35:32] T142727: Set $wgEnotifMinorEdits true throughout WMF, and instead use $wgDefaultUserOptions['enotifminoredits'] - https://phabricator.wikimedia.org/T142727 [23:35:32] T29884: enotif doesn't send email if page on watchlist edited following a minor edit and enotif not configured to send minor edits. - https://phabricator.wikimedia.org/T29884 [23:36:06] Testing now [23:41:43] RoanKattouw, works. [23:41:46] Mostly... [23:41:47] RoanKattouw: [23:41:49] Editor's summary: forgot to enable emails themselves This is a id="minoredit_helplink">[[Help:Minor edit|minor edit]] [23:42:10] ^ [23:42:19] Starting with "This" is code-generated, with unparsed wikitext. [23:42:24] Other part is my real summary. [23:42:26] Oh good [23:42:37] RoanKattouw, I'll fix that now. [23:42:53] 10Operations, 10vm-requests, 10Patch-For-Review, 10Release-Engineering-Team (Kanban): setup releases2001.codfw.wmnet - https://phabricator.wikimedia.org/T171917#3480158 (10Dzahn) releases2001.codfw.wmnet has address 10.192.16.174 releases2001.codfw.wmnet has IPv6 address 2620:0:860:102:10:192:16:174 [23:42:57] RoanKattouw, we don't even support text email for core notifications... [23:42:58] Did you know I recently took all the code that ran combinations of strip_tags() and html_entity_decode() and what not and made it all use Sanitizer::stripAllTags()? [23:43:26] RoanKattouw, nice. [23:44:02] So it's not like it has to choose whether to send HTML or text. This works for no-one. I guess this is not a popular setting. [23:44:30] Despite defaulting true? [23:44:35] Maybe that isn't the default text. [23:44:54] Indeed [23:48:29] Oh maybe enwiki overrode that message to add wikitext to it? [23:48:39] !log catrope@tin Synchronized php-1.30.0-wmf.11/resources/src/mediawiki.rcfilters/: T171368 (duration: 00m 42s) [23:48:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:48:51] T171368: RC page - default selection of '50 changes in 7 days' is sticky - https://phabricator.wikimedia.org/T171368 [23:53:03] RoanKattouw, yeah, I'm splitting them (same text in core, but different messages). But I don't want to SWAT that since then we'll have untranslated everywhere. After I test and upload, I'll ask Global Collaboration if they can clone the message. [23:54:42] "they"? [23:54:48] :P [23:54:49] RoanKattouw, :) [23:57:16] PROBLEM - Host cp3048 is DOWN: PING CRITICAL - Packet loss = 100% [23:57:32] 10Operations, 10MediaWiki-JobRunner, 10Performance-Team, 10Patch-For-Review: Investigate 30x increase in Jobrunner errors - https://phabricator.wikimedia.org/T171371#3480169 (10Krinkle) p:05Triage>03High [23:58:01] (03PS1) 10Catrope: Remove temporary wgStructuredChangeFiltersEnableExperimentalViews setting [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368330 [23:58:14] RECOVERY - Host cp3048 is UP: PING WARNING - Packet loss = 80%, RTA = 666.04 ms [23:58:18] (03CR) 10Catrope: [C: 04-2] "Don't deploy this until wmf.12 has actually been deployed everywhere" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368330 (owner: 10Catrope) [23:58:26] (03CR) 10Rush: [C: 032] openstack: move openstack::repo to new model [puppet] - 10https://gerrit.wikimedia.org/r/368321 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [23:59:44] PROBLEM - puppet last run on labtestcontrol2001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues