[01:04:17] <icinga-wm>	 PROBLEM - Check health of redis instance on 6479 on rdb2005 is CRITICAL: CRITICAL: replication_delay is 1509498249 600 - REDIS 2.8.17 on 127.0.0.1:6479 has 1 databases (db0) with 3826336 keys, up 4 minutes 6 seconds - replication_delay is 1509498249
[01:04:17] <icinga-wm>	 PROBLEM - Check health of redis instance on 6379 on rdb2003 is CRITICAL: CRITICAL ERROR - Redis Library - can not ping 127.0.0.1 on port 6379
[01:04:17] <icinga-wm>	 PROBLEM - Check health of redis instance on 6381 on rdb2003 is CRITICAL: CRITICAL ERROR - Redis Library - can not ping 127.0.0.1 on port 6381
[01:04:27] <icinga-wm>	 PROBLEM - Check health of redis instance on 6380 on rdb2003 is CRITICAL: CRITICAL ERROR - Redis Library - can not ping 127.0.0.1 on port 6380
[01:04:38] <icinga-wm>	 PROBLEM - Check health of redis instance on 6481 on rdb2005 is CRITICAL: CRITICAL: replication_delay is 1509498275 600 - REDIS 2.8.17 on 127.0.0.1:6481 has 1 databases (db0) with 3824473 keys, up 4 minutes 31 seconds - replication_delay is 1509498275
[01:05:08] <icinga-wm>	 PROBLEM - Check health of redis instance on 6480 on rdb2005 is CRITICAL: CRITICAL: replication_delay is 1509498303 600 - REDIS 2.8.17 on 127.0.0.1:6480 has 1 databases (db0) with 3829264 keys, up 5 minutes - replication_delay is 1509498303
[01:05:17] <icinga-wm>	 RECOVERY - Check health of redis instance on 6479 on rdb2005 is OK: OK: REDIS 2.8.17 on 127.0.0.1:6479 has 1 databases (db0) with 3817053 keys, up 5 minutes 7 seconds - replication_delay is 0
[01:05:18] <icinga-wm>	 RECOVERY - Check health of redis instance on 6380 on rdb2003 is OK: OK: REDIS 2.8.17 on 127.0.0.1:6380 has 1 databases (db0) with 8523026 keys, up 5 minutes 12 seconds - replication_delay is 0
[01:05:18] <icinga-wm>	 RECOVERY - Check health of redis instance on 6381 on rdb2003 is OK: OK: REDIS 2.8.17 on 127.0.0.1:6381 has 1 databases (db0) with 8418411 keys, up 5 minutes 12 seconds - replication_delay is 0
[01:05:18] <icinga-wm>	 RECOVERY - Check health of redis instance on 6379 on rdb2003 is OK: OK: REDIS 2.8.17 on 127.0.0.1:6379 has 1 databases (db0) with 8520588 keys, up 5 minutes 13 seconds - replication_delay is 0
[01:05:38] <icinga-wm>	 RECOVERY - Check health of redis instance on 6481 on rdb2005 is OK: OK: REDIS 2.8.17 on 127.0.0.1:6481 has 1 databases (db0) with 3815296 keys, up 5 minutes 32 seconds - replication_delay is 0
[01:06:08] <icinga-wm>	 RECOVERY - Check health of redis instance on 6480 on rdb2005 is OK: OK: REDIS 2.8.17 on 127.0.0.1:6480 has 1 databases (db0) with 3819551 keys, up 6 minutes 1 seconds - replication_delay is 0
[02:06:27] <icinga-wm>	 PROBLEM - puppet last run on cp1045 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[02:36:21] <logmsgbot>	 !log l10nupdate@tin scap sync-l10n completed (1.31.0-wmf.5) (duration: 09m 46s)
[02:36:27] <icinga-wm>	 RECOVERY - puppet last run on cp1045 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures
[02:36:28] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[02:36:57] <icinga-wm>	 PROBLEM - puppet last run on mw2227 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[03:06:57] <icinga-wm>	 RECOVERY - puppet last run on mw2227 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures
[03:14:46] <logmsgbot>	 !log l10nupdate@tin scap sync-l10n completed (1.31.0-wmf.6) (duration: 15m 21s)
[03:14:51] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[03:22:02] <logmsgbot>	 !log l10nupdate@tin ResourceLoader cache refresh completed at Wed Nov  1 03:22:02 UTC 2017 (duration 7m 17s)
[03:22:08] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[03:27:07] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 838.48 seconds
[03:33:28] <icinga-wm>	 PROBLEM - HHVM rendering on mw2201 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[03:34:18] <icinga-wm>	 RECOVERY - HHVM rendering on mw2201 is OK: HTTP OK: HTTP/1.1 200 OK - 73647 bytes in 0.357 second response time
[03:44:38] <icinga-wm>	 PROBLEM - puppet last run on db1054 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[04:07:37] <icinga-wm>	 PROBLEM - puppet last run on lvs3002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[04:14:38] <icinga-wm>	 RECOVERY - puppet last run on db1054 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures
[04:18:27] <icinga-wm>	 RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 217.59 seconds
[04:26:30] <wikibugs>	 10Operations, 10Cloud-Services, 10Developer-Relations, 10LDAP: Create a single application to provision and manage developer (LDAP) accounts - https://phabricator.wikimedia.org/T179463#3725515 (10bd808)
[04:32:37] <icinga-wm>	 RECOVERY - puppet last run on lvs3002 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures
[05:42:17] <icinga-wm>	 PROBLEM - puppet last run on hassium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[06:05:28] <icinga-wm>	 PROBLEM - puppet last run on es2002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[06:07:17] <icinga-wm>	 RECOVERY - puppet last run on hassium is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures
[06:25:17] <icinga-wm>	 PROBLEM - puppet last run on cp1052 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[06:30:50] <icinga-wm>	 PROBLEM - mysqld processes on labsdb1001 is CRITICAL: PROCS CRITICAL: 0 processes with command name mysqld
[06:35:28] <icinga-wm>	 RECOVERY - puppet last run on es2002 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures
[06:51:00] <icinga-wm>	 RECOVERY - mysqld processes on labsdb1001 is OK: PROCS OK: 1 process with command name mysqld
[06:55:00] <icinga-wm>	 PROBLEM - mysqld processes on labsdb1001 is CRITICAL: PROCS CRITICAL: 0 processes with command name mysqld
[06:55:18] <icinga-wm>	 RECOVERY - puppet last run on cp1052 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[06:57:00] <icinga-wm>	 RECOVERY - mysqld processes on labsdb1001 is OK: PROCS OK: 1 process with command name mysqld
[06:58:00] <wikibugs>	 10Operations, 10cloud-services-team (Kanban): labsdb1001 crashed - storage issue - https://phabricator.wikimedia.org/T179464#3725553 (10Marostegui)
[06:59:04] <wikibugs>	 10Operations, 10cloud-services-team (Kanban): labsdb1001 crashed - storage issue - https://phabricator.wikimedia.org/T179464#3725567 (10Marostegui) I am trying to start MySQL but it failing on storage so I think this server is no longer available: ``` 171101  6:57:13 [Note] InnoDB: Starting an apply batch of l...
[07:00:08] <wikibugs>	 10Operations, 10DBA, 10cloud-services-team, 10Patch-For-Review, 10Scoring-platform-team (Current): Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3725569 (10Marostegui) Please check: T179464 labsdb1001 has crashed and the storage looks totally broken, hard to say if it is...
[07:00:25] <wikibugs>	 (03PS1) 10Madhuvishy: Revert "Revert "labsdb: Switchover dns for labsdb1001 shards to labsdb1003"" [puppet] - 10https://gerrit.wikimedia.org/r/387772 (https://phabricator.wikimedia.org/T179464)
[07:01:00] <wikibugs>	 (03PS2) 10Madhuvishy: Revert "Revert "labsdb: Switchover dns for labsdb1001 shards to labsdb1003"" [puppet] - 10https://gerrit.wikimedia.org/r/387772 (https://phabricator.wikimedia.org/T179464)
[07:01:02] <wikibugs>	 (03CR) 10Marostegui: [C: 031] Revert "Revert "labsdb: Switchover dns for labsdb1001 shards to labsdb1003"" [puppet] - 10https://gerrit.wikimedia.org/r/387772 (https://phabricator.wikimedia.org/T179464) (owner: 10Madhuvishy)
[07:01:03] <icinga-wm>	 PROBLEM - mysqld processes on labsdb1001 is CRITICAL: PROCS CRITICAL: 0 processes with command name mysqld
[07:01:09] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Revert "Revert "labsdb: Switchover dns for labsdb1001 shards to labsdb1003"" [puppet] - 10https://gerrit.wikimedia.org/r/387772 (https://phabricator.wikimedia.org/T179464) (owner: 10Madhuvishy)
[07:01:31] <marostegui>	 going to downtime labsdb1001
[07:02:23] <wikibugs>	 (03PS3) 10Madhuvishy: Revert "Revert "labsdb: Switch dns for labsdb1001 shards to labsdb1003"" [puppet] - 10https://gerrit.wikimedia.org/r/387772 (https://phabricator.wikimedia.org/T179464)
[07:02:53] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Revert "Revert "labsdb: Switch dns for labsdb1001 shards to labsdb1003"" [puppet] - 10https://gerrit.wikimedia.org/r/387772 (https://phabricator.wikimedia.org/T179464) (owner: 10Madhuvishy)
[07:03:19] <wikibugs>	 (03PS4) 10Madhuvishy: Revert "Revert "labsdb: Switch dns for labsdb1001 to labsdb1003"" [puppet] - 10https://gerrit.wikimedia.org/r/387772 (https://phabricator.wikimedia.org/T179464)
[07:03:48] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Revert "Revert "labsdb: Switch dns for labsdb1001 to labsdb1003"" [puppet] - 10https://gerrit.wikimedia.org/r/387772 (https://phabricator.wikimedia.org/T179464) (owner: 10Madhuvishy)
[07:04:30] <wikibugs>	 10Operations, 10DBA, 10cloud-services-team, 10Patch-For-Review, 10Scoring-platform-team (Current): Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3725571 (10Marostegui) We should consider labsdb1001 broken for good and decommission it -  we need to decide whether we want...
[07:04:41] <wikibugs>	 (03PS5) 10Madhuvishy: Revert "Revert "labsdb: Switch dns for labsdb1001 to labsdb1003"" [puppet] - 10https://gerrit.wikimedia.org/r/387772 (https://phabricator.wikimedia.org/T179464)
[07:05:17] <wikibugs>	 (03CR) 10Madhuvishy: [C: 032] Revert "Revert "labsdb: Switch dns for labsdb1001 to labsdb1003"" [puppet] - 10https://gerrit.wikimedia.org/r/387772 (https://phabricator.wikimedia.org/T179464) (owner: 10Madhuvishy)
[07:07:55] <wikibugs>	 10Operations, 10Patch-For-Review, 10cloud-services-team (Kanban): labsdb1001 crashed - storage issue - https://phabricator.wikimedia.org/T179464#3725577 (10Marostegui) btw, the RAID keeps saying Optimal :-) ``` Number of Virtual Disks: 1 Virtual Drive: 0 (Target Id: 0) Name                : RAID Level...
[07:12:47] <icinga-wm>	 PROBLEM - HHVM rendering on mw2151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[07:13:38] <icinga-wm>	 RECOVERY - HHVM rendering on mw2151 is OK: HTTP OK: HTTP/1.1 200 OK - 73028 bytes in 0.296 second response time
[07:25:37] <wikibugs>	 10Operations, 10Patch-For-Review, 10cloud-services-team (Kanban): labsdb1001 crashed - storage issue - https://phabricator.wikimedia.org/T179464#3725595 (10Marostegui) I have disabled notifications and downtimed labsdb1001
[08:17:34] <wikibugs>	 (03PS2) 10Ema: varnish child started: avoid illegal characters [puppet] - 10https://gerrit.wikimedia.org/r/387242
[08:17:39] <wikibugs>	 (03CR) 10Ema: [V: 032 C: 032] varnish child started: avoid illegal characters [puppet] - 10https://gerrit.wikimedia.org/r/387242 (owner: 10Ema)
[08:19:27] <wikibugs>	 (03PS3) 10Ema: puppet: fix trailing slash on file resource /usr/share/varnish/tests [puppet] - 10https://gerrit.wikimedia.org/r/387584 (https://phabricator.wikimedia.org/T179396) (owner: 10Herron)
[08:19:32] <wikibugs>	 (03CR) 10Ema: [V: 032 C: 032] puppet: fix trailing slash on file resource /usr/share/varnish/tests [puppet] - 10https://gerrit.wikimedia.org/r/387584 (https://phabricator.wikimedia.org/T179396) (owner: 10Herron)
[08:24:53] <wikibugs>	 10Operations, 10ops-ulsfo, 10Traffic: setup bast4001/WMF7218 - https://phabricator.wikimedia.org/T179050#3711643 (10MoritzMuehlenhoff) This is currently installed with jessie, but if we setup a new box, let's use stretch from the start?
[08:29:39] <wikibugs>	 10Operations, 10DBA, 10cloud-services-team, 10Patch-For-Review, 10Scoring-platform-team (Current): Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3725662 (10MoritzMuehlenhoff) >>! In T168584#3725571, @Marostegui wrote: > We should consider labsdb1001 broken for good and d...
[08:31:31] <wikibugs>	 10Operations, 10DBA, 10cloud-services-team, 10Patch-For-Review, 10Scoring-platform-team (Current): Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3725667 (10Marostegui) We are aiming for 13th Dec to retire these two hosts: T142807 and https://wikitech.wikimedia.org/wiki/W...
[08:34:42] <wikibugs>	 10Operations, 10DBA, 10cloud-services-team, 10Patch-For-Review, 10Scoring-platform-team (Current): Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3725668 (10MoritzMuehlenhoff) Let's just keep 1003 running w/o reboot then.
[08:35:37] <icinga-wm>	 RECOVERY - Disk space on stat1005 is OK: DISK OK
[08:35:57] <elukey>	 !log forced umount/mount for /mnt/hdfs on stat1005 (not working after repeated oom kill actions)
[08:36:03] <elukey>	 apergos: --^ 
[08:36:04] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:47:37] <wikibugs>	 10Operations, 10Ops-Access-Requests: Requesting access to RESOURCE for USER[S] - https://phabricator.wikimedia.org/T179452#3725688 (10Framawiki) a:05Mehrdadbot>03None Hello @Mehrdadbot and welcome ! What "RESOURCE" you want to access ?
[08:49:06] <apergos>	 elukey: thanks, saw the emails yesterday
[08:58:57] <wikibugs>	 (03CR) 10Ema: "Couple of inline comments, the rest looks good." (033 comments) [puppet] - 10https://gerrit.wikimedia.org/r/386895 (owner: 10BBlack)
[09:23:18] <wikibugs>	 10Operations, 10Ops-Access-Requests: Requesting access to RESOURCE for USER[S] - https://phabricator.wikimedia.org/T179452#3725752 (10Mehrdadbot) yes. thanks.
[09:26:26] <wikibugs>	 (03PS3) 10DCausse: Properly check for cluster existence prior setting TTM mirrors [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387281 (https://phabricator.wikimedia.org/T179270)
[09:28:05] <wikibugs>	 (03PS1) 10Alexandros Kosiaris: k8s::controller: Notify service on config changes [puppet] - 10https://gerrit.wikimedia.org/r/387775
[09:29:51] <wikibugs>	 (03CR) 10Alexandros Kosiaris: [C: 032] k8s::controller: Notify service on config changes [puppet] - 10https://gerrit.wikimedia.org/r/387775 (owner: 10Alexandros Kosiaris)
[09:33:52] <wikibugs>	 (03PS2) 10Alexandros Kosiaris: Remove $cluster_cidr from k8s::controller [puppet] - 10https://gerrit.wikimedia.org/r/386753
[09:33:54] <wikibugs>	 (03PS2) 10Alexandros Kosiaris: k8s::controller: support service account token signing [puppet] - 10https://gerrit.wikimedia.org/r/386754 (https://phabricator.wikimedia.org/T177393)
[09:33:56] <wikibugs>	 (03PS2) 10Alexandros Kosiaris: Enable k8s::controller manager ServiceAccount signing [puppet] - 10https://gerrit.wikimedia.org/r/386755 (https://phabricator.wikimedia.org/T177393)
[09:35:25] <wikibugs>	 (03CR) 10Alexandros Kosiaris: [C: 032] Remove $cluster_cidr from k8s::controller [puppet] - 10https://gerrit.wikimedia.org/r/386753 (owner: 10Alexandros Kosiaris)
[09:37:03] <wikibugs>	 (03CR) 10Alexandros Kosiaris: [C: 032] k8s::controller: support service account token signing [puppet] - 10https://gerrit.wikimedia.org/r/386754 (https://phabricator.wikimedia.org/T177393) (owner: 10Alexandros Kosiaris)
[09:39:51] <wikibugs>	 (03CR) 10Alexandros Kosiaris: [C: 032] Enable k8s::controller manager ServiceAccount signing [puppet] - 10https://gerrit.wikimedia.org/r/386755 (https://phabricator.wikimedia.org/T177393) (owner: 10Alexandros Kosiaris)
[09:42:51] <wikibugs>	 10Operations, 10Scap, 10Release-Engineering-Team (Watching / External): Scap: Standardize git version - https://phabricator.wikimedia.org/T179353#3725775 (10MoritzMuehlenhoff) silver will be replaced by the new labweb* hosts using stretch soon, so that should be resolved soon. Is that the only one deployment...
[09:44:33] <yannf>	 Hi, I have a problem uploading a PDF (this book https://commons.wikimedia.org/wiki/File:Tolsto%C3%AF_-_%C5%92uvres_compl%C3%A8tes,_vol10.djvu ), it says the file is corrupted, I made it again -> same error, it is a big file (174 MB), but it's the first time I get this
[09:46:06] <yannf>	 I am trying to upload over https://commons.wikimedia.org/wiki/File:Tolsto%C3%AF_-_%C5%92uvres_compl%C3%A8tes,_vol10.pdf which chunked upload
[09:52:29] <icinga-wm>	 PROBLEM - puppet last run on mw1214 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[09:55:49] <icinga-wm>	 PROBLEM - puppet last run on ms-fe2006 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[09:59:08] <wikibugs>	 (03CR) 10GoranSMilovanovic: "> Yeah, until WMF/WMDE has a CRAN mirror we can't install packages" [puppet] - 10https://gerrit.wikimedia.org/r/369902 (https://phabricator.wikimedia.org/T171258) (owner: 10Addshore)
[10:03:00] <wikibugs>	 10Operations, 10Ops-Access-Requests: Requesting access to RESOURCE for USER[S] - https://phabricator.wikimedia.org/T179452#3725800 (10Mehrdadbot) shell access(tool forge)...
[10:03:30] <wikibugs>	 (03CR) 10Nikerabbit: [C: 031] Properly check for cluster existence prior setting TTM mirrors [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387281 (https://phabricator.wikimedia.org/T179270) (owner: 10DCausse)
[10:07:58] <icinga-wm>	 PROBLEM - puppet last run on mw2199 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[10:17:29] <icinga-wm>	 RECOVERY - puppet last run on mw1214 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures
[10:20:32] <wikibugs>	 10Operations, 10wikidiff2, 10Patch-For-Review, 10User-Addshore, and 2 others: Update and use php-wikidiff2 to 1.5 in production - https://phabricator.wikimedia.org/T177891#3725819 (10Tobi_WMDE_SW)
[10:20:49] <icinga-wm>	 RECOVERY - puppet last run on ms-fe2006 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures
[10:22:09] <icinga-wm>	 PROBLEM - puppet last run on elastic1034 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[10:32:04] <yannf>	 that's the file: https://www.dropbox.com/s/c25hri2mfhbkmop/Tolsto%C3%AF%20-%20OC%20-%20tome%2010%20-%20GP%2C%204.pdf?dl=0
[10:35:18] <icinga-wm>	 PROBLEM - puppet last run on cp1049 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[10:37:58] <icinga-wm>	 RECOVERY - puppet last run on mw2199 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures
[10:51:14] <wikibugs>	 (03CR) 10jenkins-bot: Enable Unicode section links on mediawiki.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386710 (https://phabricator.wikimedia.org/T175725) (owner: 10MaxSem)
[10:51:16] <wikibugs>	 (03CR) 10jenkins-bot: Setup CirrusSearch AB test on dbn group sizing [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387586 (owner: 10EBernhardson)
[10:51:18] <wikibugs>	 (03CR) 10jenkins-bot: Scap prep: check reference directory exists [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387743 (owner: 10Thcipriani)
[10:51:20] <wikibugs>	 (03CR) 10jenkins-bot: cirrus interleave config should not be wg prefixed [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387752 (owner: 10EBernhardson)
[10:52:09] <icinga-wm>	 RECOVERY - puppet last run on elastic1034 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures
[10:55:04] <wikibugs>	 (03PS2) 10Muehlenhoff: Ship a dummy config since IncludeOptional isn't really optional [puppet] - 10https://gerrit.wikimedia.org/r/386617
[11:03:03] <apergos>	 oh good, I just got a batch of labsdb1001 pages from hours ago >_<
[11:04:33] <wikibugs>	 10Operations, 10Ops-Access-Requests: Requesting access to RESOURCE for USER[S] - https://phabricator.wikimedia.org/T179452#3725973 (10Framawiki) 05Open>03Invalid Hello @Mehrdadbot, all the steeps to ask for a shell account on toolforge are present on https://tools.wmflabs.org/, I let you follow these guide...
[11:05:18] <icinga-wm>	 RECOVERY - puppet last run on cp1049 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures
[11:07:16] <wikibugs>	 (03PS1) 10ArielGlenn: convert script generating lists of dumps for rsync, to use config overrides [puppet] - 10https://gerrit.wikimedia.org/r/387781
[11:10:26] <wikibugs>	 (03CR) 10Muehlenhoff: [C: 032] Ship a dummy config since IncludeOptional isn't really optional [puppet] - 10https://gerrit.wikimedia.org/r/386617 (owner: 10Muehlenhoff)
[11:12:18] <icinga-wm>	 RECOVERY - HHVM rendering on labweb1002 is OK: HTTP OK: HTTP/1.1 200 OK - 72972 bytes in 7.692 second response time
[11:12:19] <icinga-wm>	 RECOVERY - Apache HTTP on labweb1002 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 620 bytes in 0.072 second response time
[11:12:38] <icinga-wm>	 RECOVERY - Check systemd state on labweb1002 is OK: OK - running: The system is fully operational
[11:13:59] <wikibugs>	 (03PS2) 10ArielGlenn: convert script generating lists of dumps for rsync, to use config overrides [puppet] - 10https://gerrit.wikimedia.org/r/387781
[11:14:39] <icinga-wm>	 RECOVERY - puppet last run on labweb1002 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures
[11:15:10] <wikibugs>	 (03CR) 10ArielGlenn: [C: 032] convert script generating lists of dumps for rsync, to use config overrides [puppet] - 10https://gerrit.wikimedia.org/r/387781 (owner: 10ArielGlenn)
[11:20:38] <icinga-wm>	 PROBLEM - puppet last run on wdqs2003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[11:22:24] <wikibugs>	 (03PS6) 10ArielGlenn: use separate path for public/other datasets [puppet] - 10https://gerrit.wikimedia.org/r/386161 (https://phabricator.wikimedia.org/T178888)
[11:30:29] <icinga-wm>	 PROBLEM - puppet last run on elastic1030 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[11:50:38] <icinga-wm>	 RECOVERY - puppet last run on wdqs2003 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures
[12:00:29] <icinga-wm>	 RECOVERY - puppet last run on elastic1030 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures
[12:07:23] <logmsgbot>	 !log kartik@tin Started deploy [cxserver/deploy@10651e2]: Update cxserver to 0227acb
[12:07:29] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:10:30] <logmsgbot>	 !log kartik@tin Finished deploy [cxserver/deploy@10651e2]: Update cxserver to 0227acb (duration: 03m 07s)
[12:10:35] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:37:38] <icinga-wm>	 PROBLEM - IPv6 ping to eqiad on ripe-atlas-eqiad IPv6 is CRITICAL: Traceback (most recent call last)
[12:40:43] <wikibugs>	 10Operations, 10media-storage, 10User-fgiunchedi: Deleting file on Commons "Error deleting file: An unknown error occurred in storage backend "local-multiwrite"." - https://phabricator.wikimedia.org/T173374#3726148 (10Aklapper) Should this task get closed as `resolved`?
[12:42:38] <icinga-wm>	 RECOVERY - IPv6 ping to eqiad on ripe-atlas-eqiad IPv6 is OK: OK - failed 7 probes of 285 (alerts on 19) - https://atlas.ripe.net/measurements/1790947/#!map
[12:44:54] <moritzm>	 !log installinng libdatetime-timezone-perl stable updates on Debian
[12:45:00] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:48:08] <moritzm>	 !log installing libav security updates
[12:48:13] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:00:04] <jouncebot>	 addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: #bothumor My software never has bugs. It just develops random features. Rise for European Mid-day SWAT(Max 8 patches). (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171101T1300).
[13:00:04] <jouncebot>	 dcausse: A patch you scheduled for European Mid-day SWAT(Max 8 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker.
[13:00:20] <dcausse>	 o/
[13:26:58] <icinga-wm>	 PROBLEM - puppet last run on restbase2012 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[13:29:17] <logmsgbot>	 !log ppchelko@tin Started deploy [restbase/deploy@2321c4c]: Update hyperswitch dependency
[13:29:22] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:29:49] <icinga-wm>	 PROBLEM - restbase endpoints health on restbase1012 is CRITICAL: /en.wikipedia.org/v1/feed/onthisday/{type}/{mm}/{dd} (Retrieve selected the events for Jan 01) timed out before a response was received
[13:29:49] <icinga-wm>	 PROBLEM - mobileapps endpoints health on scb1002 is CRITICAL: /{domain}/v1/page/media/{title} (retrieve images and videos of en.wp Cat page via media route) timed out before a response was received: /{domain}/v1/page/most-read/{yyyy}/{mm}/{dd} (retrieve the most-read articles for January 1, 2016 (with aggregated=true)) timed out before a response was received
[13:30:48] <icinga-wm>	 RECOVERY - mobileapps endpoints health on scb1002 is OK: All endpoints are healthy
[13:30:48] <icinga-wm>	 RECOVERY - restbase endpoints health on restbase1012 is OK: All endpoints are healthy
[13:33:06] <logmsgbot>	 !log ppchelko@tin Finished deploy [restbase/deploy@2321c4c]: Update hyperswitch dependency (duration: 03m 50s)
[13:33:12] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:34:57] <hasharAway>	 dcausse: have you managed to deploy your changes?
[13:35:10] <hashar>	 I was at the restaurant with familly and it has taken ages ..
[13:35:13] <logmsgbot>	 !log ppchelko@tin Started deploy [restbase/deploy@2321c4c]: Update hyperswitch dependency. Take 2
[13:35:19] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:35:22] <dcausse>	 hashar: np, I can deploy if it helps
[13:35:39] <wikibugs>	 (03CR) 10Ottomata: "Depends on what data you are accessing in Hadoop.  If you need to access things like webrequest logs, the user accessing the data needs to" [puppet] - 10https://gerrit.wikimedia.org/r/369902 (https://phabricator.wikimedia.org/T171258) (owner: 10Addshore)
[13:36:18] <hashar>	 dcausse: if that is for labs, you can override the settings in CommonSettings-labs.php 
[13:36:28] <icinga-wm>	 PROBLEM - puppet last run on cp1065 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[13:37:03] <dcausse>	 hashar: yes mostly for labs but I did not want to redo everything in the -labs.php file
[13:37:14] <hashar>	 ;D
[13:37:34] <wikibugs>	 (03CR) 10Hashar: [C: 032] Properly check for cluster existence prior setting TTM mirrors [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387281 (https://phabricator.wikimedia.org/T179270) (owner: 10DCausse)
[13:37:50] <hashar>	 fair :)
[13:38:08] <wikibugs>	 (03CR) 10Hashar: [C: 032] Enable blocking feature of abuse filter in fawikiquote [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384252 (https://phabricator.wikimedia.org/T178227) (owner: 10Ladsgroup)
[13:38:41] <wikibugs>	 (03CR) 10Hashar: [C: 032] Enable NewUserMessage on fawikiquote [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387741 (https://phabricator.wikimedia.org/T179442) (owner: 10Ladsgroup)
[13:38:43] <wikibugs>	 (03Merged) 10jenkins-bot: Properly check for cluster existence prior setting TTM mirrors [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387281 (https://phabricator.wikimedia.org/T179270) (owner: 10DCausse)
[13:38:54] <wikibugs>	 (03CR) 10jenkins-bot: Properly check for cluster existence prior setting TTM mirrors [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387281 (https://phabricator.wikimedia.org/T179270) (owner: 10DCausse)
[13:38:59] <icinga-wm>	 PROBLEM - mobileapps endpoints health on scb1003 is CRITICAL: /{domain}/v1/page/most-read/{yyyy}/{mm}/{dd} (retrieve the most-read articles for January 1, 2016) timed out before a response was received: /{domain}/v1/page/most-read/{yyyy}/{mm}/{dd} (retrieve the most-read articles for January 1, 2016 (with aggregated=true)) timed out before a response was received
[13:39:34] <hashar>	 dcausse: is that testable?  I pulled it on mwdebug1001
[13:39:50] <dcausse>	 hashar: if it's possible to pull on terbium I can test
[13:40:08] <hashar>	 I guess it is all about running "scap pull" on terbium
[13:40:15] <hashar>	 if you wanna give it a try
[13:40:20] <dcausse>	 ok testing
[13:40:39] <Amir1>	 hashar: I'm here now
[13:40:46] <hashar>	 it would sync terbium /srv/mediawiki with whatever has been fetched on tin.eqiad.wmnet
[13:40:52] <Amir1>	 to be honest I forgot I put stuff in SWAT
[13:41:03] <logmsgbot>	 !log ppchelko@tin Finished deploy [restbase/deploy@2321c4c]: Update hyperswitch dependency. Take 2 (duration: 05m 50s)
[13:41:06] <Amir1>	 sorry
[13:41:08] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:41:18] <wikibugs>	 (03PS4) 10Hashar: Enable blocking feature of abuse filter in fawikiquote [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384252 (https://phabricator.wikimedia.org/T178227) (owner: 10Ladsgroup)
[13:41:20] <hashar>	 grblblb
[13:41:23] <hashar>	 stupid merge conflicts
[13:41:33] <wikibugs>	 (03CR) 10Hashar: [C: 032] Enable blocking feature of abuse filter in fawikiquote [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384252 (https://phabricator.wikimedia.org/T178227) (owner: 10Ladsgroup)
[13:41:53] <Amir1>	 I can't test it but it looks straightforward I guess
[13:41:58] <icinga-wm>	 RECOVERY - mobileapps endpoints health on scb1003 is OK: All endpoints are healthy
[13:42:05] <hashar>	 Amir1: yeah I will just sync them
[13:42:14] <hashar>	 I am not worried about those fawikiquote patches
[13:42:19] <dcausse>	 hashar: looks good, config is unchanged in prod
[13:42:25] <hashar>	 cool
[13:43:50] <logmsgbot>	 !log hashar@tin Synchronized wmf-config/CommonSettings.php: Properly check for cluster existence prior setting TTM mirrors - T179270 (duration: 01m 05s)
[13:43:56] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:43:57] <stashbot>	 T179270: TTMServerMessageUpdateJob fails in labs - https://phabricator.wikimedia.org/T179270
[13:44:44] <Amir1>	 hashar: thank you
[13:44:55] <wikibugs>	 (03Merged) 10jenkins-bot: Enable blocking feature of abuse filter in fawikiquote [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384252 (https://phabricator.wikimedia.org/T178227) (owner: 10Ladsgroup)
[13:45:36] <dcausse>	 hashar: thanks!
[13:45:53] <wikibugs>	 (03PS2) 10Hashar: Enable NewUserMessage on fawikiquote [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387741 (https://phabricator.wikimedia.org/T179442) (owner: 10Ladsgroup)
[13:45:57] <moritzm>	 !log installing quagga security updates
[13:46:02] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:46:03] <hashar>	 dcausse: et n'oublie pas la toussaint :]
[13:46:14] <dcausse>	 heh :)
[13:46:19] <logmsgbot>	 !log hashar@tin Synchronized wmf-config/abusefilter.php: Enable blocking feature of abuse filter in fawikiquote - T178227 (duration: 00m 50s)
[13:46:23] <wikibugs>	 (03CR) 10GoranSMilovanovic: "> Depends on what data you are accessing in Hadoop.  If you need to" [puppet] - 10https://gerrit.wikimedia.org/r/369902 (https://phabricator.wikimedia.org/T171258) (owner: 10Addshore)
[13:46:25] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:46:25] <stashbot>	 T178227: Enable blocking feature of abuse filter in fawikiquote - https://phabricator.wikimedia.org/T178227
[13:46:26] <wikibugs>	 (03CR) 10jenkins-bot: Enable blocking feature of abuse filter in fawikiquote [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384252 (https://phabricator.wikimedia.org/T178227) (owner: 10Ladsgroup)
[13:46:40] <wikibugs>	 (03CR) 10Hashar: [C: 032] Enable NewUserMessage on fawikiquote [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387741 (https://phabricator.wikimedia.org/T179442) (owner: 10Ladsgroup)
[13:47:15] <hashar>	 Notice: Undefined index: enwiki in /srv/mediawiki/php-1.31.0-wmf.5/extensions/ORES/includes/Cache.php on line 52
[13:47:15] <hashar>	 Warning: Invalid argument supplied for foreach() in /srv/mediawiki/php-1.31.0-wmf.5/extensions/ORES/includes/Cache.php on line 56
[13:47:29] <hashar>	 Amir1: unrelated to SWAT but ORES got some notice/warning :)
[13:47:45] <wikibugs>	 (03Merged) 10jenkins-bot: Enable NewUserMessage on fawikiquote [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387741 (https://phabricator.wikimedia.org/T179442) (owner: 10Ladsgroup)
[13:49:11] <logmsgbot>	 !log hashar@tin Synchronized wmf-config/InitialiseSettings.php: Enable NewUserMessage on fawikiquote - T179442 (duration: 00m 50s)
[13:49:15] <wikibugs>	 (03CR) 10jenkins-bot: Enable NewUserMessage on fawikiquote [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387741 (https://phabricator.wikimedia.org/T179442) (owner: 10Ladsgroup)
[13:49:17] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:49:18] <stashbot>	 T179442: Enable NewUserMessage on fawikiquote - https://phabricator.wikimedia.org/T179442
[13:49:19] <hashar>	 Amir1: both changes deployed
[13:49:43] <wikibugs>	 10Operations, 10ops-ulsfo, 10Traffic: setup bast4001/WMF7218 - https://phabricator.wikimedia.org/T179050#3726257 (10BBlack) >>! In T179050#3725651, @MoritzMuehlenhoff wrote: > This is currently installed with jessie, but if we setup a new box, let's use stretch from the start?  +1 We may as well move to stre...
[13:51:02] <Amir1>	 hashar: thanks
[13:51:29] <Amir1>	 hashar: Regarding ORES, Adam is one it
[13:51:32] <Amir1>	 *on
[13:51:41] <hashar>	 cool
[13:51:46] <wikibugs>	 (03Abandoned) 10Gehel: wdqs: cleanup JVM options [puppet] - 10https://gerrit.wikimedia.org/r/384663 (https://phabricator.wikimedia.org/T175919) (owner: 10Gehel)
[13:55:37] <wikibugs>	 10Operations, 10ops-ulsfo, 10Traffic: setup bast4001/WMF7218 - https://phabricator.wikimedia.org/T179050#3726279 (10MoritzMuehlenhoff) >>! In T179050#3726257, @BBlack wrote: > +1 We may as well move to stretch here.  For the bastion/installserver role it should be pretty simple?  I wouldn't expect any proble...
[13:56:05] <wikibugs>	 10Operations, 10Discovery, 10Discovery-Search, 10Elasticsearch, 10Epic: EPIC: Cultivating the Elasticsearch garden (operational lessons from 1.7.1 upgrade) - https://phabricator.wikimedia.org/T109089#3726284 (10Gehel)
[13:56:07] <wikibugs>	 10Operations, 10Discovery, 10Discovery-Search, 10Elasticsearch: Investigate tweaking of the "wait for me" parameter for upgrades / restarts - https://phabricator.wikimedia.org/T109091#3726281 (10Gehel) 05Open>03Resolved a:03Gehel We have tuned a bit this part. The main issue is that as soon as writes...
[13:56:58] <icinga-wm>	 RECOVERY - puppet last run on restbase2012 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures
[13:59:57] <wikibugs>	 10Operations, 10Ops-Access-Requests: Requesting access to RESOURCE for USER[S] - https://phabricator.wikimedia.org/T179452#3726299 (10Mehrdadbot) thanks.
[14:01:28] <icinga-wm>	 RECOVERY - puppet last run on cp1065 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures
[14:03:33] <wikibugs>	 (03PS3) 10Gehel: logstash: update logstash_syslog common hiera parameter to point to LVS. [puppet] - 10https://gerrit.wikimedia.org/r/383146 (https://phabricator.wikimedia.org/T175242)
[14:09:44] <bblack>	 !log cp*: disabling puppet to test strongswan change...
[14:09:49] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:10:38] <wikibugs>	 (03PS2) 10BBlack: strongswan: turn on fragmentation of IKE [puppet] - 10https://gerrit.wikimedia.org/r/387648
[14:11:17] <wikibugs>	 (03CR) 10BBlack: [C: 032] strongswan: turn on fragmentation of IKE [puppet] - 10https://gerrit.wikimedia.org/r/387648 (owner: 10BBlack)
[14:23:51] <wikibugs>	 10Operations, 10Traffic, 10Services (watching): restbase.svc.eqiad.wmnet directs requests to staging if the origin is staging too - https://phabricator.wikimedia.org/T179494#3726448 (10mobrovac)
[14:24:39] <icinga-wm>	 PROBLEM - puppet last run on db1068 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[14:40:02] <wikibugs>	 10Operations, 10Traffic, 10Services (watching): restbase.svc.eqiad.wmnet directs requests to staging if the origin is staging too - https://phabricator.wikimedia.org/T179494#3726448 (10mark) The staging hosts have the LVS service IP (restbase.svc.eqiad.wmnet, 10.2.2.17) bound to their loopback IP - as every...
[14:41:45] <moritzm>	 !log installing poppler security updates
[14:41:50] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:44:51] <wikibugs>	 (03PS1) 10BBlack: Revert "strongswan: turn on fragmentation of IKE" [puppet] - 10https://gerrit.wikimedia.org/r/387803
[14:45:09] <wikibugs>	 (03CR) 10BBlack: [V: 032 C: 032] Revert "strongswan: turn on fragmentation of IKE" [puppet] - 10https://gerrit.wikimedia.org/r/387803 (owner: 10BBlack)
[14:47:16] <wikibugs>	 (03PS1) 10BBlack: Remove borked cp4024 from ipsec nodelists [puppet] - 10https://gerrit.wikimedia.org/r/387805 (https://phabricator.wikimedia.org/T174891)
[14:47:35] <wikibugs>	 (03CR) 10BBlack: [V: 032 C: 032] Remove borked cp4024 from ipsec nodelists [puppet] - 10https://gerrit.wikimedia.org/r/387805 (https://phabricator.wikimedia.org/T174891) (owner: 10BBlack)
[14:49:28] <wikibugs>	 (03PS1) 10EBernhardson: Update CirrusSearch enwiki MLR model [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387806
[14:50:04] <ebernhardson>	 can i sneak a mediawiki-config deploy in? It's a one line change to update the cirrus ranking model, there are some significant deficiencies with the one rolled out monday
[14:51:38] <icinga-wm>	 RECOVERY - IPsec on kafka1022 is OK: Strongswan OK - 112 ESP OK
[14:53:06] <wikibugs>	 10Operations, 10Traffic, 10Services (watching): restbase.svc.eqiad.wmnet directs requests to staging if the origin is staging too - https://phabricator.wikimedia.org/T179494#3726448 (10BBlack) I don't think they're //currently// puppetized for lvs::realserver, but it looks like the machines had such a config...
[14:54:18] <icinga-wm>	 RECOVERY - IPsec on cp1049 is OK: Strongswan OK - 54 ESP OK
[14:54:38] <icinga-wm>	 RECOVERY - IPsec on cp1099 is OK: Strongswan OK - 54 ESP OK
[14:54:39] <icinga-wm>	 RECOVERY - puppet last run on db1068 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures
[14:54:48] <icinga-wm>	 RECOVERY - IPsec on cp2026 is OK: Strongswan OK - 68 ESP OK
[14:54:48] <icinga-wm>	 RECOVERY - IPsec on kafka1018 is OK: Strongswan OK - 112 ESP OK
[14:54:49] <icinga-wm>	 RECOVERY - IPsec on cp1073 is OK: Strongswan OK - 54 ESP OK
[14:54:58] <icinga-wm>	 RECOVERY - IPsec on cp1050 is OK: Strongswan OK - 54 ESP OK
[14:55:08] <icinga-wm>	 RECOVERY - IPsec on cp2020 is OK: Strongswan OK - 68 ESP OK
[14:55:08] <icinga-wm>	 RECOVERY - IPsec on cp1071 is OK: Strongswan OK - 54 ESP OK
[14:55:09] <icinga-wm>	 RECOVERY - IPsec on cp2005 is OK: Strongswan OK - 68 ESP OK
[14:55:09] <icinga-wm>	 RECOVERY - IPsec on cp1072 is OK: Strongswan OK - 54 ESP OK
[14:55:17] <bblack>	 !log strongswan experiment done, cp* back to puppet-agent-enabled
[14:55:18] <icinga-wm>	 RECOVERY - IPsec on cp2014 is OK: Strongswan OK - 68 ESP OK
[14:55:18] <icinga-wm>	 RECOVERY - IPsec on cp2002 is OK: Strongswan OK - 68 ESP OK
[14:55:18] <icinga-wm>	 RECOVERY - IPsec on cp2011 is OK: Strongswan OK - 68 ESP OK
[14:55:21] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:55:28] <icinga-wm>	 RECOVERY - IPsec on cp1062 is OK: Strongswan OK - 54 ESP OK
[14:55:29] <icinga-wm>	 RECOVERY - IPsec on cp1048 is OK: Strongswan OK - 54 ESP OK
[14:55:29] <icinga-wm>	 RECOVERY - IPsec on cp1074 is OK: Strongswan OK - 54 ESP OK
[14:55:48] <icinga-wm>	 RECOVERY - IPsec on cp1063 is OK: Strongswan OK - 54 ESP OK
[14:55:48] <icinga-wm>	 RECOVERY - IPsec on cp2008 is OK: Strongswan OK - 68 ESP OK
[14:55:48] <icinga-wm>	 RECOVERY - IPsec on cp2022 is OK: Strongswan OK - 68 ESP OK
[14:55:48] <icinga-wm>	 RECOVERY - IPsec on cp2024 is OK: Strongswan OK - 68 ESP OK
[14:55:48] <icinga-wm>	 RECOVERY - IPsec on cp2017 is OK: Strongswan OK - 68 ESP OK
[14:55:58] <icinga-wm>	 RECOVERY - IPsec on cp1064 is OK: Strongswan OK - 54 ESP OK
[14:57:10] <wikibugs>	 (03PS2) 10EBernhardson: Update CirrusSearch enwiki MLR model [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387806
[14:59:11] <wikibugs>	 (03CR) 10DCausse: [C: 031] Update CirrusSearch enwiki MLR model [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387806 (owner: 10EBernhardson)
[14:59:44] <wikibugs>	 (03CR) 10EBernhardson: [C: 032] Update CirrusSearch enwiki MLR model [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387806 (owner: 10EBernhardson)
[15:00:59] <mobrovac>	 !log restbase: removing wikimedia-lvs-realserver from staging hosts T179494
[15:01:05] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:01:05] <stashbot>	 T179494: restbase.svc.eqiad.wmnet directs requests to staging if the origin is staging too - https://phabricator.wikimedia.org/T179494
[15:01:45] <wikibugs>	 (03Merged) 10jenkins-bot: Update CirrusSearch enwiki MLR model [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387806 (owner: 10EBernhardson)
[15:01:58] <icinga-wm>	 RECOVERY - IPsec on kafka1012 is OK: Strongswan OK - 112 ESP OK
[15:01:59] <wikibugs>	 (03CR) 10jenkins-bot: Update CirrusSearch enwiki MLR model [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387806 (owner: 10EBernhardson)
[15:05:59] <icinga-wm>	 RECOVERY - IPsec on kafka1013 is OK: Strongswan OK - 112 ESP OK
[15:07:09] <icinga-wm>	 RECOVERY - IPsec on kafka1020 is OK: Strongswan OK - 112 ESP OK
[15:07:12] <logmsgbot>	 !log ebernhardson@tin Synchronized wmf-config/InitialiseSettings.php: Update CirrusSearch MLR model on enwiki (duration: 00m 51s)
[15:07:17] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:07:18] <icinga-wm>	 PROBLEM - MediaWiki exceptions and fatals per minute on graphite1001 is CRITICAL: CRITICAL: 70.00% of data above the critical threshold [50.0]
[15:07:35] <ebernhardson>	 looking
[15:07:41] <logmsgbot>	 !log awight@tin Started deploy [ores/deploy@9f361d2]: revscoring 2 -> ores* (non-production)
[15:07:46] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:08:10] <ebernhardson>	 looks like the mediawiki exceptions was nurelated to my push, they are mostly: [{exception_id}] {exception_url} Wikimedia\Rdbms\DBReplicationWaitError from line 372 of /srv/mediawiki/php-1.31.0-wmf.5/includes/libs/rdbms/lbfactory/LBFactory.php: Could not wait for replica DBs to catch up to db1062
[15:08:50] <awight>	 akosiaris: This might be in your domain?  I’m trying to deploy to the new ORES cluster, and getting this error from ores2003 and ores2007: > 15:07:44 ['/usr/bin/scap', 'deploy-local', '-v', '--repo', 'ores/deploy', '-g', 'cluster', 'fetch', '--refresh-config'] on ores2003.codfw.wmnet returned [255]: Permission denied (publickey,keyboard-interactive).
[15:09:18] <icinga-wm>	 RECOVERY - MediaWiki exceptions and fatals per minute on graphite1001 is OK: OK: Less than 70.00% above the threshold [25.0]
[15:09:37] <logmsgbot>	 !log awight@tin Finished deploy [ores/deploy@9f361d2]: revscoring 2 -> ores* (non-production) (duration: 01m 57s)
[15:09:42] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:10:32] <logmsgbot>	 !log awight@tin Started deploy [ores/deploy@9f361d2]: revscoring 2 -> ores1002 (non-production)
[15:10:36] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:12:06] <awight>	 akosiaris: thcipriani and I think there’s something wrong with those two machines.
[15:12:58] <logmsgbot>	 !log awight@tin Finished deploy [ores/deploy@9f361d2]: revscoring 2 -> ores1002 (non-production) (duration: 02m 25s)
[15:13:03] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:13:03] <thcipriani>	 I just don't see them in puppet manifests/site.pp, but I could be missing it
[15:13:47] <wikibugs>	 10Operations, 10Traffic, 10Services (done): restbase.svc.eqiad.wmnet directs requests to staging if the origin is staging too - https://phabricator.wikimedia.org/T179494#3726709 (10mobrovac) 05Open>03Resolved a:03mobrovac Ok, after a round of `apt-get remove --purge wikimedia-lvs-realserver && ip addr...
[15:15:40] <moritzm>	 !log repo reorg: moved ftpsync from thirdparty to main and docker-engine from thirdparty to thirdparty/k8s
[15:15:45] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:16:38] <icinga-wm>	 RECOVERY - IPsec on kafka1014 is OK: Strongswan OK - 112 ESP OK
[15:17:21] <wikibugs>	 (03PS9) 10Muehlenhoff: Use new repository layout for stretch onwards [puppet] - 10https://gerrit.wikimedia.org/r/357559 (https://phabricator.wikimedia.org/T158583)
[15:17:33] <wikibugs>	 (03CR) 10Zoranzoki21: "> Per task description, "@jhsoby-WMNO will ask the (very small)" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387077 (https://phabricator.wikimedia.org/T179241) (owner: 10Zoranzoki21)
[15:17:37] <wikibugs>	 (03PS3) 10Zoranzoki21: Enable the ArticlePlaceholder for Northern Sami (sewiki) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387077 (https://phabricator.wikimedia.org/T179241)
[15:30:10] <wikibugs>	 (03CR) 10Chad: "Yes, that's my plan" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384193 (https://phabricator.wikimedia.org/T104148) (owner: 10Chad)
[15:30:25] <wikibugs>	 (03PS3) 10Chad: Get rid of squid-file-labs in favor of new reverse-proxy-staging [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384193 (https://phabricator.wikimedia.org/T104148)
[15:31:00] <wikibugs>	 (03PS7) 10ArielGlenn: use separate path for public/other datasets [puppet] - 10https://gerrit.wikimedia.org/r/386161 (https://phabricator.wikimedia.org/T178888)
[15:35:28] <wikibugs>	 10Operations, 10media-storage, 10User-fgiunchedi: Deleting file on Commons "Error deleting file: An unknown error occurred in storage backend "local-multiwrite"." - https://phabricator.wikimedia.org/T173374#3726757 (10Jcb) Yes, I think the task is resolved now the two files are gone and I haven't seen any ot...
[15:36:20] <akosiaris>	 awight: I am around, which seems to be the problem ?
[15:37:33] <awight>	 akosiaris: thcipriani dug up more info than I have.  tl;dr, scap can’t push to those machines.  He says, > awight: so yesterday you had 2 errors for ores2003 and ores2007 but the thing is I don't see those in puppet anywhere https://github.com/wikimedia/puppet/blob/production/manifests/site.pp#L1976-L1988 and the deploy-service user can't ssh there from tin so there's some problem (I think) with the setup of those machines
[15:38:08] <awight>	 Not urgent, btw.
[15:41:22] <akosiaris>	 awight: https://phabricator.wikimedia.org/T165170 those boxes have never been pooled into service intentionally
[15:41:37] <akosiaris>	 why are you trying to deploy code to them ?
[15:42:08] <awight>	 akosiaris: aha, thanks for the info.  Just out of ignorance.  I’ll fix our scap to ignore those for now.
[15:42:14] <akosiaris>	 task is stalled btw per https://phabricator.wikimedia.org/T165170#3566244 on https://phabricator.wikimedia.org/T169246
[15:43:13] <akosiaris>	 awight: actually this is information scap should not be having locally. It's best solved on the deployment server. Parsoid had the same issue and it's now fixed in a better way
[15:43:18] <akosiaris>	 lemme find the changes
[15:43:35] <wikibugs>	 10Operations, 10Patch-For-Review, 10cloud-services-team (Kanban): labsdb1001 crashed - storage issue - https://phabricator.wikimedia.org/T179464#3725553 (10bd808) Announced on cloud-announce: https://lists.wikimedia.org/pipermail/cloud-announce/2017-November/000007.html
[15:43:52] <awight>	 I’m confused.  I was going to workaround the issue with https://gerrit.wikimedia.org/r/387811
[15:44:33] <akosiaris>	 awight: https://gerrit.wikimedia.org/r/377966
[15:45:10] <akosiaris>	 granted there is no ores dsh group yet
[15:45:30] <akosiaris>	 cause we are blocked on the stresstesting and haven't yet enabled that cluster
[15:45:49] <akosiaris>	 but that's something we should do
[15:45:59] <awight>	 akosiaris: Interesting.  I was getting advice to use environments rather than groups, but was going to hold off until we deprecate the old scb* deployments anyway.
[15:46:17] <awight>	 But this external node list looks nice.  Shall I make a task to do that?
[15:46:45] <akosiaris>	 I 'd say yes. Feel free to block it on getting the new cluster up and running though
[15:46:51] <akosiaris>	 it does make sense
[15:49:02] <wikibugs>	 10Operations, 10ORES, 10Scoring-platform-team: Use external dsh group to list pooled ORES nodes - https://phabricator.wikimedia.org/T179501#3726795 (10awight)
[15:49:21] <wikibugs>	 10Operations, 10ORES, 10Scoring-platform-team: Use external dsh group to list pooled ORES nodes - https://phabricator.wikimedia.org/T179501#3726808 (10awight)
[15:50:27] <wikibugs>	 (03CR) 10Jforrester: [C: 031] Get rid of squid-file-labs in favor of new reverse-proxy-staging [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384193 (https://phabricator.wikimedia.org/T104148) (owner: 10Chad)
[16:01:06] <bblack>	 !log lvs1003 - puppet disabled, testing experimental ethtool ringbuffer change
[16:01:11] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[16:09:45] <wikibugs>	 10Operations, 10ops-eqiad: Decommission stat1003.eqiad.wmnet - https://phabricator.wikimedia.org/T175150#3726847 (10Ottomata)
[16:12:09] <icinga-wm>	 PROBLEM - Eqiad HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0]
[16:12:09] <icinga-wm>	 PROBLEM - Text HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [1000.0]
[16:12:18] <icinga-wm>	 PROBLEM - Ulsfo HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [1000.0]
[16:13:06] <bblack>	 ^ probably me, very thin spike, sorry!
[16:13:22] <wikibugs>	 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe, 10cloud-services-team (FY2017-18): Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3726869 (10herron)
[16:13:24] <wikibugs>	 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe: puppet4: Catalog failed: Catalog has broken references: varnish::wikimedia_vcl[/usr/share/varnish/tests/wikimedia-common_upload-backend.inc.vcl](/etc/puppet/modules/varnish/manifests/instance.pp:98 - https://phabricator.wikimedia.org/T179396#3726866 (1...
[16:14:08] <icinga-wm>	 PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [1000.0]
[16:14:10] <wikibugs>	 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe, 10cloud-services-team (FY2017-18): Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3711053 (10herron)
[16:15:47] <wikibugs>	 (03PS8) 10ArielGlenn: use separate path for public/other datasets [puppet] - 10https://gerrit.wikimedia.org/r/386161 (https://phabricator.wikimedia.org/T178888)
[16:18:04] <wikibugs>	 (03CR) 10ArielGlenn: [C: 032] use separate path for public/other datasets (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/386161 (https://phabricator.wikimedia.org/T178888) (owner: 10ArielGlenn)
[16:21:18] <icinga-wm>	 RECOVERY - Eqiad HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0]
[16:21:18] <icinga-wm>	 RECOVERY - Text HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0]
[16:21:18] <icinga-wm>	 RECOVERY - Ulsfo HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0]
[16:22:09] <icinga-wm>	 RECOVERY - Esams HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0]
[16:23:32] <wikibugs>	 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe, 10cloud-services-team (FY2017-18): Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3726914 (10herron) Notes and observations from upgrading puppetmaster2001 via `apt-get install puppetmaster` puppet packages.  1. The p...
[16:27:33] <wikibugs>	 (03PS2) 10Gehel: use the logstash LVS endpoint [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383355 (https://phabricator.wikimedia.org/T175242)
[16:37:39] <wikibugs>	 (03PS1) 10Ema: VCL: add layer information to X-Cache-Status [puppet] - 10https://gerrit.wikimedia.org/r/387817
[16:43:40] <wikibugs>	 (03PS1) 10Chad: group1 to wmf.6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387818
[16:43:42] <wikibugs>	 (03CR) 10Chad: [C: 04-2] group1 to wmf.6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387818 (owner: 10Chad)
[16:46:03] <wikibugs>	 (03CR) 10Hashar: "check experimental" [debs/pybal] - 10https://gerrit.wikimedia.org/r/384483 (https://phabricator.wikimedia.org/T178149) (owner: 10Ema)
[16:47:21] <wikibugs>	 (03CR) 10jenkins-bot: 1.14.2: do not crash on empty runcommand.arguments [debs/pybal] - 10https://gerrit.wikimedia.org/r/384483 (https://phabricator.wikimedia.org/T178149) (owner: 10Ema)
[16:48:20] <no_justification>	 jouncebot: next
[16:48:21] <jouncebot>	 In 1 hour(s) and 11 minute(s): Morning SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171101T1800)
[16:48:41] * no_justification thwacks jouncebot over the head
[16:48:46] <wikibugs>	 (03CR) 10Chad: [C: 032] Get rid of squid-file-labs in favor of new reverse-proxy-staging [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384193 (https://phabricator.wikimedia.org/T104148) (owner: 10Chad)
[16:49:54] <wikibugs>	 (03Merged) 10jenkins-bot: Get rid of squid-file-labs in favor of new reverse-proxy-staging [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384193 (https://phabricator.wikimedia.org/T104148) (owner: 10Chad)
[16:50:04] <wikibugs>	 (03CR) 10jenkins-bot: Get rid of squid-file-labs in favor of new reverse-proxy-staging [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384193 (https://phabricator.wikimedia.org/T104148) (owner: 10Chad)
[16:54:58] <wikibugs>	 10Operations, 10ORES, 10Scap, 10Scoring-platform-team: Use external dsh group to list pooled ORES nodes - https://phabricator.wikimedia.org/T179501#3727071 (10Halfak) p:05Triage>03Low
[17:00:57] <wikibugs>	 (03CR) 10DCausse: [C: 031] use the logstash LVS endpoint [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383355 (https://phabricator.wikimedia.org/T175242) (owner: 10Gehel)
[17:01:58] <wikibugs>	 (03CR) 10EBernhardson: [C: 031] "For the code, this is certainly correct. For if all the services that use it will work appropriately ... probably? Will have to monitor ro" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383355 (https://phabricator.wikimedia.org/T175242) (owner: 10Gehel)
[17:06:19] <wikibugs>	 10Operations, 10Patch-For-Review, 10cloud-services-team (Kanban): labsdb1001 crashed - storage issue - https://phabricator.wikimedia.org/T179464#3725553 (10madhuvishy) Disk setup for labsdb1001  * `/dev/sda -> 3.271TB after Hardware RAID 10 (H800, External shelf, 12 Disks 558.911 GB each)` * `/dev/sd[b,c,d,e...
[17:09:06] <logmsgbot>	 !log demon@tin Synchronized docroot/noc/: dropped squid-labs.php (duration: 00m 51s)
[17:09:11] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:09:42] <wikibugs>	 (03CR) 10Hashar: "check experimental" [wikimedia/bots/jouncebot] - 10https://gerrit.wikimedia.org/r/149387 (owner: 10Hashar)
[17:10:21] <wikibugs>	 (03CR) 10jenkins-bot: Jenkins job validation (DO NOT SUBMIT) [wikimedia/bots/jouncebot] - 10https://gerrit.wikimedia.org/r/149387 (owner: 10Hashar)
[17:10:40] <logmsgbot>	 !log demon@tin Synchronized wmf-config/: dropped squid-labs, no-op in prod (duration: 00m 52s)
[17:10:44] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:14:38] <wikibugs>	 (03CR) 10Jayprakash12345: "@Reedy, Dereckson, Hashar Can you create the SQL table for shorturl. So that I can Schedule this patch." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386779 (https://phabricator.wikimedia.org/T178919) (owner: 10Jayprakash12345)
[17:17:07] <wikibugs>	 10Operations, 10Gerrit, 10Readers-Web-Backlog, 10Patch-For-Review, and 2 others: [subtask] Temporarily allow pushing large objects - https://phabricator.wikimedia.org/T178189#3727197 (10Niedzielski)
[17:26:03] <wikibugs>	 10Operations, 10Patch-For-Review, 10cloud-services-team (Kanban): labsdb1001 crashed - storage issue - https://phabricator.wikimedia.org/T179464#3725553 (10Superyetkin) Any chance to recover data?
[17:30:04] <wikibugs>	 (03PS1) 10Jforrester: Get rid of squid.php in favor of new reverse-proxy.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387832 (https://phabricator.wikimedia.org/T104148)
[17:30:27] <wikibugs>	 (03CR) 10Jforrester: "Non-staging version: I3ceac441" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384193 (https://phabricator.wikimedia.org/T104148) (owner: 10Chad)
[17:39:01] <wikibugs>	 (03PS1) 10ArielGlenn: add new dumpsgen user to dataset1001 and ms1001 [puppet] - 10https://gerrit.wikimedia.org/r/387834 (https://phabricator.wikimedia.org/T178893)
[17:39:12] <logmsgbot>	 !log demon@tin Synchronized php-1.31.0-wmf.6/includes/specials/pagers/ContribsPager.php: fix missing page_is_new error (duration: 00m 51s)
[17:39:17] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:44:17] <no_justification>	 James_F: Well, the labs one went out and the world didn't explode :p
[17:44:56] <wikibugs>	 (03PS2) 10ArielGlenn: add new dumpsgen user to dataset1001 and ms1001 [puppet] - 10https://gerrit.wikimedia.org/r/387834 (https://phabricator.wikimedia.org/T178893)
[17:52:02] <wikibugs>	 (03CR) 10ArielGlenn: [C: 032] add new dumpsgen user to dataset1001 and ms1001 [puppet] - 10https://gerrit.wikimedia.org/r/387834 (https://phabricator.wikimedia.org/T178893) (owner: 10ArielGlenn)
[18:00:04] <jouncebot>	 addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: Your horoscope predicts another unfortunate Morning SWAT (Max 8 patches) deploy. May Zuul be (nice) with you. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171101T1800).
[18:00:04] <jouncebot>	 Jayprakash12345: A patch you scheduled for Morning SWAT (Max 8 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker.
[18:00:55] <Jayprakash12345>	 yah i am ready
[18:02:41] <moritzm>	 !log installing openjpeg2 security updates
[18:02:47] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:07:44] <Jayprakash12345>	 who will swat?
[18:09:40] <wikibugs>	 (03CR) 10Chad: [C: 032] New logo for se.wikimedia.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/385165 (https://phabricator.wikimedia.org/T178550) (owner: 10SimmeD)
[18:11:12] <wikibugs>	 (03PS1) 10Ottomata: Refine Eventlogging analytics and eventbus data into Hive tables [puppet] - 10https://gerrit.wikimedia.org/r/387838 (https://phabricator.wikimedia.org/T162610)
[18:12:02] <wikibugs>	 (03Merged) 10jenkins-bot: New logo for se.wikimedia.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/385165 (https://phabricator.wikimedia.org/T178550) (owner: 10SimmeD)
[18:12:08] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Refine Eventlogging analytics and eventbus data into Hive tables [puppet] - 10https://gerrit.wikimedia.org/r/387838 (https://phabricator.wikimedia.org/T162610) (owner: 10Ottomata)
[18:12:16] <wikibugs>	 (03CR) 10jenkins-bot: New logo for se.wikimedia.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/385165 (https://phabricator.wikimedia.org/T178550) (owner: 10SimmeD)
[18:13:30] <logmsgbot>	 !log demon@tin Synchronized static/images/project-logos/sewikimedia.png: new logo (duration: 00m 50s)
[18:13:36] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:13:37] <no_justification>	 Jayprakash12345: Done.
[18:14:32] <wikibugs>	 (03PS2) 10Ottomata: Refine Eventlogging analytics and eventbus data into Hive tables [puppet] - 10https://gerrit.wikimedia.org/r/387838 (https://phabricator.wikimedia.org/T162610)
[18:14:41] <Jayprakash12345>	 yah I was checked the patch in mwdebug1002. everthing is fine. Thank You
[18:15:05] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Refine Eventlogging analytics and eventbus data into Hive tables [puppet] - 10https://gerrit.wikimedia.org/r/387838 (https://phabricator.wikimedia.org/T162610) (owner: 10Ottomata)
[18:16:12] <wikibugs>	 (03PS3) 10Ottomata: Refine Eventlogging analytics and eventbus data into Hive tables [puppet] - 10https://gerrit.wikimedia.org/r/387838 (https://phabricator.wikimedia.org/T162610)
[18:16:41] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Refine Eventlogging analytics and eventbus data into Hive tables [puppet] - 10https://gerrit.wikimedia.org/r/387838 (https://phabricator.wikimedia.org/T162610) (owner: 10Ottomata)
[18:17:54] <wikibugs>	 (03PS4) 10Ottomata: Refine Eventlogging analytics and eventbus data into Hive tables [puppet] - 10https://gerrit.wikimedia.org/r/387838 (https://phabricator.wikimedia.org/T162610)
[18:18:27] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Refine Eventlogging analytics and eventbus data into Hive tables [puppet] - 10https://gerrit.wikimedia.org/r/387838 (https://phabricator.wikimedia.org/T162610) (owner: 10Ottomata)
[18:19:15] <wikibugs>	 (03PS5) 10Ottomata: Refine Eventlogging analytics and eventbus data into Hive tables [puppet] - 10https://gerrit.wikimedia.org/r/387838 (https://phabricator.wikimedia.org/T162610)
[18:19:20] <wikibugs>	 10Operations, 10wikidiff2, 10Patch-For-Review, 10User-Addshore, and 2 others: Update and use php-wikidiff2 to 1.5 in production - https://phabricator.wikimedia.org/T177891#3727488 (10Legoktm) >>! In T177891#3719471, @Addshore wrote: > So, I just tried deploying the above change. > While testing on mwdebug1...
[18:20:04] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Refine Eventlogging analytics and eventbus data into Hive tables [puppet] - 10https://gerrit.wikimedia.org/r/387838 (https://phabricator.wikimedia.org/T162610) (owner: 10Ottomata)
[18:21:53] <wikibugs>	 (03CR) 10Zoranzoki21: [C: 031] Get rid of squid.php in favor of new reverse-proxy.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387832 (https://phabricator.wikimedia.org/T104148) (owner: 10Jforrester)
[18:22:09] <Jayprakash12345>	 Chad: Can you Synchronized static/images/project-logos/sewikimedia.png: new logo with task Number like TXXXX again? Because The change is disapperaring after the off mwdebug1002.
[18:22:34] <wikibugs>	 (03CR) 10Chad: [C: 032] Get rid of squid.php in favor of new reverse-proxy.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387832 (https://phabricator.wikimedia.org/T104148) (owner: 10Jforrester)
[18:23:43] <no_justification>	 Disappearing after the off?
[18:23:45] <no_justification>	 I don't understand
[18:25:48] <wikibugs>	 (03Merged) 10jenkins-bot: Get rid of squid.php in favor of new reverse-proxy.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387832 (https://phabricator.wikimedia.org/T104148) (owner: 10Jforrester)
[18:26:18] <icinga-wm>	 PROBLEM - Router interfaces on cr1-eqiad is CRITICAL: CRITICAL: host 208.80.154.196, interfaces up: 232, down: 1, dormant: 0, excluded: 0, unused: 0
[18:26:29] <wikibugs>	 (03CR) 10jenkins-bot: Get rid of squid.php in favor of new reverse-proxy.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387832 (https://phabricator.wikimedia.org/T104148) (owner: 10Jforrester)
[18:28:02] <Jayprakash12345>	 When I switched off mwdebug1002 then old logo came again. And I saw that you Synchronized static/images/project-logos/sewikimedia.png: new logo. So why new logo not came?
[18:29:15] <bblack>	 probably cached :)
[18:29:48] <Jayprakash12345>	 no_justification:  Can you Synchronized static/images/project-logos/sewikimedia.png: new logo with task Number like TXXXX again? 
[18:29:48] <bblack>	 what's public the URL to see the logo?
[18:30:01] <bblack>	 I don't think he needs to sync it again
[18:31:11] <Jayprakash12345>	 please go https://se.wikimedia.org and tell me what is the color of logo.
[18:31:25] <Jayprakash12345>	 old logo are in green
[18:32:08] <no_justification>	 Yep, looks cached
[18:32:11] <no_justification>	 Lemme force a purge
[18:32:11] <bblack>	 what's the public URL of the logo file itself?
[18:32:33] <no_justification>	 https://se.wikimedia.org/static/images/project-logos/sewikimedia.png
[18:33:16] <no_justification>	 Cache-busting it with like ?poop works and gives me the new one
[18:33:18] <icinga-wm>	 RECOVERY - Router interfaces on cr1-eqiad is OK: OK: host 208.80.154.196, interfaces up: 234, down: 0, dormant: 0, excluded: 0, unused: 0
[18:33:26] <no_justification>	 Jayprakash12345: It's live, just gonna take a bit for caches to all clear out
[18:33:35] <no_justification>	 (mwdebug looks right because it skips cache layer)
[18:33:51] <bblack>	 I can purge it, or the script can I guess
[18:34:02] <bblack>	 but static images all have to have the hostname rewritte for purging to work right
[18:34:48] <hasharDinner>	 no_justification: https://wikitech.wikimedia.org/wiki/SWAT_deploys/Deployers#Image_Cache_Purges
[18:34:56] <hasharDinner>	 eg:   echo "https://en.wikipedia.org/static/images/project-logos/newikibooks.png" | mwscript purgeList.php --wiki=enwiki
[18:35:23] <logmsgbot>	 !log demon@tin Synchronized docroot/: dropping squid.php (duration: 00m 52s)
[18:35:27] <hasharDinner>	 just replace newikibooks.png with the logo name
[18:35:28] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:36:29] * hasharDinner eats more
[18:36:59] <logmsgbot>	 !log demon@tin Synchronized wmf-config/CommonSettings.php: use reverse-proxy.php no more squid.php (duration: 00m 50s)
[18:37:04] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:37:19] <no_justification>	 hasharDinner: I know :)
[18:38:02] <no_justification>	 bblack: It's also just a logo update -- if it takes a bit to start showing then we'll live :)
[18:39:33] <logmsgbot>	 !log demon@tin Synchronized wmf-config/: Dropping squid.php (hang on to your pants folks, this could be fun) (duration: 00m 51s)
[18:39:38] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:40:14] <bblack>	 getting rid of "squid" naming?
[18:40:23] <bblack>	 nice
[18:41:53] <wikibugs>	 (03PS1) 10BBlack: Revert "Global: runtime disable ethernet flow on fresh install" [puppet] - 10https://gerrit.wikimedia.org/r/387844
[18:42:07] <wikibugs>	 (03PS2) 10BBlack: Revert "Global: runtime disable ethernet flow on fresh install" [puppet] - 10https://gerrit.wikimedia.org/r/387844
[18:42:21] <wikibugs>	 (03CR) 10BBlack: [V: 032 C: 032] Revert "Global: runtime disable ethernet flow on fresh install" [puppet] - 10https://gerrit.wikimedia.org/r/387844 (owner: 10BBlack)
[18:43:12] <Jayprakash12345_>	 no_justification: anything else for me
[18:43:19] <no_justification>	 No
[18:43:44] <no_justification>	 bblack: Yeah, there was some bikeshedding over the name, went with squid.php -> reverse-proxy.php
[18:44:21] <bblack>	 of course reverse-proxy.php still contains $wgSquidServersNoPurge :)
[18:44:41] <Jayprakash12345_>	 no_justification: thank you Can you tell me how much time will take to new logo live?
[18:44:53] <no_justification>	 It is live :)
[18:44:57] <no_justification>	 Just cached ;-)
[18:45:03] <no_justification>	 bblack: Well, blame MediaWiki :p
[18:46:36] <wikibugs>	 (03PS1) 10ArielGlenn: make dumps snapshot host roles more role-like [puppet] - 10https://gerrit.wikimedia.org/r/387846 (https://phabricator.wikimedia.org/T175528)
[18:47:09] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] make dumps snapshot host roles more role-like [puppet] - 10https://gerrit.wikimedia.org/r/387846 (https://phabricator.wikimedia.org/T175528) (owner: 10ArielGlenn)
[18:48:03] <apergos>	 yeah we knew that.  now I get to see how much I have to really fix
[18:51:50] <wikibugs>	 (03PS2) 10ArielGlenn: make dumps snapshot host roles more role-like [puppet] - 10https://gerrit.wikimedia.org/r/387846 (https://phabricator.wikimedia.org/T175528)
[18:59:08] <wikibugs>	 (03PS6) 10Ottomata: Refine Eventlogging analytics and eventbus data into Hive tables [puppet] - 10https://gerrit.wikimedia.org/r/387838 (https://phabricator.wikimedia.org/T162610)
[18:59:39] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Refine Eventlogging analytics and eventbus data into Hive tables [puppet] - 10https://gerrit.wikimedia.org/r/387838 (https://phabricator.wikimedia.org/T162610) (owner: 10Ottomata)
[19:00:04] <jouncebot>	 no_justification: #bothumor When your hammer is PHP, everything starts looking like a thumb. Rise for MediaWiki train. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171101T1900).
[19:00:04] <jouncebot>	 No GERRIT patches in the queue for this window AFAICS.
[19:05:06] <logmsgbot>	 !log otto@tin Started deploy [analytics/refinery@6d11d67]: Deploying refinery-source artifacts for 0.0.54 for JsonRefine job, T162610
[19:05:11] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:05:12] <stashbot>	 T162610: Implement EventLogging Hive refinement - https://phabricator.wikimedia.org/T162610
[19:05:37] <wikibugs>	 (03PS3) 10ArielGlenn: make dumps snapshot host roles more role-like [puppet] - 10https://gerrit.wikimedia.org/r/387846 (https://phabricator.wikimedia.org/T175528)
[19:05:49] <James_F>	 bblack: See https://phabricator.wikimedia.org/T104148#3727257 :-)
[19:08:51] <logmsgbot>	 !log otto@tin Finished deploy [analytics/refinery@6d11d67]: Deploying refinery-source artifacts for 0.0.54 for JsonRefine job, T162610 (duration: 03m 45s)
[19:08:56] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:09:05] <wikibugs>	 (03PS1) 10BBlack: Revert "Global: Turn off ethernet flow for all interfaces at boot time" [puppet] - 10https://gerrit.wikimedia.org/r/387849
[19:09:12] <wikibugs>	 (03PS2) 10BBlack: Revert "Global: Turn off ethernet flow for all interfaces at boot time" [puppet] - 10https://gerrit.wikimedia.org/r/387849
[19:09:16] <wikibugs>	 (03CR) 10BBlack: [V: 032 C: 032] Revert "Global: Turn off ethernet flow for all interfaces at boot time" [puppet] - 10https://gerrit.wikimedia.org/r/387849 (owner: 10BBlack)
[19:12:30] <awight>	 no_justification: I think I have a fix for T179430, how urgent is deployment?  I’m happy to ask for a window now, or it can wait 4hr until the next SWAT.
[19:12:30] <stashbot>	 T179430: ORES extension failing to parse scoring response - https://phabricator.wikimedia.org/T179430
[19:12:36] <wikibugs>	 (03PS1) 10BBlack: lvs1001-6: increase bnx2 rx ring buffer [puppet] - 10https://gerrit.wikimedia.org/r/387850
[19:12:51] <wikibugs>	 (03CR) 10Ottomata: [V: 032 C: 032] "Let's try it! :o" [puppet] - 10https://gerrit.wikimedia.org/r/387838 (https://phabricator.wikimedia.org/T162610) (owner: 10Ottomata)
[19:12:59] <wikibugs>	 (03CR) 10Ottomata: [V: 032 C: 032] "https://puppet-compiler.wmflabs.org/compiler02/8589/analytics1003.eqiad.wmnet/" [puppet] - 10https://gerrit.wikimedia.org/r/387838 (https://phabricator.wikimedia.org/T162610) (owner: 10Ottomata)
[19:13:01] <awight>	 AFAIK, it’s the logspam plus API?oresscores requests are failing.  Which I think are 3rd-party tools for now.
[19:13:01] <wikibugs>	 (03PS7) 10Ottomata: Refine Eventlogging analytics and eventbus data into Hive tables [puppet] - 10https://gerrit.wikimedia.org/r/387838 (https://phabricator.wikimedia.org/T162610)
[19:13:03] <wikibugs>	 (03CR) 10Ottomata: [V: 032 C: 032] Refine Eventlogging analytics and eventbus data into Hive tables [puppet] - 10https://gerrit.wikimedia.org/r/387838 (https://phabricator.wikimedia.org/T162610) (owner: 10Ottomata)
[19:14:30] <bblack>	 !log all hosts: manual cumin+sed removal of ethernet autoneg params from /e/n/i to match https://gerrit.wikimedia.org/r/#/c/387849/
[19:14:33] <awight>	 (no_justification: oops, moving discussion to -releng)
[19:14:36] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:16:54] <wikibugs>	 (03CR) 10Bearloga: "> @Bearloga Got it. @Addshore: We will do the same for our analytics-wmde user." [puppet] - 10https://gerrit.wikimedia.org/r/369902 (https://phabricator.wikimedia.org/T171258) (owner: 10Addshore)
[19:20:13] <wikibugs>	 (03PS4) 10ArielGlenn: make dumps snapshot host roles more role-like [puppet] - 10https://gerrit.wikimedia.org/r/387846 (https://phabricator.wikimedia.org/T175528)
[19:21:27] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] make dumps snapshot host roles more role-like [puppet] - 10https://gerrit.wikimedia.org/r/387846 (https://phabricator.wikimedia.org/T175528) (owner: 10ArielGlenn)
[19:22:05] <wikibugs>	 (03PS1) 10Ottomata: Fix refinery_job_jar var [puppet] - 10https://gerrit.wikimedia.org/r/387853
[19:23:22] <wikibugs>	 (03PS5) 10ArielGlenn: make dumps snapshot host roles more role-like [puppet] - 10https://gerrit.wikimedia.org/r/387846 (https://phabricator.wikimedia.org/T175528)
[19:23:53] <wikibugs>	 (03CR) 10Ottomata: [C: 032] Fix refinery_job_jar var [puppet] - 10https://gerrit.wikimedia.org/r/387853 (owner: 10Ottomata)
[19:26:52] <wikibugs>	 (03PS1) 10Ottomata: Fix (again) refinery_job_jar var [puppet] - 10https://gerrit.wikimedia.org/r/387854
[19:28:06] <wikibugs>	 (03PS2) 10Ottomata: Fix (again) refinery_job_jar var and run refines at 20 and 30 mins [puppet] - 10https://gerrit.wikimedia.org/r/387854
[19:28:30] <logmsgbot>	 !log awight@tin Synchronized php-1.31.0-wmf.6/extensions/ORES: Fix for API=oresscores, T179430 (duration: 00m 52s)
[19:28:36] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:28:37] <stashbot>	 T179430: ORES extension failing to parse scoring response - https://phabricator.wikimedia.org/T179430
[19:28:57] <wikibugs>	 (03CR) 10Ottomata: [C: 032] Fix (again) refinery_job_jar var and run refines at 20 and 30 mins [puppet] - 10https://gerrit.wikimedia.org/r/387854 (owner: 10Ottomata)
[19:31:16] <wikibugs>	 (03PS1) 10Ottomata: opt name is --database, not --output-database [puppet] - 10https://gerrit.wikimedia.org/r/387855
[19:31:29] <wikibugs>	 (03CR) 10Ottomata: [V: 032 C: 032] opt name is --database, not --output-database [puppet] - 10https://gerrit.wikimedia.org/r/387855 (owner: 10Ottomata)
[19:31:56] <logmsgbot>	 !log awight@tin Synchronized php-1.31.0-wmf.5/extensions/ORES: Fix for API=oresscores, T179430 (duration: 00m 50s)
[19:32:02] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:37:49] <wikibugs>	 (03PS1) 10Ottomata: Include job_name when checking if json refine job is running [puppet] - 10https://gerrit.wikimedia.org/r/387857
[19:40:10] <wikibugs>	 (03CR) 10Ottomata: [C: 032] Include job_name when checking if json refine job is running [puppet] - 10https://gerrit.wikimedia.org/r/387857 (owner: 10Ottomata)
[19:41:35] <logmsgbot>	 !log demon@tin Pruned MediaWiki: 1.31.0-wmf.4 [keeping static files] (duration: 01m 52s)
[19:41:41] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:42:42] <Krinkle>	 no_justification: Once the train is done, I'd like to roll out demon as soon as possible.
[19:42:50] <Krinkle>	 err, copypaste fail. I mean https://gerrit.wikimedia.org/r/#/c/387856/
[19:43:02] <Krinkle>	 Unbreak IE10 JS
[19:43:50] <no_justification>	 Krinkle: Go ahead now if you want, I haven't done my wikiversions.json bump yet, and I wanna eat first anyway
[19:43:59] <Krinkle>	 Got it. Thanks!
[19:53:05] <logmsgbot>	 !log demon@tin Synchronized php-1.31.0-wmf.5/extensions/CentralNotice: fix weird git rebasing issue (duration: 00m 53s)
[19:53:11] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:55:41] <wikibugs>	 (03CR) 10Chad: [C: 032] group1 to wmf.6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387818 (owner: 10Chad)
[19:56:08] <icinga-wm>	 PROBLEM - puppet last run on thumbor2001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[19:56:30] <wikibugs>	 10Operations, 10wikidiff2, 10Patch-For-Review, 10User-Addshore, and 2 others: Update and use php-wikidiff2 to 1.5 in production - https://phabricator.wikimedia.org/T177891#3727766 (10Addshore) @Legoktm Just make a change that moves a paragraph. Here is one of my tests on testwiki not working while I was te...
[19:57:10] <logmsgbot>	 !log krinkle@tin Synchronized php-1.31.0-wmf.6/resources/src/startup.js: T178943 (duration: 00m 51s)
[19:57:16] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:57:17] <stashbot>	 T178943: startUp() callback sometimes happen before 'mw' is defined in IE10 - https://phabricator.wikimedia.org/T178943
[19:59:32] <wikibugs>	 (03Merged) 10jenkins-bot: group1 to wmf.6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387818 (owner: 10Chad)
[19:59:41] <wikibugs>	 (03CR) 10jenkins-bot: group1 to wmf.6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387818 (owner: 10Chad)
[20:00:04] <jouncebot>	 gwicke, cscott, arlolra, subbu, bearND, halfak, and Amir1: #bothumor When your hammer is PHP, everything starts looking like a thumb. Rise for Services – Parsoid / OCG / Citoid / Mobileapps / ORES / …. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171101T2000).
[20:00:04] <jouncebot>	 No GERRIT patches in the queue for this window AFAICS.
[20:00:30] <subbu>	 no parsoid deploy today
[20:03:37] <wikibugs>	 (03CR) 10Zoranzoki21: "> Removed reviewer Zoranzoki21 with the following votes:" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387832 (https://phabricator.wikimedia.org/T104148) (owner: 10Jforrester)
[20:04:55] <wikibugs>	 (03CR) 10Zoranzoki21: [C: 031] labs: use new redis servers for locks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387570 (https://phabricator.wikimedia.org/T179371) (owner: 10Filippo Giunchedi)
[20:05:35] <logmsgbot>	 !log demon@tin Synchronized php: symlink swap (duration: 00m 49s)
[20:05:40] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:06:24] <wikibugs>	 (03PS6) 10ArielGlenn: make dumps snapshot host roles more role-like [puppet] - 10https://gerrit.wikimedia.org/r/387846 (https://phabricator.wikimedia.org/T175528)
[20:07:53] <wikibugs>	 (03CR) 10ArielGlenn: [C: 032] make dumps snapshot host roles more role-like [puppet] - 10https://gerrit.wikimedia.org/r/387846 (https://phabricator.wikimedia.org/T175528) (owner: 10ArielGlenn)
[20:09:28] <wikibugs>	 (03PS2) 10BBlack: lvs1001-6: increase bnx2 rx ring buffer [puppet] - 10https://gerrit.wikimedia.org/r/387850
[20:10:19] <wikibugs>	 (03CR) 10BBlack: [C: 032] lvs1001-6: increase bnx2 rx ring buffer [puppet] - 10https://gerrit.wikimedia.org/r/387850 (owner: 10BBlack)
[20:10:26] <logmsgbot>	 !log demon@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 to wmf.6
[20:10:34] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:10:46] <wikibugs>	 (03PS1) 10Ottomata: Run JsonRefine job in yarn deploy mode cluster and provide hive-site.xml [puppet] - 10https://gerrit.wikimedia.org/r/387860 (https://phabricator.wikimedia.org/T162610)
[20:11:04] <wikibugs>	 (03PS2) 10Ottomata: Run JsonRefine job in yarn deploy mode cluster and provide hive-site.xml [puppet] - 10https://gerrit.wikimedia.org/r/387860 (https://phabricator.wikimedia.org/T162610)
[20:11:51] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Run JsonRefine job in yarn deploy mode cluster and provide hive-site.xml [puppet] - 10https://gerrit.wikimedia.org/r/387860 (https://phabricator.wikimedia.org/T162610) (owner: 10Ottomata)
[20:12:47] <wikibugs>	 (03PS3) 10Ottomata: Run JsonRefine job in yarn deploy mode cluster and provide hive-site.xml [puppet] - 10https://gerrit.wikimedia.org/r/387860 (https://phabricator.wikimedia.org/T162610)
[20:13:00] <wikibugs>	 (03PS4) 10Ottomata: Run JsonRefine job in yarn deploy mode cluster and provide hive-site.xml [puppet] - 10https://gerrit.wikimedia.org/r/387860 (https://phabricator.wikimedia.org/T162610)
[20:13:07] <wikibugs>	 (03CR) 10Ottomata: [V: 032 C: 032] Run JsonRefine job in yarn deploy mode cluster and provide hive-site.xml [puppet] - 10https://gerrit.wikimedia.org/r/387860 (https://phabricator.wikimedia.org/T162610) (owner: 10Ottomata)
[20:16:16] <wikibugs>	 (03PS1) 10Ayounsi: Netbox scap3 initial commit [software/netbox-deploy] - 10https://gerrit.wikimedia.org/r/387861
[20:18:36] <wikibugs>	 (03PS1) 10BBlack: LVS+Caches: disable Ethernet flowcontrol [puppet] - 10https://gerrit.wikimedia.org/r/387863
[20:18:38] <wikibugs>	 (03PS1) 10BBlack: interface::noflow - runtime disable on fresh install [puppet] - 10https://gerrit.wikimedia.org/r/387864
[20:18:40] <wikibugs>	 (03PS1) 10ArielGlenn: fix role name in snapshot motds [puppet] - 10https://gerrit.wikimedia.org/r/387865
[20:19:39] <wikibugs>	 (03CR) 10ArielGlenn: [C: 032] fix role name in snapshot motds [puppet] - 10https://gerrit.wikimedia.org/r/387865 (owner: 10ArielGlenn)
[20:20:35] <wikibugs>	 10Operations, 10Patch-For-Review, 10cloud-services-team (Kanban): labsdb1001 crashed - storage issue - https://phabricator.wikimedia.org/T179464#3727841 (10bd808) >>! In T179464#3727235, @Superyetkin wrote: > Any chance to recover data?  We are working right now to see how many bad blocks/sectors there are o...
[20:21:08] <icinga-wm>	 RECOVERY - puppet last run on thumbor2001 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures
[20:28:03] <wikibugs>	 10Operations, 10fundraising-tech-ops, 10netops: bonded/redundant network connections for fundraising hosts - https://phabricator.wikimedia.org/T171962#3727851 (10Jgreen)
[20:28:06] <wikibugs>	 10Operations, 10ops-eqiad, 10fundraising-tech-ops, 10netops: connect second interface for each frack to opposite switch for each eqiad host - https://phabricator.wikimedia.org/T176975#3727849 (10Jgreen) 05Open>03Resolved a:03Jgreen
[20:28:16] <wikibugs>	 10Operations, 10wikidiff2, 10Patch-For-Review, 10User-Addshore, and 2 others: Update and use php-wikidiff2 to 1.5 in production - https://phabricator.wikimedia.org/T177891#3727852 (10Tobi_WMDE_SW) Wondering why it worked on beta, it should have been broken since https://gerrit.wikimedia.org/r/#/c/377804/ I...
[20:31:22] <wikibugs>	 (03PS1) 10Ottomata: Camus imports eventbus data into /wmf/data/raw/event [puppet] - 10https://gerrit.wikimedia.org/r/387869
[20:34:51] <wikibugs>	 (03CR) 10Ottomata: [C: 032] Camus imports eventbus data into /wmf/data/raw/event [puppet] - 10https://gerrit.wikimedia.org/r/387869 (owner: 10Ottomata)
[20:34:58] <wikibugs>	 (03PS2) 10Ottomata: Camus imports eventbus data into /wmf/data/raw/event [puppet] - 10https://gerrit.wikimedia.org/r/387869
[20:35:00] <wikibugs>	 (03CR) 10Ottomata: [V: 032 C: 032] Camus imports eventbus data into /wmf/data/raw/event [puppet] - 10https://gerrit.wikimedia.org/r/387869 (owner: 10Ottomata)
[20:43:51] <wikibugs>	 (03CR) 10Chad: "Because you're constantly +1ing my changes without any context or idea if they're even ok. You +1'd a change I explicitly -2'd last week. " [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387832 (https://phabricator.wikimedia.org/T104148) (owner: 10Jforrester)
[20:43:59] <wikibugs>	 (03PS1) 10ArielGlenn: add dumpsgen user to the snapshots hosts [puppet] - 10https://gerrit.wikimedia.org/r/387875
[20:50:46] <wikibugs>	 (03CR) 10ArielGlenn: [C: 032] add dumpsgen user to the snapshots hosts [puppet] - 10https://gerrit.wikimedia.org/r/387875 (owner: 10ArielGlenn)
[20:53:03] <wikibugs>	 (03PS2) 10BBlack: LVS+Caches: disable Ethernet flowcontrol [puppet] - 10https://gerrit.wikimedia.org/r/387863
[20:53:50] <wikibugs>	 (03CR) 10BBlack: [C: 032] LVS+Caches: disable Ethernet flowcontrol [puppet] - 10https://gerrit.wikimedia.org/r/387863 (owner: 10BBlack)
[20:57:35] <wikibugs>	 (03PS1) 10Ayounsi: Add fake keys for Netbox deployment [labs/private] - 10https://gerrit.wikimedia.org/r/387878
[20:58:09] <James_F>	 no_justification: Your thoughts on https://gerrit.wikimedia.org/r/#/c/387877/ (death to $wg…Squid… variables) would be appreciated.
[21:01:21] <wikibugs>	 (03PS1) 10Ayounsi: Netbox: initial puppet commit [puppet] - 10https://gerrit.wikimedia.org/r/387880
[21:01:58] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Netbox: initial puppet commit [puppet] - 10https://gerrit.wikimedia.org/r/387880 (owner: 10Ayounsi)
[21:07:39] <icinga-wm>	 PROBLEM - puppet last run on db2012 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[21:08:47] <no_justification>	 James_F: tldr right now: I don't think it's worth the dang effort
[21:08:56] <no_justification>	 Back-compat shims that will sit around /forever/
[21:10:02] <no_justification>	 wmf-config, fine whatever it's easy to fix our stu
[21:10:04] <no_justification>	 *stuff
[21:12:58] <icinga-wm>	 PROBLEM - Router interfaces on cr1-eqiad is CRITICAL: CRITICAL: host 208.80.154.196, interfaces up: 232, down: 1, dormant: 0, excluded: 0, unused: 0
[21:15:58] <icinga-wm>	 RECOVERY - Router interfaces on cr1-eqiad is OK: OK: host 208.80.154.196, interfaces up: 234, down: 0, dormant: 0, excluded: 0, unused: 0
[21:18:59] <icinga-wm>	 PROBLEM - Router interfaces on cr1-eqiad is CRITICAL: CRITICAL: host 208.80.154.196, interfaces up: 232, down: 1, dormant: 0, excluded: 0, unused: 0
[21:23:54] <bblack>	 plus Squid is a cute name, and reminds of our infrastructural legacy :)
[21:24:38] <wikibugs>	 (03PS2) 10BBlack: interface::noflow - runtime disable on fresh install [puppet] - 10https://gerrit.wikimedia.org/r/387864
[21:24:44] <wikibugs>	 (03CR) 10BBlack: [C: 032] interface::noflow - runtime disable on fresh install [puppet] - 10https://gerrit.wikimedia.org/r/387864 (owner: 10BBlack)
[21:24:59] <icinga-wm>	 PROBLEM - Long running screen/tmux on mwlog1001 is CRITICAL: CRIT: Long running SCREEN process. (PID: 56811, 1729810s 1728000s).
[21:27:50] <Zppix>	 bblack: +1
[21:29:08] <icinga-wm>	 RECOVERY - Router interfaces on cr1-eqiad is OK: OK: host 208.80.154.196, interfaces up: 234, down: 0, dormant: 0, excluded: 0, unused: 0
[21:29:37] <wikibugs>	 (03CR) 10BBlack: "Only thing I'm really worried about here, is we do send X-Cache-Status to analytics webrequest stream in modules/profile/manifests/cache/k" [puppet] - 10https://gerrit.wikimedia.org/r/387817 (owner: 10Ema)
[21:32:07] <wikibugs>	 10Operations, 10MediaWiki-General-or-Unknown, 10TechCom-RfC: Bump PHP requirement to 5.6 in 1.31 - https://phabricator.wikimedia.org/T178538#3695340 (10Krinkle) Continuing from T178538#3699577, the Last Call period for this RFC has expired and TechCom has decided to cancel it's proposed "Approval" based on t...
[21:34:29] <wikibugs>	 (03PS1) 10ArielGlenn: add dumpsgen to sudo rules for the appropriate admin groups [puppet] - 10https://gerrit.wikimedia.org/r/387917
[21:37:39] <icinga-wm>	 RECOVERY - puppet last run on db2012 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures
[21:42:46] <wikibugs>	 (03PS2) 10Ayounsi: Netbox: initial puppet commit [puppet] - 10https://gerrit.wikimedia.org/r/387880
[21:43:53] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Netbox: initial puppet commit [puppet] - 10https://gerrit.wikimedia.org/r/387880 (owner: 10Ayounsi)
[21:46:58] <wikibugs>	 (03CR) 10ArielGlenn: [C: 032] add dumpsgen to sudo rules for the appropriate admin groups [puppet] - 10https://gerrit.wikimedia.org/r/387917 (owner: 10ArielGlenn)
[21:57:01] <wikibugs>	 10Operations, 10wikidiff2, 10Patch-For-Review, 10User-Addshore, and 2 others: Update and use php-wikidiff2 to 1.5 in production - https://phabricator.wikimedia.org/T177891#3728033 (10Addshore) >>! In T177891#3727852, @Tobi_WMDE_SW wrote: > Wondering why it worked on beta, it should have been broken since h...
[22:01:53] <wikibugs>	 10Operations, 10MediaWiki-General-or-Unknown, 10TechCom-RfC: Bump PHP requirement to 5.6 in 1.31 - https://phabricator.wikimedia.org/T178538#3728041 (10Smalyshev) Does this proposal mean we'd have to migrate all PHP 5.x services to hhvm, with knowledge that we'll have to migrate them to PHP 7 at some later p...
[22:08:43] <wikibugs>	 (03PS1) 10Legoktm: Disable REL1_28 in ExtensionDistributor [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387936
[22:09:57] <legoktm>	 no_justification: ^
[22:10:10] <wikibugs>	 (03CR) 10Chad: [C: 032] Disable REL1_28 in ExtensionDistributor [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387936 (owner: 10Legoktm)
[22:11:44] <wikibugs>	 10Operations, 10Scap, 10Release-Engineering-Team (Watching / External): Scap: Standardize git version - https://phabricator.wikimedia.org/T179353#3721967 (10greg) From moritz: P6242 (machines still running trusty)
[22:13:42] <wikibugs>	 (03Merged) 10jenkins-bot: Disable REL1_28 in ExtensionDistributor [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387936 (owner: 10Legoktm)
[22:17:31] <logmsgbot>	 !log demon@tin Synchronized wmf-config/CommonSettings.php: rel1_28 is dead (duration: 00m 50s)
[22:17:38] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[22:18:04] <wikibugs>	 (03PS3) 10Ayounsi: Netbox: initial puppet commit [puppet] - 10https://gerrit.wikimedia.org/r/387880
[22:18:23] <no_justification>	 legoktm: Done ^
[22:19:06] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Netbox: initial puppet commit [puppet] - 10https://gerrit.wikimedia.org/r/387880 (owner: 10Ayounsi)
[22:19:08] <wikibugs>	 (03CR) 10jenkins-bot: Disable REL1_28 in ExtensionDistributor [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387936 (owner: 10Legoktm)
[22:19:23] <wikibugs>	 10Operations, 10MediaWiki-General-or-Unknown, 10TechCom-RfC: Bump PHP requirement to 5.6 in 1.31 - https://phabricator.wikimedia.org/T178538#3728097 (10daniel) >>! In T178538#3728041, @Smalyshev wrote: > Does this proposal mean we'd have to migrate all PHP 5.x services to hhvm, with knowledge that we'll have...
[22:23:24] <wikibugs>	 (03CR) 10Hashar: [C: 04-1] "00:00:20.214 modules/netbox/templates/ldap_config.py.erb:23:# heirarchy." [puppet] - 10https://gerrit.wikimedia.org/r/387880 (owner: 10Ayounsi)
[22:23:36] <hashar>	 XioNoX: ^^
[22:23:42] <wikibugs>	 (03PS4) 10Ayounsi: Netbox: initial puppet commit [puppet] - 10https://gerrit.wikimedia.org/r/387880
[22:24:10] <wikibugs>	 10Operations, 10MediaWiki-General-or-Unknown, 10TechCom-RfC: Bump PHP requirement to 5.6 in 1.31 - https://phabricator.wikimedia.org/T178538#3728124 (10Krinkle) >>! In T178538#3728041, @Smalyshev wrote: > Does this proposal mean we'd have to migrate all PHP 5.x services to hhvm, with knowledge that we'll hav...
[22:24:16] <XioNoX>	 hashar: what's up?
[22:24:17] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Netbox: initial puppet commit [puppet] - 10https://gerrit.wikimedia.org/r/387880 (owner: 10Ayounsi)
[22:24:19] <hashar>	 XioNoX: and in theory you could add the test command in a git hook locally :]
[22:25:08] <hashar>	 XioNoX: I gave you some hint on https://gerrit.wikimedia.org/r/#/c/387880/  to reproduce the tests locally :]  should be faster than sending to gerrit / waiting for CI
[22:25:31] <hashar>	 bundle install && bundle exec rake --jobs 1 test  :D
[22:25:49] <XioNoX>	 thx!
[22:25:52] <XioNoX>	 I'll do that
[22:27:01] <XioNoX>	 /usr/bin/ruby2.3: No such file or directory -- /usr/share/rubygems-integration/all/gems/rake-12.0.0/exe/rake (LoadError)
[22:27:15] <XioNoX>	 Gem::Ext::BuildError: ERROR: Failed to build gem native extension.
[22:28:01] <wikibugs>	 (03PS1) 10ArielGlenn: mount nfs share from dumpsdata host on snapshots [puppet] - 10https://gerrit.wikimedia.org/r/387951
[22:28:20] <XioNoX>	 is there something like virtualenv for ruby?
[22:28:45] <wikibugs>	 10Operations, 10Scap, 10Release-Engineering-Team (Watching / External): Scap: Standardize git version - https://phabricator.wikimedia.org/T179353#3728176 (10greg) From that paste and https://phabricator.wikimedia.org/source/operations-puppet/browse/production/hieradata/common/scap/dsh.yaml  * snapshot hosts...
[22:30:10] <XioNoX>	 hashar: ^ :)
[22:30:42] <hashar>	 XioNoX: yeah bundler
[22:31:03] <hashar>	 apt-get install bundler
[22:31:19] <hashar>	 it would use gems to download/install gems somewhere in your home
[22:31:26] <hashar>	 then when you do:   bundle exec FOO
[22:31:39] <XioNoX>	 bundler install doesn't work
[22:31:42] <hashar>	 it mangles the RUBYPATH and PATH to point to your gems in the home
[22:31:44] <hashar>	 hmm
[22:31:49] <hashar>	 try  "bundle update" ?
[22:31:55] <hashar>	 or maybe it is  "bundle install
[22:32:04] <hashar>	 bundle vs bundler
[22:32:31] <XioNoX>	 same isssue with both bundle/bundler install/update
[22:32:41] <hashar>	 :(
[22:33:00] <XioNoX>	 https://www.irccloud.com/pastebin/QX3NAsh0/
[22:33:28] <hashar>	 ERROR: Failed to build gem native extension.
[22:33:35] <hashar>	 there is one of the extensions that requires some compilation
[22:33:41] <Krinkle>	 !log group2(all-wikidata) wikis to wmf.5 from 24 hours ago seems to have caused a 60% drop in navigation timing metric report count (100/min => 40/min)
[22:33:47] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[22:33:48] <Krinkle>	 Need to go, but will investigate when I return
[22:33:57] <Krinkle>	 https://grafana.wikimedia.org/dashboard/db/navigation-timing-by-browser?var-metric=mediaWikiLoadComplete&panelId=6&fullscreen&orgId=1&from=1509412097543&to=1509511254065&refresh=5m
[22:34:11] <Krinkle>	 https://grafana.wikimedia.org/dashboard/db/navigation-timing?refresh=5m&panelId=12&fullscreen&orgId=1&from=now-2d&to=now&var-metric=mediaWikiLoadComplete
[22:34:54] <hashar>	 XioNoX: /usr/bin/ruby2.3: No such file or directory -- /usr/share/rubygems-integration/all/gems/rake-12.0.0/exe/rake (LoadError)   . I am 100% sure I had the issue before
[22:35:08] <paladox>	 XioNoX hi, try apt-get install ruby-dev
[22:35:12] <paladox>	 https://stackoverflow.com/questions/22544754/failed-to-build-gem-native-extension-installing-compass
[22:36:42] <wikibugs>	 (03CR) 10ArielGlenn: [C: 032] mount nfs share from dumpsdata host on snapshots [puppet] - 10https://gerrit.wikimedia.org/r/387951 (owner: 10ArielGlenn)
[22:37:46] <paladox>	 and
[22:37:47] <paladox>	 gem install rake
[22:38:07] <wikibugs>	 10Operations, 10wikidiff2, 10Patch-For-Review, 10User-Addshore, and 2 others: Update and use php-wikidiff2 to 1.5 in production - https://phabricator.wikimedia.org/T177891#3728207 (10Tobi_WMDE_SW) @Addshore right! as long as it is > 0.3.0 it should work.
[22:39:22] <hashar>	 XioNoX: cant find any note sorry :(
[22:40:53] <wikibugs>	 (03PS6) 10Paladox: Gerrit: Replace certificates with tokens for its-phabricator [puppet] - 10https://gerrit.wikimedia.org/r/384901 (https://phabricator.wikimedia.org/T178385)
[22:41:26] <wikibugs>	 (03PS7) 10Paladox: Gerrit: Replace certificates with tokens for its-phabricator [puppet] - 10https://gerrit.wikimedia.org/r/384901 (https://phabricator.wikimedia.org/T178385)
[22:44:45] <wikibugs>	 (03PS1) 10ArielGlenn: remove hiera keys for snapshots that we no longer need [puppet] - 10https://gerrit.wikimedia.org/r/387955
[22:50:04] <wikibugs>	 (03CR) 10ArielGlenn: [C: 032] remove hiera keys for snapshots that we no longer need [puppet] - 10https://gerrit.wikimedia.org/r/387955 (owner: 10ArielGlenn)
[22:55:48] <icinga-wm>	 PROBLEM - puppet last run on wtp2013 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[23:00:04] <jouncebot>	 addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: Your horoscope predicts another unfortunate Evening SWAT (Max 8 patches) deploy. May Zuul be (nice) with you. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171101T2300).
[23:00:04] <jouncebot>	 Smalyshev and legoktm: A patch you scheduled for Evening SWAT (Max 8 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker.
[23:00:19] <legoktm>	 hi
[23:00:24] <SMalyshev>	 here
[23:02:36] <ebernhardson>	 i added stuff as well that doesn't seem to have made it into jouncebot
[23:03:00] <paladox>	 jouncebot: reload
[23:03:06] <paladox>	 jouncebot: refresh
[23:03:10] <jouncebot>	 I refreshed my knowledge about deployments.
[23:03:21] <thcipriani>	 I can SWAT
[23:03:44] <wikibugs>	 (03CR) 10Thcipriani: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386554 (https://phabricator.wikimedia.org/T148411) (owner: 10Smalyshev)
[23:05:23] <wikibugs>	 (03Merged) 10jenkins-bot: Revert "Revert "Add negative weight to disambig entities"" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386554 (https://phabricator.wikimedia.org/T148411) (owner: 10Smalyshev)
[23:06:24] <thcipriani>	 SMalyshev:  Revert "Revert "Add negative weight to disambig entities"" is on mwdebug1002, if there's anything to check there
[23:06:33] <wikibugs>	 (03CR) 10jenkins-bot: Revert "Revert "Add negative weight to disambig entities"" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386554 (https://phabricator.wikimedia.org/T148411) (owner: 10Smalyshev)
[23:06:39] <SMalyshev>	 thcipriani: checking
[23:07:35] <SMalyshev>	 thcipriani: yep, seems to be working fine!
[23:07:43] <thcipriani>	 SMalyshev: ok, going live
[23:09:42] <logmsgbot>	 !log thcipriani@tin Synchronized wmf-config/Wikibase-production.php: SWAT: [[gerrit:386554|Revert "Revert "Add negative weight to disambig entities""]] T148411 (duration: 00m 51s)
[23:09:46] <thcipriani>	 ^ SMalyshev live now
[23:09:50] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:09:50] <stashbot>	 T148411: Item search for statements ranks disambiguation items too highly - https://phabricator.wikimedia.org/T148411
[23:10:04] <SMalyshev>	 thcipriani: thanks, it's working!
[23:10:16] <thcipriani>	 awesome :)
[23:11:59] <SMalyshev>	 thcipriani: for the second one, it's Wikidata, so the wikidata extension patch is the one that does the work, the other one is to keep wikibase repo in sync 
[23:12:13] <SMalyshev>	 (that's what Amir1 told me to do :)
[23:12:41] <thcipriani>	 okie doke, makes sense
[23:18:16] <wikibugs>	 10Operations, 10Patch-For-Review, 10cloud-services-team (Kanban): labsdb1001 crashed - storage issue - https://phabricator.wikimedia.org/T179464#3728323 (10madhuvishy) It looks like it may be time to say goodbye to this server. I've spent some time today looking at the state of the storage configuration, and...
[23:18:20] <thcipriani>	 SMalyshev: Wikidata extension update is live on mwdebug1002, check please
[23:19:13] <SMalyshev>	 checking
[23:19:59] <SMalyshev>	 thcipriani: yep, works
[23:20:04] <thcipriani>	 ok, going live
[23:23:26] <logmsgbot>	 !log thcipriani@tin Synchronized php-1.31.0-wmf.6/extensions/Wikidata/extensions/Wikibase/repo/Wikibase.hooks.php: SWAT: [[gerrit:387749|Allow turning Cirrus usage off from query]] T179428 (duration: 00m 51s)
[23:23:32] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:23:33] <stashbot>	 T179428: Can not enable old SQL prefix search mode on wikidata - https://phabricator.wikimedia.org/T179428
[23:25:01] <logmsgbot>	 !log thcipriani@tin Synchronized php-1.31.0-wmf.6/extensions/Wikibase/repo/Wikibase.hooks.php: SWAT: [[gerrit:387662|Allow turning Cirrus usage off from query]] T179428 (duration: 00m 49s)
[23:25:06] <thcipriani>	 ^ SMalyshev all live
[23:25:09] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:25:48] <icinga-wm>	 RECOVERY - puppet last run on wtp2013 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures
[23:26:27] <thcipriani>	 legoktm: your namespace fix is live on mwdebug1002, check please
[23:26:29] <SMalyshev>	 thcipriani: thank you! everything seems to be fine
[23:26:51] <thcipriani>	 SMalyshev: yw :) glad to hear it!
[23:27:30] <legoktm>	 thcipriani: lgtm
[23:27:35] <thcipriani>	 going live
[23:29:47] <logmsgbot>	 !log thcipriani@tin Synchronized php-1.31.0-wmf.6/extensions/ParserMigration/includes/ApiParserMigration.php: SWAT: [[gerrit:387954|API: Fix WikiPage namespace]] (duration: 00m 52s)
[23:29:54] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:29:56] <thcipriani>	 ^ legoktm live everywhere
[23:31:28] <legoktm>	 thanks!
[23:31:31] <thcipriani>	 ebernhardson: WikimediaEvents update is live for both wmf.{5,6} on mwdebug1002, check please
[23:31:44] <thcipriani>	 yw, thanks for the patch :)
[23:33:11] <ebernhardson>	 thcipriani: seems reasonable enough. not awhole lot that can be tested
[23:33:27] <thcipriani>	 okie doke, going live wmf.6 first
[23:35:42] <logmsgbot>	 !log thcipriani@tin Synchronized php-1.31.0-wmf.6/extensions/WikimediaEvents: SWAT: [[gerrit:387957|Turn on Cirrus AB test for DBN group sizing]] (duration: 00m 51s)
[23:35:49] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:37:18] <logmsgbot>	 !log thcipriani@tin Synchronized php-1.31.0-wmf.5/extensions/WikimediaEvents: SWAT: [[gerrit:387956|Turn on Cirrus AB test for DBN group sizing]] (duration: 00m 50s)
[23:37:25] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:37:25] <thcipriani>	 ^ ebernhardson all live
[23:38:06] <ebernhardson>	 thcipriani: thanks! keeping an eye on event counts, will take a few minutes to actually get to users
[23:38:40] <wikibugs>	 10Operations, 10Deployments, 10Beta-Cluster-reproducible, 10HHVM, and 2 others: Switch mwscript from Zend PHP5 to default php alternative (e.g. HHVM or PHP7) - https://phabricator.wikimedia.org/T146285#3728364 (10hashar) I am fine with https://gerrit.wikimedia.org/r/#/c/358896/  would want to schedule it a...
[23:38:53] <thcipriani>	 cool :)
[23:39:48] <wikibugs>	 10Operations, 10MediaWiki-General-or-Unknown, 10TechCom-RfC: Bump PHP requirement to 5.6 in 1.31 - https://phabricator.wikimedia.org/T178538#3728368 (10hashar) //Stop forcing php5 in `mwscript`// (https://gerrit.wikimedia.org/r/#/c/358896/). Well we just have to do the switch and see what happens I guess, it...
[23:48:29] <icinga-wm>	 PROBLEM - Router interfaces on cr1-eqiad is CRITICAL: CRITICAL: host 208.80.154.196, interfaces up: 232, down: 1, dormant: 0, excluded: 0, unused: 0
[23:52:38] <icinga-wm>	 RECOVERY - Router interfaces on cr1-eqiad is OK: OK: host 208.80.154.196, interfaces up: 234, down: 0, dormant: 0, excluded: 0, unused: 0
[23:59:38] <icinga-wm>	 PROBLEM - Router interfaces on cr1-eqiad is CRITICAL: CRITICAL: host 208.80.154.196, interfaces up: 232, down: 1, dormant: 0, excluded: 0, unused: 0