[00:00:04] addshore, hashar, anomie, no_justification, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: Dear deployers, time to do the Evening SWAT (Max 8 patches) deploy. Dont look at me like that. You signed up for it. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180125T0000). [00:00:04] No GERRIT patches in the queue for this window AFAICS. [00:01:56] (03PS6) 10Eevans: cassandra: create parent data directories with exec [puppet] - 10https://gerrit.wikimedia.org/r/404705 (https://phabricator.wikimedia.org/T175284) [00:01:58] (03CR) 10Eevans: cassandra: create parent data directories with exec (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/404705 (https://phabricator.wikimedia.org/T175284) (owner: 10Eevans) [00:11:44] (03PS9) 10Andrew Bogott: openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) [00:12:11] (03CR) 10jerkins-bot: [V: 04-1] openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) (owner: 10Andrew Bogott) [00:13:30] (03PS10) 10Andrew Bogott: openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) [00:13:56] (03CR) 10jerkins-bot: [V: 04-1] openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) (owner: 10Andrew Bogott) [00:19:09] RECOVERY - cassandra-c CQL 10.192.16.178:9042 on restbase2007 is OK: TCP OK - 0.036 second response time on 10.192.16.178 port 9042 [00:24:54] !log bootstrapping restbase2008-a - T184100 [00:25:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:25:09] T184100: Reprovision legacy Cassandra nodes into new cluster - https://phabricator.wikimedia.org/T184100 [00:28:20] (03CR) 10Mobrovac: [C: 031] cassandra: create parent data directories with exec [puppet] - 10https://gerrit.wikimedia.org/r/404705 (https://phabricator.wikimedia.org/T175284) (owner: 10Eevans) [00:37:17] (03PS1) 10Jcrespo: mariadb: Reenable notifications on es2011 after reimage [puppet] - 10https://gerrit.wikimedia.org/r/406135 [00:44:23] spike of errors on commons db [00:45:47] seems like the query killer is working hard on some api call [00:47:17] it is a POST, so I cannot see the parameters [00:48:02] (03PS11) 10Andrew Bogott: openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) [00:48:02] only can see: Function: LocalRepo::findFiles [00:48:29] (03CR) 10jerkins-bot: [V: 04-1] openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) (owner: 10Andrew Bogott) [00:48:43] the query doesn't seem like heavy [01:00:04] twentyafterfour: I, the Bot under the Fountain, allow thee, The Deployer, to do Phabricator update deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180125T0100). [01:00:04] No GERRIT patches in the queue for this window AFAICS. [01:07:29] 10Operations, 10Traffic, 10Accessibility, 10Browser-Support-Internet-Explorer: Wikipedia no longer accessible to those using some braille devices - https://phabricator.wikimedia.org/T185582#3922665 (10matmarex) MediaWiki still has basic support for IE 6, but I think Wikipedia no longer allows IE 6 to conne... [01:36:15] (03CR) 10Jcrespo: [C: 032] Revert "mariadb: Depool es2011 for reimage" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406132 (owner: 10Jcrespo) [01:36:27] (03CR) 10Jcrespo: [C: 032] mariadb: Reenable notifications on es2011 after reimage [puppet] - 10https://gerrit.wikimedia.org/r/406135 (owner: 10Jcrespo) [01:37:53] (03Merged) 10jenkins-bot: Revert "mariadb: Depool es2011 for reimage" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406132 (owner: 10Jcrespo) [01:38:03] (03CR) 10jenkins-bot: Revert "mariadb: Depool es2011 for reimage" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406132 (owner: 10Jcrespo) [01:41:55] !log jynus@tin Synchronized wmf-config/db-codfw.php: Repool es2011 (duration: 00m 57s) [01:42:00] (03PS1) 10Chad: Gerrit: Shut up completely useless WARN-level spam from EventUtil [puppet] - 10https://gerrit.wikimedia.org/r/406137 [01:42:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:43:00] !log jynus@tin Synchronized wmf-config/db-eqiad.php: Repool es2011 (duration: 00m 56s) [01:43:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:47:16] 10Operations, 10ops-eqiad, 10User-Eevans: Degraded RAID on restbase-dev1006 - https://phabricator.wikimedia.org/T185494#3922682 (10Eevans) [01:47:35] (03CR) 10Paladox: [C: 031] Gerrit: Shut up completely useless WARN-level spam from EventUtil [puppet] - 10https://gerrit.wikimedia.org/r/406137 (owner: 10Chad) [01:52:56] (03PS1) 10Chad: Gerrit: Set changeCleanup.startTime [puppet] - 10https://gerrit.wikimedia.org/r/406138 [01:54:45] (03CR) 10Paladox: [C: 031] Gerrit: Set changeCleanup.startTime [puppet] - 10https://gerrit.wikimedia.org/r/406138 (owner: 10Chad) [01:57:40] (03PS1) 10Chad: Gerrit: Set gc.aggressive = true [puppet] - 10https://gerrit.wikimedia.org/r/406139 [01:58:27] no_justification +1 ^^ :). jgit should be stable in 2.14 or 2.15. As the bug was fixed in a release i forgot :). [01:59:35] (03PS1) 10Chad: Gerrit: Set groups.newGroupsVisibleToAll = true [puppet] - 10https://gerrit.wikimedia.org/r/406140 [01:59:52] yay +1. [02:00:05] (03CR) 10Paladox: [C: 031] ":) :) :) :) :)" [puppet] - 10https://gerrit.wikimedia.org/r/406140 (owner: 10Chad) [02:01:14] (03CR) 10Paladox: [C: 031] Gerrit: Set groups.newGroupsVisibleToAll = true (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/406140 (owner: 10Chad) [02:08:15] (03CR) 10TerraCodes: [C: 031] Bureaucrats on WMF wikis to add and remove 'accountcreator' by default [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406025 (https://phabricator.wikimedia.org/T185417) (owner: 10MarcoAurelio) [02:10:10] (03PS1) 10Chad: Gerrit: Add ldap.connectTimeout [puppet] - 10https://gerrit.wikimedia.org/r/406143 [02:12:02] (03CR) 10Paladox: Gerrit: Add ldap.connectTimeout (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/406143 (owner: 10Chad) [02:15:32] PROBLEM - HHVM rendering on mw1290 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 0.001 second response time [02:15:42] PROBLEM - Nginx local proxy to apache on mw1290 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 0.007 second response time [02:16:32] RECOVERY - HHVM rendering on mw1290 is OK: HTTP OK: HTTP/1.1 200 OK - 75096 bytes in 1.000 second response time [02:16:42] RECOVERY - Nginx local proxy to apache on mw1290 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 617 bytes in 0.030 second response time [02:21:55] !log l10nupdate@tin scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 32s) [02:22:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:23:45] (03PS1) 10Chad: Gerrit: Remove hardcoding of the smtp port [puppet] - 10https://gerrit.wikimedia.org/r/406144 [02:31:59] 10Operations, 10Mail, 10Patch-For-Review: mail.wikimedia.org SSL cert expiring Mon 23 Oct 2017 - https://phabricator.wikimedia.org/T174081#3922810 (10herron) [02:32:01] 10Operations, 10Mail, 10Traffic: convert mail servers from GS to LE certificates - https://phabricator.wikimedia.org/T159346#3922809 (10herron) [02:32:58] (03PS1) 10Chad: Gerrit: Allow enabling of tls/ssl (keep default of none for now) [puppet] - 10https://gerrit.wikimedia.org/r/406145 [02:36:11] 10Operations, 10Traffic: Letsencrypt all the prod things we can - planning - https://phabricator.wikimedia.org/T133717#3922815 (10herron) [02:36:14] 10Operations, 10Mail, 10Traffic: convert mail servers from GS to LE certificates - https://phabricator.wikimedia.org/T159346#3922813 (10herron) 05Open>03Resolved Resolving as MX were migrated to LE certificates during renewal task T159346 [02:50:50] 10Operations, 10Gerrit, 10Traffic, 10Patch-For-Review: Switch on http/2 in apache for gerrit - https://phabricator.wikimedia.org/T180978#3922822 (10demon) 05Open>03declined Declining outright per what I said in T185645#3922220: > Also, I'd rather move it behind LVS: T165631 [03:04:32] PROBLEM - Nginx local proxy to apache on mw1262 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 0.007 second response time [03:04:43] PROBLEM - HHVM rendering on mw1262 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 0.001 second response time [03:05:03] PROBLEM - Apache HTTP on mw1262 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 0.001 second response time [03:05:32] RECOVERY - Nginx local proxy to apache on mw1262 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 617 bytes in 0.087 second response time [03:05:42] RECOVERY - HHVM rendering on mw1262 is OK: HTTP OK: HTTP/1.1 200 OK - 75112 bytes in 0.194 second response time [03:06:03] RECOVERY - Apache HTTP on mw1262 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 616 bytes in 0.055 second response time [03:26:03] PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 730.10 seconds [03:54:12] RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 54.75 seconds [06:33:52] RECOVERY - puppet last run on labtestnet2001 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [06:39:53] RECOVERY - Disk space on labtestnet2001 is OK: DISK OK [06:40:13] PROBLEM - Router interfaces on cr1-eqiad is CRITICAL: CRITICAL: host 208.80.154.196, interfaces up: 232, down: 1, dormant: 0, excluded: 0, unused: 0 [06:52:13] PROBLEM - HHVM jobrunner on mw1336 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 473 bytes in 0.001 second response time [06:53:13] RECOVERY - HHVM jobrunner on mw1336 is OK: HTTP OK: HTTP/1.1 200 OK - 206 bytes in 0.001 second response time [07:00:23] RECOVERY - Router interfaces on cr1-eqiad is OK: OK: host 208.80.154.196, interfaces up: 234, down: 0, dormant: 0, excluded: 0, unused: 0 [07:40:22] PROBLEM - Restbase edge esams on text-lb.esams.wikimedia.org is CRITICAL: /api/rest_v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received [07:42:22] RECOVERY - Restbase edge esams on text-lb.esams.wikimedia.org is OK: All endpoints are healthy [07:44:23] !log bootstrapping restbase2008-b - T184100 [07:44:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:44:36] T184100: Reprovision legacy Cassandra nodes into new cluster - https://phabricator.wikimedia.org/T184100 [08:08:23] PROBLEM - HHVM jobrunner on mw1334 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 473 bytes in 0.001 second response time [08:09:23] RECOVERY - HHVM jobrunner on mw1334 is OK: HTTP OK: HTTP/1.1 200 OK - 206 bytes in 0.001 second response time [08:47:23] PROBLEM - HHVM rendering on mw2145 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:48:13] RECOVERY - HHVM rendering on mw2145 is OK: HTTP OK: HTTP/1.1 200 OK - 75162 bytes in 0.300 second response time [09:12:53] PROBLEM - Apache HTTP on mw1345 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 0.001 second response time [09:13:53] RECOVERY - Apache HTTP on mw1345 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 616 bytes in 0.089 second response time [10:19:43] PROBLEM - Disk space on analytics1045 is CRITICAL: DISK CRITICAL - free space: / 756 MB (1% inode=97%) [10:21:42] RECOVERY - Disk space on analytics1045 is OK: DISK OK [10:54:29] Hi, https://phabricator.wikimedia.org is down for me [10:54:37] And Wikipedia too [10:54:44] But google works and other sites [10:55:09] This is in the UK on three mobile using 3G and 4g [10:57:23] gerrit.wikimedia.org works for me [11:11:57] It's up for me, but status.wikimedia is reporting performance degradation. [11:12:47] And anecdotally we just had a user in -en complain about intermittent enwiki timeouts. [11:13:32] * paladox won’t be able to create a task for this [11:13:41] Can’t access phab [11:15:46] 11:12 TheDragonFire: And anecdotally we just had a user in -en complain about intermittent enwiki timeouts. [11:15:49] Uh whoops [11:16:13] TheDragonFire: would you be able to create a task for this please? [11:16:20] And subscribe me please ? :) [11:16:44] Sure, gimme a second. [11:16:49] It seems this may be varnish related as phab is behind varnish so I carnt access it and gerrit is using apache and I can access it [11:16:49] Thanks [11:16:50] Ben [11:16:57] Ben = brb [11:26:11] paladox: https://phabricator.wikimedia.org/T185687 [11:26:45] paladox: Confirmed through a second report. Three mobile is experiencing possible peering issues. [11:31:09] paladox: Your IP wouldn't happen to be close to 92.40.248.37 would it? [11:36:42] 10Operations: Wikipedia access in the United Kingdom is experiencing connection issues. - https://phabricator.wikimedia.org/T185687#3923081 (10TheDragonFire) [12:19:12] 10Operations, 10Traffic, 10netops: Wikipedia access in the United Kingdom is experiencing connection issues. - https://phabricator.wikimedia.org/T185687#3923101 (10TheDragonFire) [12:23:41] 10Operations, 10Traffic, 10netops: Wikipedia access in the United Kingdom is experiencing connection issues. - https://phabricator.wikimedia.org/T185687#3923102 (10TheDragonFire) benoliver999 reports that `text-lb.esams.wikimedia.org` times out, but `ulsfo`, `eqiad` and `codfw` are good. [12:45:46] TheDragonFire I will check [12:46:22] Mine is 94.197.121.100 [12:47:03] If eqiad works then could it be esams? [12:47:29] paladox: Okay, it's the same ASN (60339) as the other user. [12:48:54] paladox: From what I can tell there's been a peering/transit failure somewhere between AS60339 (Hutchison 3G) and AS14907 (esams). [12:51:44] paladox: I've pinged noc@wikimedia. Ben pinged noc@three and has talked to Live Support at Three, but didn't get very far. I'm not sure who to raise netops issues with at WMF though. [12:52:54] Ah I see [12:53:28] TheDragonFire maybe set the task to unbreak? [12:53:54] This dosent feel like a three network internal network problem as I can access eqiad fine :). [12:56:07] TheDragonFire I guess there’s no reply since all of ops are in all hands on in the us [12:56:16] So it’s still very early morning for them [12:58:09] paladox: Aren't they entirely different address blocks? [12:58:33] TheDragonFire for eqiad? [12:59:26] Possibly [12:59:43] text-lb.eqiad.wikimedia.org = 208.80.154.224, text-lb.esams.wikimedia.org = 91.198.174.192 [13:00:18] Same ASN, but announced from different locations I think. I could be very wrong on that though. [13:00:51] Yep [13:03:18] Backhaul ISP for three responded over Twitter, I'm DMing them. [13:19:07] 10Operations, 10Traffic, 10netops: Wikipedia access in the United Kingdom is experiencing connection issues. - https://phabricator.wikimedia.org/T185687#3923212 (10TheDragonFire) 05Open>03Resolved a:03TheDragonFire Cautiously closing as connectivity is back up. Somewhat speculative, but circumstantial... [13:58:52] PROBLEM - Varnish HTTP text-backend - port 3128 on cp4030 is CRITICAL: connect to address 10.128.0.130 and port 3128: Connection refused [13:59:53] RECOVERY - Varnish HTTP text-backend - port 3128 on cp4030 is OK: HTTP OK: HTTP/1.1 200 OK - 218 bytes in 0.157 second response time [14:06:42] PROBLEM - HHVM rendering on mw1229 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 0.002 second response time [14:07:43] RECOVERY - HHVM rendering on mw1229 is OK: HTTP OK: HTTP/1.1 200 OK - 74869 bytes in 1.330 second response time [14:44:10] PROBLEM - High CPU load on API appserver on mw1316 is CRITICAL: CRITICAL - load average: 135.63, 42.62, 23.99 [14:46:10] RECOVERY - High CPU load on API appserver on mw1316 is OK: OK - load average: 30.72, 33.03, 22.72 [14:52:42] !log bootstrapping restbase2008-c - T184100 [14:52:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:53:02] T184100: Reprovision legacy Cassandra nodes into new cluster - https://phabricator.wikimedia.org/T184100 [15:10:21] PROBLEM - HHVM rendering on mw1276 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 0.001 second response time [15:10:30] PROBLEM - Apache HTTP on mw1276 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 0.001 second response time [15:10:41] PROBLEM - Nginx local proxy to apache on mw1276 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 0.007 second response time [15:11:30] RECOVERY - HHVM rendering on mw1276 is OK: HTTP OK: HTTP/1.1 200 OK - 74851 bytes in 0.130 second response time [15:11:31] RECOVERY - Apache HTTP on mw1276 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 616 bytes in 0.050 second response time [15:11:41] RECOVERY - Nginx local proxy to apache on mw1276 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 617 bytes in 0.106 second response time [15:31:00] 10Operations, 10Traffic, 10netops: Wikipedia access in the United Kingdom is experiencing connection issues. - https://phabricator.wikimedia.org/T185687#3923353 (10Aklapper) 05Resolved>03declined a:05TheDragonFire>03None [17:02:22] (03PS1) 10Ayounsi: Add PCCW side PTR [dns] - 10https://gerrit.wikimedia.org/r/406178 [17:03:13] (03CR) 10Ayounsi: [C: 032] Add PCCW side PTR [dns] - 10https://gerrit.wikimedia.org/r/406178 (owner: 10Ayounsi) [17:53:05] 10Operations, 10Continuous-Integration-Infrastructure, 10MediaWiki-Core-Tests, 10HHVM: HHVM 3.18.5+dfsg-1+wmf3 changes parse_url causing unit tests to fail - https://phabricator.wikimedia.org/T185024#3903647 (10fred) https://gist.github.com/fredemmott/39c9abef4571f1e337d339fd8355da60 should resolve this, a... [17:56:50] PROBLEM - IPv4 ping to ulsfo on ripe-atlas-ulsfo is CRITICAL: Traceback (most recent call last) [17:58:43] 10Operations, 10Continuous-Integration-Infrastructure, 10MediaWiki-Core-Tests, 10HHVM: HHVM 3.18.5+dfsg-1+wmf3 changes parse_url causing unit tests to fail - https://phabricator.wikimedia.org/T185024#3923453 (10fred) Also: - From Facebook's perspective, HHVM 3.18 is unsupported as of 2018-01-16; that said... [18:00:50] RECOVERY - IPv4 ping to ulsfo on ripe-atlas-ulsfo is OK: OK - failed 1 probes of 309 (alerts on 19) - https://atlas.ripe.net/measurements/1791307/#!map [18:13:20] <_joe_> !log restart hhvm on a few api appservers, high cpu load [18:13:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:15:15] 10Operations, 10Continuous-Integration-Infrastructure, 10MediaWiki-Core-Tests, 10HHVM: HHVM 3.18.5+dfsg-1+wmf3 changes parse_url causing unit tests to fail - https://phabricator.wikimedia.org/T185024#3923459 (10MoritzMuehlenhoff) >>! In T185024#3923453, @fred wrote: > Also: > - From Facebook's perspective... [18:23:21] 10Operations, 10Traffic, 10Accessibility, 10Browser-Support-Internet-Explorer: Wikipedia no longer accessible to those using some braille devices - https://phabricator.wikimedia.org/T185582#3923484 (10Cameron11598) They actually were receiving a different error and were just able to locate it. > "Browse... [18:24:53] 10Operations, 10Continuous-Integration-Infrastructure, 10MediaWiki-Core-Tests, 10HHVM: HHVM 3.18.5+dfsg-1+wmf3 changes parse_url causing unit tests to fail - https://phabricator.wikimedia.org/T185024#3923487 (10fred) If it's practical, I'm going to do it even if it's not necessary for Wikipedia - breaking... [18:31:03] 10Operations, 10Traffic, 10Accessibility, 10Browser-Support-Internet-Explorer: Wikipedia no longer accessible to those using some braille devices - https://phabricator.wikimedia.org/T185582#3923504 (10Reedy) They might be able to improve the situation by changing the SSL certificates. Beyond that, there's... [18:31:15] 10Operations, 10Analytics: setup/install eventlog1002.eqiad.wmnet - https://phabricator.wikimedia.org/T185667#3923505 (10faidon) [18:31:50] 10Operations, 10Traffic, 10Accessibility, 10Browser-Support-Internet-Explorer: Wikipedia no longer accessible to those using some braille devices - https://phabricator.wikimedia.org/T185582#3920199 (10Joe) I can confirm that our current SSL settings do not work with IE8 on windows XP. This is due to inhere... [18:40:08] 10Operations, 10Traffic, 10Accessibility, 10Browser-Support-Internet-Explorer: Wikipedia no longer accessible to those using some braille devices - https://phabricator.wikimedia.org/T185582#3923523 (10Joe) @Cameron11598 I couldn't find on the product page for that device any information on the OS/browser i... [18:43:15] 10Operations, 10Traffic, 10Accessibility, 10Browser-Support-Internet-Explorer: Wikipedia no longer accessible to those using some braille devices - https://phabricator.wikimedia.org/T185582#3923538 (10Joe) >>! In T185582#3922270, @Reedy wrote: > Error seems to be related to SSL settings. > > https://suppo... [18:43:47] (03PS2) 10Alexandros Kosiaris: kubernetes: Enable IPv6 forwarding and accept_ra SLAAC [puppet] - 10https://gerrit.wikimedia.org/r/404281 [18:45:07] 10Operations, 10Traffic, 10Accessibility, 10Browser-Support-Internet-Explorer: Wikipedia no longer accessible to those using some braille devices - https://phabricator.wikimedia.org/T185582#3923539 (10Reedy) Right. But there should be TLS settings to enable (if they're not enabled already), ala https://hel... [18:49:24] (03CR) 10Dzahn: [C: 032] Gerrit: Shut up completely useless WARN-level spam from EventUtil [puppet] - 10https://gerrit.wikimedia.org/r/406137 (owner: 10Chad) [18:51:01] (03PS3) 10Alexandros Kosiaris: kubernetes: Enable IPv6 forwarding and accept_ra SLAAC [puppet] - 10https://gerrit.wikimedia.org/r/404281 [18:55:33] (03PS4) 10Alexandros Kosiaris: kubernetes: Enable IPv6 forwarding and accept_ra SLAAC [puppet] - 10https://gerrit.wikimedia.org/r/404281 [18:56:09] (03CR) 10Alexandros Kosiaris: [C: 032] kubernetes: Enable IPv6 forwarding and accept_ra SLAAC [puppet] - 10https://gerrit.wikimedia.org/r/404281 (owner: 10Alexandros Kosiaris) [18:56:37] (03CR) 10Alexandros Kosiaris: [C: 032] "Change amended, we are now also enabling IPv6 forwarding globally. Tested this and it should work fine" [puppet] - 10https://gerrit.wikimedia.org/r/404281 (owner: 10Alexandros Kosiaris) [18:56:46] 10Operations, 10Traffic, 10Accessibility, 10Browser-Support-Internet-Explorer: Wikipedia no longer accessible to those using some braille devices - https://phabricator.wikimedia.org/T185582#3923572 (10Volker_E) @Joe There's a product specification page (I linked another article I've found but wanted to lin... [19:07:35] elukey: https://phabricator.wikimedia.org/T185715 (context: dh-make-golang is not present in Debian Jessie) [19:13:52] arturo: uffffff sorry! Let's chat about it later! [19:14:09] sure :-) thanks elukey [20:30:25] phabricator feels slow for me [20:31:13] This is on a different isp then this mornning [20:32:20] * paladox files a task [20:32:30] as it is very slower then usual. [20:32:57] 10Operations, 10Traffic, 10Accessibility, 10Browser-Support-Internet-Explorer: Wikipedia no longer accessible to those using some braille devices - https://phabricator.wikimedia.org/T185582#3920199 (10TheDJ) I've thrown a tweet at HIMS Inc. Maybe they can help. https://twitter.com/dj_hartman/status/9566256... [20:35:43] 10Operations, 10Phabricator: Phabricator is slow on uk ISP's - https://phabricator.wikimedia.org/T185718#3923680 (10Paladox) [20:38:27] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on uk ISP's - https://phabricator.wikimedia.org/T185718#3923666 (10Zoranzoki21) Vip mobile Serbia too Telenor Serbia work ok P. S. 😄 Deploy this https://gerrit.wikimedia.org/r/#/c/404095/ [20:38:40] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on uk ISP's - https://phabricator.wikimedia.org/T185718#3923685 (10Paladox) [20:39:16] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's - https://phabricator.wikimedia.org/T185718#3923686 (10Zoranzoki21) [20:39:38] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923687 (10Paladox) [20:41:04] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923666 (10Paladox) Maybe this should be UBN status. The slowness is really hard to do anything on here. [20:43:21] I doin't think it's my isp this time [20:43:26] as it happends on two of them [20:44:18] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923689 (10Zoranzoki21) I think to we first should contact these providers. I will contact vip mobile to ask. [20:45:34] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923690 (10Paladox) As it happends to bt and three in the uk. I doin't think there's a point unless they managed to have the same problem at the same time which is unlikely. [20:47:25] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923691 (10Zoranzoki21) >>! In T185718#3923690, @Paladox wrote: > As it happends to bt and three in the uk. I doin't think there's a point unless they managed to have the sam... [20:50:28] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923693 (10Zoranzoki21) Tech support from vip mobile told me to you check logs. [20:53:18] (03PS1) 10Bartosz Dziewoński: Set wgCategoryCollation for abwiki (Abkhaz Wikipedia) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406185 (https://phabricator.wikimedia.org/T183430) [20:55:34] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923697 (10Paladox) [20:57:20] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923698 (10Zoranzoki21) >>! In T185718#3923693, @Zoranzoki21 wrote: > Tech support from vip mobile told me to you check logs. I asked my friend which work in telenor to chec... [20:57:57] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923699 (10Zoranzoki21) [20:58:28] hmm ping ae2.cr2-esams.wikimedia.org times out. [20:58:29] 92.242.132.15 [20:58:36] PING 92.242.132.15 (92.242.132.15): 56 data bytes [20:58:36] Request timeout for icmp_seq 0 [20:58:48] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923666 (10Legoktm) Please follow the instructions at and provide the requested information so ops can di... [20:59:02] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923704 (10Zoranzoki21) I tested too on mts network. I can not open phabricator how much is slow. I updated task description to problems too happening in Serbia. [21:00:38] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923706 (10Zoranzoki21) >>! In T185718#3923702, @Legoktm wrote: > Please follow the instructions at and p... [21:02:37] adding onto what paladox said, I cant ping ae2.cr2.esams.wikimedia.org from my ISP in USA (comcast) the error is: Ping request could not find host ae2.cr2-esams.wikimedia.org. Please check the name and try again. [21:04:33] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923707 (10Zoranzoki21) Curl for phabricator tested from my test server: ``` mediawiki@server:~$ curl -v https://phabricator.wikimedia.org... [21:04:46] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923708 (10Paladox) Pinging ae2.cr2-esams.wikimedia.org returns timeout errors (including it's ip). ``` curl -v phabricator.wikimedia.org * Rebuilt URL to: phabricator.wikim... [21:11:09] (03PS1) 10Elukey: package_builder: require dh-make-golang only from stretch onward [puppet] - 10https://gerrit.wikimedia.org/r/406191 (https://phabricator.wikimedia.org/T185715) [21:11:23] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923709 (10Zoranzoki21) >>! In T185718#3923708, @Paladox wrote: > Pinging ae2.cr2-esams.wikimedia.org returns timeout errors (including it's ip). > > ``` > curl -v phabricat... [21:13:10] PROBLEM - HHVM rendering on mw1298 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:14:00] RECOVERY - HHVM rendering on mw1298 is OK: HTTP OK: HTTP/1.1 200 OK - 74859 bytes in 0.213 second response time [21:15:39] 10Operations, 10Analytics-Data-Quality, 10Traffic: Vet reliability of the response_size field for data analysis purposes - https://phabricator.wikimedia.org/T185350#3923717 (10Tbayer) >>! In T185350#3919078, @faidon wrote: > Interesting! So with a ratio in:out of approximately 25:1 (based on January's figure... [21:20:40] (03CR) 10Zoranzoki21: [C: 031] Set category collation for nowikimedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406022 (https://phabricator.wikimedia.org/T185630) (owner: 10Jon Harald Søby) [21:21:45] (03CR) 10Zoranzoki21: [C: 031] Add 3 namespaces to wawiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405258 (https://phabricator.wikimedia.org/T185289) (owner: 10Jon Harald Søby) [21:24:10] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923730 (10Paladox) This is ping data ``` ping -c 4 phabricator.wikimedia.org PING phabricator.wikimedia.org (91.198.174.217): 56 data bytes 64 bytes from 91.198.174.217: ic... [21:35:42] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923740 (10Paladox) Here's my recording of it http://recordit.co/SCP6EtgIKK [21:38:33] 10Operations, 10Traffic, 10Accessibility, 10Browser-Support-Internet-Explorer: Wikipedia no longer accessible to those using some braille devices - https://phabricator.wikimedia.org/T185582#3923741 (10Volker_E) Thanks @TheDJ – I've also sent them an email about the cause with request for participation on t... [21:39:36] (03PS2) 10Dzahn: Gerrit: Remove hardcoding of the smtp port [puppet] - 10https://gerrit.wikimedia.org/r/406144 (owner: 10Chad) [21:39:41] 10Operations, 10Traffic, 10Accessibility, 10Browser-Support-Internet-Explorer: Wikipedia no longer accessible to those using some braille devices - https://phabricator.wikimedia.org/T185582#3923751 (10Reedy) If it's running CE 6.0... I bet it's basically going to be a case of "the OS doesn't support it, bu... [21:40:19] (03CR) 10Dzahn: [C: 032] Gerrit: Remove hardcoding of the smtp port [puppet] - 10https://gerrit.wikimedia.org/r/406144 (owner: 10Chad) [21:42:25] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923757 (10Paladox) Here's how long it takes for it to load phab https://phabricator.wikimedia.org/F12791592 [21:45:17] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923759 (10Zoranzoki21) >>! In T185718#3923757, @Paladox wrote: > Here's how long it takes for it to load phab https://phabricator.wikimedia.org/F12791592 Heh, I can not see... [21:46:36] (03CR) 10Dzahn: Gerrit: Add ldap.connectTimeout (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/406143 (owner: 10Chad) [21:46:44] (03PS2) 10Dzahn: Gerrit: Add ldap.connectTimeout [puppet] - 10https://gerrit.wikimedia.org/r/406143 (owner: 10Chad) [21:48:02] (03CR) 10Dzahn: [C: 032] Gerrit: Add ldap.connectTimeout [puppet] - 10https://gerrit.wikimedia.org/r/406143 (owner: 10Chad) [21:49:35] (03PS2) 10Dzahn: Gerrit: Set groups.newGroupsVisibleToAll = true [puppet] - 10https://gerrit.wikimedia.org/r/406140 (owner: 10Chad) [21:50:54] (03CR) 10Dzahn: [C: 032] Gerrit: Set groups.newGroupsVisibleToAll = true [puppet] - 10https://gerrit.wikimedia.org/r/406140 (owner: 10Chad) [21:52:36] mutante i think those changes require gerrit be restarted. [21:54:17] yes, i will restart it, but only once [21:56:01] (03PS2) 10Dzahn: Gerrit: Set changeCleanup.startTime [puppet] - 10https://gerrit.wikimedia.org/r/406138 (owner: 10Chad) [21:56:28] one more and restart to confirm [21:56:36] not doing the gc-related one, just the others [21:56:58] (03CR) 10Dzahn: [C: 032] Gerrit: Set changeCleanup.startTime [puppet] - 10https://gerrit.wikimedia.org/r/406138 (owner: 10Chad) [22:06:29] !log bootstrapping restbase2009-a - T184100 [22:06:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:06:53] T184100: Reprovision legacy Cassandra nodes into new cluster - https://phabricator.wikimedia.org/T184100 [22:07:14] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923781 (10Paladox) tcptraceroute shows: ``` sudo tcptraceroute phabricator.wikimedia.org Password: Selected device en0, address 192.168.1.231, port 58435 for outgoing pack... [22:07:33] !log restarting apache on phabricator server [22:07:44] yay it fast again [22:07:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:07:47] thanks mutante :) [22:08:00] lol, well, nice [22:09:23] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923666 (10Dzahn) I restarted Apache on the Phabricator server and apparently that fixed it. [22:10:15] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923786 (10Zoranzoki21) >>! In T185718#3923784, @Dzahn wrote: > I restarted Apache on the Phabricator server and apparently that fixed it. Now is faster loading phabricator... [22:10:30] 10Operations, 10Phabricator, 10Traffic: Phabricator is slow on mobile ISP's and home ISP's - https://phabricator.wikimedia.org/T185718#3923787 (10Dzahn) 05Open>03Resolved a:03Dzahn 17:07 < mutante> !log restarting apache on phabricator server 17:07 < paladox> yay it fast again [22:24:34] !log restarting gerrit service to apply a few small config changes https://gerrit.wikimedia.org/r/#/q/topic:gerrit-trivial-tweaks+(status:open+OR+status:merged) [22:24:47] 10Operations, 10Traffic, 10Accessibility, 10Browser-Support-Internet-Explorer: Wikipedia no longer accessible to those using some braille devices - https://phabricator.wikimedia.org/T185582#3923793 (10TheDJ) Mainstream support for it ended in 2013.. https://support.microsoft.com/en-us/lifecycle/search?alph... [22:24:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:28:25] (03Draft1) 10Paladox: phabricator: Switch from apache to nginx [puppet] - 10https://gerrit.wikimedia.org/r/406243 [22:28:29] (03PS2) 10Paladox: phabricator: Switch from apache to nginx [puppet] - 10https://gerrit.wikimedia.org/r/406243 [22:28:31] PROBLEM - puppet last run on stat1005 is CRITICAL: CRITICAL: Puppet has 6 failures. Last run 3 minutes ago with 6 failures. Failed resources (up to 3 shown): Exec[git_pull_wmde/scripts],Exec[git_pull_wmde/toolkit-analyzer-build],Exec[git_pull_mediawiki/event-schemas],Exec[git_pull_statistics_mediawiki] [22:28:38] (03PS3) 10Paladox: WIP: phabricator: Switch from apache to nginx [puppet] - 10https://gerrit.wikimedia.org/r/406243 [22:58:31] RECOVERY - puppet last run on stat1005 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [23:03:45] (03CR) 10Arturo Borrero Gonzalez: [C: 032] package_builder: require dh-make-golang only from stretch onward [puppet] - 10https://gerrit.wikimedia.org/r/406191 (https://phabricator.wikimedia.org/T185715) (owner: 10Elukey) [23:03:51] 10Operations, 10Performance-Team, 10Traffic: load.php response taking 160s (of which only 0.031s in Apache) - https://phabricator.wikimedia.org/T181315#3923835 (10Imarlier) Couple of things that I noticed about this, which seem to generally support Gilles' conclusion: //[[ https://logstash.wikimedia.org/app... [23:03:57] (03PS2) 10Arturo Borrero Gonzalez: package_builder: require dh-make-golang only from stretch onward [puppet] - 10https://gerrit.wikimedia.org/r/406191 (https://phabricator.wikimedia.org/T185715) (owner: 10Elukey) [23:15:38] 10Operations, 10Analytics: setup/install eventlog1002.eqiad.wmnet - https://phabricator.wikimedia.org/T185667#3923839 (10Ottomata) [23:44:49] 10Operations, 10Patch-For-Review, 10Release-Engineering-Team (Watching / External), 10Scoring-platform-team (Current), 10Wikimedia-Incident: Cache ORES virtualenv within versioned source - https://phabricator.wikimedia.org/T181071#3923850 (10akosiaris) Thanks for letting me know. I 've +1ed it. We should... [23:55:56] (03PS4) 10Paladox: WIP: phabricator: Switch from apache to nginx [puppet] - 10https://gerrit.wikimedia.org/r/406243 [23:57:10] (03PS5) 10Paladox: WIP: phabricator: Switch from apache to nginx [puppet] - 10https://gerrit.wikimedia.org/r/406243 (https://phabricator.wikimedia.org/T185644)