[00:07:35] RECOVERY - puppet last run on cp3014 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [00:52:26] PROBLEM - puppet last run on mw1179 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [01:10:20] (03CR) 10Paladox: Enable JVM heap log to debug gerrit slowing down [puppet] - 10https://gerrit.wikimedia.org/r/316622 (https://phabricator.wikimedia.org/T148478) (owner: 10Paladox) [01:18:26] RECOVERY - puppet last run on mw1179 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [01:41:45] PROBLEM - puppet last run on cp3022 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [02:07:26] RECOVERY - puppet last run on cp3022 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [03:36:06] PROBLEM - puppet last run on mw1248 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 5 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/share/GeoIP/GeoIP2-City.mmdb.gz] [04:01:47] RECOVERY - puppet last run on mw1248 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [05:14:37] weird, i keep getting broken images when there are a lot of them. E.g. https://commons.wikimedia.org/wiki/Copyright - is that just me? [05:35:10] yurik: what does your browser console say? [05:35:50] p858snake|L2, Failed to load resource: net::ERR_CONNECTION_RESET [05:56:36] PROBLEM - Router interfaces on cr1-ulsfo is CRITICAL: CRITICAL: host 198.35.26.192, interfaces up: 66, down: 1, dormant: 0, excluded: 0, unused: 0BRxe-1/2/0: down - Core: cr1-eqord:xe-0/0/1 (Telia, IC-313592, 51ms) {#1502} [10Gbps wave]BR [05:57:45] PROBLEM - Router interfaces on cr2-codfw is CRITICAL: CRITICAL: host 208.80.153.193, interfaces up: 120, down: 1, dormant: 0, excluded: 0, unused: 0BRxe-5/2/1: down - Core: cr1-eqord:xe-0/0/0 (Telia, IC-314534, 24ms) {#10694} [10Gbps wave]BR [05:58:17] PROBLEM - Router interfaces on cr1-eqord is CRITICAL: CRITICAL: host 208.80.154.198, interfaces up: 35, down: 2, dormant: 0, excluded: 0, unused: 0BRxe-0/0/0: down - Core: cr2-codfw:xe-5/2/1 (Telia, IC-314534, 29ms) {#11375} [10Gbps wave]BRxe-0/0/1: down - Core: cr1-ulsfo:xe-1/2/0 (Telia, IC-313592, 51ms) {#11372} [10Gbps wave]BR [06:00:44] now that doesn't look good [06:54:18] RECOVERY - Router interfaces on cr2-codfw is OK: OK: host 208.80.153.193, interfaces up: 122, down: 0, dormant: 0, excluded: 0, unused: 0 [06:54:48] RECOVERY - Router interfaces on cr1-eqord is OK: OK: host 208.80.154.198, interfaces up: 39, down: 0, dormant: 0, excluded: 0, unused: 0 [06:55:38] RECOVERY - Router interfaces on cr1-ulsfo is OK: OK: host 198.35.26.192, interfaces up: 68, down: 0, dormant: 0, excluded: 0, unused: 0 [07:02:07] PROBLEM - tools homepage -admin tool- on tools.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Not Available - 531 bytes in 0.056 second response time [07:03:59] Is tools crashing [07:04:01] ?? [07:12:22] PROBLEM - Router interfaces on cr1-ulsfo is CRITICAL: CRITICAL: host 198.35.26.192, interfaces up: 66, down: 1, dormant: 0, excluded: 0, unused: 0BRxe-1/2/0: down - Core: cr1-eqord:xe-0/0/1 (Telia, IC-313592, 51ms) {#1502} [10Gbps wave]BR [07:13:20] PROBLEM - Router interfaces on cr2-codfw is CRITICAL: CRITICAL: host 208.80.153.193, interfaces up: 120, down: 1, dormant: 0, excluded: 0, unused: 0BRxe-5/2/1: down - Core: cr1-eqord:xe-0/0/0 (Telia, IC-314534, 24ms) {#10694} [10Gbps wave]BR [07:14:01] PROBLEM - Router interfaces on cr1-eqord is CRITICAL: CRITICAL: host 208.80.154.198, interfaces up: 35, down: 2, dormant: 0, excluded: 0, unused: 0BRxe-0/0/0: down - Core: cr2-codfw:xe-5/2/1 (Telia, IC-314534, 29ms) {#11375} [10Gbps wave]BRxe-0/0/1: down - Core: cr1-ulsfo:xe-1/2/0 (Telia, IC-313592, 51ms) {#11372} [10Gbps wave]BR [07:18:31] RECOVERY - tools homepage -admin tool- on tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 3670 bytes in 0.063 second response time [07:21:22] telia is in a maintenance window right now so that's what those are [07:22:59] It looks like tools exploded to me [07:27:20] it's working fine for me [07:38:08] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1001 is CRITICAL: CRITICAL: 80.00% of data above the critical threshold [50.0] [07:40:27] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1001 is OK: OK: Less than 1.00% above the threshold [25.0] [08:00:45] RECOVERY - Router interfaces on cr2-codfw is OK: OK: host 208.80.153.193, interfaces up: 122, down: 0, dormant: 0, excluded: 0, unused: 0 [08:04:36] RECOVERY - Router interfaces on cr1-ulsfo is OK: OK: host 198.35.26.192, interfaces up: 68, down: 0, dormant: 0, excluded: 0, unused: 0 [08:06:12] RECOVERY - Router interfaces on cr1-eqord is OK: OK: host 208.80.154.198, interfaces up: 39, down: 0, dormant: 0, excluded: 0, unused: 0 [08:12:19] PROBLEM - Ulsfo HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [1000.0] [08:13:39] PROBLEM - Text HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [1000.0] [08:20:39] RECOVERY - Text HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [08:23:56] RECOVERY - Ulsfo HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [09:30:32] PROBLEM - MariaDB Slave SQL: s5 on dbstore2001 is CRITICAL: CRITICAL slave_sql_state could not connect [09:30:32] PROBLEM - MariaDB Slave Lag: m3 on dbstore2001 is CRITICAL: CRITICAL slave_sql_lag could not connect [09:30:32] PROBLEM - MariaDB Slave IO: s5 on dbstore2001 is CRITICAL: CRITICAL slave_io_state could not connect [09:30:32] PROBLEM - MariaDB Slave IO: m2 on dbstore2001 is CRITICAL: CRITICAL slave_io_state could not connect [09:30:32] PROBLEM - MariaDB Slave SQL: s3 on dbstore2001 is CRITICAL: CRITICAL slave_sql_state could not connect [09:30:32] PROBLEM - MariaDB Slave IO: s4 on dbstore2001 is CRITICAL: CRITICAL slave_io_state could not connect [09:30:32] PROBLEM - MariaDB Slave Lag: s6 on dbstore2001 is CRITICAL: CRITICAL slave_sql_lag could not connect [09:30:33] PROBLEM - MariaDB Slave Lag: s4 on dbstore2001 is CRITICAL: CRITICAL slave_sql_lag could not connect [09:30:33] PROBLEM - MariaDB Slave SQL: s6 on dbstore2001 is CRITICAL: CRITICAL slave_sql_state could not connect [09:30:34] PROBLEM - MariaDB Slave Lag: s5 on dbstore2001 is CRITICAL: CRITICAL slave_sql_lag could not connect [09:31:24] PROBLEM - MariaDB Slave Lag: s7 on dbstore2001 is CRITICAL: CRITICAL slave_sql_lag could not connect [09:31:24] PROBLEM - MariaDB Slave IO: x1 on dbstore2001 is CRITICAL: CRITICAL slave_io_state could not connect [09:31:24] PROBLEM - MariaDB Slave SQL: x1 on dbstore2001 is CRITICAL: CRITICAL slave_sql_state could not connect [09:31:25] PROBLEM - MariaDB Slave IO: s6 on dbstore2001 is CRITICAL: CRITICAL slave_io_state could not connect [09:31:25] PROBLEM - MariaDB Slave IO: s7 on dbstore2001 is CRITICAL: CRITICAL slave_io_state could not connect [09:31:25] PROBLEM - MariaDB Slave SQL: s7 on dbstore2001 is CRITICAL: CRITICAL slave_sql_state could not connect [09:31:25] PROBLEM - MariaDB Slave SQL: m3 on dbstore2001 is CRITICAL: CRITICAL slave_sql_state could not connect [09:31:32] PROBLEM - mysqld processes on dbstore2001 is CRITICAL: PROCS CRITICAL: 0 processes with command name mysqld [09:31:43] PROBLEM - MariaDB Slave Lag: x1 on dbstore2001 is CRITICAL: CRITICAL slave_sql_lag could not connect [09:31:43] PROBLEM - MariaDB Slave IO: s1 on dbstore2001 is CRITICAL: CRITICAL slave_io_state could not connect [09:31:43] PROBLEM - MariaDB Slave Lag: s1 on dbstore2001 is CRITICAL: CRITICAL slave_sql_lag could not connect [09:31:43] PROBLEM - MariaDB Slave Lag: m2 on dbstore2001 is CRITICAL: CRITICAL slave_sql_lag could not connect [09:31:43] PROBLEM - MariaDB Slave SQL: s1 on dbstore2001 is CRITICAL: CRITICAL slave_sql_state could not connect [09:31:43] PROBLEM - MariaDB Slave SQL: m2 on dbstore2001 is CRITICAL: CRITICAL slave_sql_state could not connect [09:31:43] PROBLEM - MariaDB Slave SQL: s2 on dbstore2001 is CRITICAL: CRITICAL slave_sql_state could not connect [09:31:44] PROBLEM - MariaDB Slave IO: m3 on dbstore2001 is CRITICAL: CRITICAL slave_io_state could not connect [09:31:44] PROBLEM - MariaDB Slave Lag: s2 on dbstore2001 is CRITICAL: CRITICAL slave_sql_lag could not connect [09:31:55] <_joe_> woha [09:32:13] <_joe_> dbstore2001 crashed I'd say :P [09:32:24] PROBLEM - MariaDB Slave Lag: s3 on dbstore2001 is CRITICAL: CRITICAL slave_sql_lag could not connect [09:32:24] PROBLEM - MariaDB Slave IO: s2 on dbstore2001 is CRITICAL: CRITICAL slave_io_state could not connect [09:32:24] PROBLEM - MariaDB Slave SQL: s4 on dbstore2001 is CRITICAL: CRITICAL slave_sql_state could not connect [09:33:04] PROBLEM - MariaDB Slave IO: s3 on dbstore2001 is CRITICAL: CRITICAL slave_io_state could not connect [09:34:57] _joe_: no, it expired downtime [09:35:01] silencing again [09:35:05] <_joe_> volans: ha [09:35:21] ( marostegui pinged me :) ) [09:35:37] <_joe_> ok [10:17:11] RECOVERY - MariaDB Slave IO: s6 on dbstore2001 is OK: OK slave_io_state not a slave [10:17:19] RECOVERY - MariaDB Slave Lag: s7 on dbstore2001 is OK: OK slave_sql_lag not a slave [10:17:19] RECOVERY - MariaDB Slave SQL: x1 on dbstore2001 is OK: OK slave_sql_state not a slave [10:17:19] RECOVERY - MariaDB Slave IO: x1 on dbstore2001 is OK: OK slave_io_state not a slave [10:17:20] RECOVERY - MariaDB Slave SQL: s7 on dbstore2001 is OK: OK slave_sql_state not a slave [10:17:20] RECOVERY - MariaDB Slave SQL: m3 on dbstore2001 is OK: OK slave_sql_state not a slave [10:17:20] RECOVERY - MariaDB Slave IO: s7 on dbstore2001 is OK: OK slave_io_state not a slave [10:17:29] RECOVERY - MariaDB Slave Lag: x1 on dbstore2001 is OK: OK slave_sql_lag not a slave [10:17:30] RECOVERY - MariaDB Slave IO: s1 on dbstore2001 is OK: OK slave_io_state Slave_IO_Running: Yes [10:17:30] RECOVERY - MariaDB Slave Lag: s1 on dbstore2001 is OK: OK slave_sql_lag Replication lag: 10396.41 seconds [10:17:30] RECOVERY - MariaDB Slave SQL: s1 on dbstore2001 is OK: OK slave_sql_state Slave_SQL_Running: Yes [10:17:30] RECOVERY - MariaDB Slave SQL: m2 on dbstore2001 is OK: OK slave_sql_state not a slave [10:17:30] RECOVERY - MariaDB Slave Lag: m2 on dbstore2001 is OK: OK slave_sql_lag not a slave [10:17:30] RECOVERY - MariaDB Slave SQL: s2 on dbstore2001 is OK: OK slave_sql_state not a slave [10:17:31] RECOVERY - MariaDB Slave IO: m3 on dbstore2001 is OK: OK slave_io_state not a slave [10:17:31] RECOVERY - MariaDB Slave Lag: s2 on dbstore2001 is OK: OK slave_sql_lag not a slave [10:18:10] RECOVERY - MariaDB Slave SQL: s4 on dbstore2001 is OK: OK slave_sql_state not a slave [10:18:10] RECOVERY - MariaDB Slave IO: s2 on dbstore2001 is OK: OK slave_io_state not a slave [10:18:10] RECOVERY - MariaDB Slave Lag: s3 on dbstore2001 is OK: OK slave_sql_lag Replication lag: 10401.88 seconds [10:18:41] RECOVERY - MariaDB Slave IO: s3 on dbstore2001 is OK: OK slave_io_state Slave_IO_Running: Yes [10:18:41] RECOVERY - MariaDB Slave SQL: s5 on dbstore2001 is OK: OK slave_sql_state not a slave [10:18:41] RECOVERY - MariaDB Slave IO: m2 on dbstore2001 is OK: OK slave_io_state not a slave [10:18:49] RECOVERY - MariaDB Slave Lag: m3 on dbstore2001 is OK: OK slave_sql_lag not a slave [10:18:49] RECOVERY - MariaDB Slave IO: s5 on dbstore2001 is OK: OK slave_io_state not a slave [10:18:49] RECOVERY - MariaDB Slave SQL: s6 on dbstore2001 is OK: OK slave_sql_state not a slave [10:18:50] RECOVERY - MariaDB Slave Lag: s6 on dbstore2001 is OK: OK slave_sql_lag not a slave [10:18:50] RECOVERY - MariaDB Slave Lag: s4 on dbstore2001 is OK: OK slave_sql_lag not a slave [10:18:50] RECOVERY - MariaDB Slave IO: s4 on dbstore2001 is OK: OK slave_io_state not a slave [10:18:50] RECOVERY - MariaDB Slave SQL: s3 on dbstore2001 is OK: OK slave_sql_state Slave_SQL_Running: Yes [10:18:50] RECOVERY - MariaDB Slave Lag: s5 on dbstore2001 is OK: OK slave_sql_lag not a slave [11:08:55] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [11:34:19] PROBLEM - puppet last run on mw1258 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [12:00:17] RECOVERY - puppet last run on mw1258 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [12:13:11] PROBLEM - puppet last run on cp3033 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [12:38:26] Hi, will adding NS_PORTAL as content namespace (by adding to wgContentNamespaces) fulfil T127748? [12:38:28] T127748: Add portal namespace to ZIM - https://phabricator.wikimedia.org/T127748 [12:39:17] RECOVERY - puppet last run on cp3033 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [12:39:20] I don't think we can add a namespace as content one for specific project which isn't controlled by WMF. [12:42:50] 06Operations, 06Multimedia, 15User-Josve05a: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED) - https://phabricator.wikimedia.org/T148917#2736599 (10Josve05a) [12:44:22] 06Operations, 06Multimedia, 15User-Josve05a: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED) - https://phabricator.wikimedia.org/T148917#2736615 (10Josve05a) {F4645975} Example on https://en.wikipedia.org/wiki/Getty_Images [12:51:19] (03PS1) 10Urbanecm: Add all edited pages by myself to watchlist by default (for new users) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/317387 (https://phabricator.wikimedia.org/T148328) [13:00:34] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [1000.0] [13:21:45] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [13:36:08] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [13:47:50] PROBLEM - Text HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [13:47:52] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [13:52:25] RECOVERY - Text HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [13:59:45] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [1000.0] [14:04:35] PROBLEM - Text HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [14:09:19] RECOVERY - Text HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [14:11:42] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [14:16:24] PROBLEM - Text HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [1000.0] [14:18:45] RECOVERY - Text HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [14:23:27] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [1000.0] [14:34:49] ^ those are very small spikes it's making noise about. the root seems to be that the hhvm restart checker has been doing lots of hhvm restarts the past few hours as they all reached their uptime cutoff [14:35:07] (the one that's still going every 10 mins in screen on neodymium) [15:25:19] PROBLEM - puppet last run on lvs1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [15:46:53] (03PS1) 10Dereckson: Revert "Let's disable l10nupdate completely until we have /srv/mediawiki-staging back" [puppet] - 10https://gerrit.wikimedia.org/r/317390 [15:51:10] RECOVERY - puppet last run on lvs1001 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [15:55:25] (03PS2) 10Dereckson: Revert "Let's disable l10nupdate … until … /srv/mediawiki-staging back" [puppet] - 10https://gerrit.wikimedia.org/r/317390 [15:56:46] (03PS3) 10Dereckson: Revert "Let's disable l10nupdate … until … /srv/mediawiki-staging back" [puppet] - 10https://gerrit.wikimedia.org/r/317390 (https://phabricator.wikimedia.org/T148571) [15:59:42] PROBLEM - puppet last run on graphite1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [16:17:52] Hi, will adding NS_PORTAL as content namespace (by adding to wgContentNamespaces) fulfil T127748? I don't think we can add a namespace as content one for specific project which isn't controlled by WMF. [16:17:52] T127748: Add portal namespace to ZIM - https://phabricator.wikimedia.org/T127748 [16:25:24] RECOVERY - puppet last run on graphite1003 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [16:25:50] Hi there seems to be an Thumbnails problem that users are reporting here https://phabricator.wikimedia.org/T148917 [16:25:57] 06Operations, 06Multimedia, 15User-Josve05a: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED) - https://phabricator.wikimedia.org/T148917#2736599 (10Yann) I have this issue on the English and French WP. [16:26:05] a user has reported it in -tech [16:26:49] paladox: I can't see any problem with the image linked. [16:27:03] Urbanecm seems to be happening to random images [16:27:16] Happends to me on https://en.wikipedia.org/wiki/IOS [16:27:20] one image is have loaded [16:27:42] But some time ago I reported similar problem. It was because the server said the image is gziped but it wasn't. [16:27:48] Something like purge helped. [16:28:04] jynus would know more I think [16:28:08] Oh, how can we purge the images? [16:28:29] I don't know, ask jynus. I'll try to search in logs, wait a moment... [16:28:44] BTW I can't see a problem at iOS page. [16:28:56] Ok thanks [16:29:02] I will do a screenshot [16:29:39] https://phabricator.wikimedia.org/F4646554 [16:29:49] Urbanecm ^^ notice the ipad image [16:30:24] I can see full image. It isn't croped at my side. [16:31:11] [13:50:49] jynus: Where was the problem? [16:31:12] [13:51:11] for the 500px, it said it returned a png gziped [16:31:14] [13:51:28] it didn't- it was just a plain png [16:31:15] [13:51:50] I purged the cache and now it works as it should [16:31:25] oh [16:31:52] I guess the problem has came back. [16:32:00] The same problem? [16:32:32] i think so [16:32:58] But he just wrote in -tech, saying this https://upload.wikimedia.org/wikipedia/commons/thumb/e/e5/CFK_y_Angela_Merkel_2.jpg/220px-CFK_y_Angela_Merkel_2.jpg is broken [16:32:58] but it loads for me [16:33:09] so it seems to be depending on the image and wont show on the same image for everyone [16:33:10] For me too. [16:33:20] oh [16:33:46] But the problem which I reported it was broken for more users than only one. Strange... [16:34:00] About the purging, maybe https://wikitech.wikimedia.org/wiki/Multicast_HTCP_purging can help... [16:34:12] 06Operations, 06Multimedia, 15User-Josve05a: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED) - https://phabricator.wikimedia.org/T148917#2736773 (10Paladox) Happends for me too on en.wikipedia.org {F4646554} [16:34:49] Oh [16:35:12] 06Operations, 06Multimedia, 15User-Josve05a: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED) - https://phabricator.wikimedia.org/T148917#2736599 (10Urbanecm) I can't see any problem from my side. [16:35:34] paladox: Or maybe https://wikitech.wikimedia.org/wiki/Varnish can help too. [16:35:45] I'm not sure... [16:35:52] Oh [16:36:14] I doint visit en.wp.org often, so not sure why images doint load fully for me [16:37:22] Did you try to reload the page? [16:37:25] Yes [16:37:47] Okay. [16:37:53] Urbanecm i tryed to look at some more pages, sky go works, android works, but https://en.wikipedia.org/wiki/Northampton dosen't [16:38:25] https://phabricator.wikimedia.org/F4646571 [16:38:57] I can see full image. Could you visit https://upload.wikimedia.org/wikipedia/commons/thumb/1/16/Northampton_UK_locator_map.svg/220px-Northampton_UK_locator_map.svg.png ? What do you see? [16:39:24] That loads [16:39:25] for me [16:39:36] For me too. But why it doesn't load at the page? [16:40:30] Urbanecm i also scrolled down and it shows this https://phabricator.wikimedia.org/F4646573 [16:41:21] I can't see any problem... [16:42:23] Urbanecm apparently other users are reporting it in #wikipedia-en [16:42:34] Going to join there. [16:42:52] Ok thanks, try to search some pages and scroll down, and see if you catch the problem [16:43:02] Okay. [16:43:08] Thanks [16:43:44] 06Operations, 06Multimedia, 15User-Josve05a: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED) - https://phabricator.wikimedia.org/T148917#2736801 (10Paladox) >>! In T148917#2736774, @Urbanecm wrote: > I can't see any problem from my side. I use Google Chrome, Windows 10. There is no problem... [16:44:15] Urbanecm lol, in edge it goes green [16:44:23] Green? [16:45:37] Urbanecm https://phabricator.wikimedia.org/F4646582 [16:46:08] I see. Strange... [16:46:18] Urbanecm on chrome, no image is showing [16:46:40] Is there space for it? [16:47:41] 06Operations, 06Multimedia, 15User-Josve05a: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED) - https://phabricator.wikimedia.org/T148917#2736803 (10Paladox) It goes green in Microsoft edge. {F4646582} In chrome, it doesn't display at all. {F4646590} [16:47:57] Urbanecm what do you mean by space for it? [16:47:58] https://phabricator.wikimedia.org/F4646590 [16:48:17] and yeh i have storage space if that's what you mean [16:48:23] 1tb, with 700gb left [16:48:26] 6gb ram [16:48:28] No, I don't mean this. [16:48:33] ok [16:49:10] I mean if the text is instead image or not. But browser knows there should be something but it can't load it. [16:49:34] Oh [16:49:34] Can i set the task to unbreak please? Since other users are reporting it and it is a production wiki? [16:49:48] I think you can... [16:49:55] Thanks [16:50:15] You're welcome. [16:52:08] 06Operations, 06Multimedia, 15User-Josve05a: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED) - https://phabricator.wikimedia.org/T148917#2736806 (10Paladox) p:05Triage>03Unbreak! Changing to ubreak since users are reporting it in #wikimedia-tech, #wikipedia-en I have also confirmed the... [16:52:10] Urbanecm ^^ [16:52:11] done [16:52:13] :) [16:52:19] Thanks [16:52:41] Your welcome :) [17:01:46] 06Operations, 06Multimedia, 15User-Josve05a: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED) - https://phabricator.wikimedia.org/T148917#2736809 (10Aklapper) [17:03:06] I tried to load from different locations using proxy and I can't see any problem... [17:04:23] Oh [17:04:37] Urbanecm try internet explorer / edge [17:04:41] Okay [17:04:47] Thanks [17:07:24] 06Operations, 06Multimedia, 15User-Josve05a: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED) - https://phabricator.wikimedia.org/T148917#2736825 (10Aklapper) p:05Unbreak!>03High Can someone please provide the error message (and the browser, browser version, operating system) shown and t... [17:08:17] paladox: ^^ [17:08:23] Ok [17:08:24] thanks [17:08:34] Maybe you can help. [17:08:38] You're welcome :) [17:08:50] 06Operations, 06Multimedia, 15User-Josve05a, 15User-Urbanecm: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED) - https://phabricator.wikimedia.org/T148917#2736827 (10Urbanecm) [17:09:41] 06Operations, 06Multimedia, 15User-Josve05a, 15User-Urbanecm: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED) - https://phabricator.wikimedia.org/T148917#2736829 (10Paladox) @Aklapper hi, I'm using windows 10, Internet Explorer, Microsoft Edge and chrome. Links are https://en.wikipedia.... [17:09:54] Yep :) [17:10:39] paladox: Aklapper meant links to the image itself (to its thumbnail exactly), not to the page. [17:10:52] Oh [17:10:55] Woops sorry [17:11:07] Nothing happened, only notifying you :) [17:14:16] Ok [17:14:20] Urbanecm done [17:14:25] ive amended the comment [17:14:26] :) [17:14:53] paladox: Sorry to bothering you but Aklapper wants links to thumbnail. Please right-click on the image, choose something which copy address of the image and paste it to the task. [17:15:20] Oh [17:15:32] I thought i did [17:15:44] Urbanecm is that https://en.wikipedia.org/wiki/File:Danes_Camp_Earthworks_Northampton.jpg one? [17:15:48] No. [17:15:53] Oh [17:16:00] The URL begin with upload.wikimedia.org [17:16:23] oh [17:16:58] How do i get the upload.wikimedia.org link? [17:17:03] Urbanecm ^^ [17:17:14] Load the wiki-page where the problem is, right-click on the problematic image, choose Copy Image URL and you should have the link in your clipboard. [17:17:15] Is it clear? [17:17:22] (for you) [17:17:40] Oh yes, but i right click the image, but it gives me the link to commons [17:18:00] I mean to en.wikipedia.org [17:18:07] not a upload.wikimedia.org link [17:18:40] Because you click at another option. There is something like Copy Link Address and Copy Image Address. [17:18:47] Oh [17:18:51] https://upload.wikimedia.org/wikipedia/commons/1/16/Northampton_UK_locator_map.svg ? [17:19:30] Yes, this can be the correct link. [17:19:52] Ok [17:19:53] Done [17:19:57] Thanks a lot. [17:20:02] Your welcome [17:20:33] Strange why i picked northampton [17:21:17] :) [17:21:24] :) [17:44:41] PROBLEM - puppet last run on cp3043 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:51:26] 07Puppet, 10Beta-Cluster-Infrastructure, 06Labs: After starting new deployment-prep instance, have to delete /var/lib/puppet/ssl before puppet will function - https://phabricator.wikimedia.org/T148929#2736876 (10AlexMonk-WMF) [18:10:37] RECOVERY - puppet last run on cp3043 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [18:24:22] 06Operations, 06Commons: Thumbnails of some specific images show unwanted black lines - https://phabricator.wikimedia.org/T140536#2736941 (10Aklapper) [18:38:32] 06Operations, 06Multimedia, 15User-Josve05a, 15User-Urbanecm: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED) - https://phabricator.wikimedia.org/T148917#2736948 (10Josve05a) | {F4646819} | On https://commons.wikimedia.org/w/index.php?title=Category:CC-PD-Mark&filefrom=%22Be+a+victory+far... [18:39:07] 06Operations, 06Multimedia, 15User-Josve05a, 15User-Urbanecm: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED or ERR_SSL_BAD_RECORD_MAC_ALERT) - https://phabricator.wikimedia.org/T148917#2736949 (10Josve05a) [18:39:50] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [1000.0] [18:44:19] 06Operations, 06Multimedia, 15User-Josve05a, 15User-Urbanecm: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED or ERR_SSL_BAD_RECORD_MAC_ALERT) - https://phabricator.wikimedia.org/T148917#2736954 (10Aklapper) @Paladox: Please see T148917#2736825 for information that would be helpful here (w... [18:46:05] 06Operations, 06Multimedia, 15User-Josve05a, 15User-Urbanecm: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED or ERR_SSL_BAD_RECORD_MAC_ALERT) - https://phabricator.wikimedia.org/T148917#2736956 (10Paladox) Oh, the images show when I go to upload.wikimedia.org. But on the wiki them selfs i... [18:47:01] 06Operations, 06Multimedia, 15User-Josve05a, 15User-Urbanecm: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED or ERR_SSL_BAD_RECORD_MAC_ALERT) - https://phabricator.wikimedia.org/T148917#2736957 (10Urbanecm) I talked about this with Paladox. He says that thumbnails shows correctly when he... [18:48:10] 06Operations, 06Multimedia, 15User-Josve05a, 15User-Urbanecm: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED or ERR_SSL_BAD_RECORD_MAC_ALERT) - https://phabricator.wikimedia.org/T148917#2736599 (10BBlack) So, one of the possible factors here is various HTTPS-interfering proxies or filters... [18:48:40] 06Operations, 06Multimedia, 10Traffic, 15User-Josve05a, 15User-Urbanecm: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED or ERR_SSL_BAD_RECORD_MAC_ALERT) - https://phabricator.wikimedia.org/T148917#2736960 (10BBlack) [18:50:15] 06Operations, 06Multimedia, 10Traffic, 15User-Josve05a, 15User-Urbanecm: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED or ERR_SSL_BAD_RECORD_MAC_ALERT) - https://phabricator.wikimedia.org/T148917#2736966 (10Josve05a) >>! In T148917#2736958, @BBlack wrote: > So, one of the possible fact... [18:52:20] 06Operations, 06Multimedia, 10Traffic, 15User-Josve05a, 15User-Urbanecm: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED or ERR_SSL_BAD_RECORD_MAC_ALERT) - https://phabricator.wikimedia.org/T148917#2736968 (10Paladox) >>! In T148917#2736958, @BBlack wrote: > So, one of the possible facto... [19:09:23] 06Operations, 06Multimedia, 10Traffic, 15User-Josve05a, 15User-Urbanecm: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED or ERR_SSL_BAD_RECORD_MAC_ALERT) - https://phabricator.wikimedia.org/T148917#2736970 (10Josve05a) I changed WiFi to my phones personal hotspot, and that didn't chnage... [19:16:40] PROBLEM - Redis status tcp_6479 on rdb2006 is CRITICAL: CRITICAL: replication_delay is 660 600 - REDIS on 10.192.48.44:6479 has 1 databases (db0) with 2997928 keys - replication_delay is 660 [19:21:25] RECOVERY - Redis status tcp_6479 on rdb2006 is OK: OK: REDIS on 10.192.48.44:6479 has 1 databases (db0) with 2975681 keys - replication_delay is 0 [19:22:42] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [1000.0] [19:46:39] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [1000.0] [19:52:02] (03CR) 10QChris: [C: 04-1] "Hi Paladox," [puppet] - 10https://gerrit.wikimedia.org/r/308753 (https://phabricator.wikimedia.org/T85002) (owner: 10Paladox) [19:58:32] PROBLEM - Disk space on cp4006 is CRITICAL: DISK CRITICAL - free space: / 348 MB (3% inode=86%) [20:22:57] 06Operations, 06Multimedia, 10Traffic, 15User-Josve05a, 15User-Urbanecm: Thumbnails failing to render sporadically (ERR_CONNECTION_CLOSED or ERR_SSL_BAD_RECORD_MAC_ALERT) - https://phabricator.wikimedia.org/T148917#2737059 (10BBlack) I've reproduced this now, at least once. I took a lot of retries. I u... [20:51:07] (03PS1) 10BBlack: cache frontends: 8x local ports 3120-3127 [puppet] - 10https://gerrit.wikimedia.org/r/317404 [20:51:09] (03PS1) 10BBlack: tlsproxy: use 8x FE ports to balance [puppet] - 10https://gerrit.wikimedia.org/r/317405 [20:52:15] (03CR) 10jenkins-bot: [V: 04-1] tlsproxy: use 8x FE ports to balance [puppet] - 10https://gerrit.wikimedia.org/r/317405 (owner: 10BBlack) [20:52:37] bblack: are you able to review changes for grrrit-wm ?? [20:53:22] (03PS1) 10BBlack: tlsproxy: raise worker connection limits, too [puppet] - 10https://gerrit.wikimedia.org/r/317414 [20:53:29] Zppix: not really, no [20:53:41] Damn ok [20:53:44] (03CR) 10Hashar: "I have absolutely no idea how that would impact the PoolCounter system :(" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/316356 (https://phabricator.wikimedia.org/T123734) (owner: 10Filippo Giunchedi) [20:55:06] (03PS2) 10BBlack: tlsproxy: use 8x FE ports to balance [puppet] - 10https://gerrit.wikimedia.org/r/317405 [20:55:08] (03PS2) 10BBlack: tlsproxy: raise worker connection limits, too [puppet] - 10https://gerrit.wikimedia.org/r/317414 [21:02:35] Zppix i can [21:02:41] Chris [21:02:44] Did [21:03:43] oh [21:04:44] Zppix https://gerrit.wikimedia.org/r/#/c/317380/ ? [21:04:50] You need to remove a whitespace [22:15:13] PROBLEM - puppet last run on mx2001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [22:41:14] RECOVERY - puppet last run on mx2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [22:48:39] (03PS14) 10Paladox: Add support for searching gerrit using bug:T1 [puppet] - 10https://gerrit.wikimedia.org/r/308753 (https://phabricator.wikimedia.org/T85002) [22:48:43] (03PS15) 10Paladox: Add support for searching gerrit using bug:T1 [puppet] - 10https://gerrit.wikimedia.org/r/308753 (https://phabricator.wikimedia.org/T85002) [22:49:04] (03CR) 10Paladox: "@QChris done, I have tested this on our test install https://gerrit-test.wmflabs.org/" [puppet] - 10https://gerrit.wikimedia.org/r/308753 (https://phabricator.wikimedia.org/T85002) (owner: 10Paladox) [23:16:36] PROBLEM - puppet last run on analytics1015 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [23:19:06] RECOVERY - Esams HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [23:30:15] (03PS3) 10Alex Monk: redirects.dat - split non-canonical to separate section [puppet] - 10https://gerrit.wikimedia.org/r/292785 (https://phabricator.wikimedia.org/T133548) (owner: 10BBlack) [23:30:17] (03PS1) 10Alex Monk: POC: Secure redirect service [puppet] - 10https://gerrit.wikimedia.org/r/317450 (https://phabricator.wikimedia.org/T133548) [23:31:36] (03CR) 10jenkins-bot: [V: 04-1] POC: Secure redirect service [puppet] - 10https://gerrit.wikimedia.org/r/317450 (https://phabricator.wikimedia.org/T133548) (owner: 10Alex Monk) [23:36:59] (03PS2) 10Alex Monk: POC: Secure redirect service [puppet] - 10https://gerrit.wikimedia.org/r/317450 (https://phabricator.wikimedia.org/T133548) [23:38:11] (03CR) 10jenkins-bot: [V: 04-1] POC: Secure redirect service [puppet] - 10https://gerrit.wikimedia.org/r/317450 (https://phabricator.wikimedia.org/T133548) (owner: 10Alex Monk) [23:41:08] (03PS3) 10Alex Monk: POC: Secure redirect service [puppet] - 10https://gerrit.wikimedia.org/r/317450 (https://phabricator.wikimedia.org/T133548) [23:42:23] (03CR) 10jenkins-bot: [V: 04-1] POC: Secure redirect service [puppet] - 10https://gerrit.wikimedia.org/r/317450 (https://phabricator.wikimedia.org/T133548) (owner: 10Alex Monk) [23:42:25] RECOVERY - puppet last run on analytics1015 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [23:44:37] (03PS4) 10Alex Monk: POC: Secure redirect service [puppet] - 10https://gerrit.wikimedia.org/r/317450 (https://phabricator.wikimedia.org/T133548)