[00:00:04] Deploy window No Deploys - US Holiday (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20190902T0000) [00:06:51] PROBLEM - IPv6 ping to ulsfo on ripe-atlas-ulsfo IPv6 is CRITICAL: CRITICAL - failed 37 probes of 451 (alerts on 35) - https://atlas.ripe.net/measurements/1791309/#!map https://wikitech.wikimedia.org/wiki/Network_monitoring%23Atlas_alerts [00:12:23] RECOVERY - IPv6 ping to ulsfo on ripe-atlas-ulsfo IPv6 is OK: OK - failed 25 probes of 451 (alerts on 35) - https://atlas.ripe.net/measurements/1791309/#!map https://wikitech.wikimedia.org/wiki/Network_monitoring%23Atlas_alerts [00:52:43] RECOVERY - Memory correctable errors -EDAC- on thumbor1004 is OK: (C)4 ge (W)2 ge 1 https://wikitech.wikimedia.org/wiki/Monitoring/Memory%23Memory_correctable_errors_-EDAC- https://grafana.wikimedia.org/dashboard/db/host-overview?orgId=1&var-server=thumbor1004&var-datasource=eqiad+prometheus/ops [04:29:37] RECOVERY - snapshot of s6 in codfw on db1115 is OK: snapshot for s6 at codfw taken less than 4 days ago and larger than 90 GB: Last one 2019-09-02 03:28:01 from db2097.codfw.wmnet:3316 (502 GB) https://wikitech.wikimedia.org/wiki/MariaDB/Backups [04:50:57] 10Operations, 10ops-codfw: Degraded RAID on db2059 - https://phabricator.wikimedia.org/T231690 (10Marostegui) 05Open→03Invalid This host is waiting to be decommissioned {T230884} [04:51:27] 10Operations, 10ops-codfw, 10DC-Ops, 10decommission: Decommission db2059.codfw.wmnet - https://phabricator.wikimedia.org/T230884 (10Marostegui) [04:53:21] (03CR) 10Marostegui: [C: 03+1] "Thank you!" [software/conftool] - 10https://gerrit.wikimedia.org/r/533556 (https://phabricator.wikimedia.org/T231629) (owner: 10CDanis) [04:56:43] 10Operations, 10DBA: Decommission db2046.codfw.wmnet - https://phabricator.wikimedia.org/T231767 (10Marostegui) [04:57:03] 10Operations, 10DBA: Decommission db2046.codfw.wmnet - https://phabricator.wikimedia.org/T231767 (10Marostegui) p:05Triage→03Normal [04:58:01] (03PS1) 10Marostegui: db-eqiad,db-codfw.php: Decommission db2046 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/533777 (https://phabricator.wikimedia.org/T231767) [04:59:17] (03CR) 10Marostegui: [C: 03+2] db-eqiad,db-codfw.php: Decommission db2046 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/533777 (https://phabricator.wikimedia.org/T231767) (owner: 10Marostegui) [05:00:21] (03Merged) 10jenkins-bot: db-eqiad,db-codfw.php: Decommission db2046 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/533777 (https://phabricator.wikimedia.org/T231767) (owner: 10Marostegui) [05:00:40] (03CR) 10jenkins-bot: db-eqiad,db-codfw.php: Decommission db2046 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/533777 (https://phabricator.wikimedia.org/T231767) (owner: 10Marostegui) [05:01:53] !log marostegui@deploy1001 Synchronized wmf-config/db-codfw.php: Remove db2046 from config T231767 (duration: 00m 55s) [05:01:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:01:56] T231767: Decommission db2046.codfw.wmnet - https://phabricator.wikimedia.org/T231767 [05:02:58] !log marostegui@deploy1001 Synchronized wmf-config/db-eqiad.php: Remove db2046 from config T231767 (duration: 00m 53s) [05:03:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:03:24] 10Operations, 10DBA, 10Patch-For-Review: Decommission db2046.codfw.wmnet - https://phabricator.wikimedia.org/T231767 (10Marostegui) [06:36:39] (03PS12) 10DannyS712: General cleanup of initialize settings [mediawiki-config] - 10https://gerrit.wikimedia.org/r/532280 (https://phabricator.wikimedia.org/T231178) [06:36:52] (03CR) 10jerkins-bot: [V: 04-1] General cleanup of initialize settings [mediawiki-config] - 10https://gerrit.wikimedia.org/r/532280 (https://phabricator.wikimedia.org/T231178) (owner: 10DannyS712) [06:53:17] RECOVERY - snapshot of x1 in codfw on db1115 is OK: snapshot for x1 at codfw taken less than 4 days ago and larger than 90 GB: Last one 2019-09-02 06:24:05 from db2101.codfw.wmnet:3320 (136 GB) https://wikitech.wikimedia.org/wiki/MariaDB/Backups [06:54:24] (03Abandoned) 10DannyS712: General cleanup of initialize settings [mediawiki-config] - 10https://gerrit.wikimedia.org/r/532280 (https://phabricator.wikimedia.org/T231178) (owner: 10DannyS712) [07:17:00] !log Drop filejournal table on s6 - T51195 [07:17:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:17:03] T51195: Drop filejournal table from WMF - https://phabricator.wikimedia.org/T51195 [07:21:42] (03PS21) 10Mathew.onipe: Add maps reboot cookbook [cookbooks] - 10https://gerrit.wikimedia.org/r/511819 (https://phabricator.wikimedia.org/T224072) [07:22:19] (03CR) 10Mathew.onipe: Add maps reboot cookbook (036 comments) [cookbooks] - 10https://gerrit.wikimedia.org/r/511819 (https://phabricator.wikimedia.org/T224072) (owner: 10Mathew.onipe) [07:26:08] !log Drop filejournal table on s5 - T51195 [07:26:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:26:11] T51195: Drop filejournal table from WMF - https://phabricator.wikimedia.org/T51195 [07:31:42] (03PS2) 10Giuseppe Lavagetto: k8s::worker: switch to profile::lvs::realserver [puppet] - 10https://gerrit.wikimedia.org/r/532662 [07:34:24] !log Drop filejournal table on s4 - T51195 [07:34:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:34:26] T51195: Drop filejournal table from WMF - https://phabricator.wikimedia.org/T51195 [07:40:41] (03CR) 10Giuseppe Lavagetto: [C: 03+1] "https://puppet-compiler.wmflabs.org/compiler1001/18132/kubernetes1002.eqiad.wmnet/ this will also remove the "depool/pool" scripts from th" [puppet] - 10https://gerrit.wikimedia.org/r/532662 (owner: 10Giuseppe Lavagetto) [07:41:03] <_joe_> akosiaris: interested in your opinion about ^^ [07:41:25] <_joe_> the TL;DR is I'm removing all the conftool-related scripts from the k8s workers [07:46:03] heh, never used them over there, [07:46:09] they don't really make that much sense [07:46:27] <_joe_> so while the "pool/depool" general ones make *some* sense, I don't think they're strictly needed [07:46:30] (03CR) 10Alexandros Kosiaris: [C: 03+1] "nice!" [puppet] - 10https://gerrit.wikimedia.org/r/532662 (owner: 10Giuseppe Lavagetto) [07:46:44] the general (host level ones) do have some mild sense [07:47:05] <_joe_> we can re-add them later, but I think we'd be better served by writing cookbooks [07:47:15] (03CR) 10Giuseppe Lavagetto: [C: 03+2] k8s::worker: switch to profile::lvs::realserver [puppet] - 10https://gerrit.wikimedia.org/r/532662 (owner: 10Giuseppe Lavagetto) [07:47:19] yeah, fully agreed on that [07:47:24] I was about to point it out [07:47:50] <_joe_> btw, my patches to spicerack are in place and now there is an easy way to deal with lvs pools in spicerack [07:48:02] <_joe_> as in, doing things coordinated with etcd pools/depools [07:51:26] (03PS2) 10Giuseppe Lavagetto: elasticsearch::cirrus: convert to profile::lvs::realserver [puppet] - 10https://gerrit.wikimedia.org/r/532663 [08:08:51] <_joe_> onimisionipe, gehel do you use the "depool-search" and "pool-search" scripts on the elastic nodes? [08:09:12] <_joe_> I hope not, they weren't doing anything correct :P [08:10:42] (03CR) 10Giuseppe Lavagetto: [C: 03+1] "https://puppet-compiler.wmflabs.org/compiler1001/18133/elastic1032.eqiad.wmnet/" [puppet] - 10https://gerrit.wikimedia.org/r/532663 (owner: 10Giuseppe Lavagetto) [08:11:16] <_joe_> added you both as reviewers to ^^ [08:12:04] _joe_: I don't think we do. We use `/usr/local/bin/pool` and `/usr/local/bin/depool` resp [08:12:16] dunno if there are related [08:12:21] !log update amd-rocm debian repository gpg key (same id, new expiration) [08:12:22] <_joe_> those are not going away [08:12:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:12:22] *they are [08:12:39] <_joe_> onimisionipe: I'm about to write some text on wikitech to clarify what each script does [08:12:52] alright. that'd be great [08:13:28] thanks! [08:15:36] !log upgrade grafana to 5.4.5 on grafana1001 [08:15:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:17:11] (03CR) 10Vgutierrez: [C: 03+1] "pcc seems happy on LVS https://puppet-compiler.wmflabs.org/compiler1001/18134/, overall the CR seems sane, please ensure than the ferm rul" [puppet] - 10https://gerrit.wikimedia.org/r/529053 (https://phabricator.wikimedia.org/T176875) (owner: 10Mathew.onipe) [08:18:42] (03CR) 10Mathew.onipe: [C: 03+1] "let's wait for gehel's opinion on this." [puppet] - 10https://gerrit.wikimedia.org/r/532663 (owner: 10Giuseppe Lavagetto) [08:20:52] (03PS1) 10DCausse: [cirrus] Reenable sanity checks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/533842 (https://phabricator.wikimedia.org/T231194) [08:22:04] (03PS1) 10Vgutierrez: Release 0.21 [software/acme-chief] - 10https://gerrit.wikimedia.org/r/533856 (https://phabricator.wikimedia.org/T219765) [08:23:27] _joe_: did not even know they existed :) [08:23:38] <_joe_> good then :P [08:25:33] !log Drop filejournal table on s2 - T51195 [08:25:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:25:35] T51195: Drop filejournal table from WMF - https://phabricator.wikimedia.org/T51195 [08:27:39] !log Drop filejournal table on labtestwiki - T51195 [08:27:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:27:53] _joe_: just to make sure I understand, the /usr/local/bin/pool and /usr/local/bin/depool scripts are unchanged and fine to be used? [08:29:29] <_joe_> gehel: correct [08:29:38] all good then! [08:29:46] <_joe_> gehel: but you might not want to, once you read the page I'm writing :P [08:30:04] * gehel is waiting... [08:37:56] (03PS2) 10Filippo Giunchedi: monitoring: alert on availability over two minutes [puppet] - 10https://gerrit.wikimedia.org/r/532335 (https://phabricator.wikimedia.org/T228379) [08:38:23] (03CR) 10Filippo Giunchedi: [C: 03+2] monitoring: alert on availability over two minutes [puppet] - 10https://gerrit.wikimedia.org/r/532335 (https://phabricator.wikimedia.org/T228379) (owner: 10Filippo Giunchedi) [08:39:13] (03PS1) 10Muehlenhoff: Switch Stas to volunteer account [puppet] - 10https://gerrit.wikimedia.org/r/533859 [08:45:50] !log Drop filejournal table on s8 - T51195 [08:45:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:45:53] T51195: Drop filejournal table from WMF - https://phabricator.wikimedia.org/T51195 [08:48:53] (03CR) 10Daimona Eaytoy: [C: 04-1] Change configuration of AbuseFilter extension for enwikisource (032 comments) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/533747 (https://phabricator.wikimedia.org/T231750) (owner: 10Zoranzoki21) [08:53:53] (03CR) 10Gehel: [C: 03+1] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/533859 (owner: 10Muehlenhoff) [09:05:47] (03PS3) 10Gehel: icinga: add old JVM GC check for elastic [puppet] - 10https://gerrit.wikimedia.org/r/533189 (https://phabricator.wikimedia.org/T231516) (owner: 10Mathew.onipe) [09:08:31] (03CR) 10Gehel: [C: 03+2] icinga: add old JVM GC check for elastic [puppet] - 10https://gerrit.wikimedia.org/r/533189 (https://phabricator.wikimedia.org/T231516) (owner: 10Mathew.onipe) [09:12:15] RECOVERY - snapshot of s3 in codfw on db1115 is OK: snapshot for s3 at codfw taken less than 4 days ago and larger than 90 GB: Last one 2019-09-02 05:29:42 from db2098.codfw.wmnet:3313 (772 GB) https://wikitech.wikimedia.org/wiki/MariaDB/Backups [09:13:41] onimisionipe: ^^ merged, I'll let you check if it looks good [09:14:01] gehel: alright. Thanks! [09:14:38] (03PS2) 10Zoranzoki21: Change configuration of AbuseFilter extension for enwikisource [mediawiki-config] - 10https://gerrit.wikimedia.org/r/533747 (https://phabricator.wikimedia.org/T231750) [09:14:43] (03CR) 10Zoranzoki21: Change configuration of AbuseFilter extension for enwikisource (032 comments) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/533747 (https://phabricator.wikimedia.org/T231750) (owner: 10Zoranzoki21) [09:14:50] (03PS3) 10Zoranzoki21: Change configuration of AbuseFilter extension for enwikisource [mediawiki-config] - 10https://gerrit.wikimedia.org/r/533747 (https://phabricator.wikimedia.org/T231750) [09:15:24] !log Drop filejournal table on s1 - T51195 [09:15:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:15:27] T51195: Drop filejournal table from WMF - https://phabricator.wikimedia.org/T51195 [09:21:49] !log Drop filejournal table on s7 - T51195 [09:21:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:21:51] T51195: Drop filejournal table from WMF - https://phabricator.wikimedia.org/T51195 [09:26:21] (03CR) 10Filippo Giunchedi: prometheus: aggregate systemd failed metrics (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/533282 (https://phabricator.wikimedia.org/T230570) (owner: 10Herron) [09:29:16] (03CR) 10Mobrovac: [C: 03+1] [cirrus] Reenable sanity checks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/533842 (https://phabricator.wikimedia.org/T231194) (owner: 10DCausse) [09:37:08] <_joe_> gehel, onimisionipe https://wikitech.wikimedia.org/wiki/Load_Balanced_Services_And_Conftool [09:37:16] * gehel is looking [09:38:53] <_joe_> changes/suggestions are welcome [09:39:00] we might want to review our elasticsearch depool script to maybe use some of that more intelligently [09:39:57] in the elasticsearch case, we don't really ever work on a single service, we always treat the server as the smallest work unit [09:40:10] * gehel needs to do some more reading / thinking [09:41:06] <_joe_> gehel: for cookbooks, you should use the new LBRemoteCluster class I created [09:41:30] <_joe_> still needs some refinements - namely adding the verification phase we have in those restart scripts [09:41:48] that was my thinking (but I haven't checked that code recently) [09:41:49] <_joe_> heh I should add a specific section to that page I guess [09:43:20] looks interesting! [09:43:41] almost time to get the kids from daycare, I'll try to have another look tonigut [09:43:58] (03PS1) 10Ladsgroup: Enable WRITE_BOTH for items term store for wikidata [mediawiki-config] - 10https://gerrit.wikimedia.org/r/533882 (https://phabricator.wikimedia.org/T225055) [10:07:08] !log installing subversion security updates on jessie [10:07:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:09:12] (03CR) 10Giuseppe Lavagetto: [C: 03+2] elasticsearch::cirrus: convert to profile::lvs::realserver [puppet] - 10https://gerrit.wikimedia.org/r/532663 (owner: 10Giuseppe Lavagetto) [10:09:24] (03PS3) 10Giuseppe Lavagetto: elasticsearch::cirrus: convert to profile::lvs::realserver [puppet] - 10https://gerrit.wikimedia.org/r/532663 [10:20:12] (03PS2) 10Giuseppe Lavagetto: Removing hiera file for role::eventbus::eventbus, unused [puppet] - 10https://gerrit.wikimedia.org/r/532664 [10:21:01] (03CR) 10Giuseppe Lavagetto: [C: 03+2] Removing hiera file for role::eventbus::eventbus, unused [puppet] - 10https://gerrit.wikimedia.org/r/532664 (owner: 10Giuseppe Lavagetto) [10:25:20] PROBLEM - Mobileapps LVS codfw on mobileapps.svc.codfw.wmnet is CRITICAL: /{domain}/v1/feed/onthisday/{type}/{month}/{day} (retrieve all events on January 15) timed out before a response was received https://wikitech.wikimedia.org/wiki/Mobileapps_%28service%29 [10:25:29] !log installing libav security updates [10:25:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:26:52] RECOVERY - Mobileapps LVS codfw on mobileapps.svc.codfw.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Mobileapps_%28service%29 [10:36:25] (03CR) 10Ema: "> Help? Cherry-picking on beta puppet master causes a compilation" [puppet] - 10https://gerrit.wikimedia.org/r/530773 (https://phabricator.wikimedia.org/T158837) (owner: 10Krinkle) [11:00:02] (03CR) 10Ema: [C: 03+1] hiera: Move nginx from port 443 to 4443 on cp2002 [puppet] - 10https://gerrit.wikimedia.org/r/532984 (https://phabricator.wikimedia.org/T231433) (owner: 10Vgutierrez) [11:00:10] (03CR) 10Ema: [C: 03+1] hiera: Move ats-tls from port 8443 to port 443 on cp2002 [puppet] - 10https://gerrit.wikimedia.org/r/532985 (https://phabricator.wikimedia.org/T231433) (owner: 10Vgutierrez) [11:06:08] PROBLEM - Apache HTTP on mwdebug1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Application_servers [11:07:34] RECOVERY - Apache HTTP on mwdebug1001 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 596 bytes in 0.101 second response time https://wikitech.wikimedia.org/wiki/Application_servers [11:18:46] !log imported apache2 2.4.10-10+deb8u15+wmf1 to apt.wikimedia.org/jessie-wikimedia (rebuild of latest Jessie update against our patches) [11:18:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:20:49] (03PS1) 10Ema: ATS: add X-ATS-Timestamp [puppet] - 10https://gerrit.wikimedia.org/r/533891 (https://phabricator.wikimedia.org/T227432) [11:23:38] PROBLEM - mobileapps endpoints health on scb2005 is CRITICAL: /{domain}/v1/transform/html/to/mobile-html/{title} (Get preview mobile HTML for test page) is CRITICAL: Test Get preview mobile HTML for test page returned the unexpected status 504 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/mobileapps [11:23:57] !log installing apache2 security updates on jessie [11:23:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:27:15] (03PS2) 10Ema: ATS: add X-ATS-Timestamp [puppet] - 10https://gerrit.wikimedia.org/r/533891 (https://phabricator.wikimedia.org/T227432) [11:28:24] RECOVERY - mobileapps endpoints health on scb2005 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/mobileapps [11:30:59] (03CR) 10Vgutierrez: [C: 03+1] ATS: add X-ATS-Timestamp [puppet] - 10https://gerrit.wikimedia.org/r/533891 (https://phabricator.wikimedia.org/T227432) (owner: 10Ema) [11:34:25] (03CR) 10Ema: [C: 03+2] ATS: add X-ATS-Timestamp [puppet] - 10https://gerrit.wikimedia.org/r/533891 (https://phabricator.wikimedia.org/T227432) (owner: 10Ema) [11:41:49] jouncebot: now [11:41:49] For the next 12 hour(s) and 18 minute(s): No Deploys - US Holiday (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20190902T0000) [11:41:55] oh [11:42:02] * Urbanecm thought there's a SWAT :( [11:42:03] tomorrow [11:42:14] * Urbanecm waves to Lucas_WMDE [11:42:14] Labor Day, apparently [11:42:21] * Lucas_WMDE waves back [11:42:26] seems so [11:42:30] A US holiday that isn't in the middle of the week! [11:42:40] is that an exception, Reedy ? :-) [11:42:45] * Urbanecm waves to Reedy as well [11:42:54] Americans quite often have holidays in the middle of the week [11:43:11] Whereas the sane europeans try and snap them to the start/end of the week [11:43:47] like election day, on Tuesday? [11:43:50] oh wait [11:45:04] (03PS1) 10Vgutierrez: package_builder: Install debhelper from backports [puppet] - 10https://gerrit.wikimedia.org/r/533892 [11:45:25] Reedy: so Spaniards aren't sane europeans, noted :) [11:45:50] * Reedy looks around shiftily [11:48:40] (03CR) 10Muehlenhoff: package_builder: Install debhelper from backports (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/533892 (owner: 10Vgutierrez) [11:49:04] (03CR) 10Muehlenhoff: package_builder: Install debhelper from backports (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/533892 (owner: 10Vgutierrez) [11:50:47] the uk, where you have things like: "early may bank holiday, spring bank holiday, summer bank holiday and late summer bank holiday"..... [11:51:50] or as i call them: "f'it, we just don't feel like working today"-days ;) [11:52:15] Banks need holidays for all their hard work screwing up the economy [11:52:33] I wonder what politicians need then [11:53:06] Ours specifically? [11:53:20] hmmm nah, ours suck as well [11:54:14] Reedy: i think they just need to be treated for an addiction to mushrooms or something... [11:54:17] Reedy: I really wonder why (inter)bank transfers aren't immediate [11:54:31] however, it just saved me yesterday, entered wrong amount and was able to cancel&fix :D [11:54:33] Urbanecm: Because then they can't make money on it [11:54:46] well how they make money on it is the another question [11:54:52] in CZ, domestic bank transfers are free [11:54:54] no bank charges [11:55:06] Urbanecm: because the computers calculating the interest weren't fast enough to do that multiple times per day back in the 80s [11:55:14] :D [11:55:26] but they are in 2019, aren't they [11:55:26] and interbank is still based on the 80s systems (though this is cleaning up fast now). [11:55:31] (03PS2) 10Vgutierrez: package_builder: Install debhelper from backports [puppet] - 10https://gerrit.wikimedia.org/r/533892 [11:55:36] yeah, indeed thedj [11:55:46] there are three banks betatesting immediate transfers, granted [11:55:54] (03PS3) 10Vgutierrez: package_builder: Install debhelper from backports [puppet] - 10https://gerrit.wikimedia.org/r/533892 [11:56:59] * Urbanecm still doesn't fully understand how (Czech) banks make money on bank transfers [11:57:07] I mean, it's free for me to transfer 0,01 CZK [11:57:13] .nl already has immediate transfers now. they are pushing the rest of europe to follow (i believe there is a 2021 deadline) [11:57:26] that's 0.00042 USD, btw [11:57:38] it's free, and supported, did that once, to test :D [11:57:50] so.. SEPA immediate transfers are already a thing, but the rolling out is quite slow [11:58:09] SEPA being immediate is...less important, I don't use it often [11:58:19] through I'm happy it's free in my bank :-) [11:58:24] as having made multiple mobile apps for banks, iv been exposed to their systems a bit. you have no idea how far behind they were until they needed to innovate when mobile banking made its introduction [11:58:48] thedj: I can't describe, but...well, guess I can imagine :-) [11:58:54] feel free to explain through [11:59:42] well my bank (which was pretty forward looking already in 2010.. [12:00:08] their api would crash if customers would request their transaction more than twice a day on average. [12:00:37] with mobile they saw a 35x increase in those endpoints being hit. [12:01:11] and so they build an gigantic caching layer on top that could be queried more often, because replacing the old one was too dangerious [12:01:38] just was going to ask what would happen if I transfer 100 eurocents, each with an individual transaction [12:01:44] (03CR) 10Muehlenhoff: [C: 03+1] "Looks good" [puppet] - 10https://gerrit.wikimedia.org/r/533892 (owner: 10Vgutierrez) [12:01:50] so guess it would be cached at that stage [12:02:07] (03CR) 10Vgutierrez: [C: 03+2] package_builder: Install debhelper from backports [puppet] - 10https://gerrit.wikimedia.org/r/533892 (owner: 10Vgutierrez) [12:02:09] and then they learned that they should have done that anyways, because then push notifications, and limits on nfc transfers etc etc.. [12:02:16] (03PS4) 10Vgutierrez: package_builder: Install debhelper from backports [puppet] - 10https://gerrit.wikimedia.org/r/533892 [12:02:35] Urbanecm: you probably would be blocked at some point for suspicious activity [12:03:17] get that, althrough I'd probably say that it might look suspicious only because the bank has little knowledge about why I make them :D [12:05:04] you are not allowed to use your account for anything that doesn't represent "normal consumer behavior" [12:05:11] yeah, banks seem to have some sanity checks for surges of transfers, in the early 2000s some people from my university were chatting via 1 ct transfers and the "reason for payment" line and after 1-2 weeks the bank called and threatened to shut down the bank account :-) [12:05:27] exactly [12:06:45] a) can I, as a customer, read somewhere what is considered "normal"? In another words, is making transfers frequently, "abnormal" by definition? b) what if I would want to get all the money and go somewhere else, when/after blocked? [12:07:45] lol, no [12:08:03] guessed so, but...then I can't be expected to be "normal" :-) [12:08:05] a) no. these are usually captured under some sort of "fair dealing" clause in the terms of use. Basically if the bank thinks its abnormal and you can't convince them otherwise, then its abnormal. b) tough luck [12:08:22] moritzm: I seem to recall people doing that more recently... Possibly with a fee, but it was still cheaper than an SMS [12:08:52] or rather, be prepared for many bureacratic hours of getting b done, would waste many of your hours [12:09:56] same if you have a bankaccount and ask for the full balance in cash.. you'd think that is possible, but its not. [12:10:26] well, depends [12:10:34] with a bank account of balance of 3 USDs? [12:10:40] :) [12:10:52] or 3000 USDs? Or something else? :-) [12:11:24] there is like a maxdraw of 10k eur a day usually, and if you want bigger amounts you need to pre-announce and you can expect some scrutiny. [12:11:36] and if there is a bank run, you can forget about it all together. [12:12:01] anyway, it's definitelly possible to switch banks, either by telling the bank "I'd like to have my full balance available in cash" ahead of time, or by transferring the money digitally [12:12:23] oh sure [12:12:48] but i remember i once tried to open a secondary bank account, and that was hella problematic. [12:13:06] when it was? [12:13:14] 2nd savings no problem, 2nd bank acount at same bank, no problem. 2nd bank account at a different bank... [12:13:22] (03PS1) 10Vgutierrez: package_builder: Fix debhelper dependencies on stretch [puppet] - 10https://gerrit.wikimedia.org/r/533897 [12:13:26] it can be country-different [12:13:41] but I have three bank accounts at three different banks [12:13:48] well this was mostly because the account was gonna cost them more, than i would bring in ;) [12:13:58] guess so :-) [12:14:12] I mean, it was perfectly doable in my case [12:14:14] and fully digital [12:14:32] i just wanted a backup account, with a backup card with like a 1000 bucks on it. static. wasn't possible. [12:14:52] filling a form and transferring initial balance from an account in my name, and proving that to the bank by sending them a bank statement [12:15:14] needed to bring either my salary over (dynamica) or considerably more money say 5000 at the very least). [12:15:47] yeah, that's kinda why I have three accounts, no one complained about that, to me [12:15:49] then again, i was a studen, so i hardly had credit score and stuff like that back then. [12:15:59] yeah, get that [12:16:25] * Urbanecm is a student and has a backup card with static ballance exactly because of that :-) [12:17:03] anyway, no one complained, yet [12:17:15] perhaps the caching infrastructure is capable of handling the issues of old core? [12:17:42] or because no one put a question "Why do you want an account" in that form :D [12:17:50] its become easier now. you have full digital banks now, where you can open an account in a day. [12:17:56] yeah :-) [12:18:17] if you are a resident of that country, i guess [12:18:52] (it required copy of govt-issued identification document, so I imagine wanting account elsewhere would be more difficult) [12:18:53] at least [12:20:02] thedj: however, I wonder about the difficultness of another account in different bank [12:20:13] I mean, was it more difficult than getting your first account? [12:20:57] * Urbanecm is sorry for abusing -operations for this purpose [12:21:06] but it's really interesting debate for me :-) [12:22:16] BTW, there are "digital banks" like Revolut (UK based) that allows you to open an account as long as you're an EU citizen [12:22:28] same for N26 (based on Germany) [12:22:30] vgutierrez: until 31 of october ;) [12:22:45] thedj: why? [12:22:52] and Revolut isn't a bank, is it? [12:22:57] Urbanecm: yes it was more difficult (but it was like 15 years ago, sooo) [12:23:12] that explains it :-) [12:23:37] vgutierrez: is there a way how to start that process other than filling my phone no at https://www.revolut.com/en-CZ/? [12:23:48] I mean, I don't want to copy the link it might send there :D [12:24:19] revolut is a bank. like bunq and n26 are. they all carry european banking licenses. [12:24:34] Urbanecm: AFAIK revolut requires a smartphone [12:24:48] so just download the app from your smartphone's app market [12:25:06] hmm, is there a way to explore/manage revolut at desktop? [12:25:09] More convinient for me [12:25:13] nope [12:25:19] then it's probably not for me :-) [12:25:19] and you only need some form of id (even another banknr suffices) to open them and do limited transfers. often when you want to do more complex or bigger amounts at those banks, you still need to provide a passport in some way. [12:26:00] well providing a passport is not a principal problem [12:26:06] the desktop/mobile is :-) [12:27:27] I mean, I trust banks, enough to provide them a passport :-) [12:27:49] you shouldn't honestly ;) [12:27:59] could you spend more words on that? [12:28:27] I shouldn't trust them? [12:28:29] Or provide passport? [12:28:31] or both? [12:29:58] you shouldn't have illusions that banks will protect your personal data any better than most companies. Usually they focus on keeping your account secure and the rest sort of gets dropped off the priority list. [12:30:23] anyway, i have to go now, as fun as this is.. [12:30:36] thanks for your time [12:30:55] thedj: I mean, I trust them similary to how I trust big corps, including govt, somehow [12:31:10] I don't think data protection is the top priority [12:31:21] (that's to make money, principle of commercial companies) [12:31:44] but I trust them enough to not providing other's passports to me, when I ask :-) [12:36:54] (03CR) 10Vgutierrez: [C: 03+2] package_builder: Fix debhelper dependencies on stretch [puppet] - 10https://gerrit.wikimedia.org/r/533897 (owner: 10Vgutierrez) [12:38:04] (03CR) 10CDanis: [C: 03+2] dbctl: document config commit --batch [software/conftool] - 10https://gerrit.wikimedia.org/r/533556 (https://phabricator.wikimedia.org/T231629) (owner: 10CDanis) [12:40:13] !log installing freetype security updates on jessie (stretch/buster already fixed) [12:40:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:41:17] (03Merged) 10jenkins-bot: dbctl: document config commit --batch [software/conftool] - 10https://gerrit.wikimedia.org/r/533556 (https://phabricator.wikimedia.org/T231629) (owner: 10CDanis) [12:41:19] (03PS1) 10Muehlenhoff: Add library hint for freetype [puppet] - 10https://gerrit.wikimedia.org/r/533906 [12:46:55] !log uploaded prometheus-trafficserver-exporter 0.3.2 to apt.wikimedia.org (stretch) - T231533 [12:46:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:46:58] T231533: Improve ATS prometheus metrics - https://phabricator.wikimedia.org/T231533 [12:47:24] (03PS2) 10Muehlenhoff: Add library hint for freetype [puppet] - 10https://gerrit.wikimedia.org/r/533906 [12:49:28] (03CR) 10Muehlenhoff: [C: 03+2] Add library hint for freetype [puppet] - 10https://gerrit.wikimedia.org/r/533906 (owner: 10Muehlenhoff) [12:50:23] (03CR) 10Alexandros Kosiaris: [C: 04-1] "Note that in https://puppet-compiler.wmflabs.org/compiler1001/261/mw1223.eqiad.wmnet/ it's clear that the preamble php tags that " (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/511078 (https://phabricator.wikimedia.org/T113114) (owner: 10Ladsgroup) [12:53:35] (03CR) 10Vgutierrez: prometheus: Ship a custom metrics file for trafficserver_exporter (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/533178 (https://phabricator.wikimedia.org/T231533) (owner: 10Vgutierrez) [12:58:00] !log upgrading prometheus-trafficserver-exporter to version 0.3.2 on cp5001 - T231533 [12:58:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:58:12] T231533: Improve ATS prometheus metrics - https://phabricator.wikimedia.org/T231533 [12:58:48] (03PS6) 10Marostegui: mariadb: Promote db1133 to m5 master [puppet] - 10https://gerrit.wikimedia.org/r/529331 (https://phabricator.wikimedia.org/T229657) [13:09:39] !log upgrading prometheus-trafficserver-exporter to version 0.3.2 on the cache cluster - T231533 [13:09:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:09:42] T231533: Improve ATS prometheus metrics - https://phabricator.wikimedia.org/T231533 [13:12:07] (03CR) 10Ema: [C: 03+1] prometheus: Ship a custom metrics file for trafficserver_exporter (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/533178 (https://phabricator.wikimedia.org/T231533) (owner: 10Vgutierrez) [13:20:13] (03CR) 10Alexandros Kosiaris: [C: 04-1] mediawiki: Use mediawiki::errorpage instead of a hhvm-fatal-error.php.erb (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/511078 (https://phabricator.wikimedia.org/T113114) (owner: 10Ladsgroup) [13:22:05] (03CR) 10Alexandros Kosiaris: [C: 04-1] "@Krinkle, do you know perhaps why in the original file (hhvm-fatal-error.php.erb), headers_sent() is returning FALSE, and there by what is" [puppet] - 10https://gerrit.wikimedia.org/r/511078 (https://phabricator.wikimedia.org/T113114) (owner: 10Ladsgroup) [13:29:23] (03PS1) 10Muehlenhoff: Add a new command to combine a deployment with a library restart check [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/533911 [13:32:44] (03CR) 10Vgutierrez: [C: 03+2] prometheus: Ship a custom metrics file for trafficserver_exporter [puppet] - 10https://gerrit.wikimedia.org/r/533178 (https://phabricator.wikimedia.org/T231533) (owner: 10Vgutierrez) [13:33:00] (03PS3) 10Vgutierrez: prometheus: Ship a custom metrics file for trafficserver_exporter [puppet] - 10https://gerrit.wikimedia.org/r/533178 (https://phabricator.wikimedia.org/T231533) [13:40:07] !log @ helmfile [STAGING] Ran 'sync' command on namespace 'sessionstore' for release 'staging' . [13:40:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:43:11] !log @ helmfile [STAGING] Ran 'sync' command on namespace 'sessionstore' for release 'staging' . [13:43:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:44:21] !log resync the sessionstore staging release as there was wrong port mapping (port 8080 instead of 8081) for both netpol and service [13:44:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:10:27] !log @ helmfile [STAGING] Ran 'apply' command on namespace 'sessionstore' for release 'staging' . [14:10:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:11:25] (03PS2) 10Giuseppe Lavagetto: maps: convert to profile::lvs::realserver [puppet] - 10https://gerrit.wikimedia.org/r/532665 [14:17:32] (03CR) 10Giuseppe Lavagetto: [C: 03+2] "https://puppet-compiler.wmflabs.org/compiler1002/18139/" [puppet] - 10https://gerrit.wikimedia.org/r/532665 (owner: 10Giuseppe Lavagetto) [14:21:28] !log @ helmfile [STAGING] Ran 'apply' command on namespace 'sessionstore' for release 'staging' . [14:21:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:24:00] (03PS2) 10Giuseppe Lavagetto: wdqs: convert to profile::lvs::realserver [puppet] - 10https://gerrit.wikimedia.org/r/532666 [14:24:43] !log @ helmfile [STAGING] Ran 'apply' command on namespace 'sessionstore' for release 'staging' . [14:24:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:28:07] (03PS1) 10Muehlenhoff: Revert "package_builder: Fix debhelper dependencies on stretch" [puppet] - 10https://gerrit.wikimedia.org/r/533918 [14:29:26] (03CR) 10Muehlenhoff: [C: 03+2] Revert "package_builder: Fix debhelper dependencies on stretch" [puppet] - 10https://gerrit.wikimedia.org/r/533918 (owner: 10Muehlenhoff) [14:31:06] (03PS1) 10Muehlenhoff: Revert "package_builder: Install debhelper from backports" [puppet] - 10https://gerrit.wikimedia.org/r/533919 [14:31:27] (03CR) 10Giuseppe Lavagetto: [C: 03+2] "https://puppet-compiler.wmflabs.org/compiler1002/18140/wdqs1003.eqiad.wmnet/" [puppet] - 10https://gerrit.wikimedia.org/r/532666 (owner: 10Giuseppe Lavagetto) [14:32:15] (03CR) 10Muehlenhoff: [C: 03+2] Revert "package_builder: Install debhelper from backports" [puppet] - 10https://gerrit.wikimedia.org/r/533919 (owner: 10Muehlenhoff) [14:36:45] !log installing ghostscript updates on thumbor1001 [14:36:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:39:40] (03PS1) 10Ema: varnish: fix labs static_host [puppet] - 10https://gerrit.wikimedia.org/r/533920 [14:41:11] (03CR) 10Ema: [C: 03+2] varnish: fix labs static_host [puppet] - 10https://gerrit.wikimedia.org/r/533920 (owner: 10Ema) [15:00:49] (03PS3) 10Giuseppe Lavagetto: wdqs: convert to profile::lvs::realserver [puppet] - 10https://gerrit.wikimedia.org/r/532666 [15:08:03] !log installing libssh2 security updates [15:08:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:10:03] (03PS1) 10Alexandros Kosiaris: sessionstore: Bump limits and requests [deployment-charts] - 10https://gerrit.wikimedia.org/r/533922 (https://phabricator.wikimedia.org/T229697) [15:12:40] (03CR) 10Krinkle: "The headers_sent() check has to do with how the file is executed in MediaWiki context. If an exception happens after MW has started produc" [puppet] - 10https://gerrit.wikimedia.org/r/511078 (https://phabricator.wikimedia.org/T113114) (owner: 10Ladsgroup) [15:14:42] (03CR) 10Krinkle: "Looking at it now, headers_sent() should always be true, thus !headers_sent() always false, because there is literal HTML output before th" [puppet] - 10https://gerrit.wikimedia.org/r/511078 (https://phabricator.wikimedia.org/T113114) (owner: 10Ladsgroup) [15:14:45] (03CR) 10Vgutierrez: [C: 03+2] prometheus: Add basic ATS network and ssl metrics [puppet] - 10https://gerrit.wikimedia.org/r/533193 (https://phabricator.wikimedia.org/T231533) (owner: 10Vgutierrez) [15:14:55] (03PS3) 10Vgutierrez: prometheus: Add basic ATS network and ssl metrics [puppet] - 10https://gerrit.wikimedia.org/r/533193 (https://phabricator.wikimedia.org/T231533) [15:15:04] (03PS2) 10Andrew Bogott: openstack configs: forward some mitaka updates to newton [puppet] - 10https://gerrit.wikimedia.org/r/533549 [15:15:06] (03PS1) 10Andrew Bogott: glance: add Newton config files [puppet] - 10https://gerrit.wikimedia.org/r/533923 (https://phabricator.wikimedia.org/T212302) [15:15:08] (03PS1) 10Andrew Bogott: keystone: forward mitaka config to newton [puppet] - 10https://gerrit.wikimedia.org/r/533924 (https://phabricator.wikimedia.org/T212302) [15:15:10] (03PS1) 10Andrew Bogott: keystone: update policy.json for Newton [puppet] - 10https://gerrit.wikimedia.org/r/533925 (https://phabricator.wikimedia.org/T212302) [15:15:12] (03PS1) 10Andrew Bogott: Designate: add Newton config files and resources [puppet] - 10https://gerrit.wikimedia.org/r/533926 (https://phabricator.wikimedia.org/T212302) [15:15:14] (03PS1) 10Andrew Bogott: Openstack Neutron: added config files and templates for version Newton [puppet] - 10https://gerrit.wikimedia.org/r/533927 (https://phabricator.wikimedia.org/T212302) [15:16:51] (03PS14) 10Mathew.onipe: lvs: allow access to wdqs lvs on port 8888 [puppet] - 10https://gerrit.wikimedia.org/r/529053 (https://phabricator.wikimedia.org/T176875) [15:16:53] (03PS1) 10Mathew.onipe: elasticsearch: add syslog logging option [puppet] - 10https://gerrit.wikimedia.org/r/533928 (https://phabricator.wikimedia.org/T225125) [15:18:07] (03CR) 10jerkins-bot: [V: 04-1] Openstack Neutron: added config files and templates for version Newton [puppet] - 10https://gerrit.wikimedia.org/r/533927 (https://phabricator.wikimedia.org/T212302) (owner: 10Andrew Bogott) [15:18:36] (03CR) 10Mathew.onipe: [C: 04-1] "We should rebase this on https://gerrit.wikimedia.org/r/c/operations/puppet/+/533928" (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/531922 (https://phabricator.wikimedia.org/T225125) (owner: 10Mathew.onipe) [15:24:23] (03CR) 10Alexandros Kosiaris: [C: 04-1] "> Looking at it now, headers_sent() should always be true, thus !headers_sent() always false, because there is literal HTML output before " [puppet] - 10https://gerrit.wikimedia.org/r/511078 (https://phabricator.wikimedia.org/T113114) (owner: 10Ladsgroup) [15:25:17] (03PS3) 10Andrew Bogott: openstack configs: forward some mitaka updates to newton [puppet] - 10https://gerrit.wikimedia.org/r/533549 [15:25:19] (03PS2) 10Andrew Bogott: glance: add Newton config files [puppet] - 10https://gerrit.wikimedia.org/r/533923 (https://phabricator.wikimedia.org/T212302) [15:25:21] (03PS2) 10Andrew Bogott: keystone: forward mitaka config to newton [puppet] - 10https://gerrit.wikimedia.org/r/533924 (https://phabricator.wikimedia.org/T212302) [15:25:23] (03PS2) 10Andrew Bogott: keystone: update policy.json for Newton [puppet] - 10https://gerrit.wikimedia.org/r/533925 (https://phabricator.wikimedia.org/T212302) [15:25:25] (03PS2) 10Andrew Bogott: Designate: add Newton config files and resources [puppet] - 10https://gerrit.wikimedia.org/r/533926 (https://phabricator.wikimedia.org/T212302) [15:25:27] (03PS2) 10Andrew Bogott: Openstack Neutron: added config files and templates for version Newton [puppet] - 10https://gerrit.wikimedia.org/r/533927 (https://phabricator.wikimedia.org/T212302) [15:25:51] (03PS1) 10Pmiazga: Bump MobileWebUIActionsTracking sampling rate to 1 percent [mediawiki-config] - 10https://gerrit.wikimedia.org/r/533930 (https://phabricator.wikimedia.org/T220016) [15:27:13] (03CR) 10Mathew.onipe: "changes are expected according to PCC: https://puppet-compiler.wmflabs.org/compiler1002/18141/" [puppet] - 10https://gerrit.wikimedia.org/r/533928 (https://phabricator.wikimedia.org/T225125) (owner: 10Mathew.onipe) [15:35:42] PROBLEM - Work requests waiting in Zuul Gearman server on contint1001 is CRITICAL: CRITICAL: 73.33% of data above the critical threshold [140.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [15:42:28] (03PS5) 10Lucas Werkmeister (WMDE): dologmsg: add manpage [puppet] - 10https://gerrit.wikimedia.org/r/513759 (https://phabricator.wikimedia.org/T222244) [15:42:50] (03PS1) 10Vgutierrez: prometheus: Fix trafficserver_response_classes_total metric [puppet] - 10https://gerrit.wikimedia.org/r/533935 (https://phabricator.wikimedia.org/T231533) [15:48:52] (03CR) 10Vgutierrez: [C: 03+2] prometheus: Fix trafficserver_response_classes_total metric [puppet] - 10https://gerrit.wikimedia.org/r/533935 (https://phabricator.wikimedia.org/T231533) (owner: 10Vgutierrez) [15:54:34] (03CR) 10DCausse: elasticsearch: add syslog logging option (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/533928 (https://phabricator.wikimedia.org/T225125) (owner: 10Mathew.onipe) [16:08:23] (03PS1) 10Ema: ATS: log Cookie in labs too [puppet] - 10https://gerrit.wikimedia.org/r/533938 (https://phabricator.wikimedia.org/T227432) [16:13:59] (03CR) 10Nuria: [C: 03+1] Bump MobileWebUIActionsTracking sampling rate to 1 percent [mediawiki-config] - 10https://gerrit.wikimedia.org/r/533930 (https://phabricator.wikimedia.org/T220016) (owner: 10Pmiazga) [16:24:23] (03CR) 10Gehel: [C: 04-1] "see inline" (033 comments) [puppet] - 10https://gerrit.wikimedia.org/r/533928 (https://phabricator.wikimedia.org/T225125) (owner: 10Mathew.onipe) [16:27:10] (03CR) 10Mathew.onipe: elasticsearch: add syslog logging option (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/533928 (https://phabricator.wikimedia.org/T225125) (owner: 10Mathew.onipe) [17:02:26] PROBLEM - Varnish traffic drop between 30min ago and now at eqsin on icinga1001 is CRITICAL: 58.86 le 60 https://wikitech.wikimedia.org/wiki/Varnish%23Diagnosing_Varnish_alerts https://grafana.wikimedia.org/dashboard/db/varnish-http-requests?panelId=6&fullscreen&orgId=1 [17:05:38] RECOVERY - Varnish traffic drop between 30min ago and now at eqsin on icinga1001 is OK: (C)60 le (W)70 le 80.28 https://wikitech.wikimedia.org/wiki/Varnish%23Diagnosing_Varnish_alerts https://grafana.wikimedia.org/dashboard/db/varnish-http-requests?panelId=6&fullscreen&orgId=1 [17:49:24] (03PS12) 10MacFan4000: Set wgNoticeProjects for wikimedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/471663 (https://phabricator.wikimedia.org/T208694) [17:51:26] (03PS7) 10Krinkle: CommonSettings: Store mtime inside wmf-config cache file [mediawiki-config] - 10https://gerrit.wikimedia.org/r/528447 (https://phabricator.wikimedia.org/T217830) [17:57:13] !log regenerating tiles from z0 to z9 in eqiad and codfw- T231691, T230511 [17:57:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:57:18] T230511: Lake missing on map for some zoom values - https://phabricator.wikimedia.org/T230511 [17:57:18] T231691: Lake Huron missing due to apparent OSM vandalism - https://phabricator.wikimedia.org/T231691 [18:02:43] (03PS1) 10Mforns: analytics::refinery::job::data_purge.pp Add skip-trash to timers [puppet] - 10https://gerrit.wikimedia.org/r/533955 (https://phabricator.wikimedia.org/T229436) [18:05:39] (03CR) 10Mforns: "I tested each one of the jobs after adding --skip-trash, and they look good!" [puppet] - 10https://gerrit.wikimedia.org/r/533955 (https://phabricator.wikimedia.org/T229436) (owner: 10Mforns) [18:17:44] RECOVERY - Work requests waiting in Zuul Gearman server on contint1001 is OK: OK: Less than 30.00% above the threshold [90.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [19:13:10] PROBLEM - CirrusSearch eqiad 95th percentile latency on graphite1004 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [1000.0] https://wikitech.wikimedia.org/wiki/Search%23Health/Activity_Monitoring https://grafana.wikimedia.org/dashboard/db/elasticsearch-percentiles?panelId=19&fullscreen&orgId=1&var-cluster=eqiad&var-smoothing=1 [19:31:16] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1004 is CRITICAL: CRITICAL: 70.00% of data above the critical threshold [50.0] https://wikitech.wikimedia.org/wiki/Application_servers https://grafana.wikimedia.org/dashboard/db/mediawiki-graphite-alerts?orgId=1&panelId=2&fullscreen [19:36:02] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1004 is OK: OK: Less than 70.00% above the threshold [25.0] https://wikitech.wikimedia.org/wiki/Application_servers https://grafana.wikimedia.org/dashboard/db/mediawiki-graphite-alerts?orgId=1&panelId=2&fullscreen [19:37:22] PROBLEM - Nginx local proxy to apache on mw1288 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Application_servers [19:38:50] RECOVERY - Nginx local proxy to apache on mw1288 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 591 bytes in 0.057 second response time https://wikitech.wikimedia.org/wiki/Application_servers [19:49:32] PROBLEM - IPv6 ping to ulsfo on ripe-atlas-ulsfo IPv6 is CRITICAL: CRITICAL - failed 39 probes of 451 (alerts on 35) - https://atlas.ripe.net/measurements/1791309/#!map https://wikitech.wikimedia.org/wiki/Network_monitoring%23Atlas_alerts [19:53:02] PROBLEM - CirrusSearch eqiad 95th percentile latency on graphite1004 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [1000.0] https://wikitech.wikimedia.org/wiki/Search%23Health/Activity_Monitoring https://grafana.wikimedia.org/dashboard/db/elasticsearch-percentiles?panelId=19&fullscreen&orgId=1&var-cluster=eqiad&var-smoothing=1 [19:54:44] !log restart elasticsearch_6@production-search-eqiad on elastic1027 [19:54:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:55:08] RECOVERY - IPv6 ping to ulsfo on ripe-atlas-ulsfo IPv6 is OK: OK - failed 24 probes of 451 (alerts on 35) - https://atlas.ripe.net/measurements/1791309/#!map https://wikitech.wikimedia.org/wiki/Network_monitoring%23Atlas_alerts [20:00:31] (03PS1) 10MSantos: Make osm-pbf source private [puppet] - 10https://gerrit.wikimedia.org/r/533972 (https://phabricator.wikimedia.org/T231842) [20:04:20] RECOVERY - CirrusSearch eqiad 95th percentile latency on graphite1004 is OK: OK: Less than 20.00% above the threshold [500.0] https://wikitech.wikimedia.org/wiki/Search%23Health/Activity_Monitoring https://grafana.wikimedia.org/dashboard/db/elasticsearch-percentiles?panelId=19&fullscreen&orgId=1&var-cluster=eqiad&var-smoothing=1 [20:15:43] (03CR) 10Gehel: [C: 03+2] Make osm-pbf source private [puppet] - 10https://gerrit.wikimedia.org/r/533972 (https://phabricator.wikimedia.org/T231842) (owner: 10MSantos) [20:25:03] (03PS1) 10Gehel: maps: cleanup unused template [puppet] - 10https://gerrit.wikimedia.org/r/533974 [20:31:04] !log mbsantos@deploy1001 Started deploy [kartotherian/deploy@453ee8a]: Make osm-pbf source private (T231842) [20:31:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:33:13] !log mbsantos@deploy1001 Finished deploy [kartotherian/deploy@453ee8a]: Make osm-pbf source private (T231842) (duration: 02m 09s) [20:33:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:34:36] PROBLEM - CirrusSearch eqiad 95th percentile latency on graphite1004 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [1000.0] https://wikitech.wikimedia.org/wiki/Search%23Health/Activity_Monitoring https://grafana.wikimedia.org/dashboard/db/elasticsearch-percentiles?panelId=19&fullscreen&orgId=1&var-cluster=eqiad&var-smoothing=1 [20:43:12] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1004 is CRITICAL: CRITICAL: 80.00% of data above the critical threshold [50.0] https://wikitech.wikimedia.org/wiki/Application_servers https://grafana.wikimedia.org/dashboard/db/mediawiki-graphite-alerts?orgId=1&panelId=2&fullscreen [20:46:06] PROBLEM - recommendation_api endpoints health on scb1004 is CRITICAL: /{domain}/v1/article/creation/translation/{source}{/seed} (article.creation.translation - bad seed) is CRITICAL: Test article.creation.translation - bad seed returned the unexpected status 500 (expecting: 404) https://wikitech.wikimedia.org/wiki/Services/Monitoring/recommendation_api [20:47:46] RECOVERY - recommendation_api endpoints health on scb1004 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/recommendation_api [20:48:08] recomendation apis are likely because something is generating extra search load... [20:48:23] !log restart production-search-eqiad on elastic1027 again [20:48:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:56:00] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1004 is OK: OK: Less than 70.00% above the threshold [25.0] https://wikitech.wikimedia.org/wiki/Application_servers https://grafana.wikimedia.org/dashboard/db/mediawiki-graphite-alerts?orgId=1&panelId=2&fullscreen [21:08:42] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1004 is CRITICAL: CRITICAL: 80.00% of data above the critical threshold [50.0] https://wikitech.wikimedia.org/wiki/Application_servers https://grafana.wikimedia.org/dashboard/db/mediawiki-graphite-alerts?orgId=1&panelId=2&fullscreen [22:08:02] !log ban elastic1027 from production-search-chi [22:08:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:10:29] (03CR) 10Awight: [C: 03+1] Set wgNoticeProjects for wikimedia (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/471663 (https://phabricator.wikimedia.org/T208694) (owner: 10MacFan4000) [22:13:32] PROBLEM - recommendation_api endpoints health on scb2005 is CRITICAL: /{domain}/v1/article/creation/translation/{source}{/seed} (article.creation.translation - bad seed) is CRITICAL: Test article.creation.translation - bad seed returned the unexpected status 500 (expecting: 404) https://wikitech.wikimedia.org/wiki/Services/Monitoring/recommendation_api [22:15:08] RECOVERY - recommendation_api endpoints health on scb2005 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/recommendation_api [22:15:54] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1004 is OK: OK: Less than 70.00% above the threshold [25.0] https://wikitech.wikimedia.org/wiki/Application_servers https://grafana.wikimedia.org/dashboard/db/mediawiki-graphite-alerts?orgId=1&panelId=2&fullscreen [22:35:56] PROBLEM - CirrusSearch eqiad 95th percentile latency on graphite1004 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [1000.0] https://wikitech.wikimedia.org/wiki/Search%23Health/Activity_Monitoring https://grafana.wikimedia.org/dashboard/db/elasticsearch-percentiles?panelId=19&fullscreen&orgId=1&var-cluster=eqiad&var-smoothing=1 [22:56:38] RECOVERY - CirrusSearch eqiad 95th percentile latency on graphite1004 is OK: OK: Less than 20.00% above the threshold [500.0] https://wikitech.wikimedia.org/wiki/Search%23Health/Activity_Monitoring https://grafana.wikimedia.org/dashboard/db/elasticsearch-percentiles?panelId=19&fullscreen&orgId=1&var-cluster=eqiad&var-smoothing=1