[00:00:41] bd808, ToAruShiroiNeko: Depends on the type of wiki [00:01:24] If you're after wikimania2014wiki, there's a ticket open for that - waiting for the local organisers to comment [00:02:19] I am bothered by all closed wikis [00:02:22] There is https://meta.wikimedia.org/wiki/Closing_projects_policy for the standard wikis [00:02:30] I wish they remain semi-editable [00:02:44] closed wikis are semi-editable [00:02:49] very few people can edit them [00:03:17] SUL unification, username renames, deletion of copyrighted content etc. [00:03:43] They still run global user renames [00:03:50] I am told username renames arent possible [00:03:56] local user renames sure [00:04:05] they were locked before SUL [00:06:30] ToAruShiroiNeko, and didn't ever have SUL finalisation? [00:06:46] unfortunately no [00:07:55] which wikis are these? [00:07:56] legoktm, ^ [00:08:12] Krenair its a long list [00:08:35] huh? [00:08:41] http://tools.wmflabs.org/meta/userpages/White+Cat [00:08:41] ToAruShiroiNeko: what wikis? [00:08:47] any page that isnt a redirect [00:09:58] https://simple.wikibooks.org/wiki/Special:Log/Maintenance_script looks fine to me [00:13:33] so how can I get the White Cat accounts merged to my current user? [00:15:14] lol still hoping [00:16:17] I imagine you'll have to wait for the user merge tool? [00:19:19] I have been waiting for this for over four years :p [00:19:28] I can wait four more if need be [00:19:31] but not more :p [00:21:41] nemo_bis it's in progress! legoktm is hard at work merging my accounts! :D [00:32:21] (03PS1) 10BBlack: VCL: remove fqdn comment line [puppet] - 10https://gerrit.wikimedia.org/r/228584 [00:32:24] (03PS1) 10BBlack: VCL: remove restrict_access from text/upload backends [puppet] - 10https://gerrit.wikimedia.org/r/228585 [00:32:26] (03PS1) 10BBlack: network::constants::all_networks(_lo)? via flatten() [puppet] - 10https://gerrit.wikimedia.org/r/228586 [00:32:28] (03PS1) 10BBlack: VCL: use network::constants::all_networks_lo for ssl_proxies [puppet] - 10https://gerrit.wikimedia.org/r/228587 [00:32:30] (03PS1) 10BBlack: VCL: remove unused probes "swift", "options" [puppet] - 10https://gerrit.wikimedia.org/r/228588 [00:32:32] (03PS1) 10BBlack: VCL: define vcl_config "layer" for parsoidcache [puppet] - 10https://gerrit.wikimedia.org/r/228589 [00:32:34] (03PS1) 10BBlack: vhtcpd: /etc/init/varnishhtcpd.conf is long-gone now [puppet] - 10https://gerrit.wikimedia.org/r/228590 [00:32:36] (03PS1) 10BBlack: varnish: get rid of some pre-systemd cruft [puppet] - 10https://gerrit.wikimedia.org/r/228591 [01:22:17] PROBLEM - puppet last run on analytics1044 is CRITICAL Puppet last ran 6 hours ago [01:55:26] really? the patch gets rejected because my registered email address has a capital locally to start rather then a lowercase on gerrit? /grumbles grumbles/ [01:57:05] Jamesofur: the username part of email addresses is case-sensitive :) [01:57:29] * Jamesofur glares [01:57:48] * Jamesofur also apparently has to unstage the commit to get it to realize he changed the email address [01:57:59] Jamesofur: git commit --amend --reset-author [01:58:08] :) [01:59:37] (03PS1) 10Jalexander: Replace ssh key for jamesur [puppet] - 10https://gerrit.wikimedia.org/r/228597 [02:20:02] !log l10nupdate Synchronized php-1.26wmf16/cache/l10n: (no message) (duration: 06m 11s) [02:20:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [02:23:09] !log @tin LocalisationUpdate completed (1.26wmf16) at 2015-08-02 02:23:09+00:00 [02:23:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [02:33:48] PROBLEM - LVS HTTP IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out [02:34:16] hmm, site loads for me [02:34:39] logged in and logged out [02:34:43] what's going on [02:35:48] oh [02:35:50] ipv6 [02:35:53] Yeah. [02:36:33] that's been flapping now and then [02:38:07] RECOVERY - LVS HTTP IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 301 TLS Redirect - 497 bytes in 0.018 second response time [02:54:57] PROBLEM - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out [02:56:57] RECOVERY - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 18558 bytes in 3.060 second response time [03:36:17] PROBLEM - puppet last run on mw1050 is CRITICAL Puppet has 1 failures [03:36:48] PROBLEM - puppet last run on mw1109 is CRITICAL Puppet has 1 failures [03:37:16] PROBLEM - LVS HTTP IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out [03:38:17] blahhhhh [03:38:29] stop alerting ipv6 my new alert tone is horrible [03:39:36] Pagerduty is doing a good job of training me to ignore it [03:43:06] RECOVERY - LVS HTTP IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 301 TLS Redirect - 497 bytes in 0.003 second response time [03:46:12] andrewbogott: well, dont blame pagerduty [03:46:15] its really our fault [03:46:17] its our check ;D [03:46:53] if you dont use the pagerduty app its less annoying but meh, its supposed to annoy us really. [03:47:45] ALERT #86, #87 on ops-gmtminus. Replay 154: Ack all, 156: Resolv all [03:48:01] It’s hard for me to take cryptic texts like that seriously. Does every page amount to ‘check your email’? [03:48:07] And if so, maybe they should just say that :) [03:48:11] nope, but im using the app [03:48:23] i'll change over to the normal sms on monday and try to make tem more useful [03:48:33] ok, I’ll give the app a try tomorrow. [03:48:50] In the meantime… should I actually try to fix that ipv6 flap? I have no idea where that is or what it means :) [03:53:24] I think its just an oversensitive check [03:53:32] but its been ongoing for month+ [03:53:53] and now everyone is getting them and being annoyed.... i imagine we'll discuss in ops meeting now ;D [03:55:13] yeah, maybe that counts as the system working :) [03:58:59] I cleared a few earlier but count catch it in the act, not sure what to do ATM but family stuff going on here so time is limited [03:59:13] Couldn't catch I mean [04:00:32] yea when i hit the computer i happen to login to pagerduty dashboard and ack them for both 'zones' i hate the znoes shit too [04:00:39] so i'm tring out a competitor on monday [04:00:54] similar featureseat but unlimited # of contacts in a single page event [04:01:04] unlike pagerduty's 10 (which leads to odd shit in config) [04:01:29] cuz now when we ack it [04:01:34] it doesnt ack it for the gmt+ folks [04:01:35] it sucks. [04:01:52] so i click on all temas in app and ack for both when i do it [04:01:57] but its annoying as hell. [04:02:26] RECOVERY - puppet last run on mw1050 is OK Puppet is currently enabled, last run 36 seconds ago with 0 failures [04:02:47] RECOVERY - puppet last run on mw1109 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures [04:56:29] !log @tin ResourceLoader cache refresh completed at Sun Aug 2 04:56:29 UTC 2015 (duration 56m 28s) [04:56:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [05:44:37] PROBLEM - HTTP error ratio anomaly detection on graphite1001 is CRITICAL Anomaly detected: 10 data above and 0 below the confidence bounds [05:55:47] PROBLEM - HTTP 5xx req/min on graphite1001 is CRITICAL 16.67% of data above the critical threshold [500.0] [06:29:48] PROBLEM - puppet last run on mw1177 is CRITICAL puppet fail [06:30:56] PROBLEM - puppet last run on db2044 is CRITICAL puppet fail [06:30:57] PROBLEM - puppet last run on db1028 is CRITICAL puppet fail [06:31:06] PROBLEM - puppet last run on mc2007 is CRITICAL Puppet has 1 failures [06:31:17] PROBLEM - puppet last run on db1067 is CRITICAL puppet fail [06:31:37] PROBLEM - puppet last run on mw1110 is CRITICAL Puppet has 2 failures [06:31:48] PROBLEM - puppet last run on mw1158 is CRITICAL Puppet has 1 failures [06:31:56] PROBLEM - puppet last run on mw2158 is CRITICAL Puppet has 1 failures [06:32:57] PROBLEM - puppet last run on db1045 is CRITICAL Puppet has 1 failures [06:32:58] PROBLEM - puppet last run on wtp2017 is CRITICAL Puppet has 1 failures [06:32:58] PROBLEM - puppet last run on mw2045 is CRITICAL Puppet has 1 failures [06:32:58] PROBLEM - puppet last run on mw2050 is CRITICAL Puppet has 1 failures [06:33:07] PROBLEM - puppet last run on mw2016 is CRITICAL Puppet has 1 failures [06:33:56] PROBLEM - puppet last run on mw2207 is CRITICAL Puppet has 1 failures [06:33:57] PROBLEM - puppet last run on mw2018 is CRITICAL Puppet has 1 failures [06:55:56] RECOVERY - puppet last run on mw1158 is OK Puppet is currently enabled, last run 13 seconds ago with 0 failures [06:56:57] RECOVERY - puppet last run on db1045 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures [06:56:58] RECOVERY - puppet last run on db1028 is OK Puppet is currently enabled, last run 25 seconds ago with 0 failures [06:57:06] RECOVERY - puppet last run on db2044 is OK Puppet is currently enabled, last run 25 seconds ago with 0 failures [06:57:07] RECOVERY - puppet last run on wtp2017 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures [06:57:07] RECOVERY - puppet last run on mw2045 is OK Puppet is currently enabled, last run 13 seconds ago with 0 failures [06:57:07] RECOVERY - puppet last run on mw2050 is OK Puppet is currently enabled, last run 18 seconds ago with 0 failures [06:57:08] RECOVERY - puppet last run on mw2016 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures [06:57:16] RECOVERY - puppet last run on mc2007 is OK Puppet is currently enabled, last run 55 seconds ago with 0 failures [06:57:17] RECOVERY - puppet last run on db1067 is OK Puppet is currently enabled, last run 55 seconds ago with 0 failures [06:57:37] RECOVERY - puppet last run on mw1110 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures [06:57:57] RECOVERY - puppet last run on mw2207 is OK Puppet is currently enabled, last run 55 seconds ago with 0 failures [06:57:57] RECOVERY - puppet last run on mw2158 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures [06:57:57] RECOVERY - puppet last run on mw1177 is OK Puppet is currently enabled, last run 27 seconds ago with 0 failures [06:58:06] RECOVERY - puppet last run on mw2018 is OK Puppet is currently enabled, last run 59 seconds ago with 0 failures [06:58:17] RECOVERY - HTTP 5xx req/min on graphite1001 is OK Less than 1.00% above the threshold [250.0] [07:19:57] PROBLEM - check if wikidata.org dispatch lag is higher than 2 minutes on wikidata is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - pattern not found - 1419 bytes in 0.306 second response time [07:39:16] RECOVERY - HTTP error ratio anomaly detection on graphite1001 is OK No anomaly detected [08:21:56] RECOVERY - check if wikidata.org dispatch lag is higher than 2 minutes on wikidata is OK: HTTP OK: HTTP/1.1 200 OK - 1413 bytes in 0.118 second response time [08:40:29] (03PS1) 10Legoktm: Set an explicit 'wgLanguageCode' entry for metawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/228618 (https://phabricator.wikimedia.org/T90612) [08:41:32] (03CR) 10Legoktm: "Needed by I53aa995d385b09bae41b210664b45143d7789861" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/228618 (https://phabricator.wikimedia.org/T90612) (owner: 10Legoktm) [10:10:27] 6operations, 6Commons, 10MediaWiki-File-management, 10MediaWiki-Tarball-Backports, and 7 others: InstantCommons broken by switch to HTTPS - https://phabricator.wikimedia.org/T102566#1501358 (10Tau) I tried several maintenance scripts (purgeList, checkImages, rebuildImages etc.) but none of them helped. It'... [10:16:15] 6operations, 10ops-ulsfo: RIPE Atlas Anchor @ ulsfo is down - https://phabricator.wikimedia.org/T107691#1501361 (10faidon) 3NEW [10:40:18] PROBLEM - puppet last run on cp4011 is CRITICAL puppet fail [11:06:57] RECOVERY - puppet last run on cp4011 is OK Puppet is currently enabled, last run 23 seconds ago with 0 failures [11:26:35] (03PS1) 10Merlijn van Deen: [toollabs] add script to generate python package listings [puppet] - 10https://gerrit.wikimedia.org/r/228635 (https://phabricator.wikimedia.org/T101646) [11:27:36] (03CR) 10jenkins-bot: [V: 04-1] [toollabs] add script to generate python package listings [puppet] - 10https://gerrit.wikimedia.org/r/228635 (https://phabricator.wikimedia.org/T101646) (owner: 10Merlijn van Deen) [11:31:07] (03PS2) 10Merlijn van Deen: [toollabs] add script to generate python package listings [puppet] - 10https://gerrit.wikimedia.org/r/228635 (https://phabricator.wikimedia.org/T101646) [11:31:55] (03CR) 10jenkins-bot: [V: 04-1] [toollabs] add script to generate python package listings [puppet] - 10https://gerrit.wikimedia.org/r/228635 (https://phabricator.wikimedia.org/T101646) (owner: 10Merlijn van Deen) [11:33:28] (03PS3) 10Merlijn van Deen: [toollabs] add script to generate python package listings [puppet] - 10https://gerrit.wikimedia.org/r/228635 (https://phabricator.wikimedia.org/T101646) [11:34:09] (03CR) 10jenkins-bot: [V: 04-1] [toollabs] add script to generate python package listings [puppet] - 10https://gerrit.wikimedia.org/r/228635 (https://phabricator.wikimedia.org/T101646) (owner: 10Merlijn van Deen) [11:39:02] (03PS4) 10Merlijn van Deen: [toollabs] add script to generate python package listings [puppet] - 10https://gerrit.wikimedia.org/r/228635 (https://phabricator.wikimedia.org/T101646) [11:39:48] (03CR) 10jenkins-bot: [V: 04-1] [toollabs] add script to generate python package listings [puppet] - 10https://gerrit.wikimedia.org/r/228635 (https://phabricator.wikimedia.org/T101646) (owner: 10Merlijn van Deen) [11:43:08] (03PS5) 10Merlijn van Deen: [toollabs] add script to generate python package listings [puppet] - 10https://gerrit.wikimedia.org/r/228635 (https://phabricator.wikimedia.org/T101646) [11:43:49] (03CR) 10jenkins-bot: [V: 04-1] [toollabs] add script to generate python package listings [puppet] - 10https://gerrit.wikimedia.org/r/228635 (https://phabricator.wikimedia.org/T101646) (owner: 10Merlijn van Deen) [11:54:22] bblack u there O_O? [11:55:28] PROBLEM - Host text-lb.esams.wikimedia.org_ipv6 is DOWN: /bin/ping6 -n -U -w 15 -c 5 2620:0:862:ed1a::1 [11:55:30] yeah somewhat [11:55:35] why? [11:55:50] Things are a little bit slow, I think... [11:56:02] what do you mean? [11:56:27] RECOVERY - Host text-lb.esams.wikimedia.org_ipv6 is UPING OK - Packet loss = 0%, RTA = 88.52 ms [11:57:11] yesturday i renamed together with hoo a account with 60000 (nothing happen), now i have again a account with 60000 edits to rename. It is okay for you if i rename it now or schould i wait for him? <--bblack [11:57:14] yeah something happened in the graphs, not sure what yet [11:57:50] Steinsplitter: I have no idea what that really means in technical terms, but my advice would be if you have to ask, don't do it over the weekend. [11:58:01] ok [11:58:22] what's going on? [11:58:29] <_joe_> bblack: hey [11:58:36] looks kinda like the synflood the other day, but at esams? [11:58:37] <_joe_> what paravoid said [11:58:49] still staring at graphs, already mostly over I think [11:59:12] http://ganglia.wikimedia.org/latest/graph.php?r=hour&z=xlarge&c=LVS+loadbalancers+esams&m=cpu_report&s=by+name&mc=2&g=network_report [12:00:58] ( https://upload.wikimedia.org/wikipedia/commons/0/03/Server-kitty.jpg ) [12:05:27] !log started pybal on lvs3001 [12:05:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [12:06:03] (03PS6) 10Merlijn van Deen: [toollabs] add script to generate python package listings [puppet] - 10https://gerrit.wikimedia.org/r/228635 (https://phabricator.wikimedia.org/T101646) [12:06:46] (03CR) 10jenkins-bot: [V: 04-1] [toollabs] add script to generate python package listings [puppet] - 10https://gerrit.wikimedia.org/r/228635 (https://phabricator.wikimedia.org/T101646) (owner: 10Merlijn van Deen) [12:28:47] PROBLEM - Hadoop DataNode on analytics1043 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [12:29:17] PROBLEM - salt-minion processes on analytics1043 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [12:30:38] PROBLEM - dhclient process on analytics1043 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [12:30:38] PROBLEM - Hadoop NodeManager on analytics1043 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [12:37:05] (03CR) 10Glaisher: "https://github.com/wikimedia/operations-mediawiki-config/blob/d2813e1b8ae7e9e35414a30b1cb68a56e4033f71/wmf-config/CommonSettings.php#L889 " [mediawiki-config] - 10https://gerrit.wikimedia.org/r/228618 (https://phabricator.wikimedia.org/T90612) (owner: 10Legoktm) [12:38:36] (03CR) 10Glaisher: "Also does wgConf->get() stuff not work for settings in CommonSettings?" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/228618 (https://phabricator.wikimedia.org/T90612) (owner: 10Legoktm) [12:55:45] is anyone doing something on the hadoop/analytics hosts? I will restart a couple of them otherwise [13:00:50] 6operations, 7Ipv6: Fix IPv6 autoconf issues once and for all, across the fleet. - https://phabricator.wikimedia.org/T102099#1501465 (10BBlack) [13:06:28] ^dmesg is full of kernel bugs, shutdown/ps/etc does not work, will powercycle [13:10:50] !log powercycling analytics1043: kernel issues [13:10:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [13:12:57] PROBLEM - Host analytics1043 is DOWN: PING CRITICAL - Packet loss = 100% [13:13:56] RECOVERY - Hadoop DataNode on analytics1043 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.hdfs.server.datanode.DataNode [13:13:57] RECOVERY - Host analytics1043 is UPING OK - Packet loss = 0%, RTA = 0.32 ms [13:14:27] RECOVERY - salt-minion processes on analytics1043 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion [13:14:46] maybe got fried due to the recent logs, but when ps gets locked, not a good signal [13:15:36] RECOVERY - dhclient process on analytics1043 is OK: PROCS OK: 0 processes with command name dhclient [13:15:36] RECOVERY - Hadoop NodeManager on analytics1043 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager [13:26:25] !log powercycling analytics1044: same kernel fatal issues as 1043 [13:26:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [13:28:56] PROBLEM - Host mw2027 is DOWN: PING CRITICAL - Packet loss = 100% [13:29:36] PROBLEM - Host analytics1044 is DOWN: PING CRITICAL - Packet loss = 100% [13:29:57] RECOVERY - Host mw2027 is UPING OK - Packet loss = 0%, RTA = 44.12 ms [13:31:17] RECOVERY - Hadoop NodeManager on analytics1044 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager [13:31:17] RECOVERY - Hadoop DataNode on analytics1044 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.hdfs.server.datanode.DataNode [13:31:26] RECOVERY - Host analytics1044 is UPING OK - Packet loss = 0%, RTA = 1.30 ms [13:32:27] RECOVERY - dhclient process on analytics1044 is OK: PROCS OK: 0 processes with command name dhclient [13:32:27] RECOVERY - salt-minion processes on analytics1044 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion [13:32:56] RECOVERY - puppet last run on analytics1044 is OK Puppet is currently enabled, last run 33 seconds ago with 0 failures [13:39:06] PROBLEM - Outgoing network saturation on labstore1003 is CRITICAL 10.71% of data above the critical threshold [100000000.0] [13:41:58] 6operations: kernel BUG at /build/buildd/linux-3.13.0/mm/memory.c:3756 for analytics1044 and analytics1043 - https://phabricator.wikimedia.org/T107698#1501534 (10jcrespo) 3NEW [13:43:30] ^reported it but it does not require immediate actionables [14:24:46] RECOVERY - Outgoing network saturation on labstore1003 is OK Less than 10.00% above the threshold [75000000.0] [14:48:09] (03PS1) 10Faidon Liambotis: mail: bump MX's spamassassin max_children to 32 [puppet] - 10https://gerrit.wikimedia.org/r/228656 [14:48:32] (03CR) 10Faidon Liambotis: [C: 032] mail: bump MX's spamassassin max_children to 32 [puppet] - 10https://gerrit.wikimedia.org/r/228656 (owner: 10Faidon Liambotis) [15:05:11] (03CR) 10Nemo bis: "Thanks for taking care of spam. :)" [puppet] - 10https://gerrit.wikimedia.org/r/228656 (owner: 10Faidon Liambotis) [15:46:52] https://zu.wikipedia.org/static/images/project-logos/default.png https://zu.wikipedia.org/w/static/images/project-logos/default.png [15:47:05] can the cache be cleared from that? [15:47:19] looks like it's causing some users to be served foundation logo on wikipedias [15:47:24] bblack: ^ [15:49:27] also, can you explain why some users are being served from /w/static while others from /static [15:53:01] 6operations, 7Varnish: Figure out purging of static logos for updates - https://phabricator.wikimedia.org/T106620#1501604 (10Glaisher) https://zu.wikipedia.org/static/images/project-logos/default.png https://zu.wikipedia.org/w/static/images/project-logos/default.png can the cache be cleare... [15:54:33] 6operations, 7Varnish: Figure out purging of static logos for updates - https://phabricator.wikimedia.org/T106620#1501606 (10Glaisher) p:5Triage>3High Changing to high because users shouldn't be seeing foundation logo on Wikipedias. Also why is it set to expire on 2016? That seems a bit lengthy. [15:56:29] PROBLEM - puppet last run on mw2202 is CRITICAL Puppet has 1 failures [16:09:15] enwiki job queue is all enqueue jobs... yay :/ [16:22:47] RECOVERY - puppet last run on mw2202 is OK Puppet is currently enabled, last run 1 minute ago with 0 failures [17:13:45] Glaisher: I don't have good explanations for anything off the top of my head [17:14:01] was there a change to the logo image and/or path? [17:14:17] (is the cache actually different from what MW serves?) [17:14:57] (is this an all-wikis problem, or just "zu"?) [17:16:15] 6operations: Configure librenms to use LDAP for authentication - https://phabricator.wikimedia.org/T107702#1501635 (10ori) 3NEW [17:17:05] bblack: Looks like there was a change to the logo recently. [17:17:31] do you know what the change was? like a gerrit link or something? [17:17:58] https://github.com/wikimedia/operations-mediawiki-config/commit/05d2bd0a6edd4a224ce72305a14c0683232c1d7f [17:19:02] hmm.. so the wikipedia logo (the correct one) is actually the old one [17:19:24] which wiki was this? [17:19:36] zu.wikipedia.org [17:19:41] See #wikipedia's scrollback [17:20:18] wtf, why are they pointing to default.png? [17:20:48] Krenair: they don't define anything more specific, I think [17:21:18] I've gotta run, I have a lunch appt to make [17:21:20] ahh [17:21:22] I'll check back in later, though :) [17:21:30] bblack: we can sort it out :) take care [17:21:39] one moment [17:21:42] Krenair: or possibly they do, but not the HD variant [17:23:32] my fault Glaisher, fixing [17:24:21] (03PS1) 10Alex Monk: Default wikipedias to enwiki.png [mediawiki-config] - 10https://gerrit.wikimedia.org/r/228676 [17:24:23] but why is /w/static and /static different? [17:24:31] one is cached and one isn't [17:24:40] the cached one is out of date and we don't know how to fix it [17:25:11] (03CR) 10Glaisher: [C: 031] Default wikipedias to enwiki.png [mediawiki-config] - 10https://gerrit.wikimedia.org/r/228676 (owner: 10Alex Monk) [17:25:12] wait, we're using uncached URLs for the project logo now? [17:25:32] (03CR) 10Ori.livneh: [C: 031] "Go ahead and deploy." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/228676 (owner: 10Alex Monk) [17:25:54] no [17:26:07] Krenair: actually, do you mind if I sync that? [17:26:11] we're still using /static [17:26:12] I want to test a change to my deployment helper scripts [17:26:31] sure [17:26:39] ok. Glaisher, gimme 5 mins. [17:27:31] (03CR) 10Ori.livneh: [C: 032] Default wikipedias to enwiki.png [mediawiki-config] - 10https://gerrit.wikimedia.org/r/228676 (owner: 10Alex Monk) [17:27:37] (03Merged) 10jenkins-bot: Default wikipedias to enwiki.png [mediawiki-config] - 10https://gerrit.wikimedia.org/r/228676 (owner: 10Alex Monk) [17:28:52] (03CR) 10Alex Monk: "I think this was affecting about 23 wikis. Mentioned by Glaisher on T106620" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/228676 (owner: 10Alex Monk) [17:29:44] 6operations, 7Varnish: Figure out purging of static logos for updates - https://phabricator.wikimedia.org/T106620#1501647 (10Krenair) That issue with people seeing the foundation logo was separate to the purpose of this bug, see https://gerrit.wikimedia.org/r/#/c/228676/ [17:31:24] Krenair: we actually have a specific logo for zuwiki [17:32:19] we do? [17:32:21] Can we have a list of wikis with the specific $lang$site.png but not in InitialiseSettings.php? [17:32:38] oh, yeah [17:32:39] https://github.com/wikimedia/operations-mediawiki-config/blob/master/w/static/images/project-logos/zuwiki.png [17:32:48] but english :p [17:32:51] Yes [17:32:57] It's a copy of enwiki.png [17:33:18] Would've been a copy of default.png when it was added. I don't know why we ended up with such a mess in that directory [17:34:23] wikis without specifically configured logos just got the one they had inherited from their project (or the default, enwiki's one) downloaded [17:35:01] even though it was unused [17:35:37] mhm.. [17:36:08] why were some users being served the outdated cached one while others the new one? [17:47:16] (03CR) 10Alex Monk: "Returns null for me" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/228618 (https://phabricator.wikimedia.org/T90612) (owner: 10Legoktm) [17:52:17] !log ori Synchronized wmf-config/InitialiseSettings.php: If7fcb6e6: Default wikipedias to enwiki.png (duration: 00m 12s) [17:52:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [17:52:41] woooooo [17:53:13] https://dpaste.de/LkKp/raw [17:54:51] the script queried the API for recently-merged changes in wmf/* branches in mediawiki/core and in master of operations/mediawiki-config, checked which ones had not been merged locally, computed the most concise invocation of either sync-dir/sync-file, and generated the log message [17:56:19] https://gist.github.com/atdt/f1befbf448100a339ee0 [17:57:28] That sounds handy... and also very dangerous (given it doesn't ask for confirmation) [17:57:33] heh [17:57:37] yes, I wouldn't run that [17:57:41] i need to add some safety features (prompting for confirmation, handling failures) and handling of submodules [17:58:08] it's not done yet :P [17:59:00] ori, how can we clear the varnish cache of static images which have changed? [18:01:07] ideally, we don't; we wait for the change to propagate [18:01:29] yeah [18:01:34] and if not... poke Brandon [18:01:46] the "holiday logo" use-case can be satisfied with MediaWiki:Common.css [18:02:18] permanent updates to logos are sufficiently infrequent that they can require either waiting for cache propagation or asking someone in ops to ban the url [18:02:19] but it IMO shouldn't [18:02:52] (That's because I have general concerns with that kind of CSS) [18:03:35] Krenair: in this case, if the wrong logo is cached (or possibly cached), ping b.black with the URL [18:04:36] */static/images/project-logos/default.png ? [18:04:48] (03CR) 10Hoo man: "@Lokal Profil: Can you please bring your github repo up to date or do you want to do the primary development here now?" [puppet] - 10https://gerrit.wikimedia.org/r/219800 (https://phabricator.wikimedia.org/T103087) (owner: 10Lokal Profil) [18:05:55] Krenair: how many wikis does '*' represent in this case? [18:06:33] I don't think I actually should have needed default.png purged [18:07:05] although clearly it did cause an issue on a few wikis due to a mistake [18:07:24] but we should be able to purge a logo file everywhere [18:34:37] PROBLEM - HTTP error ratio anomaly detection on graphite1001 is CRITICAL Anomaly detected: 10 data above and 0 below the confidence bounds [19:00:37] PROBLEM - puppet last run on cp3048 is CRITICAL puppet fail [19:26:47] RECOVERY - puppet last run on cp3048 is OK Puppet is currently enabled, last run 30 seconds ago with 0 failures [19:27:45] (03PS1) 10Tim Landscheidt: Ignore warnings about URLs without modules for volatile directory [puppet] - 10https://gerrit.wikimedia.org/r/228682 (https://phabricator.wikimedia.org/T87132) [19:28:50] (03CR) 10Tim Landscheidt: "If my description of the volatile directory is wrong, please amend." [puppet] - 10https://gerrit.wikimedia.org/r/228682 (https://phabricator.wikimedia.org/T87132) (owner: 10Tim Landscheidt) [19:41:03] 6operations, 7Varnish: Figure out purging of static logos for updates - https://phabricator.wikimedia.org/T106620#1501729 (10BBlack) >>! In T106620#1501606, @Glaisher wrote: > Changing to high because users shouldn't be seeing foundation logo on Wikipedias. Also why is it set to expire on 2016? That seems a bi... [19:43:04] some crazy things going on with reqerr rates the past 24h, two little lumps, one ongoing [19:43:10] they're not very big, just... odd [19:43:11] https://gdash.wikimedia.org/dashboards/reqerror/ [19:52:29] (03PS1) 10Tim Landscheidt: haproxy: Move check_haproxy to module itself [puppet] - 10https://gerrit.wikimedia.org/r/228712 (https://phabricator.wikimedia.org/T87132) [20:15:56] RECOVERY - HTTP error ratio anomaly detection on graphite1001 is OK No anomaly detected [20:21:35] (03PS7) 10Merlijn van Deen: [toollabs] add script to generate python package listings [puppet] - 10https://gerrit.wikimedia.org/r/228635 (https://phabricator.wikimedia.org/T101646) [20:22:18] (03CR) 10jenkins-bot: [V: 04-1] [toollabs] add script to generate python package listings [puppet] - 10https://gerrit.wikimedia.org/r/228635 (https://phabricator.wikimedia.org/T101646) (owner: 10Merlijn van Deen) [20:24:36] (03PS8) 10Merlijn van Deen: [toollabs] add script to generate python package listings [puppet] - 10https://gerrit.wikimedia.org/r/228635 (https://phabricator.wikimedia.org/T101646) [20:31:57] PROBLEM - puppet last run on cp3036 is CRITICAL puppet fail [20:58:17] RECOVERY - puppet last run on cp3036 is OK Puppet is currently enabled, last run 31 seconds ago with 0 failures [20:59:37] (03CR) 10Legoktm: "> Also does wgConf->get() stuff not work for settings in CommonSettings?" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/228618 (https://phabricator.wikimedia.org/T90612) (owner: 10Legoktm) [21:26:24] (03CR) 10Alex Monk: "https://phabricator.wikimedia.org/P1812" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/228618 (https://phabricator.wikimedia.org/T90612) (owner: 10Legoktm) [21:31:14] (03PS2) 10Alex Monk: Fix part of the VE NS config issue [mediawiki-config] - 10https://gerrit.wikimedia.org/r/228198 (https://phabricator.wikimedia.org/T104898) [22:31:36] PROBLEM - check if wikidata.org dispatch lag is higher than 2 minutes on wikidata is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - pattern not found - 1426 bytes in 0.194 second response time [22:43:27] hoo: you were working on the patch to rsync wikidata json dumps to labs, right? [22:44:01] Addshore was primarily, but I have been doing CR so yes [22:44:42] hoo: link to the patch? [22:44:45] (just curious) [22:45:11] https://gerrit.wikimedia.org/r/215585 [22:45:24] I'm working on the above alert, btw [23:02:38] RECOVERY - check if wikidata.org dispatch lag is higher than 2 minutes on wikidata is OK: HTTP OK: HTTP/1.1 200 OK - 1418 bytes in 0.184 second response time [23:04:51] hoo: thanks [23:06:10] 6operations, 6Commons, 10MediaWiki-File-management, 10MediaWiki-Tarball-Backports, and 7 others: InstantCommons broken by switch to HTTPS - https://phabricator.wikimedia.org/T102566#1502005 (10Tgr) You could try to cherry-pick https://gerrit.wikimedia.org/r/#/c/223518/ and set `$wgDebugLogGroups['http'] =... [23:52:26] PROBLEM - puppet last run on mw1047 is CRITICAL Puppet has 1 failures [23:59:14] (03CR) 10Alex Monk: [C: 032] Fix part of the VE NS config issue [mediawiki-config] - 10https://gerrit.wikimedia.org/r/228198 (https://phabricator.wikimedia.org/T104898) (owner: 10Alex Monk) [23:59:38] (03Merged) 10jenkins-bot: Fix part of the VE NS config issue [mediawiki-config] - 10https://gerrit.wikimedia.org/r/228198 (https://phabricator.wikimedia.org/T104898) (owner: 10Alex Monk)