[02:49:25] enwiki seems to be running extremely slowly; others have reported it too [02:49:56] even simple things like opening Special:Delete (not actually doing the delete, but just going to that page) [02:52:50] Shirik: are pages loading slowly or bits.wm.o? [02:52:54] ori-l: ^ [02:56:10] it seems to be claiming it's the pages but I'm double-checking now [02:57:36] here's the list of biggest offenders: http://gyazo.com/f04d1939254a57b0d5ca3aa7783eaa5a [03:09:53] urgh [03:09:56] so fucking slow [03:11:29] Shirik: where are you connecting from? [03:11:41] just outside of Portland, OR [03:12:05] weird, doesn't seem to be location-specific [03:18:31] Have there been any recent changes to the API or to the proxies in front of the API? [03:25:43] Hi Cobi. What's your real question? :-) [05:09:59] yeah, something is fucked. [05:10:59] Shirik: can you traceroute bits.wikimedia.org and pastebin it somewhere? [05:20:05] closedmouth: could you? [05:21:07] ori-l: Wizardman is having troubles too [05:21:20] Wizardman: can you traceroute bits.wikimedia.org? [05:22:49] ok; doing so now [05:23:20] thank you. you can use paste.debian.org [05:27:15] tracert works for me over IPv6 [05:27:44] also over IPv43 [05:27:48] what seems to be broken? i.e. enwiki works for me [05:27:54] Jasper_Deng: 43?? [05:27:56] :P [05:27:57] ran it twice, only took a few hops and no timeouts, so it checks out for me. [05:28:00] IPv4* [05:28:31] meanwhile the page i clicked is still trying to load, yet the trace is fine [05:29:04] Wizardman: can you use your browser's web tools to get the response headers from one of the slow requests, if you know how? [05:29:17] and also just to see which are slow/blocked [05:29:37] i could try. on this kind of stuff i'm illiterate though [05:30:00] browser? [05:30:01] OS? [05:30:48] Wizardman? [05:31:30] using firefox, just updated to the latest version so maybe there's a problem there [05:31:39] i'll try chrome/ie [05:32:21] sure, would be good to know if it's different in a different browser [05:32:30] windows? [05:32:37] in firefox you can do ctrl-shift-k [05:32:41] windows7, yeah [05:32:44] to open up the "web console" [05:33:10] and chrome already loaded the page, as did IE [05:33:14] and there you will see status codes and total time per request [05:33:20] so, back to firefox [05:34:52] i haven't upgraded firefox yet [05:35:03] loading another page now to check. chrome loaded it already, ie/fox way behind but starting to. the ctrl+shift+k shows, well, a LOT of stuff [05:35:42] ori-l: Yeah sure [05:35:46] sorry I just saw your message now [05:35:51] Wizardman: right. you could just copy/paste the whole thing to paste.debian.org [05:36:37] https://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28technical%29#Horrible_slowdown btw [05:36:50] ori-l: http://pastebin.com/ppb89XDG [05:37:05] (running ping statistics now) [05:37:25] yeah it so slow that i can't even get to that thread.. [05:38:31] ori-l: http://p.defau.lt/?EL46Ur_WvvezllAoToE3yg [05:38:39] both ulsfo [05:39:23] apparently the localization is the issue? [00:37:37.258] GET https://en.wikipedia.org/w/api.php?action=ulslocalization&language=en [HTTP/1.1 200 OK 18736ms] [05:40:09] hah [05:40:22] that link loaded instantly for me, but lol ULS [05:40:46] bits-lb.ulsfo adn text-lb.ulsfo for me.. [05:40:58] legoktm: that was a 300 sec cache according to the bug iirc [05:41:07] >.> [05:41:09] that's a rather clunky API request... [05:41:16] yes, it should die in a fire [05:41:20] yeah, age 43 [05:41:44] max-age and s-max-age are 300 [05:41:51] * legoktm facepalms [05:41:53] ori-l: i thought we were changing to a month? [05:42:10] couldn't deploy because tests were failing [05:42:17] Age: 151 on bits for me.. [05:42:19] what tests? [05:42:23] ULS [05:42:29] i don't know the details, they just asked me to hold off [05:42:39] ori-l: :-( [05:42:59] ori-l: anyway, what now... [05:43:26] do we really think/know that ULS is the culprit for the latest issue? [05:43:35] no [05:43:36] maybe other affected users can confirm/refute? [05:43:44] so far all the reports point at ULSFO [05:43:48] trying to recreate the ULS thing on my end and waiting for the random article to push through [05:43:53] why is this... ahh, ulsfo [05:43:56] the ULS / ULSFO distinction is wonderfully confusing [05:44:14] ori-l: we could rename the vendor [05:44:21] for those keeping score: ULS = UniversalLanguageSelector, a MediaWiki extension; ULSFO = a data center in San Francisco [05:44:32] all my bits are over 100 years old.. [05:44:46] cometstyles: huh? [05:45:30] and all of a sudden i can load pages again. will wait a few to make sure it's not temporary [05:46:25] Wizardman: well it should be cached for you locally for <5 mins. and then maybe break again [05:46:36] it has not just the max-age but also expires [05:46:47] to be doubly sure of a short life [05:46:56] maybe. trying to load watchlist now and it's getting hung up. false hope [05:47:22] i guess i have to set ulsfo in my hosts file [05:48:27] i got a very slow bits request as well [05:48:59] https://dpaste.de/zdh0/raw/ [05:50:58] the time-to-first byte graphs on ganglia don't indicate anything unusual, and that is consistent with what i saw: 451ms waiting for resp., then 17.32s to receive it [05:52:43] If we're talking about slowness, I noticed major slowness on enwiki and Commons all day today in the office. [05:52:50] I just assumed it was our shitty WiFi... [05:53:26] jeremyb: about 50% of the time, https://dpaste.de/wA9n/raw/ chokes for me [05:53:42] i start getting the response quickly in all cases but then it comes in staggered chunks on about half of my requests [05:53:43] http://pastebin.com/CZisPhQT [05:57:07] trace router using WinSuperKit (a bit jumbled up) http://pastebin.com/mfXTyiUi [06:01:25] huh, found a bug and i can patch it myself :) [06:01:54] (not a fix) [06:02:00] for these reports [06:02:26] go on [06:02:30] what is it? [06:02:40] rdns is missing for some of ulsfo [06:02:49] but forward is there [06:11:20] Elsie: That was my real question. If you want more context, https://en.wikipedia.org/wiki/Wikipedia:Administrators%27_noticeboard#User:ClueBot_NG is a report of a bot of mine randomly injecting the TFA into user talk pages. The content of the relevant (local-scoped) variable is populated (fairly directly) from the response of the bot attempting to use the api to grab the latest contents of the user talk page. It is using curl to h [06:11:40] Cobi: you've been truncated [06:11:47] At what point? [06:11:51] of the user talk page. It is using curl to h [06:12:16] It is using curl to hit the API. But it is using keep alives and making multiple requests potentially in parallel to curl. And it does periodically ask for the contents of the TFA, but not in the context that it is getting the result. [06:12:21] It was really a hunch that perhaps something changed on the -tech side since there were no bot code changes that brought on this malfunctioning. [06:15:29] Furthermore there is a similiar issue that appears to be related, and perhaps caused by the same root cause -- it is selecting the wrong username, but not wrong revision, when it issues rollbacks -- https://en.wikipedia.org/w/index.php?title=Walter_Tirel&diff=580474198&oldid=580474186 [06:15:39] Note that it says: Reverting possible vandalism by 86.134.209.177 to version by Bencherlite [06:15:59] But the revision it was reverting to was made by Nortonius. [06:16:32] This, too, is populated directly from an API call. [06:17:00] And has been working correctly for years. [06:17:05] Cobi: AFAIK, nothing API side has been changed, and all other bots are working fine. [06:17:20] if the API was returning bad data, my bot would be fucked up too [06:18:42] legoktm: When I hit the api manually in a browser, it returns correctly. This seems intermittent, and could be due to mis-caching or mis-handling of concurrent requests on a single keep-alive session. [06:19:20] https://github.com/wikimedia/mediawiki-core/commits/master/includes/api <-- I don't see anything recent that could affect it. [06:19:47] I was actually more concerned about the proxies in front of the app servers. [06:20:09] Oh. [06:20:21] API isn't cached I thought. [06:20:29] Well, it shouldn't be. [06:20:39] But, it still is routed through their proxy network. [06:20:50] But they should just be passing it straight through. [06:21:48] Right. The relevant people are in #wikimedia-operations, but i think they're hunting down the ulsfo-slowness issue right now [06:21:57] we're here too [06:22:09] * legoktm huggles ori-l  [06:22:27] Gay. [06:22:37] Can I still blame brion? :P [06:22:52] legoktm: :) [06:26:44] Bits having issues and the API having issues? [06:27:14] i'm not convinced it's the API [06:27:16] And both are intermittent? [06:27:24] Hmm. [06:27:28] bits is 100% reproducible for me now [06:27:38] Interesting. [06:28:36] is bits the only thing that's hosted at ulsfo? [06:28:53] yes [06:29:28] is that b/c moving the content sites to that place led to disaster a week or two ago? [06:31:03] FYI: 14:33 mark: dist-upgrade && reboot on amssq47..62 [06:31:16] That happened around the same time as I started getting reports of oddness. [06:31:26] ori-l: erm? you sure? [06:32:00] no it's not [06:32:30] What else is hosted at ulsfo? [06:32:32] Cobi: wrong DC [06:33:03] everything [06:33:06] as in, all varnish clusters [06:33:10] text, bits, upload, mobile [06:33:33] anyway, back to debugging [06:34:10] paravoid: it'd be very instructive if you mention here what things you're looking at and ruling out [06:34:38] i'm pretty dumbfounded at the moment :/ [06:34:57] I'm looking at varnishhist and it looks fine [06:35:26] network-wise, i don't see issues so far (no packet loss in investigative pings, nothing on routers' logs) [06:35:32] the site is more or less completely unusable for me atm [06:35:38] that bad? [06:36:12] same [06:36:21] just won't load anything [06:36:28] just bits? [06:36:42] yep [06:36:55] and not all requests [06:37:07] actually, that's suspicious [06:37:55] no, nevermind. https://bits.wikimedia.org/static-1.23wmf1/skins/common/images/poweredby_mediawiki_88x31.png was fast on the last page load, but curling it is slow, so just a fluke [06:42:09] ok, definitely something's wrong with the network [06:43:58] hey guys, any updates on when the current lag issue may go away? [06:45:49] paravoid: what are you seeing? [06:46:23] Piotrus: no, just doing some digging for now [06:46:53] ok, good luck [06:48:10] ori-l: packet loss, 15-25% [06:48:55] on the transit, not on the link to eqiad [06:51:24] I switched DNS back to eqiad for all of NA a few minutes ago [06:51:37] about 5' to be exact [06:51:47] i'm getting ~5 % on some HE hops and no loss for the last few hops [06:51:48] TTL is 10' [06:54:04] looks like the loss is mostly or entirely v6. but i go through different carriers for the different IP versions [06:54:19] nope [06:55:06] well for me... [06:55:10] with mtr [07:02:57] Which DC is wmflabs running in? And which DC would it be hitting for enwp? [07:03:29] labs is in pmtpa [07:03:40] it should hit eqiad i think. [07:03:43] wmflabs is in pmtpa, for enwp you'd get a different DC based on some criteria (usually: nearest to your location) [07:03:43] how are things looking now? [07:04:12] paravoid: wmflabs -> enwp. [07:04:25] ah, sorry [07:04:30] pmtpa -> eqiad [11:06:53] [[Tech]]; Billinghurst; /* Fixing MediaWiki:Gadget-SBHandler.js to load earlier */ new section; https://meta.wikimedia.org/w/index.php?diff=6288657&oldid=6253890&rcid=4655398 [12:40:04] hi guys, I can not edit Meta, there is this error: "Your edit has been rejected because your client mangled the punctuation characters in the edit token. The edit has been rejected to prevent corruption of the page text. This sometimes happens when you are using a buggy web-based anonymous proxy service." [12:40:13] but I am not usig proxy at all, maybe my provider but I haven't acces to it [14:15:17] KuboF: what page? what are you trying to change? [14:15:50] any page - main namespace, user talk, grant talk... [14:16:04] ok, give me a specific example [14:16:07] what browser? [14:16:08] what OS? [14:16:18] https://meta.wikimedia.org/wiki/Grants_talk:Wikimedia_Slovakia/Start-up/Report/Past_expenses [14:16:42] Firefox 24, Kubuntu 13.04 [14:17:09] no problem on anoter Wikimedia wikis [14:19:14] jeremyb: any idea? [14:19:35] maybe ip-block-protection? I don't know [14:21:32] KuboF: https://translatewiki.net/wiki/MediaWiki:Token_suffix_mismatch/qqq [14:21:39] KuboF: https://translatewiki.net/wiki/MediaWiki:Token_suffix_mismatch/en [14:22:55] jeremyb: yes, this is the message, I am not usig proxy on my PC, but I have shared internet connection with limite acces to router [14:23:14] KuboF: this is with HTTPS or HTTP? [14:23:59] https, as default on WMF wikis [14:42:17] hi! it's been almost two months since [Special:WantedCategories] was updated on svwiktionary - wouldn't it be good to update it more frequently? [14:48:58] this is getting to be a regular thing :P [14:49:37] skalman12: https://bugzilla.wikimedia.org/53227 [14:50:21] !stalemaintpages [14:50:30] !stalemaintpages is https://bugzilla.wikimedia.org/53227 [14:50:30] Key was added [14:50:35] !stalemaintpages [14:50:35] https://bugzilla.wikimedia.org/53227 [14:50:39] :) [14:51:54] It should be fixed when it next runs according to schedule.. [14:52:11] jeremyb: oh.. ok. I'm not alone then :) thanks for pointing me in the right direction [14:52:35] Reedy: what's the schedule? [14:52:47] Look in the puppet repo for the cron entry [14:53:15] hrmmm, manifests/misc/maintenance.pp [14:57:54] Reedy: looks like every 3rd day [14:58:41] skalman12: ^ [14:59:43] jeremyb: I have no idea what I should do with this information.. i cc:d myself to the bug to get to know when it's fixed.. [15:00:05] skalman12: come back in 2 or 3 days if it's still not updated :) [15:00:31] although based on mutante's comment's timestamp i would expect it to be going already? [15:00:39] 2013-11-06 01:47:19 UTC [15:00:50] vs. 00 05 */3 [15:01:19] KuboF: try another browser? [15:01:27] KuboF: try a really simple change like mine? [15:01:46] jeremyb: ok - great [15:02:00] skalman12: unless maybe it's still broken :) [15:02:28] jeremyb: in which case I wish the person doing the debugging the best of luck [15:02:43] jeremyb: Google chrome, minor edit, still the same problem [15:02:53] KuboF: did you see my change? [15:03:24] KuboF: where are you editing from? slovakia? [15:03:57] jeremyb: yes, I see, I wanted to do change like your [15:04:11] yes, from Slovakia [15:07:37] jeremyb: but i can edit MEta as IP... [15:08:18] KuboF: IP editing doesn't use edit tokens [15:08:24] aha [15:09:02] ori-l: around? what would happen in the browser if we attempted to save to localStorage something on the order of 500,000 characters as a result of a call to RL and the save were to fail? [16:53:58] Ooh, fun. The "Unlock further protect options" checkbox on the protection form is now broken. Krinkle, didn't you check for core callers of your deprecated JS functions before you went and nooped them? [16:57:36] Krinkle: Also, the logging of the fact that the deprecated function is used doesn't seem to be working. At least not in FireBug with Firefox 25. [16:58:02] anomie: debug=true? [16:58:21] I did check for core callers [16:58:25] chrismcmahon: on which wiki were you experiencing the problem you reported to wikitech? [16:59:02] Krinkle: Oh, it doesn't show up in red like I was expecting. [16:59:25] anomie: well, it uses whatever styling your console uses for console.warn [16:59:51] I didn't find any core callers of wikibits (at least not the ones I removed) [17:00:07] I'll fix it asap. Got a link / bug ? [17:00:39] Link, https://test.wikipedia.org/wiki/Page393?action=protect&debug=true shows it. No bug yet, I can file one if you want. [17:00:53] that'd be great [17:02:12] anomie: hm.. strange, looks like the stracktrace is useless [17:03:42] Krinkle: https://bugzilla.wikimedia.org/show_bug.cgi?id=56726 [17:12:30] anomie: thx [17:12:41] nvm about the trace, it is complete [17:16:00] ori-l: you mean maxing out localStorage in Chromium? beta labs enwiki [17:17:40] can you file a bug with as much detail as you can muster? [17:17:44] i'm not able to repro [17:18:54] ori-l: sure [17:19:17] ori-l I have a Chromium instance with right now 4.4MB of localStorage filled [17:20:03] ori-l: I am interested in what happens when that fills up completely [17:20:44] reactor breach [17:21:55] ori-l: 4.7... [17:22:37] run! [17:22:57] joking aside, it fails gracefully [17:27:01] hi, ryasmeen - I'm Sumana, your colleague (currently on sabbatical) who has a name slightly similar to yours. :-) just wanted to say hi and welcome. [17:30:38] Hi Sumana :) Thanks! [17:34:26] :) [17:35:39] hey brainwane! hackathon this weekend? [17:35:49] jeremyb: Sorry, no [17:35:58] Hope you have a productive time, though [17:36:15] * jeremyb too! [17:45:39] ori-l: heisenbug. now that I'm actually trying, I can't seem to run localStorage past 4.8. when I commented on the mail list I ran it to 5.3 in just a couple minutes on beta enwiki. [17:47:51] ori-l: what is the bugzilla category for localStorage? [17:51:25] Is US/CA back on eqiad now, or still split between eqiad/ulsfo? [17:58:19] chrismcmahon: Mediawiki -> JavaScript [17:58:33] thanks MatmaRex [17:58:39] chrismcmahon: (ori is not in default CC, so add him :) ) [17:58:52] MatmaRex: yep, will do [18:01:35] https://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28technical%29#User_talk:2620:0:862:1:91:198:174:70 [18:03:32] maybe related to the similar problem yesterday - https://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28technical%29#Wikimedia_Foundation_IP_addresses_causing_autoblocks [18:04:27] same issue on dewiki: https://de.wikipedia.org/w/index.php?title=Benutzer_Diskussion:2620:0:862:1:91:198:174:70&action=history [18:04:42] mark: ^ [18:04:44] mark LeslieCarr ^ [18:05:02] yup, thanks, we're on it [18:05:35] anyone I can contact to check an issue with a global account? [18:10:03] should be fixed now... [18:10:50] Cobi: back on eqiad until we sort this out [18:20:47] mark: related? https://bugzilla.wikimedia.org/show_bug.cgi?id=56727 [18:20:52] paravoid: ^ [18:21:23] yeah, i'm sure that anything related to ips 2620:0:862:1:: are related [18:21:28] +bugs [18:33:33] mark: so is the wikimedia ip issue fixed (for) now? [18:34:30] pajz: yes [18:34:33] should be [18:49:13] Good, thanks. [20:34:22] Jasper_Deng_away / JD|cloud: do you think https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)#Edit_window:_automated_inserting_bug could be related to our {{cot}} bug on Wikidata? [20:57:01] Помилка бази даних [20:57:01] Перейти до: навігація, пошук [20:57:01] При запиті до бази даних сталася помилка. Це може вказувати на помилку у програмному забезпеченні. [20:57:01] Функція: BetaFeaturesHooks::getUserCountsFromDb [20:57:01] Помилка: 1146 Table 'metawiki.betafeatures_user_counts' doesn't exist (10.64.16.30) [20:58:44] i see no point in adding a linkt to topbar which doesnt work [20:58:50] so remove it please :) [20:59:02] DAMN IT [20:59:17] Reedy: You should probably update the database [20:59:23] Hence my damn it [20:59:24] ;) [22:00:25] PinkAmpersand: sry I was having a review session [22:00:48] 'sfine :) [22:00:49] possibly related [22:01:01] it is. we need to remove the addhandler thingy [22:01:08] working on it on test.wp right now [22:03:16] Hey all. Wikipedia just went down for me. Error message is: [22:03:18] "Request: GET http://en.wikipedia.org/wiki/File:Santa_Claus_1863_Harpers.png, from 10.64.0.134 via cp1015.eqiad.wmnet (squid/2.7.STABLE9) to () [22:03:19] Error: ERR_CANNOT_FORWARD, errno (11) Resource temporarily unavailable at Thu, 07 Nov 2013 22:02:19 GMT" [22:03:40] Sven_Manguard: i just had a ridiculous pageload time for a sec too, on testwiki [22:03:42] It's slow for us either [22:04:29] how often does the error count chart update? [22:07:33] JD|cloud: i've now updated the CollapseButtons part of https://test.wikipedia.org/wiki/MediaWiki:Common.js to perfectly match en.wp (where {{cot}} works), but it's not working [22:07:55] oh, wait, 1 sec [22:09:38] so it works? [22:09:46] did the cache get cleared? [22:10:05] no, just thought of something else is all. hold on [22:12:50] Hey guys. I just noticed the CSS/JS editor on English Wikipedia and I just wanted to say thanks to whoever is responsible. [22:13:51] you mean how it's all colorful? [22:14:07] because that's awesome [22:14:29] except when it steals the tab key and ruins your whitespace [22:14:34] *and* tabs work. And it highlights while you type. [22:14:41] lol [22:14:47] JD|cloud: success! https://test.wikipedia.org/wiki/Template:Collapse_top [22:15:02] you lied to me! you said we had the same collapse function as en.wp [22:15:42] PinkAmpersand: Different flavors of Jenga [22:16:14] Sven_Manguard: any objection to mine swapping it out on Wikidata now? think that should fix our problems with {{cot}} [22:16:31] Is there support for the change? [22:16:41] is there any drawback? [22:16:57] I mean, personally I don't care. I don't think I've ever used collapse on WD [22:17:10] we use it on RFD a lot, though [22:17:19] and as you can see on my enWiki user page, I like the type of collapse you have on testwiki [22:17:25] oh, shit [22:17:42] if it's used on RfD check with the person who runs the archive bot before you make a change [22:17:53] it won't effect that at all [22:18:08] because if you screw up RfD archiving, that page will be too large to load in like, three days [22:18:16] just a change to the JS functions that the templates use, not to the page's markup [22:18:43] hey, you're not in Category:Wikidata administrators on en [22:19:29] whoa, whoa, whoa. Take it a huge step back. You lost me at "just a change to the JS functions". I only started learning Python *yesterday* and I haven't gotten to JS yet [22:19:48] if it won't break anything, go ahead though [22:20:13] I won't be on Wikidata for a while, because I've got... erm... "Portal madness" again [22:20:40] so I'm on a crusade to fix a neglected namespace with support from only one other person [22:21:10] I'm pretty clueless with JS too. But all I'm doing is copying code that I know works, and removing code that I know doesn't [22:21:36] When in doubt, I always bother legoktm :D [22:21:57] who I just pinged accidentally, didn't I... [22:22:01] Sorry [22:25:11] I wonder how much of my loading issues has to do with that Steam just decided to update one of my games? [22:26:47] PinkAmpersand: yay! [22:26:55] JD|cloud: it's working for you now? [22:27:33] It is predicting that it will take me two and a half hours to download a 604 mb patch [22:28:11] at 42.1 kb/s [22:28:17] god I hate my internet connection [22:29:36] JD|cloud: at least on my end, the bug is still happening, even though the code is identical to the code on testwiki, where it works fine [22:31:52] nvm, working for me now! XD [22:33:55] cache needs updating [22:34:58] and just in time for some massive bulk deletion requests. *groan* [23:16:25] ori-l: do you have 10 minutes to talk to me about kafka? I have some free time and I was going to start writing that log stream parser we talked about a couple of weeks ago [23:17:32] argh, sorry, not at the moment :/ [23:17:34] tomorrow? [23:17:36] sure [23:17:46] * mwalker has lots of things that need doing :)