[00:10:40] Wikidata is doing some weird redirect things right now [00:10:49] A minute ago it was in a redirect loop (according to ff) [00:10:55] Now its just redirecting to enwp [00:11:39] legoktm: give me a few seconds..fixing [00:11:39] ty :) [00:40:47] mutante: any luck? [00:45:29] legoktm: ok, long story. i tried to fix the loop then it was cached, then i purged cache, then i was told we cant drop the www in the first place, then i reverted it all [00:45:39] but now it may have broken the CSS [00:45:50] because of the way bits works we can't NOT have www [00:46:07] it would break geolocation and send all traffic to one data center [00:46:14] well we can at least get to wikidata now, so thanks for that [00:46:35] what changed with regards to css? [00:47:08] does it look ok to you? [00:47:16] yes [00:47:17] but [00:47:28] Special:RecentChanges on wikidata goes to enwp's RC for me [00:47:29] it should be like before we changed anything [00:47:43] that is caching ..sigh ..let me try to purge that too [00:48:15] try again, also cleaning the browser cache [00:48:45] works :D [00:49:11] hm [00:49:12] ugh, i'm sorry for that [00:49:17] but we should not have even tried [00:49:33] when i hit show bots (now on index.php) i'm getting sent to enwp again [01:21:08] I'm not seeing CSS files when I load https://wikidata.org. Either that or for some reason the files are giving "myskin" as "vector" [01:21:35] kibble: you should know better! shhhh ;) [01:21:47] :o ! [01:22:21] Ah, I see, someone else already brought it up before I came in. [01:22:28] Chrome is blocking Wikidata's attempts to load insecure content >_> [01:22:29] Reedy: I guess I just wanted an excuse to see your beautiful face. <3 [02:40:49] Anyone know any reason that could possibly cause non-prettyness and loss of purge link and nav popups? [02:41:17] gwickwire: bits.wikimedia is having issues, theyre working on it [02:41:18] I'm pretty sure they're working on it. [02:41:31] Figured ya'll were :) [02:50:14] oh god, my skin D: [03:00:45] Deja vu - why're we back to nostalgia? [03:02:26] hello? [03:03:41] The_Aviatrix: bits.wikimedia is having issues, theyre working on it [03:03:47] thnx [03:04:02] this related to the shift to VA? [03:04:14] Don't think so [03:04:51] so everyone's aware that CSS requests to load.php are throwing 503s? [03:05:03] yeah [03:05:04] Hello71: yes :P [03:05:31] and nobody knows why yet? [03:06:16] I'll take that as a yes. [03:15:13] Is anyone else seeing unstyled pages? [03:15:20] superm401: yes [03:15:29] theyre working on fixing it [03:15:43] Got you. I'm glad you changed the status. It's borderline up. :) [03:16:03] np, you're not the only person who's come in to ask. [03:16:06] I was one of them. :-) [03:17:18] "Static assets (CSS/JS) Service is operating normally" [03:17:24] We're gonna have to fix the dashboard too. :) [03:19:42] status.wm.o? [03:20:37] Reedy, yep. [03:20:50] that isn't manual [03:20:51] Maybe it's only based on raw uptime, not trying an authenticated-user style request. [03:20:53] Yes, that's what superm401 is talking about. It's kinda crappy, but it's not really for this. [03:21:18] I don't really know where it gets its information, but it never gives me what I want. ;-) Ganglia is better. [03:21:19] it does all sorts of weird things [03:21:39] It should be making numerous various requests to all those services from all over the world [03:21:48] And this: http://nagios.wikimedia.org/ [03:28:23] ouch, kibble. Lots of red there. [03:28:35] Indeed. :( [03:28:35] Seems to be back up [03:29:01] Is anyone else h...oh wait [03:29:03] Tim did something to get the site back up, I believe. I'm not sure if it's all fixed yet though. [03:35:06] all up now? [03:35:52] I seem to be fine now, thanks Tim [03:36:29] wikidata loks erm interesting [03:42:16] I'm still seeing intermittent issues on https://wikidata.org/ Is that probably just caching? [03:43:21] wikidata has been a bit strange since trying to remove the www [03:44:32] it's not intermittent for me, no css/skin at all [03:44:35] Okay, so it's unrelated and a known issue by the awesome Sam, kk. [04:21:42] ori-l you around? [04:22:01] or spagewmf? [05:00:22] awjr: hey [05:00:51] yo! [05:01:03] i just got on, so catching up with the back-log [05:01:14] but if you want to give me the capsule summary, go ahead :) [05:01:29] ori-l: cool, i dont think there's much for ya but had a q for you but i think it's already been answered [05:01:52] basically it looks like bits went into a deaht spiral likely because of a huge spike in cache misses, about 45 minutes after MF deployment today [05:01:58] s/deaht/death [05:02:16] what caused it? [05:02:21] ops put in a quick hack to block reuqests to bits with a referer of *.m.wikipedia.org [05:02:22] tbd [05:02:36] and i rolled back MF to pre-deploy state [05:02:48] we've not been able to figure out what has caused it yet [05:03:06] but i was curious what eventlogging request URLs to bits might look like [05:03:28] they're not requests for load.php; they all start with /event.gif?.. [05:03:33] i have them plotted in ganglia, let me dig up the URL. [05:04:52] tomorrow i'll [05:05:37] awjr: http://ganglia.wikimedia.org/latest/graph.php?r=day&z=xlarge&title=EventLogging&vl=Events+%2F+Second&x=&n=&hreg[]=vanadium&mreg[]=client-generated-%28raw%7Cvalid%29>ype=stack&glegend=show&aggregate=1 [05:05:48] so as expected: very low volume [05:06:24] laughably small load, peaking at about 2 hits / second [05:07:32] (raw/valid are swapped btw.) [05:08:17] yeah ok [05:08:32] my hunch is that hacking around ResourceLoader is short-circuiting some of the cache expiry behavior [05:08:38] and certainly no real difference between yesterday and today [05:09:07] well, we're not really doing much different today than we were yesterday [05:09:56] hrm. [05:10:05] it looks like the emergency is over now [05:10:06] let me look at the logs a bit more before i speculate [05:10:15] TimStarling: did you discover anything? [05:10:19] we'll have to fix the monitoring issues before we deploy it again [05:10:24] ori-l: not really [05:11:13] do the logs on vanadium or analytics1001 go anywhere? [05:11:20] * Jasper_Deng doesn't know whether to file a Bugzilla for wikidata.org's lack of AAAA or to bother sysadmins right now [05:11:32] the answer may be in the logs, if any were stored [05:12:06] yes, i'm checking the sizes [05:12:55] btw there's a bash process on vanadium in a tight loop using 25% CPU, not sure what it's doing [05:12:59] they're logged to files locally and inserted into a db s1-analytics-slave.eqiad.wmnet [05:13:04] it's got your name on it ori-l [05:14:04] client-side-events.log is at 17M, pretty standard [05:14:31] what on earth is that bash thing, heh [05:15:10] oh, i remember. nothing important; killed it. [05:15:53] it was writing the logs to a separate file; had that up while i was getting logrotate/rsync of logs puppetized [05:20:57] TimStarling: EventLogging generates some module code dynamically (as opposed to loading it from a file on disk, which is the common case), and it relies on ResourceLoader calling ResourceLoaderSchemaModule->getModifiedTime() and acting on it to function correctly [05:21:51] if EventLogging is implicated in this at all, this is where things seem more likely to go wrong, because MobileFrontend works around RL somewhat. [05:22:13] awjr/ori-l: http://ganglia.wikimedia.org/latest/?r=day&cs=&ce=&m=cache_miss&s=by+name&c=Bits+caches+eqiad&h=&host_regex=&max_graphs=0&tab=m&vn=&sh=1&z=small&hc=3 [05:22:37] yeah, i've been staring at those [05:22:40] see, there was a slow decline in miss rate while I had my patch in place [05:23:02] so probably the reason it didn't break when I removed the patch was because the miss rate had declined to an acceptable level [05:23:44] then the miss rate dropped to almost 0 around when you toko the patch away [05:24:03] sorry, what patch? [05:24:15] ops put in a quick hack to block reuqests to bits with a referer of *.m.wikipedia.org [05:24:17] ^ that? [05:24:21] yah [05:25:14] the sudden spike in misses seems to correlate with the increase in hits - but i presume the peak of the cache hits is normal traffic patterns? [05:25:26] umm [05:25:32] image urls have a timestamp attached to them [05:25:39] that seems dynamically generated [05:25:52] i'm seeing things like http://bits.wikimedia.org/static-1.21wmf7/extensions/MobileFrontend/stylesheets/modules/images/zoom.png?2013-01-18T01:38:20Z [05:25:56] in my network pane [05:26:03] @_@ [05:26:28] actually that might not be weird if the timestamp was when it was las refreshed [05:26:34] that's not EventLogging, definitely. looking at MF.. [05:26:38] does that timestamp change for every subsequent request? [05:28:16] no, it looks like it's 2013-01-18T01:38:20Z [05:28:19] so not totally crazy [05:28:21] i don think that's actually scary, i think that's how RL works [05:28:22] yeah [05:28:28] but it did mean that all static assets had to be refetched [05:28:31] im seeing the same [05:28:32] once you deployed it [05:29:23] sure but it's not a lot; nearly all requests to bits would have been refreshed when we deployed (like usual) [05:30:46] the number of unique URLs generated would presumably be easily cachable; plus the spike in misses happened ~45 minutes post deploy [05:32:54] i'm curling some of the images and i get cache hit headers, so it isn't like the query string thwarts caching [05:33:48] at this point when debugging some sort of problem my working assumption is that tim already figured everything relevant out and is waiting to see how long it takes us to come around [05:34:03] hehehehe likely [05:34:41] sure but it's not a lot; nearly all requests to bits would have been refreshed when we deployed (like usual) [05:34:46] ^ why is that? [05:34:59] are you taking into account browser caching? [05:35:51] because we typically brute force a cache refresh of js/css assets since we don't load them the Right Way with RL [05:35:56] when we do a deployment [05:36:23] basically, touch a bunch of files (sometimes all js/css files) [05:36:48] and then refresh the mobile varnish cache (which has the URLs for those requests cached in the HTML) [05:38:45] well, right, but you set far future expires [05:38:48] e.g. [05:38:54] curl -I "http://bits.wikimedia.org/static-1.21wmf7/extensions/MobileFrontend/stylesheets/modules/images/show.png?2013-01-18T01:38:20Z" [05:39:02] Cache-Control: max-age=2592000 [05:39:03] Expires: Sun, 17 Feb 2013 01:49:01 GMT [05:39:22] so you do leverage browser caching [06:00:26] awjr, what up? [06:00:50] spagewmf: nada, got it mostly sorted for now, thanks :) [09:59:03] hey all [10:00:05] Ugh, the Germans are starting wokr [10:00:10] This is a sign I should not be awake [10:00:23] hehe [10:00:24] :D [10:00:35] what is your timezone? [10:00:36] RoanKattouw: maybe you can help us before going to sleep... [10:00:50] on wikidata.org, the vector skin is broken. other skins seem to work [10:00:56] i see no failed requests [10:01:10] Denny_WMDE: PST [10:01:17] apparently it was triggered when we tried to drop the www subdomain - but that got reverted, and the problem is still there [10:01:19] any idea? [10:01:20] go to bed! [10:01:28] no, help us fix this! [10:01:30] :P [10:01:35] Hmm, that's not very pretty [10:01:46] ...but Ctrl+Shift+R fixed it for me [10:01:49] the www drop got reverted? [10:01:54] * RoanKattouw attempts to reproduce in Chrome in incognito mode [10:01:56] what is actually the status? [10:02:10] WFM incognito [10:02:11] Denny_WMDE: according to the comments on bug 41847, yes [10:02:24] Maybe my Ctrl+Shift+R fixed it [10:02:39] If it was something stale cached in Squid, that would make sense [10:02:49] RoanKattouw: hm.... no fixy here... [10:03:00] Ctrl+Shift+R sends Pragma: no-cache headers in the request, and Squid obeys those while Varnish doesn't [10:03:03] somethign stuck in the european cache? [10:03:06] Probably [10:03:10] gah# [10:03:15] I'm hitting the US cache, of course [10:03:28] so, how to purge these? [10:03:40] can we bump the skin version? [10:03:50] The skin version doesn't exist anymore [10:03:52] It hasn't in a long time [10:04:02] I'm gonna manually switch myself to esams so I can look at the breakage [10:04:43] thanks [10:04:47] Bingo, I have a broken page. Let's see what's wrong with it [10:04:57] doesn't work with chrome/incognito either [10:05:13] i would have expected to see failing requests to css resources, but i don't. [10:05:18] filing a bug [10:06:05] OMG [10:06:21] This is because of the www-drop being reverted [10:06:28] >_< [10:06:36] bits.wm.o/www.wikidata.org/load.php?.... is returning a 301 to wikidata.org [10:06:47] bits 301ing away from itself is BAD for caching reasons [10:06:55] But in this case probably also resulting in incorrect responses [10:09:08] Some but not all bits/www.wikidata URLs are exhibiting this behavior [10:09:14] The URL delivering Vector CSS is one of them [10:09:56] And.... WTF?! [10:09:56] yes, that seems consistent with what is happening. most pages work, but vector seems not to load [10:09:58] Empty response [10:10:00] Probably a PHP fatal error [10:10:12] woot? where? [10:10:22] So, here's what I have so far: [10:10:47] 1) after the www-drop, URLs like http://bits.wikimedia.org/www.wikidata.org/load.php seem to have been redirecting to http://wikidata.org/w/load.php [10:11:00] This makes intuitive sense to me given how bits patches through load.php requests [10:11:27] RoanKattouw: please document your findings here: https://bugzilla.wikimedia.org/show_bug.cgi?id=44094 [10:11:29] 2) these redirect responses are 301s, which are indefinitely cacheable, so some of them are still stuck in the bits (Varnish) cache [10:11:32] will do [10:11:35] thank you [10:12:04] 3) the target URL of these redirects returns what seems to be an empty response, presumably some PHP fatal error triggered by attempting to access wikidata.org without www [10:13:09] RoanKattouw: why does it only happen for vector? [10:13:18] because the bits for other skins arn't cached? [10:13:33] As I said, some but not all of these 301s seem to have gotten stuck [10:13:58] It's possible the www-less code was live briefly, and some URLs simply weren't hit during that time [10:14:11] right [10:14:26] It's also possible that the URLs that got 301s cached were simply those whose cached versions expired during the www-less window [10:14:59] Most resources are cached for 30 days, and for those there would ordinarily be no reason for the cache to refetch, so the caches probably never noticed those URLs changed at all [10:15:19] But the borked URLs seem to be biased towards things that ordinarily have a short (5 mins) cache timeout [10:20:26] OOOOH [10:20:28] I SEE [10:20:37] The fact that the URL is pointing to wikidata.org isn't a huge problem [10:21:04] It's the wrong domain and it hits Squid instead of Varnish, but the REAL reason it's failing is double-encoding [10:28:17] RoanKattouw: having the morning meeting, back in a few [10:31:13] oh... double-encoding sucks :) [10:31:31] !log Purged all URLs containing 'wikidata' from Varnish [10:31:40] Logged the message, Mr. Obvious [10:32:29] DanielK_WMDE_: Looks fixed to me now [10:33:09] RoanKattouw: confirmed, works. thanks a ton! [10:33:37] so we can't get rid of www? [10:33:51] Denny_WMDE: We can [10:34:09] We just can't switch and then switch back without doing something about cached redirects :) [10:34:29] (Well, when I say "we can" I really mean "if we can't, it won't be because of this") [10:34:39] I'm assuming there was a good reason to revert the www change [10:34:57] RoanKattouw: https://bugzilla.wikimedia.org/show_bug.cgi?id=41847#c12 [10:35:11] mutante should have the details [10:35:18] Oh right [10:35:28] Yeah I see what he's saying [10:35:33] And I remember the wikimediafoundation.org situation now [10:36:27] i guess i'll file a bug about resolving the cname issue, and make it block 41847. [10:37:11] RoanKattouw: can you give some details about the double escaping issue at bug 44094? [10:37:29] Didn't I just do that? [10:41:21] RoanKattouw: https://gerrit.wikimedia.org/r/#/c/44575/ [10:41:36] the most ultimate coolest feature there is [10:41:43] i'm sure wikidata people would love to use it too :) [10:41:57] unless csteipp kills it :( [10:43:41] RoanKattouw: yea, thanks - and sorry for being impatient. [10:44:54] yurik: Ha, yeah I'm not touching that with a 10ft pole, that one's for Chris :) [10:45:04] Especially at.... dammit it's almost 3am already [10:45:53] hehe [10:46:10] but would you say its a useful feauter? [10:46:39] imagine how much easier it is to debug all the token-related modules [10:46:49] thank you very much RoanKattouw [10:47:06] in other words, its up to devs to convince chris ;) [10:47:33] yurik: we have settings like that for wikibase. chris didn't complain. [10:47:40] i think it's a good idea, actually [10:47:51] yurik: i've already found it useful [10:48:02] legoktm: excellente! [10:48:15] please tell anomie about it - he was highly sceptical [10:48:30] what did you use it for? [10:48:37] * yurik is building up his case [10:50:25] i've been working on a action=global(un)block api module for the GlobalBlocking extension which i was testing with mustbeposted disabled / tokens so this just made it much easier [10:50:35] it still doesnt work properly though >.> [10:51:37] the flag? [10:51:43] or the global thingy? [10:51:56] oh your part works fine. its my module that doesnt [10:55:28] legoktm: excellent study case - please let the other reviewers know how useful this is [10:59:52] also, legoktm, if you have any thoughts on the new API, please comment - the earlier we comment, the less work it is for me to rewrite it later. http://www.mediawiki.org/wiki/Requests_for_comment/API_Future [11:01:39] yeah sure, i think i read it after your email but never commented on it [11:01:50] there has been tons of changes [11:14:46] RoanKattouw: http://wikidata.org/wiki/Wikidata:Project_chat still redirects to en:wp for me even after reloading [11:15:08] or anyone else ^ [11:16:50] That's really strange [11:17:01] Probably cached remnants of an interwiki prefix misconfiguration? [11:18:42] yep, redirected [11:18:52] Whoa [11:18:56] It's a redirect to http://en.wikipedia.org/wiki/Wikidata:Project_chat [11:19:05] yes [11:19:20] Which redirects to http://wikidata.org/wiki/Project_chat as expected [11:19:21] Oo [11:19:22] Which in turn redirects to http://en.wikipedia.org/wiki/Project_chat [11:19:25] W ... T ... F .... [11:19:29] WTF indeed [11:19:43] oohh dear [11:19:45] demons!!! [11:20:01] So for some reason wikidata.org URLs are sometimes (but not always) redirecting to enwiki [11:20:10] www.wikidata.org seems to work though [11:20:32] sometimes but not always sounds like a line from HHGTTG [11:21:07] time to sleep. its past 6am [11:25:57] RoanKattouw: also, accessing wikidata without the www *does* work fine, mostly. it just seems that mediawiki doesn't know about it [11:46:42] and more issues: http://www.wikidata.org/w/index.php?title=Wikidata:Contact_the_development_team&diff=4565960&oldid=4562330 :( [11:47:08] (redirect look on user page) [11:58:22] out of curiousity, did you folks try a redirect rule on the apache config? [11:58:50] hi neverendingo [11:58:57] hi Nikerabbit [12:01:23] fake subdomains are tricky [12:24:25] especially when you don't test your changes [13:04:08] hi there, does anyone know the link to see wikimedia's localsettings.php? [13:05:23] it's off of noc [13:05:48] https://noc.wikimedia.org/conf/ [13:08:42] would i be right in saying that commonsettings.php includes localsettings? [13:09:08] the other way [13:18:44] oh. I might be blind but I can't see localsettings there/ [13:33:21] ah because we don't use 'localsettings.php' as such [13:33:24] wmf projects use [13:33:33] CommonSettings.php for settings across all projects [13:33:42] and InitialiseSettings.php for per-wiki settings [13:33:48] ChiyoMihama: [13:33:51] sorry I didn't say that [13:34:07] ahh ok [13:34:42] its because im trying to semi-replicate the footer for how it appears on monobook, but it hasnt worked quite as i expected it too. [14:39:08] apergos, initialisesettings is used for centralauth settings, right? Like userrights and such [14:39:13] Or am I wrong thre? [14:39:15] there* [14:39:56] lemme look [14:41:10] yes the centralauth settings are in there [14:41:17] yeah that's what I remember [14:41:26] I don't use them lol [14:41:36] for localhost projects using ca [14:41:41] <^demon> I can never remember which file has which settings. I generally have to open both and ctrl+f. [14:41:47] lol [14:42:37] that is what I just did :-D [14:43:37] We should just make InitialiseCommonSettings.php [14:43:49] ooowww [14:43:55] it would stab my eyeballs out [14:45:52] I keep thinking yesterday was Friday [14:47:42] hah [14:51:38] <^demon> Bah, 10am and I still haven't had breakfast. [14:51:43] <^demon> Time to fix that. [15:17:43] andre__: only 27 formerly "Security" bugs fixed. Possible? http://ur1.ca/ck5wu [15:19:10] Nemo_bis: I get 36 when querying for "Product | changed from | Security" && "Resolution == FIXED" [15:19:57] andre__: maybe 9 were sent back to security and are not visible for me. [15:20:17] Oh, you did non-MediaWiki too? [15:20:41] yes. [15:20:48] but 1 is still under Security, right [15:22:48] andre__: I'm looking for some compelling argument to add to https://www.mediawiki.org/wiki/Manual:Upgrading#Why_upgrade.3F [15:23:22] "35 explosive security bugs with summaries you won't understand a letter of fixed since bugzilla exists" doesn't seem that effective, hmm. [15:23:36] Last minor upgrade fixed 6 (?) security issues IIRC [15:23:54] replace explosive with "might let people kill your kittens"? [15:25:57] The problem with those numbers is that if one is at 1.13.y all those bugs might not affect him at all and be only later releases. [15:30:11] Nemo_bis, I'd keep it intentionally vague. Saying "our software is extremely vulnerable so we already had to fix X issues in the last years" isn't awesome, but not mentioning security fixes isn't very honest either. [15:31:18] andre__: we already have vague phrasing, "Many upgrades solve security issues which help to keep your wiki and possibly even your host system safe from vandals" [15:31:26] <^demon> "All software has security flaws. However, we take security very seriously, and when we find security issues, we make every effort to quickly fix the issue and get a new release out for our users." [16:34:01] Hi, how should I put 5GB TIFF files on Commons for a GLAM? May someone help me? [17:29:23] Kelson: You need to upload it somewhere that we can download it from [17:29:25] dropbox for example [17:30:51] Reedy: If I give you a way to download them, would you do that? [17:30:58] I can try, yeah [17:31:06] Sometimes uploading large files can have issues [17:31:23] But whent they're somewhere I can get them from, I can attempt to do it [17:32:11] Reedy: ok, do you confirm the 25Mpixel thumbnail rendering limitation? [17:32:28] Err [17:32:38] Is it 25MP per page in the tiff? [17:32:40] Reedy: so this does not make sense to upload picutres with more than 25 Mpixels [17:32:54] People can still download them and use them [17:33:11] Most people also upload a smaller more manageable version [17:35:00] Reedy: ok, so this would be an acceptable strategy to upload a big TIFF file, and a reduce version? [17:35:14] yup [17:35:50] Reedy: This is great! I have now a strategy to propose to our GLAM .I count with you and will be back with more informaiont. Thank you very much for your help [17:36:05] Easiest place to file these requests is on bugzilla [17:36:24] Reedy: or RT? [17:36:45] Not really, it doesn't need Ops to do it [17:37:14] and RT will be much harder for you to track [17:37:37] Reedy: oh? but you need a shell access to WMF servers to do that isn't it? [17:37:44] To do the upload? Yes [17:38:05] Reedy: ok, and there are people with this shell access who are not ops? [17:38:12] Yup [17:38:36] A number of WMF engineering employees have shell access [17:38:56] And a few volunteers (with root, even) [17:43:53] Reedy: ok, I will be back in a few days with more information about how to get the pictures. [17:44:07] Cool [17:44:19] Reedy: we want to start the import in around two weeks [18:02:14] RoanKattouw_away: thanks for that purging earlier. we had already restarted bits varnish and purged several URLs from squid thinking that was resolved [18:02:35] and sorry for the breakage [21:18:14] So https://commons.wikimedia.org/wiki/File:2012_State_Of_The_Union_Address_%28720p%29.ogv is "503 Service Unavailable", AaronSchulz? [21:21:12] Works fine for me [21:22:20] Are you streaming or or downloading it? [21:23:19] Cuz' the transcode status for webm.480p (Jan 3) says "File:2012 State Of The Union Address (720p).ogv: Source not found /tmp/localcopy_2c1f6aa41d8c-1.ogv" [21:26:06] Streaming [21:26:11] You didn't really say what did or didn't work [21:40:03] Reedy, maybe you can help get this stack trace: https://bugzilla.wikimedia.org/show_bug.cgi?id=43786 [21:40:59] Looking [21:41:05] It gives me [1f0bb48b] 2013-01-18 21:40:47: Fatal exception of type MWException [21:41:15] Shell users can grep the logs for 1f0bb48b to get the full message [21:41:53] Yes that's the bit which needs WMF to help with [21:43:01] which I just did [21:43:03] Pasting into the bug now [22:34:19] Ouch. [22:34:20] > [22:34:22] PHP fatal error in /usr/local/apache/common-local/wmf-config/CommonSettings.php line 233: [22:34:25] require() [function.require]: Failed opening required '/usr/local/apache/common-local/php-1.21wmf7/../wmf-config/mc.php' (include_path='/usr/local/apache/common-local/php-1.21wmf7:/usr/local/lib/php:/usr/share/php') [22:34:29] > [22:34:31] https://wikimediafoundation.org/w/index.php?title=User_talk:Philippe_(WMF)&action=edit§ion=new [22:34:40] It doesn't seem to be going anywhere on refresh. Hmm. [22:34:45] can we please redesign the green screen of death please? [22:35:29] problems [22:35:34] Known. [22:35:45] ok [22:35:51] Gable its already copy-n-pasted before i joined in, but: [22:35:52] PHP fatal error in /usr/local/apache/common-local/wmf-config/CommonSettings.php line 233: [22:35:54] require() [function.require]: Failed opening required '/usr/local/apache/common-local/php-1.21wmf7/../wmf-config/mc.php' (include_path='/usr/local/apache/common-local/php-1.21wmf7:/usr/local/lib/php:/usr/share/php') [22:35:56] Guessed so :) [22:36:15] Hi Superfreak, Jyothis. [22:36:33] Damned overflow. [22:36:36] Hello Susan. [22:37:01] Superfreak: I didn't know you were still active. Good to see you. :-) [22:37:17] back [22:37:22] Since when do we expose the actual raw PHP errors to website visitors? [22:37:22] Yes I am. I've had a recent surge of activity after serious slacking for a while. [22:37:43] Hi Susan - for a moment I thought you called me Superfreak :) [22:37:46] Serious slacking? If you keep that up, they're gonna halve your pay. [22:37:51] Jyothis: Heh. :-) [22:38:10] Susan: I'm okay with that. When they start paying me I'll stop slacking. ;) [22:38:21] (o; [22:38:37] Saying that, I get paid to do research, and I'm kind of slacking at that. [22:38:38] Oh well. [22:38:45] like, something that could become iconic, like the fail whale. [22:38:47] (my supervisor isn't in here is he?) [22:38:59] Lirodon: I'd like for the error message to not become iconic. [22:39:07] (specifically, I was thinking of mocking up something to play off the [citation needed] thing) [22:39:09] That suggests much more serious problems. [22:39:27] Preferably users never see the error screen. :-) [22:39:41] The green one is pretty awful, though. If you can come up something better, please file a bug. [22:39:53] It's kind of a pain in the ass to update it, but it's done every few years. [22:39:55] I was thinking of something making Vector-inspired [22:50:25] Lirodon: Yeah, killing the puke green color in favor of white or light grey would be nice. :-) [22:50:59] You should also consider i18n in any design. With smart graphics, the amount of text needed should be minimal. [23:04:14] Susan, http://i.imgur.com/aacj50C.png [23:05:39] Lirodon: Looks good to me (besides the mangled Unicode). I'm not sure what the history of the green background is. It's documented on Meta-Wiki somewhere. [23:06:12] Lirodon: https://meta.wikimedia.org/wiki/Multilingual_error_messages [23:06:19] I saw [23:06:43] Oh, cool [23:06:44] . [23:07:16] as you can see, its a bit more unified with the Vector CSS [23:07:26] Heh, Mark Ryan has done work on a number of Wikimedia's error messages. A byproduct of living in Australia, I imagine. [23:07:40] Yeah, I like it. I might recommend a visual. [23:07:48] Not necessarily a whale, but something. [23:07:49] trying to keep it light [23:08:06] Hmm, I thought I the current version had the Wikimedia logo. [23:08:08] Guess not: . [23:08:34] It may make sense to put the debug info in an HTML comment. [23:08:45] The users who know how to report shit on IRC are the same who can view HTML source. [23:08:55] I think. [23:09:17] I'm also not sure we still need to advertise #wikipedia (this channel is probably fine nowadays). [23:09:28] maybe wikimedia-tech? [23:09:33] Yeah. [23:09:50] That was proposed many years ago, but it was a different time and there were concerns about floods of people. [23:10:06] Those concerns are mostly gone now. [23:12:14] Lirodon: Yeah, we've really gotta update some of that text as well. The donation language is simply wrong. There have been complaints about that as well. [23:12:25] how-so? [23:12:43] Wikipedia is not in constant need of new hardware. [23:13:13] And donations don't generally go to hardware in any case. [23:13:15] But would still plugging donations be a good idea? [23:13:25] Probably not a terrible idea, it just has to be more honest. [23:13:52] but would the donation page be down if something is triggering that error too? :P [23:14:11] Heh, hopefully not. [23:14:26] "Help support free and open knowledge by donating today." [23:14:38] Lirodon: https://meta.wikimedia.org/wiki/Wikimedia_Forum/Archives/2012-11#Donations.2C_ethical_issue [23:14:51] Lirodon: Something like that, yeah. Maybe "You can help...". [23:15:18] It needs to not have the appearance of a ransom note. ;-) [23:15:35] maybe mention "Some of our proceeds go towards maintaining our servers"? [23:16:29] Reedy, could you link the gerrit change for https://bugzilla.wikimedia.org/show_bug.cgi?id=43863 please? the reporter is telling me that it's not working - and I can't see anything from https://gerrit.wikimedia.org/r/#/q/project:operations/mediawiki-config+branch:master+topic:bug/43863,n,z [23:16:50] Lirodon: Maybe, but I really think it's an insignificant amount. [23:17:14] WMF has an operating budget of about $30M, I think. It costs about $2M to keep the sites up and running. [23:17:20] I think you might have got it confused with https://gerrit.wikimedia.org/r/#/c/42774/ maybe? [23:17:26] And that $2M is bandwidth, hosting, etc., mostly not new hardware. [23:17:34] AIUI. [23:17:58] Susan: There is some capex component as well, but I forget how much it is [23:18:06] It's also not entirely constant year-to-year [23:18:17] What's capex? [23:18:18] IIRC we spent something like $3.3m on setting up eqiad [23:18:19] Thehelpfulone: I suspect I closed the wrong bug (hence there being no comment) [23:18:22] capital expenditure [23:18:22] Capital expenditure [23:18:26] ty [23:18:36] Roan has been infected with Wikimedia's lingo disease. :-/ [23:18:47] It's not WMF lingo [23:18:49] The purchase of expensive and relatively long-lasting things. In our case, mostly servers and other tech equipment [23:18:50] Capex your white papers, we've gotta save the Global South! [23:18:54] It's a very universal term [23:19:26] It's not in my dictionary. [23:19:41] http://www.google.co.uk/search?q=define%3Acapex&oq=define%3Acapex&sourceid=chrome&ie=UTF-8 [23:19:43] Google knows [23:19:54] RoanKattouw: And sure, one-time investments are a different matter. [23:19:56] http://en.wikipedia.org/wiki/Capital_expenditure [23:19:58] So does Wikipedia [23:20:07] would like to talk to a person involved in next Wikimania [23:20:17] I'm not saying you made the term up, I'm saying that it's not in the New Oxford American Dictionary. [23:20:27] especially about setting up the website for it [23:20:28] mutante: #wikimania ? [23:20:31] Are abbreviations usually? [23:20:32] Susan: Yup, although we probably had another smaller one this year with ulsfo, and in the future we'll probably have a few more of them [23:20:36] Or e-mail. [23:20:38] Reedy: Of course. [23:20:45] ulsfo? [23:20:52] sf caching centre [23:20:54] Also, for the established datacenters there's a more or less constant trickle of hardware being replaced and added [23:20:59] I don't spend much time looking at dictionarys [23:21:07] I hadn't even heard of that. [23:21:08] Reedy: Loser. [23:21:18] Susan: United Layer San Francisco [23:21:40] RoanKattouw: Right, I'm not disagreeing with Web sites needing hardware. I just don't think begging for money for hardware in an error message is appropriate. ;-) [23:21:42] and that's booked in as capex rather than hosting costs because purchasing a server is a purchase of an object that retains a value, rather than the purchase of a service for a limited amount of time [23:21:50] Susan: Oh yes, I agree with your main point [23:21:55] I understand capital expenses. ;-) [23:22:04] I just wanted to point out that there are some hidden costs that push up that $2m figure a bit [23:22:14] and figuring out "the number" from the books may not be trivial [23:22:17] Though "capex" really is a new term to me. I guess it's paired with "opex." [23:22:22] Yes [23:22:32] opex is running things like monthly payments for bandwidth and hosting [23:22:38] I'll revise it to $2.5M, for inflation and such. (o; [23:22:43] heh [23:22:46] the ul in ulsfo is for UnitedLayer [23:22:56] I don't know how much it is exactly but it's probably not far off [23:23:00] and we always use the IATA airport codes for the last 3 letters [23:23:21] mutante: Except for yaseo back in the day :) Look up where SEO is and you'll be amused [23:23:39] So yeah ulsfo is a location we acquired a few months ago and have yet to build out [23:23:50] I think that's planned for some time after next week's eqiad switchover [23:23:58] RoanKattouw: https://meta.wikimedia.org/w/index.php?title=MediaWiki:Centralnotice-template-money_or_die&diff=5077110&oldid=2169595 [23:24:16] RoanKattouw: it's not a full location, it's a caching center [23:24:21] haha [23:24:24] https://meta.wikimedia.org/w/index.php?title=Special:NoticeTemplate/view&template=money_or_die [23:24:35] RoanKattouw: also there were a lot more issues than just "haven't built it out yet" like server drama, etc [23:24:42] Oh, right [23:24:49] That shipping saga [23:25:46] (I call it a location because, if caching centers don't count as locations, then we currently only have 1 location :D ) [23:25:59] LeslieCarr: Just out of curiosity, what's the distinction between a location and a center? [23:26:20] Is the SF place going to be read-only? [23:26:35] only 2 racks, definitely read only [23:26:43] a bunch of varnish servers basically [23:26:55] Got it. [23:27:06] LeslieCarr: Are we gonna have Squids in ulsfo, or are we planning to move away from Squid to Varnish before building out ulsfo? [23:27:09] This is global load-balancing, then? [23:27:09] and built to only be able to serve a portion of wiki traffic [23:27:14] that's the plan ;) [23:27:28] unsure RoanKattouw - probably just varnish but i do not know for certain [23:27:44] Squids clinge. [23:29:28] LeslieCarr: Random question: do DNS additional sections sent by the authoritative server typically propagate down to individual resolvers? [23:30:20] don't know, i'd have to google it [23:30:24] Heh. [23:30:31] Cause I was thinking it would be nice that when our clients send a DNS request for en.wikipedia.org, they'd get the records for bits in the additional section [23:30:57] But that's only beneficial to end users if their DNS resolvers somehow know to do that too [23:31:25] Lirodon: So when you have some put together HTML, you can file a bug at . Including a screenshot (that imgur one, e.g.) may also help. [23:31:37] Including --> as an attachment in the bug tracker. [23:32:19] hrm, i always doubt that anything interesting is every allowed downstream and then i am never disappointed [23:32:42] tbh, as far as dns latency goes, it's a teeny part of the site latency [23:33:05] since for like 99% of the world's isp's, there's enough wikipedia traffic that the lookup is only done as long as their caching server's TTL [23:33:13] Yeah [23:33:36] I was mostly worried about the client having to hit their resolver a second time for bits. Anecdotally that is actually very slow for some people sometimes [23:33:45] geoiplookup.wm.o especially exhibited that behavior [23:35:34] bits hasn't been having a great week. [23:35:42] Though I agree with the anecdote. [23:36:12] But yeah I also doubt anything interesting will make it downstream