[00:01:39] *robla wonders if Farsi has a sane unicode ordering [00:02:35] *maplebed idly wonders if japanese or chinese sort order is by radical + stroke count [00:24:25] hey look...both binasher and ^demon are back :) [00:24:39] yo [00:25:30] binasher: ^demon is trying to debug bug 30086 http://bugzilla.wikimedia.org/30086 ...and hit a snag that you might be able to help with [00:26:02] he's setting up profiling, but there's apparently not a collection group for commons [00:26:35] as he pointed out: (04:40:28 PM) ^demon: RoanKattouw: StartProfiler I can do, but report.py is owned by root:root :( [00:27:46] binasher: I realize that you're not the only one with root, but I'm guessing you'll get enlisted in the ultimate problem solving here sooner rather than later [00:27:56] <^demon> I'm not entirely sure what changes need doing to report.py yet. [00:28:05] i'm reading over the ticket [00:28:45] <^demon> I can easily add a new profiling group to StartProfiler.php. What I don't know how to do is make that show up in noc's report.py. [00:32:57] regarding the ticket, is their speculation that uploads via the api are slow but via some other method (what else is there?) aren't? it's a bit difficult to parse due to speculation re: different issues [00:33:38] <^demon> binasher: You can upload via the old-fashioned Special:Upload page. [00:33:49] <^demon> Or via the API (which the new-fangled UploadWizard uses) [00:34:52] gotcha. i saw one report in there that both were equally slow. is that generally the case, or is the api method much slower? [00:35:05] <^demon> Anecdotally, it's slower via the API [00:35:13] <^demon> Getting profiling data will help determine if that's the case. [00:35:45] <^demon> People saying "It took 4 minutes for me to upload ABC file" and another person saying "That same file took me 10 minutes!!" isn't a very useful profiling metric :) [00:36:26] definitely not [00:38:00] <^demon> So where Roan and I got stuck is that he and I both can do the MW-side of profiling config for commonswiki. [00:38:01] and it's entirely possible that there's an issue that isn't directly visible from php [00:38:08] <^demon> That's also true. [00:38:20] <^demon> But getting commons profiling data will help us start ruling out causes. [00:38:27] and report.py?db=commonswiki doesn't let you get to it? [00:38:38] <^demon> I haven't done the StartProfiler config. [00:38:47] <^demon> I assumed you had to do some sort of other config to report.py [00:39:04] <^demon> Worst case, we end up profiling for a few minutes and it goes to a black hole. [00:39:25] GONE FOREVUR [00:40:43] *binasher is looking at collector.c [00:41:48] Which version? [00:41:58] trunk [00:41:59] IIRC svn version isn't the same as what's in us [00:41:59] e [00:42:02] greeeeat [00:43:22] it's always fun when a relatively new person comes and still isn't numb to all of the "oh yeah, we should fix that"s we have [00:45:14] <^demon> Whee, it works. [00:45:18] <^demon> http://noc.wikimedia.org/cgi-bin/ng/report.py?db=commonswiki [00:45:24] we could have svn $id$ expansion assigned to a marginally used string in all of our c code that doesn't have --help output, so that the strings command will show you what it was built from [00:45:35] ^demon: it looked like it would [00:45:37] :) [00:45:44] *^demon thought it wouldn't. [00:45:53] <^demon> db=cli doesn't work, but it's supposedly a profiling group. [00:46:19] <^demon> robla: We now have profiling data for commonswiki. [00:46:27] xlnt! [00:47:09] odd, cli key value pairs are in the collector bdb [00:47:19] <^demon> 2 calls to "Special:Upload" already avg 1726.9 s/c [00:47:27] <^demon> Ouch [00:48:30] what does that cover exactly? the start of an upload (put or push?) until??? ? [00:49:33] <^demon> I would imagine that's on the POST that does the upload. [00:50:00] <^demon> Probably should let stats come in for a few hours or so so outliers don't skew it so badly. [00:50:11] <^demon> It could've just been 2 really bad calls ended up first. [00:51:16] ^demon: do you have anything more granular than that rigged up? [00:51:29] <^demon> Nope. [00:51:55] <^demon> Right now it's doing !( $rand % 50 ) as the sampling frame. [00:52:06] <^demon> Then if $host == 'commons.wikimedia.org' it will profile. [00:54:49] *robla gets ready to go in a couple min [00:55:21] <^demon> It's 9pm here, I'm so done for today :) [00:57:32] i'm curious if US uploads are generally as slow as EU uploads, but I can figure that out by looking at squid logs [00:58:11] we haven't figured that out either, but anecdotally, geography isn't a big driver [00:59:23] binasher, same issues with both [00:59:38] Originally we thought it was only an EU issue, but tests in the office confirmed it wasn't [01:00:38] i'd still like to get data from the squid logs [01:00:43] sure [01:00:47] they might be at fault :P [01:01:35] the office connectivity is a bit weird - it's a very fast pipe to somewhere with horrible peering in places [01:08:12] can I help any of you guys with this uploading issue? Reedy, binasher? [01:09:50] the sharp drop in performance feels like some sort of resource exhaustion thing, but I don't know what you've already looked into at this point. [01:12:46] neilk_: i'd like to capture tcp traffic from an upload if that my be relevant, but first understand all the various hops the bits take from client to disk. if you know uploads well, perhaps you could go over it with me in person tomorrow. i have a feeling there's a dizzying number of hops along the way. any one of which could be a bottle neck, and many that will require me to call upon reserves of stoicism. [01:13:39] <^demon|away> We should have some decent profiling data to look at by tomorrow too. [01:23:33] binasher: okay, I'll try to help. I don't think it's as complex as you think it does, but I'm much more familiar with MediaWiki as a single node. [01:25:14] probably not, it's more that it might go through a few different squids, getting spooled to different disks along the way [01:25:48] <^demon|away> I'm actually more inclined to blame MediaWiki, but I'll wait until more stats come in [01:27:12] With it seemingly being ok on test (all but idle apache), but not everywhere else... [01:27:53] is mw doing all that much more than basic php upload handling? [01:28:19] <^demon|away> It does a lot of stuff [01:28:23] <^demon|away> Extract metadata. [01:28:27] <^demon|away> Store a bunch of crap in the DB [01:28:50] <^demon|away> Checks files for viruses, injection exploits. [01:28:53] <^demon|away> Etc etc. [01:29:32] oh my [01:30:06] will we get profiling data for all of those separate steps? [01:30:22] <^demon|away> Most of it. [01:30:27] <^demon|away> If we don't, we can easily add more [01:30:33] *mattwj2002 sits and listens [01:30:35] kick ass [12:19:53] ^demon|away: About that profiling, I cleared it a few hours ago because an unclosed profiling section was polluting the results [12:20:18] <^demon|away> I did notice that before I went off for the night. [12:20:22] <^demon|away> Thanks for fixing that. [13:01:43] @replag [13:02:23] @replag [13:02:24] Krinkle: [s1] db12: 4s, db26: 4s [20:37:15] <^demon> hexmode: How's that .deb going? [20:37:55] I just realized I need to put a lot more in it [20:47:57] <^demon> :( [20:58:36] ^demon: "a lot more" ... I missed the files it needs to compile stuff. That and "make install" doesn't put binaries in place. [20:58:44] or rather [20:58:49] it puts binaries [20:58:58] but doesn't set up /usr/share [21:00:06] *hexmode peeks at "highest" priority bugs [21:00:13] need to add some more [21:00:25] robla: now? [21:01:14] <^demon> hexmode: Yeah, make doesn't move them. Rather than actually moving them, I'd recommend adding symlinks in /usr/bin [21:01:45] symlinks for hphp? [21:01:59] <^demon> Yeah [21:02:06] <^demon> And hphpi [21:02:17] <^demon> Our mw build script expects them in $PATH [21:02:39] hrm... don't think the Debian people would take too kindly to that [21:02:57] <^demon> Oh true [21:02:58] "symlinks in /usr/bin !!!!" [21:03:06] anyway [21:03:06] *^demon is just thinking how he'd do it cuz he's lazy [21:03:11] mw build script? [21:03:36] is that in svn and I missed it? [21:03:45] <^demon> ./maintenance/hiphop/ [21:03:46] hi hexmode [22:34:15] Question: When should I use the XML methods for building HTML components and when should I use the HTML methods? I've looked through the documentation but have never been able to figure this out. [22:36:04] I didn't really care which class I was using until I notices other people changing my "XML::" to "HTML::", but without any explaination. [22:36:14] notices=noticed [22:36:39] and in the process, breaking my code in some cases [22:37:34] should I just revert these changes, or should I migrate my code to use one or the other?