[00:03:33] marktraceur: Aha !, sorry about that, promise to keep it really shot (< 5m) to minimize the pain, ;-). [00:04:36] 0's all right [00:04:40] 's all right* [00:15:19] (03PS7) 10Aarcos: Fix resize listener leak problem. [extensions/MultimediaViewer] - 10https://gerrit.wikimedia.org/r/96819 [00:15:59] (03CR) 10jenkins-bot: [V: 04-1] Fix resize listener leak problem. [extensions/MultimediaViewer] - 10https://gerrit.wikimedia.org/r/96819 (owner: 10Aarcos) [00:18:46] (03PS8) 10Aarcos: Fix resize listener leak problem. [extensions/MultimediaViewer] - 10https://gerrit.wikimedia.org/r/96819 [16:27:56] marktraceur: https://bugzilla.wikimedia.org/show_bug.cgi?id=58100 [16:28:10] we should probably deploy this [17:09:52] Aha [17:09:58] tgr: Good point [17:13:07] W00t! https://bugzilla.wikimedia.org/show_bug.cgi?id=56178 is closed (security review of GWToolset) [17:16:40] Noice [17:47:18] * marktraceur going to get groceries *super* quickly [17:47:34] Will be on hangout on time [17:57:39] Woot [18:00:00] Stand-up! [18:00:00] Order: fabriceflorin -> bd808 -> tgr -> marktraceur [18:11:33] fab [18:11:49] aarcos: https://meta.wikimedia.org/wiki/Schema:MediaViewerPerf and I can link it to Fabrice if he gets on IRC [18:13:52] (03PS9) 10Aarcos: Fix resize listener leak problem. [extensions/MultimediaViewer] - 10https://gerrit.wikimedia.org/r/96819 [18:40:34] tgr: Any chance you'd want to review https://gerrit.wikimedia.org/r/99696 - it should be easy, and it fixes some really broken behaviour [18:40:58] bawolff: will check in a moment [18:44:55] marktraceur: Will take a look, tx ! [18:52:28] marktraceur: Schema looks good, any reason not to keep imageWidth and imageHight? This may give us a clue on the common image dimensions, if we just keep the area we lose this info. [18:52:41] Maybe so [18:52:43] I'll change it [18:54:29] marktraceur: That's it for the moment but let you know if anything else comes to mind. [18:57:25] aarcos: I'm also curious about Fabrice, could you poke him about not being on IRC? [19:00:05] marktraceur: He left like 20min ago, probably to a meeting, leaving a post it on his desk, ;-). [19:00:38] Ah, sure [19:07:18] (03CR) 10MarkTraceur: [C: 04-1] "1. We get "View License" if there is no parseable license. Make sure that the message exists before adding it to the title." [extensions/MultimediaViewer] - 10https://gerrit.wikimedia.org/r/99533 (owner: 10Yamelnychuk) [19:20:34] marktraceur: One more thing, this metrics look "browser centric" which is fine for client only operations like load but in my experience is also good to have the server side story and put instrumentation there too. This gives you a clearer picture on where time is being spent and allows you to infer the network latency component of any operation, makes sense? [19:21:23] It does, but that's a muuuuch larger issue [19:21:38] bd808|LUNCH: Are there plans in motion for monitoring the performance of image scalers? [19:22:16] Maybe not in this iteration but something to keep in mind. [19:22:50] Yeah [19:22:54] It's not really something MMV could do [19:28:34] Oh, I guess I should maybe think about fixing the bucketing patch before perf logging [19:28:40] Unless we want before and after for that [19:30:55] marktraceur: As discussed, can you share the URL to the new schema for collecting image load metrics? [19:31:12] marktraceur: Sorry I couldn't get on earlier, was tied up in another call. [19:31:37] I can! [19:31:40] fabriceflorin: https://meta.wikimedia.org/wiki/Schema:MediaViewerPerf [19:32:08] marktraceur: Thanks for adding the URL to the Mingle card as well, where I just found it: [19:32:09] https://mingle.corp.wikimedia.org/projects/multimedia/cards/69 [19:32:33] fabriceflorin: aarcos already mentioned that we should use height and width, not just area, and that we should get stats for the image scalers (but that's separate) [19:32:54] Oh wait, the URL on Mingle is the old one. I'll add this new URL now, so we're all on the same page: https://meta.wikimedia.org/wiki/Schema:MediaViewerPerf [19:33:02] Yeah [19:35:27] Does 'image-load' correspond to the time that the user clicked on the thumbnail? or the time that the server received the load request from Media Viewer? [19:35:28] marktraceur: what do you think about logging page load time as well? [19:35:45] tgr: I feel like that might be another coretype thing, but we could do that [19:36:07] lightbox load time divided by page load time seems like a good metric of user annoyance [19:36:12] fabriceflorin: We have no way of knowing when the server gets the request, but I'll probably start the image-load timing when we send the first request. [19:36:14] I agree with aarcos's recommendation to keep track of height and width separately, this will be useful data. [19:38:39] marktraceur: Sounds good. I think we want to use metrics that are as close as possible to what the user experiences, so we can reduce the pain point from a user's perspective (so we can reduce the amount of time he/she has to wait to see the large image). [19:40:21] Ah, 'kay [19:40:47] fabriceflorin: We could reasonably do that, but I feel like figuring out granularly what takes what amount of time will be one of the more helpful things [19:41:17] fabriceflorin: Interface creation is basically going to be 1% of the time, if that, because the network load times are so much longer than DOM traversal and manipulation times [19:41:26] I also think that it would be helpful to track some of the other file data which I proposed on Mingle: File Size (bytes), File Format (e.g. JPEG), File Resolution (dips), File URL -- as well as user data (Connection Speed, Computer Type, Browser Version, Screen Size, User ID). Is that info being stored somewhere so we can retrieve it to analyze things further? [19:41:36] https://mingle.corp.wikimedia.org/projects/multimedia/cards/69 [19:41:45] Hm [19:41:57] Connection speed and computer type...aren't really anything we can do [19:42:13] I don't think we can manage screen size, just browser window size, but it won't tell us anything useful [19:42:20] User ID is also probably not useful here [19:42:25] Because it's just perf metrics [19:42:33] Isn't that in violation of privacy policy (associating user id -> browser version, OS, etc) [19:42:42] Why are we fetching the gender of the user? ('gender-fetch') That raises all sorts of troubling questions for me … [19:44:18] OK. Too bad we can't get things like connection speed, which are so key to the load time. Let's at least get browser version and OS, as that could be relevant. [19:44:54] UserID could be useful if we wanted to figure out why things are taking so long for that particular user. [19:45:00] fabriceflorin: We're fetching the gender of the uploader from the API [19:45:10] It's available [19:45:29] Why would the gender matter in this case? This could be perceived as sexist. [19:45:35] fabriceflorin: user-agent will "have" the browser and OS, but they'll be potentially spoofed [19:45:41] fabriceflorin: It's for i18n purposes [19:46:12] Does the i18n use of gender impact image load times? [19:46:43] It's an API request that takes some amount of time, but it doesn't block the rest of the interface loading IIRC, I think we set it to "unknown" and update when the response comes back [19:46:50] We can remove that one if you'd like [19:47:11] What about tracking some of the file data recommended above? (e.g. file size in bytes, resolution in dpis, etc.) [19:47:12] But it won't hurt to know the impact it has [19:47:40] marktraceur: Don't mean to nag, but some of the other data I'm proposing seems more relevant than gender. [19:48:08] File size in bytes is likely helpful, but I think resolution in dpis won't be appreciably different data from the resolution in pixels, unless I have no idea what I'm talking about [19:48:18] fabriceflorin: Not _gender_, gender _fetch time_ [19:48:31] I think you're not understanding what I'm storing with that [19:48:51] marktraceur: It can be different, I can't imagine that matters much though [19:49:01] bawolff: I didn't figure so [19:49:06] Ooh, file type will be important though. [19:49:07] OK on the dpi front, glad we can capture the bytes. Are we tracking the file URL as well, so we can investigate and find out why a particular file is causing issues? [19:49:09] * marktraceur does that [19:49:29] Yes, knowing if it's a JPEG or PNG is relevant, methinks. [19:49:35] fabriceflorin: I doubt that will be relevant...file type and size should basically cover us there [19:49:54] If there are *serious* issues with a particular file the scaler logs will probably tell the story [19:50:04] By file type, do you mean JPEG vs. PNG? [19:50:28] Well certain files will take longer to render then others. A 100 MP file takes longer [19:50:35] then a sane sized file [19:50:53] fabriceflorin: Yeah, we have the file extension, so it's useful [19:51:10] Cool, thanks for the clarifications. [19:51:11] OK, gotta prepare for my lunch with Erik now. Can offer more input afterwards, but at least you now have my main comments. [19:51:30] marktraceur: I think there is a os level monitoring for the image scalers but I'm not aware of any deep instrumentation. That would be a good enhancement bug to file and flag as performance and/or platformeng. [19:51:58] And then pester Ori about how important it really is :) [19:52:54] 'kay [19:52:58] * marktraceur does *that* next [19:52:59] I think that if the thumb pipeline RFC gets the green light performance instrumentation could probably ride along [19:53:14] bd808: Should I talk about it on the RFC page then? [19:54:15] Nah. Just file a perf bug but I'll keep it in mind when and if we start talking schedules [19:54:27] 'kay! [19:56:40] there is also https://noc.wikimedia.org/cgi-bin/report.py?db=thumb-1.23wmf5 for the image scalars if its useful (not sure what type of data you're after) [19:58:04] bawolff: That [19:58:48] Thats probably the sort of info but it would be good to get the important numbers into graphite so we can see trends [19:59:49] I think I'd like to see total response time by media type and maybe time spent shelling out to convert [20:01:06] Wait wait, is that CommonSettings taking up over 20% of the real time? [20:01:22] That seems supremely suboptimal [20:01:57] Something like thumb.(fetch|gen).(gif|bpm|jpg|…).elapsed.p_90 [20:02:24] Yrrrs [20:03:22] Probably that includes requests that are cached responses or something, if the request is a no-op, then CommonSettings.php might take a large part of the time [20:03:33] or maybe those numbers are total crap :) [20:03:37] * bawolff doesn't really know [20:03:50] no-op wouldn't make it to the scaler I don't think [20:04:07] but yeah I don't know how to interpret the numbers in those reports [20:04:37] CommonSettings is a pig though because that's where all of MW is setup [20:04:41] I think that report was used more before graphite existed, so maybe nobody is maintaining it, and it no longer has accurate info or something [20:05:23] * bawolff not really familar with how wmf does profiling [20:05:50] The real% numbers there don't add up either. Or more specifically they add to >100% [20:07:15] Oh but the calls would nest duh [20:07:26] I did a ?forceprofile=true request recently that said 30000% of the request time was being used to generate image links... [20:08:40] Cool. YOu figured out how to make php multi-threaded :) [20:15:26] (03PS1) 10MarkTraceur: Move clearInterface things to mw.LightboxInterface [extensions/MultimediaViewer] - 10https://gerrit.wikimedia.org/r/99722 [20:16:11] (03PS2) 10MarkTraceur: Move clearInterface things to mw.LightboxInterface [extensions/MultimediaViewer] - 10https://gerrit.wikimedia.org/r/99722 [20:17:12] I guess I should add some tests there too [20:35:36] (03PS3) 10MarkTraceur: Move clearInterface things to mw.LightboxInterface [extensions/MultimediaViewer] - 10https://gerrit.wikimedia.org/r/99722 [20:35:38] THERE [20:35:42] Take your blood money [20:35:43] Er [20:35:46] Unit tests [20:38:01] marktraceur: we have two logging schemas, one of them logs user ids, the other logs data which is enough to identify an image [20:38:06] isn't that bad? [20:39:05] tgr: I guess "yes", but I would bow to the legal team [20:40:01] shouldn't we disable userid logging then? [20:40:08] not like we make any use of it [20:40:37] True [20:40:49] fabriceflorin: Thoughts? [20:41:00] I think our conclusion before was that it was a non-issue [20:42:22] i think the conclusion was that logging only one of userid and image name is fine [20:42:34] Ahhh. [20:42:42] but logging image size + dimension is pretty much the same as logging the name [20:43:15] we have two different schemas, so the question is, can those be correlated? [20:43:34] if timestamps are also logged, then it would be trivial [20:44:04] They are, but that data is private, IIRC [20:47:19] re network speed, there is navigator.mozConnection.bandwidth, but it only works on firefox for android [20:47:28] which is a pretty small market share [20:47:37] even so, it might be interesting [20:57:43] I've torn out the user ID field and the edit count field from the media viewer schema [20:58:13] I don't think we need to worry about the user agent field [21:03:15] /ih [21:03:17] ... [21:32:55] Hey! GWToolkit config pushed for beta. [21:33:13] Wooo [21:33:43] * bd808 has a George Jetson style sore finger from all the +2's [21:34:24] Hahaha [21:35:32] bd808: That's wonderful news. Congratulations to everyone who made this possible! [21:36:46] dan-nl: You must be feeling pretty good about this :) Thanks so much for your perservance and patience! [21:37:46] fabriceflorin: yes indeed!! i'm so happy we were all able to put this thing together with some many people so busy with so many other projects [21:39:03] dan-nl: http://commons.wikimedia.beta.wmflabs.org/wiki/Special:Version shows "GWToolset (Version 0.0.1-dev) (de0970e) 00:06, 5 December 2013" [21:39:41] I'm not sure why 888d2d0 didn't make it over there yet but it should soon [21:39:45] bd808: excellent! just need to be added to the gwtoolset group then. do you happen to know of an admin that can do that? [21:40:38] ah, okay, we'll need to wait for 888d2d0 to make it over there before it will work [21:41:05] http://commons.wikimedia.beta.wmflabs.org/w/index.php?title=Special:ListUsers&group=sysop [21:42:44] thanks [21:45:21] beta is slooooooow right now [21:46:24] Related? [21:46:48] I don't think it could be. Nobody has rights to use the new extension yet [21:47:08] Unless just adding it to the config caused issues [21:47:38] It doesn't run *any* code on its own? [21:48:09] no, there are only jobs that are run in the background if they exist [21:48:32] but since no one has access to the extension atm there should be no jobs waiting to be run [21:49:01] response from beta is fine over here atm [21:58:19] dan-nl: Glad we could all swarm on this project to push it through the first gate together. I think I can see the light at the end of the tunnel now. Please continue to ping us as you have -- and good luck on the next steps! :) [21:59:02] thanks fabriceflorin [22:01:04] dan-nl: 888d2d0 is on beta now so things should be ready to test [22:01:18] perfect. [22:01:46] just need to get access to the tool so i can run a small test [22:05:08] dan-nl: hashar seems to be online in #wikimedia-labs. He could give you the group access for sure [22:05:22] There may be other beta commons sysops in the as well [22:05:28] *in there [22:06:19] thanks, i'll ping him there … tried earlier in #wikimedia-dev [22:51:28] bd808, update. the extension is working, but not copying the media files from the external domain … looking into what config value might need to be changed [22:52:16] dan-nl: Ok. I'm in a heated discussion in another room. I didn't mean to wander away on you. [22:52:29] oh no worries, i understand [22:54:22] dan-nl: I think there's something funky about external http access on wmf cluster - it might have to go through some proxy or something [22:54:38] i think i found the config for it [22:54:38] * bawolff doesn't really know. I don't even know if beta is considered on cluster... [22:54:55] It's on labs, so no [22:55:03] InitializeSettings.php line 10297 [22:55:26] it looks like it only allows flickr as an external copy from url [22:55:50] this might be tricky … the potential for the whitelist is great [22:56:19] * bawolff has never understood why we don't allow url uploads from all domains... [22:56:26] do we need to whitelist every domain that a glam might use to download content from? [22:57:17] I wonder if you could convince the powers to be to drop the whole whitelist bussiness - it would make a bunch of people on commons happy to be able to do arbitrary upload by url [22:58:54] for the beta cluster, could we temporarily set the array to array(). it looks like if it has nothing in it UploadFromUrl will allow any url [22:59:38] bawolff: i don't know if i can do that, but i can try ... [23:00:01] dan-nl: yeah, that might be difficult [23:02:39] dan-nl: For reference, see https://bugzilla.wikimedia.org/show_bug.cgi?id=45735 for info about that debate [23:04:24] bd808: if i want to change the config just for the beta server, can i just make that change in the InitializeSettings-labs.php ? [23:04:55] bawolff: thanks for that link [23:05:51] dan-nl: I think so yes, but … I don't think that making a similar change in prod will be so easy [23:06:16] i don't think so either [23:06:20] So what's the issue exactly? [23:07:02] InitialiseSettings.php line 10297 'wgCopyUploadsDomains' [23:07:27] the array has a whitelist with a default array that i believe the beta cluster uses [23:07:43] if we set it for commonswiki => array() [23:07:56] then UploadFromUrl.php will whitelist everything [23:08:22] can i add what i think i need to add to that beta cluster patch mark created or do i need to start another? [23:10:10] You need to start another. You can't amend merged patches [23:10:26] k, i was wondering about that … will start another [23:20:05] bd808: https://gerrit.wikimedia.org/r/99775 [23:28:43] dan-nl: Got it. I'm going to get some other folks to look at it before I push. [23:28:53] k [23:37:27] hey bawolff i see you can approve the concept ;) https://gerrit.wikimedia.org/r/99775 [23:38:26] I don't generally +1 ops/config related changes, as I don't really know enough about that area to know what I'm talking about [23:38:41] I don't even know why we have restricted domains for that setting in the first place [23:39:15] k, np :)