[00:00:48] * RoanKattouw blames Krinkle , partially [00:03:32] Krinkle: https://gerrit.wikimedia.org/r/#/c/94708/ [00:09:00] ori-l: Timo knows what's going on and is fixing [00:09:16] cool, thanks [00:12:22] MaxSem: Are you doing your LD now? [00:12:46] RoanKattouw: fix https://gerrit.wikimedia.org/r/#/c/74400/ I command you [00:13:05] Oh right that one [00:13:07] sigh [00:13:14] thou cluttereth thine dashboard [00:13:15] poor Psi IM wiki http://psi-im.org/wiki/ [00:13:47] (03PS1) 10Chad: Fix test2wiki to only have Cirrus as primary, not both [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99007 [00:13:54] RoanKattouw, awwww - totally forgot:( [00:13:58] (03CR) 10Chad: [C: 032 V: 032] Fix test2wiki to only have Cirrus as primary, not both [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99007 (owner: 10Chad) [00:14:00] feel free to deploy [00:14:10] I'll be ready after that [00:14:14] sorry [00:14:32] OK [00:14:34] Going now [00:14:50] !log demon synchronized cirrus.dblist 'Fix test2wiki to have Cirrus as primary' [00:14:53] mutante: we heard you like exceptions so we put an exception inside your exception, etc. [00:15:01] ^d: Dude? [00:15:06] Logged the message, Master [00:15:08] Scheduled deploy window? [00:15:27] <^d> LD. [00:15:37] Yes [00:15:47] ori-l: haha, yea, exceptions in the exception handler are the best exceptions? [00:15:53] Which you didn't list yourself for [00:15:59] And you didn't communicate with the people that had listed themselves [00:16:35] <^d> Sorry the loop didn't get closed. It was discussed and approved. [00:16:39] OK [00:16:41] Are you done? [00:16:41] <^d> It should've gotten on-wiki. [00:16:43] <^d> Yes. [00:16:47] Cause then I'll go next, and Max after me [00:16:51] Sweet [00:17:06] (03CR) 10Catrope: [C: 032] Disable VisualEditor in content namespaces on svwiktionary [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/98871 (owner: 10Catrope) [00:17:35] mutante: :P [00:18:23] (03Merged) 10jenkins-bot: Disable VisualEditor in content namespaces on svwiktionary [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/98871 (owner: 10Catrope) [00:19:58] !log catrope synchronized wmf-config/InitialiseSettings.php 'Add wmgVisualEditorInContentNamespaces and set it to false on svwiktionary' [00:20:14] Logged the message, Master [00:21:00] !log catrope synchronized wmf-config/CommonSettings.php 'Add plumbing for wmgVisualEditorInContentNamespaces' [00:21:14] Logged the message, Master [00:30:38] !log catrope synchronized php-1.23wmf4/extensions/VisualEditor/VisualEditor.php 'Blacklist IE11' [00:30:52] Logged the message, Master [00:30:53] !log catrope synchronized php-1.23wmf5/extensions/VisualEditor/VisualEditor.php 'Blacklist IE11' [00:30:57] yurik: https://gerrit.wikimedia.org/r/#/c/97130/ is still unmerged [00:31:10] (03Abandoned) 10Chad: Make Cirrus the default on test2wiki again [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/98901 (owner: 10Manybubbles) [00:31:59] paravoid, darn, totally missed it - if you can +2 it, will push it out right now (unless anyone is deploying?) [00:32:05] greg-g, lightning? [00:32:06] yurik: also, X-Subdomain is wrong there, there's not going to be an X-Subdomain [00:32:19] paravoid, why not? [00:32:21] so fix that too [00:32:40] paravoid, are you saying it won't go through the varnish zero detection? [00:32:46] because than it won't work [00:33:17] we need it to go through zero netmapper to figure out who it is [00:33:27] hm, I guess it will have it set [00:33:53] but why do you need it anyway? Host is sufficient for that, isn't it? [00:34:14] Logged the message, Master [00:34:25] i.e. it will never be Host: m.wikipedia.org X-Subdomain: ZERO or Host: zero.wikipedia.org X-Subdomain: M [00:34:30] paravoid, absolutelly not - some carriers allow one without the other [00:34:37] oh, that [00:35:04] paravoid, https://gerrit.wikimedia.org/r/#/c/98998/ [00:35:17] that patch looks at the HOST if X-SUBDOMAIN is not set [00:35:51] so if you want, X-SUBDOMAIN doesn't have to be really set for the redirector after that patch goes out. [00:36:02] but for now, i am guessing it will be set anyway? [00:36:32] it will be, but the Vary is redundant [00:37:09] paravoid, wouldn't varnish remove m. and zero. in front? [00:37:13] Varnish will set X-Subdomain, you don't need the Vary (but it won't hurt) [00:37:14] if it won't, than yes [00:37:32] it won't [00:37:43] it currently does, but i will need to check the regex it uses [00:37:43] if it did, the virtualhost wouldn't work anyway [00:37:52] good point [00:37:58] ok, i can remove taht [00:38:09] I mean, it won't change anything in practice [00:38:23] but since we haven't merged that "fix Vary" patchset... :-) [00:39:48] I'm okay if you want to leave it there [00:40:04] RoanKattouw, are you done? [00:40:04] (03PS2) 10Yurik: Mobile redirect - changed cache Vary header [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/97130 [00:40:13] paravoid, ^ [00:40:15] MaxSem: Sorry [00:40:17] yes [00:40:19] :) [00:40:24] I was done then immediately went to the restroom, sorry :S [00:40:26] (03CR) 10MaxSem: [C: 032] Fix removal of title coordinates from extracts [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/98848 (owner: 10MaxSem) [00:40:35] (03Merged) 10jenkins-bot: Fix removal of title coordinates from extracts [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/98848 (owner: 10MaxSem) [00:40:44] MaxSem, are you deploying? [00:40:51] yep [00:40:54] (03CR) 10Faidon Liambotis: [C: 032] Mobile redirect - changed cache Vary header [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/97130 (owner: 10Yurik) [00:41:01] MaxSem, could you grab https://gerrit.wikimedia.org/r/#/c/97130/ as well? [00:41:02] (03Merged) 10jenkins-bot: Mobile redirect - changed cache Vary header [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/97130 (owner: 10Yurik) [00:41:11] its a tiny vary header change [00:42:09] sure [00:43:20] !log maxsem synchronized wmf-config/InitialiseSettings.php [00:43:25] yurik: funny how it redirects on everything you throw at it as a request URL [00:43:34] Logged the message, Master [00:43:40] GET /favicon.ico HTTP/1.1 [00:43:44] HTTP/1.1 302 Found [00:43:46] Location: https://en.m.wikipedia.org/wiki/Main_Page [00:44:33] anyway [00:44:35] off for the night [00:44:38] paravoid, heh, its a redirector: ) But yes, favicon should probably not be a redirect..? [00:44:47] !log maxsem synchronized mobilelanding.php [00:44:51] in reality noone should ever hit favicon [00:44:59] thx MaxSem ! [00:45:03] Logged the message, Master [00:46:13] yurik: is the mobile team aware of the m.wp.org change? [00:46:31] the "mobile web" team I guess we call it now ;-) [00:47:12] i haven't told them explicitly, but we are not changing the behaviour [00:47:23] because that's the same as hardcoded in varnish [00:47:32] hm, I guess that's right [00:48:06] right, nevermind [00:48:39] we're actually fixing the https case too :) [00:48:54] if you go to https://m.wp.org now it redirects to http [00:49:04] not that anyone would do that I guess [00:49:27] Krinkle: do you have a time-frame for a fix to the ResourceLoaderLanguageDataModule mtime issue? [00:49:38] ori-l: I do [00:49:41] ori-l: negative 5 minutes [00:49:48] lol [00:49:53] he's good [00:50:36] In style of Bill and Ted: Don't forget to actually go back in time once you get the time machine. but considering it is in gerrit, I guess that's proof we'll eventually have time travel :) [00:50:37] i prefer to use "i" [00:51:06] (the car keys behind the bushes in the Excellent Adventure, for those who know the movie) [00:51:07] (noncapitalized) [00:51:31] paravoid, the vary header has been pushed out btw [00:51:34] ori-l: https://gerrit.wikimedia.org/r/#/c/99010/ [00:51:36] thx to max [00:51:41] yeah, looking it over [00:51:47] yurik: cool [00:52:10] yurik: do you see any reason for us to not make the rewriterule ^/$ ? [00:52:19] not really [00:52:25] its a bouncer, noone should ever stay on it [00:53:06] i mean what else will we return - 404? [00:53:10] yes [00:53:19] RoanKattouw: Krinkle's change looks good to me; do you want to have a look before I merge? [00:53:26] Looking [00:53:45] My commit message to diff ratio will rise if this gets merged :) [00:53:49] yurik: also note (and review) https://gerrit.wikimedia.org/r/#/c/98058/ so that we can get rid of the other varnish redirect too [00:53:53] and now I'm off [00:54:07] * ori-l waves [00:54:56] s/langauge/language/ [00:54:58] on that commit msg [00:54:59] paravoid, goodnight, and that would be a separate submit [00:55:10] we would have to get biz ppl involved [00:55:13] yurik: yes it will be [00:55:21] why? it doesn't change anything [00:55:34] no idea who uses or why we have that, or if we have contracts for that [00:55:46] go sleep already [00:56:34] that change doesn't change any business logic [00:56:51] the only functional change is breaking the old ruby gateway URLs but I don't think we should care anymore [00:57:56] Krinkle: cherry-picks? [01:02:03] I'm not deploying right now [01:02:31] Unless we can get a window, in which I'll be glad to take it all the way [01:02:37] case* [01:02:43] i asked greg-g in -dev [01:02:50] ..and he just said yes [01:02:53] and I said yes [01:02:58] ok [01:03:07] ori-l: go ahead then :) [01:03:07] want me to do it? [01:03:09] k [01:03:12] sure [01:07:28] (03PS2) 10Tim Starling: Add "touch.php" for $wgAppleTouchIcon... [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90886 (owner: 10Reedy) [01:07:48] (03CR) 10Tim Starling: [C: 032] Add "touch.php" for $wgAppleTouchIcon... [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90886 (owner: 10Reedy) [01:08:38] !log ori synchronized php-1.23wmf4/includes/resourceloader/ResourceLoaderModule.php 'Ifa9088c11: resourceloader: Make sure hashmtime cache key is different by language' [01:09:38] Logged the message, Master [01:10:43] !log ori synchronized php-1.23wmf5/includes/resourceloader/ResourceLoaderModule.php 'Ifa9088c11: resourceloader: Make sure hashmtime cache key is different by language' [01:10:58] Logged the message, Master [01:13:50] (03Merged) 10jenkins-bot: Add "touch.php" for $wgAppleTouchIcon... [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90886 (owner: 10Reedy) [01:15:43] Krinkle: module version looks stable now [01:18:09] (03CR) 10Tim Starling: [C: 04-1] "The apache config looks OK already, but it definitely needs the relevant DNS entries to be added before it can be merged." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/53885 (owner: 10Reedy) [01:21:47] (03CR) 10Tim Starling: "Well, the apache config is OK in that the wikis will be served correctly from the new domains, but redirects will need to be added for b/c" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/53885 (owner: 10Reedy) [01:24:35] Krinkle: https://ganglia.wikimedia.org/latest/graph.php?r=day&z=xlarge&c=Bits+caches+eqiad&m=cpu_report&s=by+name&mc=2&g=network_report [01:25:26] https://ganglia.wikimedia.org/latest/graph.php?r=week&z=xlarge&c=Bits+caches+eqiad&m=cpu_report&s=by+name&mc=2&g=network_report [01:25:33] Krinkle: https://ganglia.wikimedia.org/latest/graph.php?r=day&z=xlarge&c=Bits+application+servers+eqiad&m=cpu_report&s=by+name&mc=2&g=network_report [01:25:47] https://ganglia.wikimedia.org/latest/graph.php?r=week&z=xlarge&c=Bits+application+servers+eqiad&m=cpu_report&s=by+name&mc=2&g=network_report [01:26:26] (03PS1) 10Tim Starling: Additional domains for Icdbfb2e74a [operations/dns] - 10https://gerrit.wikimedia.org/r/99017 [01:31:47] (03CR) 10Tim Starling: [C: 032] Additional domains for Icdbfb2e74a [operations/dns] - 10https://gerrit.wikimedia.org/r/99017 (owner: 10Tim Starling) [01:33:42] the bits app servers are going to disappear out of existence at this rate [01:34:15] hehe [01:34:41] ori-l: Somewhere in my Twitter stream in like early 2011 there's a Ganglia graph of the bits app servers CPU being cut in half [01:35:21] By changing ResourceLoaderUserModule::getModifiedTime() from return max( 0, max( map( 'mtime', $userJSPages ) ) ); to return max( 1, .... ); [01:35:24] (since 0 means "now") [01:36:26] heh [01:36:31] we've had a few overloads of the bits appservers, causing CSS delivery to be pretty much down [01:37:02] I'm not sure what to do about it [01:37:36] Mark was saying something about just throwing them into the general pool [01:37:38] well, load in general should be much reduced now, but it's definitely still fragile [01:38:00] it's quite possible that if the two pools were merged, the previous issues would have overloaded the whole appserver pool [01:38:21] then the site would have been even more down [01:38:44] maybe the bits varnishes just need to limit their backend connection count [01:39:41] I don't know, something clever needs to be done, anyway [01:39:43] should should also have monitoring for continuously increasing module mtimes [01:39:48] *we should [01:41:36] well, the issue on November 14 was not related to module mtime [01:43:07] well, for the ones I caused, I really don't get why 403s need a one-minute TTL [01:45:17] ori-l: Yeah we should monitor mtimes tracking within 1 minute of NOW() for like more than 5 minutes straight I suppose? That requires a fair bit of state-keeping probably though [01:45:52] there are lots of ways to overload the bits appservers, not just those two [01:46:11] you could have private responses, or you could have many URLs [01:48:40] (03CR) 10Tim Starling: "I added the DNS entries." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/53885 (owner: 10Reedy) [01:49:33] if (req.url ~ "modules=startup") beresp.ttl += rand() % 10; [01:50:13] 10 * 60, rather [01:51:04] most bits app server request URLs are generated in JS by ResourceLoader based on the contents of the startup manifest [01:51:25] is it possible to have varnish return a 503 if the number of pending backend requests for a given URL exceeds some value? [01:51:58] 503s from bits wouldn't be very nice [01:52:03] It'd be better to return stale content if present [01:52:28] sure, if there is stale content [01:52:58] but if there is none, you have the choice of issuing 503s for the problematic URLs, or letting the whole backend pool overload and issuing 503s for everything, after a timeout [01:52:58] https://www.varnish-software.com/static/book/Saving_a_request.html [01:53:24] "When Varnish is in grace mode, it uses an object that has already expired as far as the TTL is concerned. There are several reasons this might happen, one of them being if a backend is marked as bad by a health probe." [01:54:31] that sounds like the right solution [01:54:48] well, it is the default, I think [01:54:55] but like I said, it only works if there is stale cached content [01:55:02] which is not guaranteed [01:55:21] for example, the responses may be private, or may be treated as private [01:56:00] 500s from apache will presumably not be cached [01:56:06] e.g. PHP timeouts [01:56:18] so the backend declaration specifies some URL that varnish should poll; if varnish's request fails, it marks the backend as sick and serves cached objects past their normal expiry [01:58:31] (03CR) 10Tim Starling: [C: 04-1] "$wmgWikimediaDatabaseLists is only populated on a conf cache miss, so normally, it will be empty." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/57173 (owner: 10Reedy) [01:58:38] hmm [01:59:49] is it possible to have varnish return a 503 if the number of pending backend requests for a given URL exceeds some value? [02:00:53] you could have the health check URL point to a URL that the bits app servers use to report pending requests [02:00:53] or even some separate web host that monitors the app servers [02:01:33] sounds complicated [02:10:58] just looking through the varnish source [02:13:44] there's apparently not a hashtable of pending requests keyed by URL or anything convenient like that [02:15:31] !log LocalisationUpdate completed (1.23wmf5) at Wed Dec 4 02:15:31 UTC 2013 [02:15:41] you would have to traverse a list of all sessions [02:15:47] Logged the message, Master [02:16:59] the backends have a list of connections, but I can't see a way to work out the URL for a given connection [02:20:29] (03PS10) 10Tim Starling: Add a sqldump script wrapper around mysqldump [operations/puppet] - 10https://gerrit.wikimedia.org/r/43844 (owner: 10Reedy) [02:20:52] (03CR) 10Tim Starling: [C: 032] Add a sqldump script wrapper around mysqldump [operations/puppet] - 10https://gerrit.wikimedia.org/r/43844 (owner: 10Reedy) [02:21:41] !log LocalisationUpdate completed (1.23wmf4) at Wed Dec 4 02:21:41 UTC 2013 [02:21:57] Logged the message, Master [02:23:19] TimStarling: this is what I had in mind: http://paste.tstarling.com/p/YsQEEX.html [02:23:36] it doesn't seem too complicated to me [02:27:04] probe.url is just the path part, you can't send a probe to some other host or port [02:27:37] it just uses the host/port of the backend [02:28:31] well, if the goal is to mark a backend as unhealthy before it is actually maxed and unable to respond [02:28:47] then we're targetting a point in time at which apache is still responsive [02:28:55] so it could just as well be a simple CGI script [02:29:18] and it doesn't have to be loadavg, it can be based on wc -l /proc/net/tcp or whatever [02:30:11] well, apache already has MaxClients and configurable listen backlog [02:30:58] but taking the whole of bits down is not really the feature I was looking for [02:31:14] letting it reach MaxClients will do that well enough [02:31:46] I would like it so that problematic (high request rate, slow) URLs are served 503s, while the rest of the site is still up [02:36:23] dunno, i have some hazy ideas but nothing definite. maybe strip out the RL timestamp from rl URLs in vcl_hash so that there is a always a grace response [02:37:00] well, not if the set of modules that are requested changes, but dunno [02:40:05] you could also have a canonical and ideally constant fallback URL for some basic JS / CSS that client-side ResourceLoader knows to default to if requests to bits are failing [02:40:29] wait wait wait, I know [02:40:36] use squid as varnish backend! [02:41:00] ;) [02:41:07] well, it's not completely absurd, i thought about adding another layer earlier [02:41:19] mark will kill you tho [02:41:21] you know the delay pool feature? [02:42:31] huh. yeah, that looks useful [02:44:57] gotta run, i'm afraid you'll have to do without my technical expertise [02:54:47] !log LocalisationUpdate ResourceLoader cache refresh completed at Wed Dec 4 02:54:47 UTC 2013 [02:55:03] Logged the message, Master [03:48:57] (03PS1) 10Tim Starling: Re-add the docroot/secure directory [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99024 [03:50:37] Why is $wgCanonicalServer set to http:// for the private wikis? [03:50:45] That seems like a bug. [04:02:55] (03PS1) 10Tim Starling: secure.wikimedia.org ErrorDocument [operations/apache-config] - 10https://gerrit.wikimedia.org/r/99026 [04:04:01] what private wikis? [04:04:19] I see several with https: [04:05:34] TimStarling: I was looking at https://gerrit.wikimedia.org/r/#/c/53885/4/wmf-config/InitialiseSettings.php,unified [04:05:55] Hmm, perhaps out of date. [04:06:19] Seems so. [04:17:03] (03CR) 10Tim Starling: [C: 032] "Looks good. But needs manual rebase." [operations/apache-config] - 10https://gerrit.wikimedia.org/r/90703 (owner: 10Reedy) [04:17:04] (03CR) 10jenkins-bot: [V: 04-1] Move a lot of the miscellaneous wikis out of their own specific docroots [operations/apache-config] - 10https://gerrit.wikimedia.org/r/90703 (owner: 10Reedy) [04:25:22] ori-l, let me know when you want to talk about JsonConfig - i want to figure out a few key points before implementing them [04:26:45] that's not agile, you know [04:26:58] ori-l, you? [04:27:00] or the config? [04:27:10] heh [04:27:13] anyways, what's up? [04:27:31] well, so i started porting zero config to a separate extension [04:28:03] the goal being Config:Proxies:Opera pages [04:28:18] or Proxy: [04:28:35] and JsonConfig ext would allow plugins that define this subspace [04:29:21] plugins would basically define a hook to handle validation [04:30:11] something like $wgConfigNs[NS_CONFIG]['Proxy'] = { validatorFunc } [04:30:35] well, see https://bugzilla.wikimedia.org/show_bug.cgi?id=44057 [04:30:44] some jerk committed to doing it and then ran off [04:31:20] ori-l, not a jerk, a busy individual :) [04:31:27] besides, not exactly what i need [04:31:34] i need validation + ability to change values [04:31:41] very specific [04:32:14] validator func would call my function with each field name, providing a callback that knows how to validate just that specific value [04:33:03] example: validator() { $config->check( 'ips', myIpValidatorFunc ) [04:33:58] well, I wouldn't write off ContentHandler just yet [04:34:13] the interface includes preSaveTransform [04:34:15] contenthandler is used everywhere around there :) [04:34:24] i'm using that too [04:34:34] it also has a notion of article parts that it calls 'section' [04:34:55] hmm, that might be useful [04:35:03] i'm kinda scared of touching it though :) [04:35:28] i should proly get wikidata ppl onboard [04:35:30] also from my experience with eventlogging, the set of constraints that JSON schema let you specify is rich enough to express the set of valid values that a property can take [04:35:56] i think it'll be overkill for most use-cases to have to write a validator function for every single field [04:36:25] JSON schema even has the concept of a regexp validator for strings, but I don't think Rob's library currently implements it [04:36:29] agree - if we can get a general case schema done, it would solve most usecases [04:37:14] one thing though - we should be able to specify custom manipulators - for example remove duplicates, sort, etc [04:38:00] right, but a pre-save transform for the configuration object as a whole is probably adequate, right? I mean, you could write a fancy transformer that inspects & changes specific properties [04:38:06] I wonder if that worked... [04:38:06] (03PS5) 10MZMcBride: Update wgServer, wgCanonicalServer for sub.subdomain wikis [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/53885 (owner: 10Reedy) [04:38:08] (03CR) 10jenkins-bot: [V: 04-1] Update wgServer, wgCanonicalServer for sub.subdomain wikis [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/53885 (owner: 10Reedy) [04:38:46] Closer, maybe. [04:38:48] ori-l, that's true, we could in theory separate the two [04:39:05] actually there are exceptions [04:39:18] the ability to edit JSON schema themselves online seems like a use-case that is specific to EventLogging; JSON schema doesn't even have to be JSON, really, since the PHP validator just decodes the object to an associative array [04:39:19] a frequent use case would be a new version [04:39:56] so you could treat the JSON schema spec as a notation for specifying fields & constraints via an associative array [04:40:11] what do you mean, new version? [04:40:57] example - your config has { "field" : "value" } and later you decide to support { "field": { "subfld1" : "value1", ... [04:41:09] so in validator you could rewrite one into another [04:41:28] dynamically, but that would happen every time config is parsed, not when its saved [04:41:37] when saving, you would save it in the new version [04:42:07] and this way all the logic that uses that field would not have to check which one it is [04:42:16] but treat it as a dict of dicts [04:42:59] hmm [04:43:04] (03PS3) 10Tim Starling: Switch www portals to using one docroot [operations/apache-config] - 10https://gerrit.wikimedia.org/r/91209 (owner: 10Reedy) [04:43:08] schema migrations are tricky [04:43:21] (03CR) 10Tim Starling: [C: 032] Switch www portals to using one docroot [operations/apache-config] - 10https://gerrit.wikimedia.org/r/91209 (owner: 10Reedy) [04:43:24] even though the older config on wiki will still stay unchanged. This might be highly useful for a huge number of config instances [04:44:09] true, they are tricky - but if you have all the transformation logic in the validation block (that gets executed every time config page is loaded), it will be ok [04:45:49] with EventLogging each event specifies the schema it validated against by revision ID, which is nice, because revisions are (mostly) immutable [04:46:24] maybe you could do something similar, like take the notion of an attached namespace (the way talk pages accompany article pages) [04:47:11] and somehow have the schema that describes the object be connected to it [04:47:33] well, i guess the relationship of objects to schemas is many-to-one, so that wouldn't work really [04:47:50] but the broader point that i'm trying to make is that you should have some way of versioning schema [04:47:57] or maybe not support migrations at all, or curb them somehow [04:48:02] otherwise it could easily get out of hand [04:49:27] it might make sense to just decline to provide any facility for schema migrations in the JsonConfig extension itself [04:50:22] i think schema migrations should be avoided if at all possible; if you can't practically avoid them then you can do it yourself by having your pre-save and pre-load methods do fancy things, but I don't think the JsonConfig extension itself should provide more structure than that [04:50:40] (03PS6) 10MZMcBride: Update wgServer, wgCanonicalServer for sub.subdomain wikis [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/53885 (owner: 10Reedy) [04:50:50] ori-l, not sure versioning will work in zero case - each user is identified by IP->ID mapping - we process it, and forget about it right thereafter. Nothing is stored long term except configs themselves - and the page name is the ID [04:50:52] * Elsie waits for jenkins' scorn. [04:51:21] i guess we should facilitate it (since its already done by zero anyway) [04:51:34] but in the basic settings use a simple json schema validation [04:51:52] PROBLEM - MySQL Replication Heartbeat on db1046 is CRITICAL: CRIT replication delay 303 seconds [04:51:55] so basically you map the user's IP address to a config object that tells you various things about how you should handle the request? [04:52:04] yep [04:52:20] and the handling is done by the same extension that stored it in the first place [04:52:22] PROBLEM - MySQL Slave Delay on db1046 is CRITICAL: CRIT replication delay 313 seconds [04:52:34] ori-l, maybe we should get off this channel :) [04:52:39] servers don't like us [04:55:31] !log running sync-apache to deploy I730ec0bf8c388cada17b46497d87146f1e4ded1a [04:55:47] Logged the message, Master [04:57:43] (03PS7) 10MZMcBride: Update wgServer, wgCanonicalServer for sub.subdomain wikis [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/53885 (owner: 10Reedy) [04:57:47] (03PS1) 10Yurik: Corrected comment for FlaggedRevs deployment [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99028 [04:59:10] (03CR) 10Tim Starling: "Deployed." [operations/apache-config] - 10https://gerrit.wikimedia.org/r/91209 (owner: 10Reedy) [05:00:09] Is nomcom dead? [05:01:56] !log add index on db1048 otrs.ticket_history.article_id to stop daily slave lag [05:02:10] Logged the message, Master [05:02:43] (03PS8) 10MZMcBride: Update wgServer, wgCanonicalServer for sub.subdomain wikis [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/53885 (owner: 10Reedy) [05:03:19] Okay, I think that's finished. [05:03:29] I'll submit a separate changeset for nomcom. [05:04:38] noncom was killed on October 24. [05:05:15] nomcom [05:05:32] DNS was killed, it looks like the wiki itself has been dead for much longer. [05:07:58] (03PS1) 10MZMcBride: nomcom is dead. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99030 [05:45:15] (03PS3) 10MZMcBride: Create "Draft" namespace on the English Wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/97675 [05:54:13] (03CR) 10TTO: Create "Draft" namespace on the English Wikipedia (031 comment) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/97675 (owner: 10MZMcBride) [05:56:01] (03CR) 10Legoktm: Create "Draft" namespace on the English Wikipedia (032 comments) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/97675 (owner: 10MZMcBride) [05:59:13] (03PS4) 10MZMcBride: Create "Draft" namespace on the English Wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/97675 [05:59:59] (03CR) 10MZMcBride: Create "Draft" namespace on the English Wikipedia (032 comments) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/97675 (owner: 10MZMcBride) [06:28:14] PROBLEM - udp2log log age for lucene on oxygen is CRITICAL: CRITICAL: log files /a/log/lucene/lucene.log, have not been written in a critical amount of time. For most logs, this is 4 hours. For slow logs, this is 4 days. [06:30:04] RECOVERY - udp2log log age for lucene on oxygen is OK: OK: all log files active [06:34:57] (03CR) 10Ori.livneh: [C: 031] [WIP] Add configuration for Wikimania Scholarships [operations/puppet] - 10https://gerrit.wikimedia.org/r/98740 (owner: 10BryanDavis) [06:45:04] PROBLEM - Disk space on tungsten is CRITICAL: DISK CRITICAL - free space: / 0 MB (0% inode=96%): [06:48:04] RECOVERY - Disk space on tungsten is OK: DISK OK [06:55:10] (03PS1) 10Ori.livneh: Graphite: use package's default storage dir [operations/puppet] - 10https://gerrit.wikimedia.org/r/99032 [06:56:29] (03CR) 10Ori.livneh: [C: 032] Graphite: use package's default storage dir [operations/puppet] - 10https://gerrit.wikimedia.org/r/99032 (owner: 10Ori.livneh) [08:27:47] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [09:12:01] Elsie: oh trust me you don't want to even go near vixie's cron [09:12:38] Elsie: I fixed a vulnerability 6-7 years ago and my eyes were bleeding [09:12:39] apergos: on beta, Parsoid now self updates from the master branch :-D [09:12:54] apergos: had a crazy bug which caused it to show the articles from production, got it fixed yesterday \O/ [09:13:45] woah that is crazy [09:13:49] but yay for having it working [09:14:27] some path issues, the daemon was launched using a code base that missed a configuration file, so it was failing back to prod :/ [09:14:45] ouch! [09:19:56] paravoid: were is gerrit-wm? [09:20:13] no idea [09:20:37] * matanya looks at the floor of the channel [09:26:58] (03PS1) 10Hashar: beta: update Parsoid dependencies only on changes [operations/puppet] - 10https://gerrit.wikimedia.org/r/99052 [09:26:59] (03PS1) 10Hashar: beta: missing docstring in autoupdater [operations/puppet] - 10https://gerrit.wikimedia.org/r/99053 [09:27:17] matanya: gerrit-wm got replaced with grrrit-wm which runs on labs [09:27:26] gerrit-wm used to be some python hooks in Gerrit itself [09:28:09] hashar: so why grrrit-wm doesn't report my new push? [09:28:20] was/is broken maybe? [09:30:28] PROBLEM - LVS HTTPS IPv4 on wikimedia-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:30:58] PROBLEM - LVS HTTPS IPv6 on wikiquote-lb.esams.wikimedia.org_ipv6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:31:14] hi ops [09:31:41] we're seeing errors coming from the search API that Wikipedia via Text in Kenya is using. [09:31:49] PROBLEM - LVS HTTPS IPv6 on wikisource-lb.esams.wikimedia.org_ipv6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:31:58] PROBLEM - LVS HTTPS IPv6 on wiktionary-lb.esams.wikimedia.org_ipv6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:31:58] PROBLEM - LVS HTTPS IPv4 on wikipedia-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:32:08] Is this a known issue? [09:32:18] RECOVERY - LVS HTTPS IPv4 on wikimedia-lb.esams.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 85484 bytes in 1.366 second response time [09:32:21] not incredibly regular but enough to show up on my graphs [09:32:40] RECOVERY - LVS HTTPS IPv6 on wikisource-lb.esams.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 64982 bytes in 0.794 second response time [09:32:48] RECOVERY - LVS HTTPS IPv6 on wikiquote-lb.esams.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 65006 bytes in 0.799 second response time [09:32:48] RECOVERY - LVS HTTPS IPv6 on wiktionary-lb.esams.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 64870 bytes in 0.803 second response time [09:32:49] RECOVERY - LVS HTTPS IPv4 on wikipedia-lb.esams.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 85498 bytes in 0.828 second response time [09:33:09] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [09:33:28] sdehaan: text as in SMS? [09:33:37] paravoid, who would be the one to ask about those? seems like a backend search error -- {u'servedby': u'mw1124', u'error': {u'info': u'HTTP request timed out.', u'code': u'srsearch-error'}} [09:33:44] Nemo_bis: yeah, SMS & USSD [09:34:10] from the message above it seems you have someone working on it already :) [09:34:49] Nemo_bis: yurik told me to hop in this channel :) [09:35:04] (03PS2) 10Matanya: salt: lint cleanup [operations/puppet] - 10https://gerrit.wikimedia.org/r/99051 [09:35:32] sdehaan is part of the group in South Africa group that has implemented our Wiki over sms/ussd :) [09:35:35] !log short esams outage between 09:30-09:33 UTC due to crappy network vendor issues [09:35:49] yurik: bugzilla [09:35:50] Logged the message, Master [09:35:56] sec, dealing with something else now [09:36:13] sdehaan, http://bugzilla.wikimedia.org/ [09:36:25] would probably best to track it [09:37:36] well it seems to have gone away now [09:37:56] well, the current search backend is very flaky [09:38:02] but I'll keep an eye out [09:38:04] ah :) [09:38:15] they're working on replacing it [09:38:18] sdehaan: how often do you see these errors? [09:38:30] and for which wikis? [09:39:01] not all that often, just happened to see a bunch fly by for a few minutes resulting in some searching being broken. [09:39:07] paravoid: ehm, the english wiki? [09:39:12] not sure which wikis there are. [09:39:28] english wikipedia would be one [09:41:50] (03CR) 10Akosiaris: "Two nitpicks, otherwise LGTM" (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/99051 (owner: 10Matanya) [09:43:57] (03PS3) 10Matanya: salt: lint cleanup [operations/puppet] - 10https://gerrit.wikimedia.org/r/99051 [09:46:22] paravoid, sdehaan, i just grepped some logs, seems like search timeout is very frequent - ...Search timeout requesting http://10.2.2.11:8123/search/enwiki/... [09:46:48] but they seem to come in high concentration batches [09:47:43] Hoi, yesterday a configuration change went in production for the Polish language ... the OpenDyslexic font was added ... the effect of the change is noticable on wikidata but not on pl.wikipedia. Is there an issue ? [09:48:19] there is some urgency because we are planning press attention for dyslexia and its support in MediaWiki [09:48:42] a blog post for WMF blog in Polish and English is waiting in the wings [09:49:11] akosiaris can you find out for me ? [09:49:53] GerardM-: find out what ? [09:50:23] yes, the mwsearch.log is pretty full [09:50:48] GerardM-: you mean that you say the OpenDyslexic font should be enabled in pl.w but it is not ? [09:50:56] indeed [09:51:03] (03CR) 10Matanya: [C: 04-1] "Please see inline comments." (038 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/96552 (owner: 10Addshore) [09:51:12] it is available in Wikidata [09:51:24] and it was not before ... I thought it was a global change [09:51:57] ie available in any and all wikis [09:52:06] sdehaan, did you create a bug? [09:52:21] I doubt that... changes are usually gradual [09:52:21] i wanted to add some stuff to it [09:52:40] if it is something that changed recently that is [09:53:23] yurik: nope, not sure what to file other than "occasionally the search api gives us errors" which doesn't seem all that useful for a ticket [09:53:35] that string you gave me [09:53:49] i actually saw a lot of them in the error logs [09:53:57] akosiaris the change was yesterday [09:54:05] in your error logs [09:54:08] sdehaan: file what you, what you get, and what youexpect to get. this could be a good start point for debugging [09:54:31] ok, I'll try & collect some data. [09:54:47] *what you did, what you get, and what you expect ... [09:55:52] akosiaris is that period long enough to consider this an issue .. if not when should it be life ? [09:55:59] worst estimate [09:56:06] GerardM-: so I see the patch https://bugzilla.wikimedia.org/show_bug.cgi?id=57136 was closed on Nov 22 and MLEB released in Nov 29 [09:56:20] GerardM-: deployments happen on tuesdays and thursdays [09:56:30] that means yesterday [09:56:56] (wikidata is life for pl opendyslexic support) [09:58:37] akosiaris if plwp goes life on Thursday that is fine as well [09:58:45] I just want to know how to plan [09:58:50] GerardM-: https://wikitech.wikimedia.org/wiki/Deployment [09:58:59] where it says for yesterday [09:59:02] right, sdehaan so just create it reporting that you saw it, and i will add some data from our logs, and you will find some log data and add it as well - this way we can already start prioritizing this issue and it shows up for others [09:59:07] MediaWiki deploy window, currently following the 1.23 schedulegroup1 to 1.23wmf5: All non-Wikipedia sites (Wiktionary, Wikisource, Wikinews, Wikibooks, Wikiquote, Wikiversity, and a few other sites) [09:59:16] and thx for letting me knwo [09:59:38] yurik: cool will ping you with the issue when filed [09:59:49] GerardM-: so wikidata has 1.23wmf5 and wikipedias will have it tomorrow :-) [10:00:00] sdehaan, np. Off to bed. 5 am [10:00:15] yurik: you're in the wrong timezone [10:00:16] GerardM-: not much I can do to change it, but I hope that answers your question [10:00:18] akosiaris thank you :) you are a gentleman [10:00:19] and gnight [10:00:25] :-) [10:00:35] sdehaan, i'm never in the right TZ [10:01:36] (03CR) 10TTO: "Is there a bug/link to community discussion?" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/98613 (owner: 10Jforrester) [10:03:46] (03CR) 10Akosiaris: [C: 032] role.pp: minor lint clean [operations/puppet] - 10https://gerrit.wikimedia.org/r/98456 (owner: 10Matanya) [10:04:07] (03PS4) 10TTO: Make missing.php aware of interwiki prefixes [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/94716 [10:30:40] !log running Wikibase/repo/maintenance/dumpJson.php against wikidata in screen as ariel n terbium; if it causes problems feel free to shoot it [10:30:53] Logged the message, Master [10:33:11] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [10:48:10] !log brewster:8080 seems overloading, apt-get update from labs giving out connect timeout [10:48:37] morebots: comeon [10:48:39] morebots: ping [10:49:13] I am a logbot running on tools-exec-04. [10:49:14] Messages are logged to wikitech.wikimedia.org/wiki/Server_Admin_Log. [10:49:14] To log a message, type !log . [10:49:14] I am a logbot running on tools-exec-04. [10:49:14] Messages are logged to wikitech.wikimedia.org/wiki/Server_Admin_Log. [10:49:14] To log a message, type !log . [10:50:39] lol, is morebots THAT overloaded?:) [10:53:01] https://twitter.com/wikimediatech still broken [10:53:17] I wonder why that last message was posted in Twitter but NOT on wikitech wiki [10:54:16] MaxSem: is there an open RT ticket for that? [10:54:27] no idea [10:57:46] filing [11:03:39] (03PS1) 10Matanya: rsync: 2 spaces to 4 spaces. [operations/puppet] - 10https://gerrit.wikimedia.org/r/99067 [11:03:39] https://bugzilla.wikimedia.org/show_bug.cgi?id=57969 [11:13:20] (03CR) 10Faidon Liambotis: [C: 04-2] "This is a (modified) puppetlabs module. No reason to diverge in whitespace." [operations/puppet] - 10https://gerrit.wikimedia.org/r/99067 (owner: 10Matanya) [11:13:35] aude: https://bugzilla.wikimedia.org/show_bug.cgi?id=57829#c6 [11:13:55] (03Abandoned) 10Matanya: rsync: 2 spaces to 4 spaces. [operations/puppet] - 10https://gerrit.wikimedia.org/r/99067 (owner: 10Matanya) [11:14:18] \o/ [11:21:19] (03PS1) 10Siebrand: Allow exporting files as gettext in Translate [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99070 [11:21:40] Anyone deploying anytime soon? [11:22:52] (03CR) 10Nikerabbit: [C: 031] Allow exporting files as gettext in Translate [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99070 (owner: 10Siebrand) [11:33:04] PROBLEM - RAID on virt9 is CRITICAL: Timeout while attempting connection [11:33:55] RECOVERY - RAID on virt9 is OK: OK: Active: 16, Working: 16, Failed: 0, Spare: 0 [13:00:07] siebrand: Nikerabbit : I can deploy the .po export change if you want [13:01:02] siebrand: Nikerabbit though there is apparently some security concern mentioned in the bug report [13:01:05] ( https://bugzilla.wikimedia.org/show_bug.cgi?id=40341 ) [13:10:50] hashar: Chris has lifted that [13:12:53] hashar: That security concern has been removed, and it was about enabling the import part, not the export part. [13:13:09] hashar: Not having the export part was an oversight that Tilman made use aware of today. [13:14:01] ooo [13:14:19] meanwhile, blacklisting export is not nice, would be much better to whitelist the allowed exports [13:14:31] but that is unrelated [13:14:36] siebrand: wanna deploy right now ? [13:14:48] hashar: yeah, that would be great. [13:15:20] hashar: as a side note, you mentioned tests are broken again in Gerrit. Nikerabbit and Amir are in https://plus.google.com/hangouts/_/event/c77rmpmpsvf40biq3rel7dk441o and I invited you if you want to discuss. They are working on tests... [13:15:25] (03CR) 10Hashar: [C: 032] "Per discussion, the bug mention some security issue but that is on the import side. Exporting po doesn't seem like a security issue :-]" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99070 (owner: 10Siebrand) [13:15:35] (03Merged) 10jenkins-bot: Allow exporting files as gettext in Translate [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99070 (owner: 10Siebrand) [13:15:39] hold on, breaking cluster [13:15:48] hashar: heh :) [13:16:07] siebrand: I have emailed amir and niklas a minute ago about ULS browser tests [13:16:22] will join in their daily pairing sessions 10am-noon [13:16:37] hashar: oh, anyway, I thought you might want to talk about it in a higher bandwidth mode :) [13:16:39] pfff [13:17:23] hashar: Nikerabbit is using his famous "I'm in the hangout from two computers, so you can see me, and I can share my screen" mode. [13:17:41] !log hashar synchronized wmf-config/CommonSettings.php 'Allow exporting files as gettext in Translate {{gerrit|99070}} {{bug|40341}}' [13:17:57] hashar: tx. testing... [13:17:58] Logged the message, Master [13:18:00] deployed [13:18:05] well [13:18:06] !log hashar synchronized wmf-config/InitialiseSettings.php 'touch' [13:18:11] now it is enabled. [13:18:15] deployed on beta as wel [13:18:33] Logged the message, Master [13:23:03] hashar: works well. Thanks. [13:23:10] hashar: testing round trip, too... ;) [13:29:59] !! [13:44:33] hashar: All is well. [13:58:57] !log nas1001-b cf giveback [13:59:32] Logged the message, Master [14:13:59] (03CR) 10Hashar: "refreshed the package on Jenkins slave integration-debian-builder.pmtpa.wmflabs" [operations/debs/jenkins-debian-glue] - 10https://gerrit.wikimedia.org/r/95424 (owner: 10Hashar) [14:15:24] (03PS1) 10Matanya: redis: lint clean [operations/puppet] - 10https://gerrit.wikimedia.org/r/99087 [14:15:28] PROBLEM - puppet disabled on virt9 is CRITICAL: Timeout while attempting connection [14:16:18] RECOVERY - puppet disabled on virt9 is OK: OK [14:16:40] akosiaris: does /usr/lib/ganglia/python_modules/redis.py exist somewhere? [14:17:10] [17042542.967469] nf_conntrack: table full, dropping packet. (from virt9) [14:18:01] but that's been going on for awhile [14:19:30] Dec 2 13:54:01 first occurrence [14:21:39] probablt apergos can answer my question too :) [14:21:41] *y [14:21:55] just a sec [14:22:22] evening [14:23:08] why is it maxed out on virt2 and virt9 and nowhere else? [14:28:15] matanya: probably not [14:28:47] i suppose you are asking about that cleanup line with the absent [14:28:48] matanya: puppet says redis_monitoring.py used to be called redis.py and that is why it is now gone [14:29:24] yes, apergos and akosiaris. can i remove this? [14:29:36] i 'd say yes [14:29:44] pushing a patch [14:29:52] git show 301504fa [14:30:13] and yes by now you should be able to toss it [14:31:05] that is one of the things I hate about puppet is that your manifests get cluttered with ensure absent and no nice way to clean them up [14:34:01] (03PS1) 10Matanya: redis: redis.py is absent everywhere [operations/puppet] - 10https://gerrit.wikimedia.org/r/99089 [14:36:48] akosiaris: if you plan on merging this one ^ please merge https://gerrit.wikimedia.org/r/99087 before if possible, to prevent a conflict. [14:39:25] matanya: that won't avoid the conflict [14:39:54] !log Nuking HerculeBot's watchlist on frwiki at operator's request, too large to modify from web [14:40:02] akosiaris: it won't it would just be easier to solve :) [14:40:21] Logged the message, Master [14:43:02] what's wrong with ganglia? http://ganglia.wikimedia.org/latest/?c=MySQL%20eqiad&h=db1015.eqiad.wmnet&m=cpu_report&r=hour&s=descending&hc=4&mc=2 [14:48:40] not ganglia, db1015 [14:48:48] http://ganglia.wikimedia.org/latest/?c=MySQL%20eqiad&m=cpu_report&r=hour&s=descending&hc=4&mc=2 [14:54:24] (03CR) 10Akosiaris: [C: 04-1] "Nitpicks" (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/99087 (owner: 10Matanya) [14:55:23] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [14:59:00] (03PS2) 10Matanya: redis: lint clean [operations/puppet] - 10https://gerrit.wikimedia.org/r/99087 [15:01:47] (03CR) 10Akosiaris: [C: 032] redis: lint clean [operations/puppet] - 10https://gerrit.wikimedia.org/r/99087 (owner: 10Matanya) [15:05:26] !log intermittent ganglia reporting from db1015, restarted gmond there, then on db1021 (aggregator) [15:05:34] not sure that will do it, we'll see [15:05:50] Logged the message, Master [15:06:52] (03PS2) 10Matanya: redis: redis.py is absent everywhere [operations/puppet] - 10https://gerrit.wikimedia.org/r/99089 [15:12:09] ok, akosiaris i also fixed the conflict [15:12:14] well that did not improve the situation [15:14:49] (03CR) 10Akosiaris: [C: 032] redis: redis.py is absent everywhere [operations/puppet] - 10https://gerrit.wikimedia.org/r/99089 (owner: 10Matanya) [15:16:16] thank you very much akosiaris. i'll depart now. if you are bored, i have some other patches i pushed this week :) [15:17:00] not really :-) [15:25:16] (03PS1) 10Ottomata: debian/gbp.conf - now building against tags instead of branch [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/99094 [15:25:25] (03CR) 10Ottomata: [C: 032 V: 032] debian/gbp.conf - now building against tags instead of branch [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/99094 (owner: 10Ottomata) [15:26:29] PROBLEM - Disk space on virt9 is CRITICAL: Timeout while attempting connection [15:27:20] RECOVERY - Disk space on virt9 is OK: DISK OK [15:32:26] (03PS1) 10Ottomata: Unsetting build-area in gbp.conf [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/99095 [15:32:38] (03CR) 10Ottomata: [C: 032 V: 032] Unsetting build-area in gbp.conf [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/99095 (owner: 10Ottomata) [15:45:30] (03PS1) 10Ottomata: Debianize Kafka [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99101 [15:45:31] (03PS1) 10Ottomata: No need to have a special 'BROKER_JMX_PORT' variable if kafka.default is only read by kafka.init [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99102 [15:45:32] (03PS1) 10Ottomata: Updating/dowgrading libraries [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99103 [15:45:33] (03PS1) 10Ottomata: Updating debian/bin/kafka with new bin scripts and removed obsolete ones. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99104 [15:45:34] (03PS1) 10Ottomata: kafka.init - Using $DEFAULT in error message. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99105 [15:45:35] (03PS1) 10Ottomata: Adding mirror-maker and consumer-offset-checker to kafka bin script [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99106 [15:45:36] (03PS1) 10Ottomata: Fix kafka_data_dirs_fixes.patch [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99107 [15:45:37] (03PS1) 10Ottomata: Updating logging patch [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99108 [15:45:38] ah! [15:45:38] (03PS1) 10Ottomata: remove kafka-console-consumer-log4j.properties [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99109 [15:45:38] no! [15:45:39] (03PS1) 10Ottomata: Bumping version to git's latest commit on 20130827 [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99110 [15:45:40] (03PS1) 10Ottomata: Installing kafka-mirror init.d and default scripts. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99111 [15:45:41] (03PS1) 10Ottomata: Syncing debian/bin/kafka script with recent 0.8 branch bin/*.sh scripts. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99112 [15:45:42] (03PS1) 10Ottomata: Adapting debian/bin/kafka's server-stop command to change from bin/kafka-server-stop.sh introduced in kafka-1031. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99113 [15:45:43] (03PS1) 10Ottomata: Updating kafka scripts with recent changes from 0.8 branch. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99114 [15:45:44] (03PS1) 10Ottomata: Adding environment var ZOOKEEPER_URL. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99115 [15:45:45] (03PS1) 10Ottomata: Installing tools-log4j.properties in Makefile. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99116 [15:45:46] that's not what is supposed to happen! [15:45:46] (03PS1) 10Ottomata: Adding ganglia/graphite backends to kafka [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99117 [15:45:47] (03PS1) 10Ottomata: Adding kafka-ganglia jar. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99118 [15:45:48] (03PS1) 10Ottomata: Installing kafka-ganglia-1.0.0.jar and using it in CLASSPATH. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99119 [15:45:49] (03PS1) 10Ottomata: Update debian/changelog [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99120 [15:45:50] (03PS1) 10Ottomata: Remove scala 2.8 annotations [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99121 [15:45:51] (03PS1) 10Ottomata: Not including consumer.properties and producer.properties in /etc/kafka. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99122 [15:45:52] (03PS1) 10Ottomata: Typo fix MIRROR_CONFFILES_EXAMPLES [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99123 [15:45:53] (03PS1) 10Ottomata: gbp: do not set export-dir [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99124 [15:45:53] sigh [15:45:54] (03PS1) 10Ottomata: debian/gbp.conf - now building against tags instead of branch [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99125 [15:45:55] (03PS1) 10Ottomata: Unsetting build-area in gbp.conf [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99126 [15:45:56] (03PS1) 10Ottomata: Debianization release of 0.8.0 tag [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99127 [15:47:36] sigghhhh [15:47:58] (03Abandoned) 10Ottomata: kafka.init - Using $DEFAULT in error message. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99105 (owner: 10Ottomata) [15:48:02] (03Abandoned) 10Ottomata: Updating debian/bin/kafka with new bin scripts and removed obsolete ones. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99104 (owner: 10Ottomata) [15:48:11] (03Abandoned) 10Ottomata: Updating/dowgrading libraries [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99103 (owner: 10Ottomata) [15:48:14] (03Abandoned) 10Ottomata: No need to have a special 'BROKER_JMX_PORT' variable if kafka.default is only read by kafka.init [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99102 (owner: 10Ottomata) [15:48:18] (03Abandoned) 10Ottomata: Debianization release of 0.8.0 tag [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99127 (owner: 10Ottomata) [15:48:21] (03Abandoned) 10Ottomata: Unsetting build-area in gbp.conf [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99126 (owner: 10Ottomata) [15:48:24] (03Abandoned) 10Ottomata: debian/gbp.conf - now building against tags instead of branch [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99125 (owner: 10Ottomata) [15:48:29] (03Abandoned) 10Ottomata: gbp: do not set export-dir [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99124 (owner: 10Ottomata) [15:48:31] (03Abandoned) 10Ottomata: Typo fix MIRROR_CONFFILES_EXAMPLES [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99123 (owner: 10Ottomata) [15:48:36] (03Abandoned) 10Ottomata: Not including consumer.properties and producer.properties in /etc/kafka. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99122 (owner: 10Ottomata) [15:48:39] (03Abandoned) 10Ottomata: Remove scala 2.8 annotations [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99121 (owner: 10Ottomata) [15:48:42] (03Abandoned) 10Ottomata: Update debian/changelog [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99120 (owner: 10Ottomata) [15:48:45] (03Abandoned) 10Ottomata: Installing kafka-ganglia-1.0.0.jar and using it in CLASSPATH. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99119 (owner: 10Ottomata) [15:48:48] (03Abandoned) 10Ottomata: Adding kafka-ganglia jar. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99118 (owner: 10Ottomata) [15:48:51] (03Abandoned) 10Ottomata: Adding ganglia/graphite backends to kafka [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99117 (owner: 10Ottomata) [15:48:55] (03Abandoned) 10Ottomata: Installing tools-log4j.properties in Makefile. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99116 (owner: 10Ottomata) [15:48:58] (03Abandoned) 10Ottomata: Adding environment var ZOOKEEPER_URL. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99115 (owner: 10Ottomata) [15:49:01] (03Abandoned) 10Ottomata: Updating kafka scripts with recent changes from 0.8 branch. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99114 (owner: 10Ottomata) [15:49:04] (03Abandoned) 10Ottomata: Adapting debian/bin/kafka's server-stop command to change from bin/kafka-server-stop.sh introduced in kafka-1031. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99113 (owner: 10Ottomata) [15:49:07] (03Abandoned) 10Ottomata: Syncing debian/bin/kafka script with recent 0.8 branch bin/*.sh scripts. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99112 (owner: 10Ottomata) [15:49:10] (03Abandoned) 10Ottomata: Installing kafka-mirror init.d and default scripts. [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99111 (owner: 10Ottomata) [15:49:15] (03Abandoned) 10Ottomata: Bumping version to git's latest commit on 20130827 [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99110 (owner: 10Ottomata) [15:49:19] (03Abandoned) 10Ottomata: remove kafka-console-consumer-log4j.properties [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99109 (owner: 10Ottomata) [15:49:22] (03Abandoned) 10Ottomata: Updating logging patch [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99108 (owner: 10Ottomata) [15:49:25] (03Abandoned) 10Ottomata: Fix kafka_data_dirs_fixes.patch [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99107 (owner: 10Ottomata) [15:49:33] yikes, hm [15:49:35] hey akosiaris [15:49:55] how did you intend for me to build kafka once 0.8.0 is released [15:50:04] i'm trying to create a debian-0.8.0 branch [15:50:09] and i've tried it two different ways now [15:50:16] and haven't yet been able to successfully push to gerrit for review [15:50:22] i thought that I should: [15:50:33] branch debian-0.8.0 from the upstream 0.8.0 tag [15:50:34] then [15:50:36] git merge debian [15:50:45] then edit changelog and gbp.conf [15:50:49] and commit that [15:50:51] then use that to build [15:51:04] i think the trouble i'm having with that is the merge step [15:51:09] it creates a merge commit without a change id [15:51:09] hmmm [15:52:07] (03PS2) 10Ottomata: Adding mirror-maker and consumer-offset-checker to kafka bin script [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99106 [15:52:08] (03PS2) 10Ottomata: Debianize Kafka [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99101 [15:53:34] (03PS1) 10Ottomata: 0.8.0-1 debianization release [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99128 [15:53:42] hmm, i think i got it, [15:54:02] i created a change-id for the merge commit and pushed that before I made the commit I want to review [15:56:35] PROBLEM - puppet disabled on virt9 is CRITICAL: Timeout while attempting connection [15:57:25] RECOVERY - puppet disabled on virt9 is OK: OK [16:04:14] (03CR) 10Ottomata: [C: 032 V: 032] 0.8.0-1 debianization release [operations/debs/kafka] (debian-0.8.0) - 10https://gerrit.wikimedia.org/r/99128 (owner: 10Ottomata) [16:07:30] !log apt now contains kafka-0.8.0-1 .deb [16:08:00] Logged the message, Master [16:12:43] PROBLEM - DPKG on analytics1022 is CRITICAL: DPKG CRITICAL dpkg reports broken packages [16:13:23] PROBLEM - DPKG on analytics1021 is CRITICAL: DPKG CRITICAL dpkg reports broken packages [16:18:43] RECOVERY - DPKG on analytics1022 is OK: All packages OK [16:21:42] hmm, hey paravoid, another .deb versioning q for you [16:21:55] yes? [16:22:04] so, i want to build a new librdkafka with recent change from magnus [16:22:18] since the source has change, we need a new version number, not just a new debian rev number, right? [16:22:32] source of? [16:22:39] librdkafka [16:23:02] what changed? [16:23:33] (03CR) 10Jforrester: "https://es.wikipedia.org/wiki/Wikipedia:Caf%C3%A9/Archivo/Noticias/Actual#Resultados_de_la_encuesta_sobre_el_Editor_Visual" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/98613 (owner: 10Jforrester) [16:23:47] buncha stuff [16:23:53] most recent one I wan tis [16:23:53] Added socket.send.buffer.bytes and socket.receive.buffer.bytes configuration properties [16:24:00] oh and also better stats support [16:24:09] I assume Snaps is going to release 0.8.1? [16:24:12] he changed it so there were no arrays, everything was a keyed json object [16:24:20] well, i think he wants to keep the versions matching with kafka [16:24:25] 0.8.0 was just officially released [16:24:41] and the changes he is making don't necessarily have to do with kafka's 0.8.1 versino [16:24:46] 0.8.0.1? [16:24:50] 0.8.0-1-1? [16:25:36] that's Snaps' call :) [16:25:59] this has to do with the upstream versioning scheme, not Debian [16:26:26] riighhhhht [16:26:26] hm [16:26:53] he'll tag a version at some point [16:26:56] I know as much [16:27:09] http://packages.debian.org/search?keywords=librdkafka btw [16:27:23] RECOVERY - DPKG on analytics1021 is OK: All packages OK [16:27:23] PROBLEM - Disk space on virt9 is CRITICAL: Timeout while attempting connection [16:27:25] heh, awesooome [16:27:37] what's the deal with analytics1023? [16:28:02] 1023? [16:28:20] those dpkg notices? [16:28:35] sorry, 1012 [16:28:43] oh [16:28:52] psh, no idea, firmware needs upgraded? [16:28:54] waiting on that? [16:29:09] https://rt.wikimedia.org/Ticket/Display.html?id=6238 [16:29:13] RECOVERY - Disk space on virt9 is OK: DISK OK [17:01:00] (03PS1) 10Ottomata: Updating varnishkafka.conf.example with kafka.socket.send.buffer.bytes default [operations/software/varnish/varnishkafka] - 10https://gerrit.wikimedia.org/r/99154 [17:05:49] (03CR) 10Edenhill: Updating varnishkafka.conf.example with kafka.socket.send.buffer.bytes default (031 comment) [operations/software/varnish/varnishkafka] - 10https://gerrit.wikimedia.org/r/99154 (owner: 10Ottomata) [17:06:00] (03PS23) 10Ottomata: (WIP) Initial Debian version [operations/software/varnish/varnishkafka] (debian) - 10https://gerrit.wikimedia.org/r/78782 (owner: 10Faidon Liambotis) [17:06:40] paravoid, ottomata: The changes to librdkafka are ABI safe with 0.8.0. If that matters. [17:07:20] ahiiiii cool! [17:07:29] yeah, i'm just wondering about versioning, really [17:07:45] deb stuff doesn't like when the source changes but the version doesn't [17:07:55] can we make a tag at 0.8.0.1? [17:08:22] the debian revision isn't included in the orig tar.gz file [17:08:33] so when adding to apt, reprepro will complain that the source has been changed [17:09:39] (03CR) 10Edenhill: (WIP) Initial Debian version (032 comments) [operations/software/varnish/varnishkafka] (debian) - 10https://gerrit.wikimedia.org/r/78782 (owner: 10Faidon Liambotis) [17:10:52] (03PS24) 10Ottomata: (WIP) Initial Debian version [operations/software/varnish/varnishkafka] (debian) - 10https://gerrit.wikimedia.org/r/78782 (owner: 10Faidon Liambotis) [17:11:37] (03PS2) 10Ottomata: Updating varnishkafka.conf.example with kafka.socket.send.buffer.bytes default [operations/software/varnish/varnishkafka] - 10https://gerrit.wikimedia.org/r/99154 [17:11:44] (03CR) 10Ottomata: [C: 032 V: 032] Updating varnishkafka.conf.example with kafka.socket.send.buffer.bytes default [operations/software/varnish/varnishkafka] - 10https://gerrit.wikimedia.org/r/99154 (owner: 10Ottomata) [17:11:58] (03CR) 10Edenhill: [C: 031] (WIP) Initial Debian version [operations/software/varnish/varnishkafka] (debian) - 10https://gerrit.wikimedia.org/r/78782 (owner: 10Faidon Liambotis) [17:12:11] (03PS25) 10Ottomata: (WIP) Initial Debian version [operations/software/varnish/varnishkafka] (debian) - 10https://gerrit.wikimedia.org/r/78782 (owner: 10Faidon Liambotis) [17:13:09] (03PS1) 10Ottomata: Updating with kafka.socket.send.buffer.bytes configuration parameter [operations/puppet/varnishkafka] - 10https://gerrit.wikimedia.org/r/99160 [17:15:05] (03CR) 10Edenhill: [C: 031] (WIP) Initial Debian version [operations/software/varnish/varnishkafka] (debian) - 10https://gerrit.wikimedia.org/r/78782 (owner: 10Faidon Liambotis) [17:15:17] (03CR) 10Ottomata: [C: 032 V: 032] Updating with kafka.socket.send.buffer.bytes configuration parameter [operations/puppet/varnishkafka] - 10https://gerrit.wikimedia.org/r/99160 (owner: 10Ottomata) [17:15:49] (03PS10) 10Ottomata: Setting up varnishkafka on mobile varnish caches [operations/puppet] - 10https://gerrit.wikimedia.org/r/94169 [17:15:54] so many things to commit for 1 change right now! [17:15:55] Making 0.8.0.1 will introduce a new versioning scheme. I was thinking that librdkafka keeps pace with apache kafka on Major.Minor versions, but not necesarily M.m.revision [17:16:21] varnishkafka conf.example, debian/.conf, vanrishkafka puppet module, operations/puppet varnishkafka applicationt [17:16:23] 4 places! [17:16:59] hmm, paravoid, ^ [17:17:11] i'm not sure what to do then, because if we want to use your latest librdkafka [17:17:29] i think we need to differentiate the actual version number somehow, not just debian revision [17:17:30] hm [17:17:32] PROBLEM - puppet disabled on virt9 is CRITICAL: Timeout while attempting connection [17:17:35] hmm [17:17:55] Im fine with tagging 0.8.1, that was my plan [17:18:01] and I dont think that affects the debian package much [17:18:02] is 0.8.1 ready? [17:18:04] probalby not though, right? [17:18:22] apache could change things that would make you want to change stuff too? [17:19:10] yeah, but I didnt plan to follow their versionings on the revision level. [17:19:23] RECOVERY - puppet disabled on virt9 is OK: OK [17:19:34] right, the revision level could be your versioning though, rather than theirs? [17:19:42] yep [17:19:46] 0.8.0.$date? 0.8.0.$rev? [17:20:10] i don't think we really *need* to tag to do this [17:20:26] tagging is always good! :) [17:20:29] pavavoid? can we just build from master with a version something like that? [17:20:52] we could go with 0.8.0.1, but I need to fix some stuff in the lib for that [17:21:17] oh? [17:21:30] wow: https://ganglia.wikimedia.org/latest/graph.php?r=week&z=xlarge&c=Bits+application+servers+eqiad&m=cpu_report&s=by+name&mc=2&g=network_report [17:23:05] greg-g: lol [17:23:28] ? [17:23:52] (03CR) 10Edenhill: [C: 031] Setting up varnishkafka on mobile varnish caches [operations/puppet] - 10https://gerrit.wikimedia.org/r/94169 (owner: 10Ottomata) [17:23:53] ottomata: but I'd prefer not to have four version fields, just three.. [17:24:03] ottomata: so my preference at this point would be to tag 0.8.1 [17:24:23] greg-g: the graph looks funny! the recent drop is quite nice, though. [17:24:37] twkozlowski: yeah, it's awesome :) [17:25:31] but taaag whwyyyyy, are you ready to tag? [17:25:39] is kafka done making changes that you might want to make to 0.8.1? [17:25:49] what if you make changes next week and we want to deploy those? [17:25:56] Snaps: ^ [17:26:47] librdkafka version != apache kafka version. [17:27:26] but i will try to stick to the same major.minor (0.8). But not the revision. What if I follow their revision and they release 0.8.2 with no new protocol things. Do I need to make a new dummy release just to indicate compliance? [17:27:44] so I think its better to let rdk revision live its own life. [17:27:51] And on that sentiment I can tag 0.8.1 today [17:31:31] ohh hm [17:31:32] ok cool [17:31:37] let's do it then! [17:32:59] if you tag, I will build :), Snaps [17:56:34] PROBLEM - puppet disabled on virt9 is CRITICAL: Timeout while attempting connection [17:58:33] RECOVERY - puppet disabled on virt9 is OK: OK [18:00:44] paravoid: I think I want to add a new function to the API. that should be okay, right? [18:08:16] w0 deployment starting shortly. cc yurik greg-g paravoid MaxSem jdlrobson [18:08:23] <^d> greg-gggggggggg: Can I haz LD window for today? Aaron and I wanna break^H^H^H^H^H deploy some fixes to het-deploy. [18:10:11] greg-g, going going gone? [18:10:16] Snaps: new function before we tag? [18:10:26] i'm pushing this because I have time to get varnishkafka out on mobiles today or tomorrow [18:10:30] and I don't want to do it on Friday [18:10:53] dr0ptp4kt: cool [18:10:55] yurik: ? [18:11:04] greg-g, deploing zero :) [18:11:06] ^d: heh, what's the stuff? [18:11:08] yurik: yeah [18:11:28] just fyi, search is going at 11 (right after you're window's over) [18:12:05] <^d> greg-g: Changes to CDB handling so we can support hhvm. It's either going to break immediately and we roll back, or it works fine. [18:12:14] ^d: https://gerrit.wikimedia.org/r/#/c/98959/ attempt 2 :) [18:13:48] <^d> Ewww, math. [18:13:51] <^d> I just saw the rv. [18:14:15] and a few other exts, so I just tossed in some b/c [18:14:22] * AaronSchulz was reading http://misko.hevery.com/code-reviewers-guide/flaw-constructor-does-real-work/ [18:15:13] ^d: how about a 3pm 1-hour window, just in case you need to futz with stuff? [18:15:14] * AaronSchulz recommends reading that [18:15:23] !log yurik synchronized php-1.23wmf4/extensions/ZeroRatedMobileAccess/ [18:15:38] Logged the message, Master [18:15:44] granted you could override the stuff in this case, but still [18:16:20] you couldn't override if something used the class itself if there is no IoC [18:17:19] <^d> greg-g: That's fine. Like I said, it's either going to break hard or not at all :) [18:17:50] ottomata: okay, so no hurry for us then [18:18:08] ottomata: the version is a define today, which means its compile time, not linktime. i.e.; useless. need to make it into a version [18:18:55] ok, sooooo [18:19:02] wazzat mean? [18:19:13] need to fix that, or we can just bump the version in the define? [18:20:18] #define RD_KAFKA_VERSION 0x00080100 [18:20:18] :D? [18:20:40] ^d: yeah, I just don't want it to break hard during an LD and block others (no one yet, but there might be more) [18:20:57] <^d> Fair 'nuff :) [18:21:06] Snaps: ^^ [18:22:30] ^d: AaronSchulz you're on for 3 today :) [18:22:58] !log yurik synchronized php-1.23wmf5/extensions/ZeroRatedMobileAccess/ [18:23:13] Logged the message, Master [18:24:14] B1041537 [18:25:08] (03PS2) 10Yurik: Corrected comment for FlaggedRevs deployment [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99028 [18:25:55] (03CR) 10Yurik: [C: 032 V: 032] Corrected comment for FlaggedRevs deployment [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99028 (owner: 10Yurik) [18:27:01] (03PS1) 10Odder: Add Maria Pacana to the English Planet Wikimedia [operations/puppet] - 10https://gerrit.wikimedia.org/r/99177 [18:32:37] paravoid: what sort of traffic is hitting the pmtpa bits varnishes? [18:32:47] !log yurik synchronized wmf-config/InitialiseSettings.php [18:33:03] Logged the message, Master [18:35:47] ori-l: esams misses [18:36:30] half of the esams misses, to be exact [18:37:30] ah, that makes sense then [18:37:35] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [18:37:49] i'm trying to figure out why it shows a clear drop whereas esams and eqiad are much subtler [18:37:51] ottomata: need to change tha tdefine into a function [18:37:58] pmtpa: http://ganglia.wikimedia.org/latest/graph.php?r=day&z=xlarge&c=Bits+caches+pmtpa&m=cpu_report&s=by+name&mc=2&g=network_report [18:38:13] paravoid: Is it okay to _add_ new functions for an solib with the same so version? [18:38:28] to see eqiad you kind of need to look at the whole week: http://ganglia.wikimedia.org/latest/graph.php?r=week&z=xlarge&c=Bits+caches+eqiad&m=cpu_report&s=by+name&mc=2&g=network_report [18:38:33] Snaps: it is [18:38:57] ori-l: ganglia has a timeshift feature [18:39:17] oh yeah, i always forget that [18:39:38] that's because it's hard to find these buttons [18:39:45] I don't see it now, for example [18:39:48] where are they, again? [18:39:58] sometimes they appear next to the CSV & json buttons [18:40:06] AaronSchulz, running mwscript sql.php metawiki extensions/FlaggedRevs/backend/schema/mysql/FlaggedRevs.sql ... [18:42:00] ori-l: http://ganglia.wikimedia.org/latest/graph.php?r=week&z=xlarge&c=Bits+caches+eqiad&h=cp1056.eqiad.wmnet&jr=&js=&v=59578&m=varnish.n_objecthead&vl=N&ti=N+struct+objecthead [18:42:08] that's cached objects, I guess that makes sense since they're now split? [18:43:04] http://ganglia.wikimedia.org/latest/graph_all_periods.php?c=Bits%20caches%20eqiad&h=cp1056.eqiad.wmnet&r=week&z=default&jr=&js=&st=1386182434&v=3163405822&m=varnish.s_bodybytes&vl=N%2Fs&ti=Total%20body%20bytes&z=large [18:43:08] hm, very weird graph [18:43:19] cool, ok thanks Snaps [18:43:53] paravoid: this is kind of interesting: http://ganglia.wikimedia.org/latest/graph.php?r=week&z=xlarge&c=Bits+caches+eqiad&h=cp1056.eqiad.wmnet&jr=&js=&v=2207371512&m=varnish.SMA.s0.c_fail&vl=N%2Fs&ti=Allocator+failures [18:44:00] yup, I saw it [18:48:28] ori-l: so, no way in knowing if the module storage change had a large effect or not right now, right? [18:48:44] that math bug messed your data :) [18:49:10] the language module mtime, you mean? [18:49:22] oh, language, right [18:49:27] (03PS7) 10Yurik: Apply FlaggedRevs to metawiki for W0. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/95662 (owner: 10Dr0ptp4kt) [18:50:07] yes, it's a confound and it throws a wrench in everything [18:50:16] frustrating as fuck [18:50:27] you can always disable module storage as an experiment, can't you [18:51:17] yeah, that's a good point [18:52:34] mind if i try? [18:52:41] (03PS8) 10Yurik: Apply FlaggedRevs to metawiki for W0. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/95662 (owner: 10Dr0ptp4kt) [18:53:08] ori-l: The deployment calendar is pretty dense today [18:53:09] what about that cache stampede issue? [18:53:26] I recommend you wait until 2pm at least [18:53:37] yeah, sorry, i didn't mean *right now* [18:53:46] and no, I don't mind if you try it [18:53:56] ori-l: No worries, just checking :) [18:54:25] PROBLEM - DPKG on virt9 is CRITICAL: Timeout while attempting connection [18:55:23] RoanKattouw: heh, you weren't kidding about the calendar being dense [18:56:01] yurik, going to the conference room. brb online [18:56:13] gwicke: do you have any news from node 0.10 testing? [18:56:38] ori-l: Hah it's even more packed than gcal says [18:56:57] GETTING THERE! [18:57:01] * greg-g adds things to gcal [18:57:07] !log yurik@tin:/a/common/php-1.23wmf5$ mwscript sql.php --wiki=metawiki extensions/FlaggedRevs/backend/schema/mysql/FlaggedRevs.sql [18:57:08] been moving things around this morning :) [18:57:13] (03PS3) 10Ottomata: Graph components of Elasticsearch health [operations/puppet] - 10https://gerrit.wikimedia.org/r/96796 (owner: 10Manybubbles) [18:57:14] greg-g: (the het deploy thing is what I was missing) [18:57:15] RECOVERY - DPKG on virt9 is OK: All packages OK [18:57:19] (03CR) 10Ottomata: [C: 032 V: 032] Graph components of Elasticsearch health [operations/puppet] - 10https://gerrit.wikimedia.org/r/96796 (owner: 10Manybubbles) [18:57:35] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [18:57:42] Logged the message, Master [18:57:43] RoanKattouw: yeah [18:57:47] {{done}} [18:58:12] sweet [18:58:38] (03CR) 10Yurik: [C: 032 V: 032] Apply FlaggedRevs to metawiki for W0. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/95662 (owner: 10Dr0ptp4kt) [18:58:59] also: two calendards suck :) [18:59:27] yurik, i'm back. i'm about to go on the call, but hit me on irc or gchat if you need me to run anything [19:01:40] greg-g, running a few min behind [19:02:12] greg-g, use the google calendar for deployments :) [19:02:15] !log yurik synchronized wmf-config [19:02:41] Logged the message, Master [19:05:37] !log yurik synchronized flaggedrevs.dblist [19:05:54] yurik: I only use the gcal for ease for others, I'd prefer we not depend on any google hosted anything [19:05:59] Logged the message, Master [19:06:06] :) [19:06:12] I'm one of those weirdos :) [19:06:21] i'm totally ok with that :) [19:07:52] !log yurik synchronized wmf-config [19:08:55] greg-g, and i think i am done! flagrevs are now live on meta Zero namespace [19:09:06] coolio! [19:09:15] manybubbles: ^d you're all clear for search stuffz [19:09:33] greg-g: cool, we thought we were late! [19:09:47] PROBLEM - RAID on virt9 is CRITICAL: Timeout while attempting connection [19:09:54] (03PS1) 10Manybubbles: Make Cirrus secondary for wikidata [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99184 [19:10:37] RECOVERY - RAID on virt9 is OK: OK: Active: 16, Working: 16, Failed: 0, Spare: 0 [19:11:17] paravoid: not yet, sorry [19:11:31] didn't get around to install it yet in rt testing [19:11:50] currently prepping for a deploy, have it on my list for this afternoon [19:12:08] cool, thanks! [19:12:11] don't expect issues either [19:15:49] ^d: changes proposed! [19:16:02] paravoid, Snaps tagged librdkafka, now to build... [19:16:07] <^d> manybubbles: Reviewing. [19:16:10] ^d: thanks! [19:16:24] do I need to merge that tag into debian branch? [19:16:39] do I need to create my own debian/0.8.1-1 tag from 0.8.1 tag and then merge in debian branch? [19:16:47] PROBLEM - RAID on virt9 is CRITICAL: Timeout while attempting connection [19:16:51] do I let git-buildpackage do that automatically with —git-tag flag? [19:16:57] ^d: I see untracked files on tin [19:17:20] ottomata: I'll do those, there's a few other changes to release 0.8.1-1 too [19:17:24] oh, ok [19:17:24] cool [19:17:29] <^d> manybubbles: Where? [19:17:34] well, an extension with unsaved commits and a .save.1 file [19:17:37] PROBLEM - puppet disabled on virt9 is CRITICAL: Timeout while attempting connection [19:17:43] hm, do you have time to do real soon? I'm hoping to get some stuff out there today [19:17:44] /a/common/php1.23wmf5 [19:17:56] /a/common/php-1.23wmf5 [19:17:58] sorry [19:18:02] if not, tomorrow would be fine, but it would be nice to prep as much as possible today, paravoid [19:18:14] I assume I can just rebase my deployment on top of them [19:18:37] RECOVERY - RAID on virt9 is OK: OK: Active: 16, Working: 16, Failed: 0, Spare: 0 [19:18:58] <^d> Should be able to, untracked is no big deal. [19:19:02] <^d> I think. [19:19:26] AaronSchulz, https://www.mediawiki.org/wiki/Extension:FlaggedRevs#Wikimedia_Server_Installation [19:19:37] RECOVERY - puppet disabled on virt9 is OK: OK [19:22:31] (03CR) 10Chad: [C: 032] Make Cirrus secondary for wikidata [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99184 (owner: 10Manybubbles) [19:22:43] <^d> manybubbles: All merged. And I removed that .save file from wmf5 [19:22:56] ^d: cool [19:23:01] doing wmf5 now then [19:24:24] (03Merged) 10jenkins-bot: Make Cirrus secondary for wikidata [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99184 (owner: 10Manybubbles) [19:24:37] PROBLEM - Disk space on virt9 is CRITICAL: Timeout while attempting connection [19:25:20] !log manybubbles synchronized php-1.23wmf5/extensions/CirrusSearch/ 'Updating Cirrus to master' [19:25:58] hmm, ^d, should opssoftware gerrit group have Push Annotated Tags rights? [19:26:12] on operations/software repositories? [19:26:22] everything looks good on test2wiki - syncing to others [19:26:24] <^d> All owners should have Push Annotated Tags. [19:26:27] RECOVERY - Disk space on virt9 is OK: DISK OK [19:26:29] rather, syncing to wmf4:) [19:26:31] <^d> I think I set that on All-Projects. [19:26:47] PROBLEM - RAID on virt9 is CRITICAL: Timeout while attempting connection [19:27:10] hmm, yeah i'm probably not an owner [19:27:11] hmm [19:27:16] well [19:27:22] owner says opssoftware [19:27:34] and i'm in the ldap/ops group [19:27:37] RECOVERY - RAID on virt9 is OK: OK: Active: 16, Working: 16, Failed: 0, Spare: 0 [19:28:10] !log manybubbles synchronized php-1.23wmf4/extensions/CirrusSearch/ 'Updating Cirrus to master' [19:28:33] ^d so this [19:28:33] https://gerrit.wikimedia.org/r/#/admin/projects/operations/software,access [19:28:43] maybe needs Push Annotated Tag added to it? [19:28:45] or is that wrong? [19:29:15] <^d> Ahhh, I only set that on mediawiki/* repos. [19:29:17] <^d> So yeah, that's fine. [19:29:21] <^d> Can't hurt, anyway :D [19:29:26] ok [19:29:26] cool [19:29:57] paravoid, want to approve that one real quick? [19:29:58] https://gerrit.wikimedia.org/r/#/c/99188/ [19:30:01] !log manybubbles synchronized wmf-config/InitialiseSettings.php 'Make Cirrus secondary for wikidata' [19:30:13] <^d> You could've just pressed submit instead of for review :D [19:30:31] i know, but i thought that was naughty [19:30:42] i could self review too [19:30:43] <^d> For acl changes? Nah. It's still logged :) [19:30:45] <^d> I already did. [19:30:53] ah, ok danke [19:31:00] nm para void [19:31:17] PROBLEM - DPKG on virt9 is CRITICAL: Timeout while attempting connection [19:31:23] yay that works now, danke [19:31:29] <^d> yw [19:31:45] ^d and greg-g: I'm done syncing files [19:32:17] RECOVERY - DPKG on virt9 is OK: All packages OK [19:32:37] PROBLEM - Disk space on virt9 is CRITICAL: Timeout while attempting connection [19:33:26] !log building wikidata's Cirrus index now [19:34:28] RECOVERY - Disk space on virt9 is OK: DISK OK [19:34:45] ok, paravoid, i still have the same question though, about varnishkafka [19:34:53] since i'm using tags there, and the setup is basically the same [19:40:43] PROBLEM - RAID on virt9 is CRITICAL: Timeout while attempting connection [19:41:33] RECOVERY - RAID on virt9 is OK: OK: Active: 16, Working: 16, Failed: 0, Spare: 0 [19:42:52] ottomata: where does %version come from? [19:43:08] changelog [19:43:13] pretty sure [19:44:04] ottomata: okay, I dont know much about debian packaging so I dont know what good I Will do reviewing that thing [19:44:16] ottomata: I can +1 it because I like you [19:44:23] PROBLEM - DPKG on virt9 is CRITICAL: Timeout while attempting connection [19:45:14] RECOVERY - DPKG on virt9 is OK: All packages OK [19:46:20] haha [19:46:21] ok [19:46:26] i'm trying to figure it out myself [19:47:12] is there anyone in who can look at running queries for me? I have a job going crazy super slow and I want to be sure it isn't doing something funky with mysql. [19:47:46] it has worked fine in the past but it is doing something funky on wikidata [19:50:53] maybe? where do I look? [19:51:00] manybubbles: ^ [19:51:18] ottomata: the database running wikidata [19:51:20] whichever one that is [19:51:24] ha hm [19:51:25] good q [19:51:37] not really sure how to find that out... [19:52:20] s5, I think [19:52:49] labs seems to be in trouble currently [19:52:55] I just need to know if any cirrus queries are sitting around taking forever [19:53:34] gwicke: yeah, I think labs has other problems lately [19:53:36] gwicke@bastion1:~$ host parsoid-spof.wmflabs [19:53:36] ;; connection timed out; no servers could be reached [19:53:43] PROBLEM - SSH on virt9 is CRITICAL: Connection timed out [19:54:26] uhh there are so many s5s! [19:54:28] hmm [19:54:34] RECOVERY - SSH on virt9 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [19:54:50] i dunno manybubbles... [19:54:59] don't have much experiences with our dbs, dunno where things are [19:58:23] PROBLEM - Host virt9 is DOWN: PING CRITICAL - Packet loss = 100% [19:58:43] RECOVERY - Host virt9 is UP: PING WARNING - Packet loss = 37%, RTA = 35.41 ms [19:58:59] (03PS27) 10Ottomata: Initial Debian version [operations/software/varnish/varnishkafka] (debian) - 10https://gerrit.wikimedia.org/r/78782 (owner: 10Faidon Liambotis) [19:59:44] akosiaris: you still around? [20:00:03] PROBLEM - puppet disabled on virt9 is CRITICAL: Timeout while attempting connection [20:02:53] RECOVERY - puppet disabled on virt9 is OK: OK [20:04:03] (03PS1) 10Andrew Bogott: Point the puppet freshness check to nagios.wmflabs.org [operations/puppet] - 10https://gerrit.wikimedia.org/r/99192 [20:05:29] PROBLEM - Host virt9 is DOWN: PING CRITICAL - Packet loss = 100% [20:05:39] RECOVERY - Host virt9 is UP: PING WARNING - Packet loss = 28%, RTA = 35.46 ms [20:05:52] (03CR) 10Andrew Bogott: [C: 032] Point the puppet freshness check to nagios.wmflabs.org [operations/puppet] - 10https://gerrit.wikimedia.org/r/99192 (owner: 10Andrew Bogott) [20:07:53] (03CR) 10Ottomata: Initial Debian version (031 comment) [operations/software/varnish/varnishkafka] (debian) - 10https://gerrit.wikimedia.org/r/78782 (owner: 10Faidon Liambotis) [20:20:39] PROBLEM - RAID on virt9 is CRITICAL: Timeout while attempting connection [20:22:33] (03PS1) 10Hashar: contint: djvulibre-bin for mw djvu unit tests [operations/puppet] - 10https://gerrit.wikimedia.org/r/99196 [20:22:39] RECOVERY - RAID on virt9 is OK: OK: Active: 16, Working: 16, Failed: 0, Spare: 0 [20:31:39] PROBLEM - Disk space on virt9 is CRITICAL: Timeout while attempting connection [20:33:30] RECOVERY - Disk space on virt9 is OK: DISK OK [20:40:19] greg-g: is my window still open? [20:41:07] in other words, since I found something that will speed up indexing wikidata, tested it, and merged it, can I deploy it because my window ends in 20 minutes or should I wait? [20:41:55] on second thought, let me play with it a bit more [20:42:01] maybe the lightning deploy? [20:48:53] manybubbles: you have 10 minutes now :) [20:48:58] manybubbles: sorry was making lunch [20:49:04] manybubbles: or at 2pm [20:55:19] my 5 pm? [20:55:23] I can wait for that [20:55:38] I'm not likely to get it done properly in 5 minutes [20:56:36] :) [21:07:23] manybubbles: ^d is there a good reason for $wgCirrusSearchEnablePref = false; (default)? [21:07:41] or when might we be able to enable it? [21:09:20] aude: no good reason. Actually we were planning to enable it soon [21:09:32] ok [21:09:32] (03PS1) 10Ottomata: Hosting public-datasets from /a partition [operations/puppet] - 10https://gerrit.wikimedia.org/r/99249 [21:09:36] ^d: you gonna have time to work on it or should I? [21:09:50] i suppose first see how the load / indexing / updates goes [21:09:52] <^d> It should only exist when we're in secondary mode tbh. It's pretty redundant otherwise. Hence not making it default. [21:10:03] ^d: right [21:10:05] (03CR) 10jenkins-bot: [V: 04-1] Hosting public-datasets from /a partition [operations/puppet] - 10https://gerrit.wikimedia.org/r/99249 (owner: 10Ottomata) [21:10:11] i'm not sure the best way to configure that. [21:10:15] aude: normally we wouldn't enable it for a wiki that isn't finished indexing, but I suppose we can make an exception for wikidata [21:10:17] is there a list of wikis (dblist) [21:10:20] (03PS2) 10Ottomata: Hosting public-datasets from /a partition [operations/puppet] - 10https://gerrit.wikimedia.org/r/99249 [21:10:21] if it is expected [21:10:24] manybubbles: we can wait [21:10:27] <^d> Everyone's indexed now :) [21:10:28] cool [21:10:32] not wikidata! [21:10:33] <^d> Other than WD [21:10:40] <^d> But WD knows since aude is here. [21:10:41] it's just easier than my hacky javascript that i made [21:10:42] <^d> I think we're fine. [21:10:50] (03PS3) 10Ottomata: Hosting public-datasets from /a partition [operations/puppet] - 10https://gerrit.wikimedia.org/r/99249 [21:11:05] aude: yeah, I'd love to have it on. [21:11:10] <^d> manybubbles: We'll merge and roll it into your window. [21:11:16] sweet [21:11:31] we have a dblist for everything that used to have cirrus as primary [21:11:41] but we were keeping the secondaries in InitializeSettings [21:12:00] i'll let you figure out the setting [21:12:06] probably a dblist is what i'd do [21:12:29] (03CR) 10Ottomata: [C: 032 V: 032] Hosting public-datasets from /a partition [operations/puppet] - 10https://gerrit.wikimedia.org/r/99249 (owner: 10Ottomata) [21:14:03] <^d> Eventually it'll be everyone and we can remove the settings :) [21:14:14] <^d> But that's still a bit of time from now. [21:14:24] yep [21:14:30] ^d: maybe a dblist for all the wikis on which cirrus is a secondary AND it you can set this setting. When we start indexing we add the wiki to InitializeSettings, then we move it to this list when it is done, then to the main one when we're primary [21:15:04] <^d> Do we need to add a ton of non-group wikis to secondary? [21:16:16] <^d> manybubbles: Also, I don't mind doing https://gerrit.wikimedia.org/r/#/c/99159/ [21:16:19] not sure yet. we might if some languages have trouble and other don't [21:16:25] <^d> I know exactly what I'm wanting & what Aaron's thinking :) [21:16:39] ^d: cool. you can have it [21:16:50] if you do that, though, remove the pool counter from the link counting as well [21:16:53] that is done on the queue [21:17:42] we still would need to disable the search counter during maintenance because the freshness check uses it by default. [21:17:50] because it shares code with morelikethis: [21:18:05] !log deployed Parsoid 0ac82a2 [21:18:06] sorry, morelike: [21:18:18] Logged the message, Master [21:19:47] (03PS11) 10BryanDavis: Add configuration for Wikimania Scholarships [operations/puppet] - 10https://gerrit.wikimedia.org/r/98740 [21:22:32] apergos: ping [21:22:44] gwicke: ponnnggg [21:22:50] although I'm about outa juice [21:22:52] what's up? [21:23:50] apergos: we just deployed a new parsoid version [21:23:59] I think logging can be re-enabled [21:24:25] do you have a link to the changeset that disabled it? [21:24:30] sec [21:24:41] I won't be around to babysit but if someone is then worksforme [21:25:06] apergos: it is not urgent, but would be nice to sanity-check the deploy [21:25:09] also somewhere there was the beginning of an upstart job someone was workoing on [21:25:17] for parsoid, which would mean log rotation [21:25:19] yes, +100 on that [21:25:34] and systemd, in case Debian makes up its mind soon [21:25:53] I don't remember who over there it was now but if it comes across your radar, feel free to a) nag them b) add me to review, comment, push it along, etc [21:26:04] we have configs for both already, just don't use them yet in prod [21:26:20] https://gerrit.wikimedia.org/r/#/c/98082/ [21:26:31] ok, well I'm happy to help get that done [21:26:32] apergos: thanks! [21:26:35] yw [21:26:44] thanks to you guys for being so fast on the fix [21:26:55] PHP Fatal error: Call to a member function getLevel() on a non-object in /usr/local/apache/common-local/php-1.23wmf5/extensions/ProofreadPage/ProofreadPage.body.php on line 704 [21:27:24] was the last bug I got to last night, but was an easy fix [21:31:18] (03PS1) 10GWicke: Revert "turn off logging for parsoid for now, was filling /" [operations/puppet] - 10https://gerrit.wikimedia.org/r/99251 [21:33:29] (03CR) 10BryanDavis: "I think this is pretty much ready to go. Some of the config depends on patches to Scholarships that aren't approved yet, but they should b" [operations/puppet] - 10https://gerrit.wikimedia.org/r/98740 (owner: 10BryanDavis) [21:35:41] (03PS1) 10Edenhill: Fixed librdkafka configuration reference URL [operations/software/varnish/varnishkafka] - 10https://gerrit.wikimedia.org/r/99252 [21:37:38] (03CR) 10Ottomata: [C: 032 V: 032] Fixed librdkafka configuration reference URL [operations/software/varnish/varnishkafka] - 10https://gerrit.wikimedia.org/r/99252 (owner: 10Edenhill) [21:37:44] doh, and there we go, won't be in the tag now :p [21:37:48] i'll fix in debian branch too [21:39:03] is jenkins on holiday? [21:39:31] (03PS28) 10Ottomata: Initial Debian version [operations/software/varnish/varnishkafka] (debian) - 10https://gerrit.wikimedia.org/r/78782 (owner: 10Faidon Liambotis) [21:43:43] <^d> manybubbles: https://gerrit.wikimedia.org/r/#/c/99159/ is a bigger patch now :p [21:44:46] ottomata: oh, didnt think of that :| sorry! [21:44:59] hehe, no worries [21:45:04] don't mind if that one misses the tag :p [21:45:13] It'll make it into .deb and puppet [21:45:14] :) [21:45:54] (03PS1) 10Ottomata: Updating configuration url in .conf comment [operations/puppet/varnishkafka] - 10https://gerrit.wikimedia.org/r/99253 [21:46:03] (03CR) 10Ottomata: [C: 032 V: 032] Updating configuration url in .conf comment [operations/puppet/varnishkafka] - 10https://gerrit.wikimedia.org/r/99253 (owner: 10Ottomata) [21:46:29] (03CR) 10Edenhill: [C: 031] Initial Debian version [operations/software/varnish/varnishkafka] (debian) - 10https://gerrit.wikimedia.org/r/78782 (owner: 10Faidon Liambotis) [21:47:09] (03PS11) 10Ottomata: Setting up varnishkafka on 3 mobile varnish hosts. [operations/puppet] - 10https://gerrit.wikimedia.org/r/94169 [21:47:19] ottomata: I guess the config file in puppet could be bare-bones without comments, just the properties used. To avoid this duplicationary [21:48:09] Snaps: gettiing this error in on esams hsot right now [21:48:09] PRODUCE: Failed to produce Kafka message (seq 1189876371): No buffer space available (1000000 messages in outq) [21:48:11] (03CR) 10Edenhill: [C: 031] Setting up varnishkafka on 3 mobile varnish hosts. [operations/puppet] - 10https://gerrit.wikimedia.org/r/94169 (owner: 10Ottomata) [21:48:15] i was about to install the latest stuff there [21:48:19] i think i'll go ahead and do that [21:48:27] and, if this is latency/jitter problems [21:48:30] hey [21:48:31] wait [21:48:33] it should keep going [21:48:35] ja? [21:48:54] could you kill it with -6 so we get a corefile? [21:49:00] sure [21:49:13] kill -6 $(cat /var/run/varnishkafka/varnishkafka.pid) ? [21:49:28] yep. If ulimit -c unlimited [21:49:31] otherwise use gcore [21:49:44] gcore -o /tmp/vk.core [21:49:50] core file size (blocks, -c) 0 [21:49:56] thats no good [21:50:05] k gcore... [21:50:22] gdb package [21:50:25] if you are up for that [21:50:26] its installed [21:51:06] guess that core file will be quite large with that many msgs in the queue [21:51:18] hmm, ok here's a potential problem, i was in the middle of installing the new version, so this may have been running with a different librdkafka .so file than was in mem [21:51:21] not sure what that would do... [21:51:34] i had already installed newer librdkafka [21:51:37] hadn't shut down vk yet [21:51:39] was about to [21:52:01] nah, it would still use the old rdk [21:52:03] oof 1.3G [21:52:16] it just complained whne i ran gcore [21:52:17] warning: .dynamic section for "/usr/lib/librdkafka.so.1" is not at the expected address (wrong library or version mismatch?) [21:52:32] few of these too [21:52:32] warning: Memory read failed for corefile section, 1048576 bytes at 0x7f5d8853b000. [21:52:49] he [21:52:52] ori-l: I take it I'm not bothering with the labs case for the role? [21:53:05] ottomata: scrap it. [21:53:11] you don't want? [21:53:42] ottomata: if the problem persists we can get a new core file later. In this state we dont really know whats what and how gdb thinks about replaced libraries, etc [21:53:43] Snaps: ? [21:53:47] ok cool [21:53:47] yeah [21:53:59] ok, installing new vk versino [22:01:47] ^d: so our window comes and master isn't good. how about we sync master~1? [22:02:01] <^d> Yes. [22:02:10] <^d> Whatever was master when you said you wanted master :) [22:02:29] I'll build the update [22:02:49] I'm fixing master and running the regression tests against it but I don't think we want to wait for them for this little fix [22:03:00] what'd you do to master?! [22:03:07] ignore me [22:03:44] <^d> cirruz mastah is teh broke [22:08:38] greg-g: I merged ^d's broke changeset [22:08:42] broke all my stuff [22:08:52] :) [22:09:32] but source control is magic [22:09:37] ^d: I've added you on both updates. [22:09:45] I'm only sucking up my change and a translatewiki change [22:10:02] and labs is so borked its breaking my test run.... [22:11:32] what about labs? [22:13:15] labs is being hammered today and I do most of my development there [22:15:47] !log manybubbles synchronized php-1.23wmf5/extensions/CirrusSearch/ 'Update CirrusSearch to speed up indexing' [22:16:02] Logged the message, Master [22:17:21] ^d: that went well, doing wmf4 [22:19:14] !log manybubbles synchronized php-1.23wmf4/extensions/CirrusSearch/ 'Update CirrusSearch to speed up indexing' [22:19:34] Logged the message, Master [22:19:43] all synced [22:20:25] greg-g: out of the way now [22:20:46] ^d: queue rate jumped from ~36 pages/second to ~150 pages/second [22:21:53] ^d: wikidata is also eating a big refreshlinks snake as well [22:21:53] sweet [22:21:59] so jobs have stalled [22:22:05] so my rate limiting kicked in fast [22:22:31] greg-g: just had to remove a WHERE clause and MySQL got much much faster [22:22:54] neato [22:23:03] yeah! [22:23:04] <^d> :) [22:24:05] baby just fell down stairs:( [22:24:33] she's ok [22:24:38] just pissed [22:25:12] <^d> :( [22:25:28] :( [22:25:39] <^d> "I fell down some stairs" [22:58:18] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [22:58:27] ^d: stopped due to new bug have solution but not time right now [22:59:27] <^d> manybubbles: The one you filed about wikidata? [23:02:08] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [23:03:03] (03Abandoned) 10Chad: Don't explode when trying to use hhvm + caches [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/95287 (owner: 10Chad) [23:04:05] (03CR) 10Chad: [C: 032] Fix up multiversion to not require dba_* functions [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93622 (owner: 10Chad) [23:04:08] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [23:04:19] * AaronSchulz gets the popcorn [23:04:20] (03Merged) 10jenkins-bot: Fix up multiversion to not require dba_* functions [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93622 (owner: 10Chad) [23:05:35] yeah [23:05:49] not frequent so i guess i can restart and watch [23:05:51] <^d> manybubbles: Working on a patch for that, should be pretty easy. [23:06:00] yeah [23:06:12] hLG of one on laptop now [23:06:26] !log demon synchronized multiversion/ 'cdb changes for hhvm support (I73195536)' [23:06:31] but you can do it:) [23:06:42] Logged the message, Master [23:07:04] still have to fix master for that so needs to wait 'till im not holding a child [23:07:12] https://en.wikipedia.org/wiki/Cold_boot_attack still loads [23:09:10] !log demon rebuilt wikiversions.cdb and synchronized wikiversions files: no actual version changes, just rebuilding to test code changes [23:09:21] <^d> Well that worked. I can't imagine anything else that's broken. [23:09:31] Logged the message, Master [23:10:43] <^d> greg-g: We're done. Like I said, either it'd work or it'd break everything :) [23:12:52] sweet [23:42:19] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [23:45:19] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [23:52:42] * Elsie looks around for Christian.