[00:00:37] !log mwalker Finished scap: SWAT for {{gerrit|129813}}, {{gerrit|129640}}, {{gerrit|129708}}, {{gerrit|129707}}, and {{gerrit|130246}} (duration: 11m 37s) [00:00:44] Logged the message, Master [00:01:20] aude, hoo, duh -- if you want to check your languages [00:01:24] they're deployed [00:01:31] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1% data above the threshold [250.0] [00:01:51] mwalker: on it [00:02:11] mwalker: just tested it on mw.o and it's working, thanks! [00:02:59] wikidata also looks good [00:03:29] greg-g, seems like I'm done [00:04:31] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 3.38983050847% of data exceeded the critical threshold [500.0] [00:08:52] I think I'm done... good night [00:14:01] !log aaron synchronized php-1.24wmf2/includes/WikiPage.php '119fd9fc17b3c309b9065b54f4c83ede7d20498b' [00:14:06] Logged the message, Master [00:17:30] ori: looks good, but I'm going to wait to merge it until later tonight after dinner, because I'll have to babysit the restart process for the cc_command stuff for libgeoip [00:22:06] !log aaron synchronized php-1.24wmf1/includes/WikiPage.php '3505cf933d874ea44bd5a3f3ffe210598ef7eec2' [00:22:13] Logged the message, Master [00:24:48] bblack: *nod* thanks very much [00:26:27] i got a 503 for https://en.wikipedia.org/wiki/Pipeline_(Unix), went away on refresh [00:26:34] sumana reported the same for a different page [00:28:37] http://en.wikipedia.org/wiki/Main_Page 503 for me [00:28:38] Request: GET http://meta.wikimedia.org/wiki/Wiki_Education_Foundation, from 208.80.154.51 via cp1067 frontend ([10.2.2.25]:80), Varnish XID 1511002857 [00:28:38] Forwarded for: 192.195.83.38, 208.80.154.51 [00:28:59] yeah, getting sporadic 503s. [00:30:29] AaronSchulz: it lines up with your sync [00:30:34] see https://graphite.wikimedia.org/render/?title=HTTP%205xx%20Responses%20-8hours&from=-1hours&width=1024&height=500&until=now&areaMode=none&hideLegend=false&lineWidth=2&lineMode=connected&target=color(cactiStyle(alias(reqstats.500,%22500%20resp/min%22)),%22red%22)&target=color(cactiStyle(alias(reqstats.5xx,%225xx%20resp/min%22)),%22blue%22) [00:31:01] or for that matter [00:31:46] (03PS1) 10MaxSem: Add my new key [operations/puppet] - 10https://gerrit.wikimedia.org/r/130265 [00:32:25] not seeing any connection [00:32:47] (03PS1) 10JGonera: Enable Compact personal bar beta feature [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130266 [00:33:18] bd808: there's a spike in bytes in to logstash which suggests that whatever may be the issue may be getting logged there, could you look? [00:33:27] i'm poking around ganglia still [00:33:53] (03CR) 10JGonera: [C: 04-2] "-2 until all the patches to Compact personal bar are merged:" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130266 (owner: 10JGonera) [00:34:06] * bd808 looks [00:35:39] bblack: load on cp1055 is high [00:36:51] chasemp, jgage, TimStarling - around? [00:37:13] yes [00:37:30] ori: I'm not finding anything obviously weird in logstash [00:37:46] TimStarling: http://ganglia.wikimedia.org/latest/graph.php?r=day&z=xlarge&c=Text+caches+eqiad&h=cp1055.eqiad.wmnet&jr=&js=&v=6.2&m=cpu_system&vl=%25&ti=CPU+System [00:38:05] TimStarling, we're getting sporadic 503s across the cluster, ori is looking for any possible root cause [00:38:29] varnishd on that host is saturating cpu [00:39:09] (03CR) 10Gage: [C: 032] Add jkrauska to rhenium using an admin group [operations/puppet] - 10https://gerrit.wikimedia.org/r/130241 (owner: 10Jkrauska) [00:42:32] are you sure? [00:44:45] gerrit js is taking so long my browser is halting execution :( [00:45:16] hm i will help look for the 503s [00:45:17] is it meant to be constantly forking? [00:46:37] yeah, it's constantly panicking [00:46:54] Apr 29 00:46:27 cp1055 varnishd[22690]: Child (16819) Panic message: Missing errorhandling code in HSH_Purge(), cache_hash.c line 593:#012 [00:47:30] child dies, new child starts [00:49:01] this is only happening on cp1055? [00:49:09] (03PS2) 10MaxSem: Add my new key [operations/puppet] - 10https://gerrit.wikimedia.org/r/130265 [00:49:30] my 503 above was from cp1067 [00:49:32] !log on cp1055: backend varnish is continually panicking and restarting its child, will try to stop/start service [00:49:38] Logged the message, Master [00:49:41] (03CR) 10Gage: [C: 032] Add my new key [operations/puppet] - 10https://gerrit.wikimedia.org/r/130265 (owner: 10MaxSem) [00:49:41] has the version changed recently? [00:50:11] (03CR) 10Gage: [V: 032] Add my new key [operations/puppet] - 10https://gerrit.wikimedia.org/r/130265 (owner: 10MaxSem) [00:50:41] PROBLEM - Varnish HTTP text-backend on cp1055 is CRITICAL: Connection refused [00:51:14] same [00:51:32] TimStarling, jgage - want me to page bblack? he usually does a lot of the varnish poking in ops - I think he's at dinner right now [00:51:54] yes [00:51:57] The 500 graph that ori linked just dropped a lot but not back to normal [00:52:04] most likely it is a configuration or version change that needs to be reverted [00:52:10] but it will take me a while to hunt it down myself [00:52:31] I have been seeing occasional 503s on an off for a while now [00:52:37] how long is a while? [00:52:41] RECOVERY - Varnish HTTP text-backend on cp1055 is OK: HTTP OK: HTTP/1.1 200 OK - 188 bytes in 0.003 second response time [00:53:03] definitely saw it last night [00:53:10] hmm [00:53:23] gwicke, http://gdash.wikimedia.org/dashboards/reqerror/ [00:53:45] yow [00:53:49] Eloquence: can Bsadowski1 please be unbanned here? [00:54:21] And the graph jumped back up again -- https://graphite.wikimedia.org/render/?title=HTTP%205xx%20Responses%20-1hours&from=-1hours&width=1024&height=500&until=now&areaMode=none&hideLegend=false&lineWidth=2&lineMode=connected&target=color(cactiStyle(alias(reqstats.500,%22500%20resp/min%22)),%22red%22)&target=color(cactiStyle(alias(reqstats.5xx,%225xx%20resp/min%22)),%22blue%22) [00:55:19] bd808: probably dropped to zero while varnish was stopped [00:55:26] then jumped back up again when I started it [00:55:39] the first start didn't work, I had to start it again [00:55:46] so it was down for a minute [00:55:48] (paged bblack) [00:56:53] should we just depool it in pybal? [00:57:01] Jasper_Deng, don't have the backstory on why he's banned in the first place. talk to active chan op? [00:57:04] looking [00:57:13] it's more than one, right? [00:57:19] yeah [00:57:29] actually only one ban [00:57:33] paravoid!~paravoid@scrooge.tty.gr banned Bsadowski1!*@* from #wikimedia-operations on Monday, April 28, 2014 9:40:34 AM. [00:57:35] timing maybe one of the recent changes, like the keeprefreshingorg? [00:57:51] Jasper_Deng, we're debugging a site issue - please put it on hold for now :) [00:59:23] bblack: keeprefreshingorg was mobile frontend, this is text backend [01:01:13] it's not affecting all of them [01:01:35] bblack: the geo_get_top_cookie_domain change rolled out to that host at 15:22 [01:01:46] yeah [01:02:09] I've stopped the backend on 1055 for now, I think it's the only one. looks like a local issue with a corrupt disk cache or something [01:02:12] Panic message: Missing errorhandling code in HSH_Purge(), cache_hash.c line 593:#012 Condition(spc >= sizeof *ocp) not true.thread = (cache-worker)#012ident = Linux,3.2.0-49-generic,x86_64,-spersistent,-spersistent,-spersistent,-spersistent,-smalloc,-hcritbit,epoll#012Backtrace:#012 0x433e75: /usr/sbin/varnishd() [0x433e75]#012 0x42d8bc: /usr/sbin/varnishd(HSH_Purge+0x42c) [0x42d8bc]#012 0x7ffa6482a945: ./vcl.JSv0TInH.so(VGC_function_ [01:02:41] yeah, I pasted that already [01:03:03] bblack, I got a 503 "via cp1067 frontend" earlier through sporadic browsing [01:03:05] looks like it started at 20:23 [01:03:12] so may affect other hosts as well [01:03:17] Eloquence: frontend -> backend mapping is orthognal [01:03:19] ok [01:03:24] *orthogonal! :) [01:03:39] "grep HSH_Purge /var/log/syslog | head -n1" says 20:23 [01:03:41] PROBLEM - Varnish HTTP text-backend on cp1055 is CRITICAL: Connection refused [01:04:05] frontends are mapped basically randomly for load spreading, and then each frontend hashes into the backends array based on the actual request data [01:04:11] https://www.varnish-cache.org/trac/ticket/551 is similar ('Missing errorhandling code in HSH_Prepare()'), and the cause was session workspace exhaustion [01:04:22] so 1055's backend failures could come out via whatever frontend [01:04:49] got it, thanks. yeah, 503s seem to be disappearing again per http://gdash.wikimedia.org/dashboards/reqerror/ [01:04:53] in any case, as long as we keep the backend shut off (it is now, with puppet disabled), the 5xx spike should stay gone as well [01:04:56] no such log entries in previous syslogs either, so definitely started 20:23 [01:05:17] the frontends healthcheck backends, so they won't send requests to a dead one [01:05:42] puppet will probably start it in half an hour, right? [01:05:53] bblack disabled it [01:06:02] puppet, that is [01:06:29] how? [01:06:35] puppetd -disable [01:07:25] ok [01:08:02] if it really is just a lack of session workspace (as opposed to some kind of other corruption that looks like that), we have to wonder about the timing of recent commits again as well, maybe one of them is bumping up session workspace usage in some cases, and this could hit other backends [01:08:18] i'll ack the alert [01:08:27] oh yeah, thanks [01:08:32] ACKNOWLEDGEMENT - Varnish HTTP text-backend on cp1055 is CRITICAL: Connection refused ori.livneh varnishd was repeatedly panicking, bblack stopped service and disabled puppet [01:09:53] could be. TimStarling, anything awful-looking in https://gerrit.wikimedia.org/r/#/c/127131/4/templates/varnish/geoip.inc.vcl.erb ? [01:11:20] assuming VRT_GetHdr doesn't allocate memory of some kind, which I don't think it does, I don't think that patch could've created a new leak [01:12:10] it also isn't likely to affect one backend more than others [01:13:14] use of strtok_r [01:14:35] bblack: it returns a pointer to workspace memory, IIRC [01:14:51] running strtok_r() on something in the workspace seems very wrong to me [01:15:13] that code's been in for a while though (the strtok_r) [01:15:36] maybe something else had to use that header [01:15:43] ? [01:15:56] strtok_r doesn't modify [01:16:41] oh, it does :P [01:16:41] strtok_r has been there since at least 2011 (commit 3f673f65cc9) [01:16:49] it overwrites the separators with nulls [01:16:53] yeah that's not awesome [01:17:03] geo_get_top_cookie_domain() does it too [01:17:04] should've caught that on the initial review, was probably me [01:17:48] strrchr() is probably what we wanted to use there [01:18:08] never mind, no it doesn't [01:18:31] yeah, i wasn't sure what you were talking about there [01:19:36] Vmod_Func_header.append() is supposed to copy? [01:20:03] where is the implementation? [01:20:09] copies cookie-out to a new header, yes [01:20:17] https://github.com/varnish/libvmod-header [01:21:46] in any case, only the bits in https://gerrit.wikimedia.org/r/#/c/127131 are new, the rest has been stable for quite a while. We should probably address at least the strtok_r() issue, but I doubt it's what triggered cp1055 [01:23:26] the fourth argument to Vmod_Func_header.append() is a format, not a string [01:23:44] so if it had percent characters in it, bad things would happen [01:24:18] maybe slightly more likely when you are tokenizing an unvalidated host header and putting that into the cookie domain? [01:24:36] probably not likely enough to cause a crash every 2 seconds though [01:25:44] how do I test this? [01:26:45] not easily [01:27:36] deployment-cache-text02.eqiad.wmflabs is the text varnish for the beta cluster; it runs the same code [01:28:37] you could puppetd -disable to prevent puppet from clobbering local modifications and then edit the files in /etc/varnish and restart varnish as needed [01:28:39] Set-Cookie: GeoIP=AU:Parramatta:-33.8167:151.0000:v4; Path=/; Domain=.wikipedia.org%d%d%d [01:28:45] maybe I am wrong [01:28:46] the crash is on a PURGE request (from vhtcpd) on /wiki/Main_Page [01:28:56] vhtcpd wouldn't be doing things like that [01:29:22] although still, any corruption can lead to anything if something's getting corrupted [01:29:49] I'm inclined to wonder how often we purge en.wiki's Main_Page, and if there's some race there because that page gets hit so hard [01:32:06] ah right, it says fmt but actually the arguments are just concatenated [01:32:31] of course, that misunderstanding would never happen if I was reading the actual varnish code [01:32:46] since all variable names are single letters and thus contain no distracting semantic content ;) [01:32:51] :) [01:33:03] there's mutex that is set up in in init_function in https://github.com/varnish/libvmod-header/blob/3.0/src/vmod_header.c#L234 [01:33:05] I guess we must purge Main_Page at least daily, right? [01:33:19] and i wonder if using the vmod outside of vcl skips that [01:33:56] bblack: there is a "purge the main page" link on Talk:Main_Page [01:33:58] ori: also, that line you highlighted is horrible, because the mutex would never be initialized with -DNDEBUG [01:35:02] but init_function should still get run [01:35:05] people probably click it pretty often [01:35:22] also actual edits to the included templates will result in a purge [01:36:25] plus if it were just a race, one of those many crash-restarts probably would've won and continued on [01:36:44] I'm inclined to go back to the idea that this is the result of a corrupted disk cache [01:47:00] I'm going to try starting up the backend varnish manually, but listening on the wrong port (so frontends and vhtcpd purges can't reach it yet), and then see if manually banning /wiki/Main_Page does anything for us [01:51:26] trying the real thing now [01:51:31] (03PS1) 10Gerrit Patch Uploader: Show AbuseFilter log hits on IRC for wikis where logs public [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130274 (https://bugzilla.wikimedia.org/64255) [01:51:37] (03CR) 10Gerrit Patch Uploader: "This commit was uploaded using the Gerrit Patch Uploader [1]." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130274 (https://bugzilla.wikimedia.org/64255) (owner: 10Gerrit Patch Uploader) [01:51:41] RECOVERY - Varnish HTTP text-backend on cp1055 is OK: HTTP OK: HTTP/1.1 200 OK - 188 bytes in 0.002 second response time [01:51:51] nope, still crashed [01:52:35] I was hopeful there for a few seconds, but that was just vhtcpd's backoff delay expiring, then it tried to purge Main_Page and the same crap happened [01:53:05] !log Running deleteEqualMessages.php on cswiki (bug 43917) [01:53:14] Logged the message, Master [01:53:30] (03CR) 10PiRSquared17: "I would prefer:" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130274 (https://bugzilla.wikimedia.org/64255) (owner: 10Gerrit Patch Uploader) [01:54:41] PROBLEM - Varnish HTTP text-backend on cp1055 is CRITICAL: Connection refused [01:54:46] At this point I think I'll just go ahead and wipe the persistent cache on cp1055, assuming nobody has a better idea or objection. It still seems likely to me that something's wrong in the disk cache [01:56:55] I triggered a normal purge of Main_Page just now, too, just to see if that will trigger an issue on whichever backend Main_Page is currently remapped to [01:58:07] doesn't seem to have (and it's cp1054) [02:06:43] ACKNOWLEDGEMENT - Varnish HTTP text-backend on cp1055 is CRITICAL: Connection refused Brandon Black Still working on this issue, its not in production service at the moment. [02:09:33] removing the old disk cache files is taking a while, probably an indication that they were fragmented from lack of pre-alloc anyways [02:22:41] RECOVERY - Varnish HTTP text-backend on cp1055 is OK: HTTP OK: HTTP/1.1 200 OK - 189 bytes in 0.001 second response time [02:24:04] TimStarling: Not that it matters, but there's a purge link on the main page itself as well (not just the talk page). [02:24:17] !log wiped disk cache (via mkfs) on cp1055 to (hopefully) clear crash-restart cycle, backend back in service now [02:24:24] Logged the message, Master [02:30:07] (03CR) 10Ladsgroup: [C: 031] "As crat in fa.wp seems okay to me and you can see the consensuses has reached" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130079 (https://bugzilla.wikimedia.org/64532) (owner: 10Odder) [02:33:58] vhtcpd cleared its purge backlog, things seem stable [02:37:24] !log LocalisationUpdate completed (1.24wmf1) at 2014-04-29 02:37:21+00:00 [02:37:31] Logged the message, Master [02:52:31] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1% data above the threshold [250.0] [03:02:58] !log LocalisationUpdate completed (1.24wmf2) at 2014-04-29 03:02:55+00:00 [03:03:03] Logged the message, Master [03:18:31] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 1.69491525424% of data exceeded the critical threshold [500.0] [03:21:31] PROBLEM - Puppet freshness on mw1031 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:21:03 2014 [03:21:31] PROBLEM - Puppet freshness on mw1134 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:20:58 2014 [03:21:31] PROBLEM - Puppet freshness on mw1178 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:20:58 2014 [03:22:31] PROBLEM - Puppet freshness on mw1145 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:22:03 2014 [03:22:31] PROBLEM - Puppet freshness on mw1022 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:21:48 2014 [03:23:31] PROBLEM - Puppet freshness on mw1001 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:22:39 2014 [03:23:31] PROBLEM - Puppet freshness on mw1026 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:22:49 2014 [03:23:31] PROBLEM - Puppet freshness on mw1045 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:22:38 2014 [03:23:31] PROBLEM - Puppet freshness on mw1082 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:23:19 2014 [03:23:31] PROBLEM - Puppet freshness on mw1059 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:22:28 2014 [03:23:32] PROBLEM - Puppet freshness on mw1093 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:23:19 2014 [03:23:32] PROBLEM - Puppet freshness on mw1139 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:22:59 2014 [03:23:33] PROBLEM - Puppet freshness on mw1141 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:22:28 2014 [03:23:33] PROBLEM - Puppet freshness on mw1185 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:22:28 2014 [03:23:34] PROBLEM - Puppet freshness on mw1200 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:22:23 2014 [03:23:34] PROBLEM - Puppet freshness on mw1219 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:22:59 2014 [03:23:35] PROBLEM - Puppet freshness on search1006 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:22:49 2014 [03:23:35] PROBLEM - Puppet freshness on search1011 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:23:04 2014 [03:24:31] PROBLEM - Puppet freshness on mw1008 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:24:20 2014 [03:24:31] PROBLEM - Puppet freshness on mw1060 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:23:49 2014 [03:24:31] PROBLEM - Puppet freshness on mw1112 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:24:20 2014 [03:24:31] PROBLEM - Puppet freshness on mw1174 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:23:24 2014 [03:24:31] PROBLEM - Puppet freshness on mw1203 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:24:05 2014 [03:24:32] PROBLEM - Puppet freshness on mw1215 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:23:44 2014 [03:24:32] PROBLEM - Puppet freshness on mw1220 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:23:54 2014 [03:24:33] PROBLEM - Puppet freshness on search1004 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:23:54 2014 [03:24:33] PROBLEM - Puppet freshness on search1016 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:23:39 2014 [03:25:31] PROBLEM - Puppet freshness on mw1009 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:24:30 2014 [03:25:31] PROBLEM - Puppet freshness on mw1061 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:25:01 2014 [03:25:31] PROBLEM - Puppet freshness on mw1086 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:24:41 2014 [03:25:31] PROBLEM - Puppet freshness on mw1090 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:24:41 2014 [03:25:31] PROBLEM - Puppet freshness on mw1119 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:25:06 2014 [03:25:32] PROBLEM - Puppet freshness on mw1120 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:24:30 2014 [03:25:32] PROBLEM - Puppet freshness on mw1135 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:25:06 2014 [03:25:33] PROBLEM - Puppet freshness on mw1158 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:25:07 2014 [03:25:33] PROBLEM - Puppet freshness on terbium is CRITICAL: Last successful Puppet run was Tue Apr 29 00:24:35 2014 [03:26:31] PROBLEM - Puppet freshness on mw1039 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:25:47 2014 [03:26:31] PROBLEM - Puppet freshness on mw1110 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:25:22 2014 [03:26:31] PROBLEM - Puppet freshness on mw1170 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:25:42 2014 [03:26:31] PROBLEM - Puppet freshness on mw1199 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:25:47 2014 [03:27:31] PROBLEM - Puppet freshness on mw1055 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:27:14 2014 [03:27:31] PROBLEM - Puppet freshness on mw1070 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:27:09 2014 [03:27:31] PROBLEM - Puppet freshness on mw1073 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:26:39 2014 [03:27:31] PROBLEM - Puppet freshness on mw1076 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:26:49 2014 [03:27:31] PROBLEM - Puppet freshness on mw1157 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:27:19 2014 [03:27:32] PROBLEM - Puppet freshness on mw1172 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:26:23 2014 [03:27:32] PROBLEM - Puppet freshness on mw1177 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:26:23 2014 [03:28:31] PROBLEM - Puppet freshness on mw1019 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:27:29 2014 [03:28:31] PROBLEM - Puppet freshness on mw1051 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:27:50 2014 [03:28:31] PROBLEM - Puppet freshness on mw1058 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:27:39 2014 [03:28:31] PROBLEM - Puppet freshness on mw1085 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:27:29 2014 [03:28:31] PROBLEM - Puppet freshness on mw1098 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:28:10 2014 [03:28:32] PROBLEM - Puppet freshness on mw1102 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:27:44 2014 [03:28:32] PROBLEM - Puppet freshness on mw1133 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:28:15 2014 [03:28:33] PROBLEM - Puppet freshness on mw1149 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:27:44 2014 [03:28:33] PROBLEM - Puppet freshness on mw1168 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:28:05 2014 [03:28:34] PROBLEM - Puppet freshness on mw1191 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:28:00 2014 [03:28:34] PROBLEM - Puppet freshness on mw1184 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:27:55 2014 [03:28:35] PROBLEM - Puppet freshness on mw1195 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:27:24 2014 [03:28:35] PROBLEM - Puppet freshness on search1002 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:27:55 2014 [03:28:36] PROBLEM - Puppet freshness on search1009 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:27:55 2014 [03:29:31] PROBLEM - Puppet freshness on hooft is CRITICAL: Last successful Puppet run was Tue Apr 29 00:29:06 2014 [03:29:31] PROBLEM - Puppet freshness on mw1014 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:28:41 2014 [03:29:31] PROBLEM - Puppet freshness on mw1079 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:28:46 2014 [03:29:31] PROBLEM - Puppet freshness on mw1183 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:29:06 2014 [03:29:31] PROBLEM - Puppet freshness on mw1151 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:28:31 2014 [03:29:32] PROBLEM - Puppet freshness on mw1202 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:28:31 2014 [03:29:32] PROBLEM - Puppet freshness on rhenium is CRITICAL: Last successful Puppet run was Tue Apr 29 00:28:36 2014 [03:29:33] PROBLEM - Puppet freshness on tmh1002 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:29:01 2014 [03:30:31] PROBLEM - Puppet freshness on mw1013 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:29:48 2014 [03:30:31] PROBLEM - Puppet freshness on mw1035 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:29:48 2014 [03:30:31] PROBLEM - Puppet freshness on mw1034 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:29:48 2014 [03:30:31] PROBLEM - Puppet freshness on mw1036 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:29:42 2014 [03:30:31] PROBLEM - Puppet freshness on mw1050 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:29:22 2014 [03:30:32] PROBLEM - Puppet freshness on mw1097 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:30:19 2014 [03:30:32] PROBLEM - Puppet freshness on mw1096 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:29:43 2014 [03:30:33] PROBLEM - Puppet freshness on mw1109 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:30:14 2014 [03:30:33] PROBLEM - Puppet freshness on mw1192 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:30:08 2014 [03:30:34] PROBLEM - Puppet freshness on mw1156 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:29:53 2014 [03:30:34] PROBLEM - Puppet freshness on mw1212 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:30:08 2014 [03:31:31] PROBLEM - Puppet freshness on mw1005 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:31:10 2014 [03:31:31] PROBLEM - Puppet freshness on mw1023 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:30:29 2014 [03:31:31] PROBLEM - Puppet freshness on mw1028 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:30:54 2014 [03:31:31] PROBLEM - Puppet freshness on mw1029 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:30:24 2014 [03:31:31] PROBLEM - Puppet freshness on mw1067 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:31:15 2014 [03:31:32] PROBLEM - Puppet freshness on mw1148 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:31:15 2014 [03:31:32] PROBLEM - Puppet freshness on search1015 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:31:20 2014 [03:31:33] PROBLEM - Puppet freshness on mw1216 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:30:44 2014 [03:32:31] PROBLEM - Puppet freshness on mw1012 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:32:12 2014 [03:32:31] PROBLEM - Puppet freshness on mw1105 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:32:12 2014 [03:32:31] PROBLEM - Puppet freshness on mw1167 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:32:01 2014 [03:32:31] PROBLEM - Puppet freshness on search1020 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:31:30 2014 [03:32:31] PROBLEM - Puppet freshness on mw1108 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:31:51 2014 [03:32:32] PROBLEM - Puppet freshness on mw1186 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:32:01 2014 [03:33:31] PROBLEM - Puppet freshness on mw1006 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:32:22 2014 [03:33:31] PROBLEM - Puppet freshness on mw1043 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:32:22 2014 [03:33:31] PROBLEM - Puppet freshness on mw1077 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:32:22 2014 [03:33:31] PROBLEM - Puppet freshness on mw1121 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:32:42 2014 [03:33:31] PROBLEM - Puppet freshness on mw1152 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:33:07 2014 [03:33:32] PROBLEM - Puppet freshness on mw1160 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:33:12 2014 [03:33:32] PROBLEM - Puppet freshness on mw1209 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:32:42 2014 [03:33:33] PROBLEM - Puppet freshness on search1013 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:32:22 2014 [03:34:31] PROBLEM - Puppet freshness on mw1016 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:33:43 2014 [03:34:31] PROBLEM - Puppet freshness on mw1064 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:34:03 2014 [03:34:32] PROBLEM - Puppet freshness on mw1088 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:34:13 2014 [03:34:32] PROBLEM - Puppet freshness on mw1099 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:33:58 2014 [03:34:32] PROBLEM - Puppet freshness on mw1100 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:34:13 2014 [03:34:32] PROBLEM - Puppet freshness on mw1164 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:34:03 2014 [03:34:32] PROBLEM - Puppet freshness on mw1187 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:33:43 2014 [03:34:33] PROBLEM - Puppet freshness on mw1193 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:33:58 2014 [03:34:33] PROBLEM - Puppet freshness on mw1217 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:34:03 2014 [03:34:34] PROBLEM - Puppet freshness on search1010 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:34:08 2014 [03:35:31] PROBLEM - Puppet freshness on mw1037 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:35:09 2014 [03:35:31] PROBLEM - Puppet freshness on fenari is CRITICAL: Last successful Puppet run was Tue Apr 29 00:35:19 2014 [03:35:31] PROBLEM - Puppet freshness on mw1042 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:35:19 2014 [03:35:31] PROBLEM - Puppet freshness on mw1052 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:34:53 2014 [03:35:31] PROBLEM - Puppet freshness on mw1071 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:34:23 2014 [03:35:32] PROBLEM - Puppet freshness on mw1117 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:34:38 2014 [03:35:32] PROBLEM - Puppet freshness on mw1113 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:35:09 2014 [03:35:33] PROBLEM - Puppet freshness on mw1144 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:35:19 2014 [03:35:33] PROBLEM - Puppet freshness on mw1176 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:34:33 2014 [03:35:34] PROBLEM - Puppet freshness on mw1207 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:35:09 2014 [03:35:34] PROBLEM - Puppet freshness on search1018 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:35:03 2014 [03:36:31] PROBLEM - Puppet freshness on mw1002 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:36:00 2014 [03:36:31] PROBLEM - Puppet freshness on mw1065 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:35:25 2014 [03:36:31] PROBLEM - Puppet freshness on mw1103 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:36:20 2014 [03:36:31] PROBLEM - Puppet freshness on mw1114 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:36:00 2014 [03:36:32] PROBLEM - Puppet freshness on mw1123 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:35:30 2014 [03:36:32] PROBLEM - Puppet freshness on mw1126 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:35:55 2014 [03:37:31] PROBLEM - Puppet freshness on mw1017 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:37:06 2014 [03:37:31] PROBLEM - Puppet freshness on mw1018 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:36:40 2014 [03:37:31] PROBLEM - Puppet freshness on mw1095 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:37:06 2014 [03:37:31] PROBLEM - Puppet freshness on mw1101 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:37:06 2014 [03:37:31] PROBLEM - Puppet freshness on mw1162 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:37:01 2014 [03:37:32] PROBLEM - Puppet freshness on mw1175 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:36:25 2014 [03:37:32] PROBLEM - Puppet freshness on search1019 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:37:11 2014 [03:38:31] PROBLEM - Puppet freshness on mw1015 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:37:26 2014 [03:38:31] PROBLEM - Puppet freshness on mw1044 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:37:26 2014 [03:38:31] PROBLEM - Puppet freshness on mw1111 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:38:02 2014 [03:38:31] PROBLEM - Puppet freshness on mw1182 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:38:07 2014 [03:38:31] PROBLEM - Puppet freshness on search1008 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:37:41 2014 [03:38:32] PROBLEM - Puppet freshness on mw1190 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:37:46 2014 [03:38:32] PROBLEM - Puppet freshness on tin is CRITICAL: Last successful Puppet run was Tue Apr 29 00:37:31 2014 [03:39:31] PROBLEM - Puppet freshness on mw1125 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:38:47 2014 [03:39:31] PROBLEM - Puppet freshness on mw1127 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:38:32 2014 [03:39:32] PROBLEM - Puppet freshness on mw1214 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:38:37 2014 [03:40:31] PROBLEM - Puppet freshness on mw1004 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:39:22 2014 [03:40:31] PROBLEM - Puppet freshness on mw1056 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:39:52 2014 [03:40:31] PROBLEM - Puppet freshness on mw1057 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:39:47 2014 [03:40:31] PROBLEM - Puppet freshness on mw1081 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:39:32 2014 [03:40:31] PROBLEM - Puppet freshness on mw1130 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:40:07 2014 [03:40:32] PROBLEM - Puppet freshness on mw1146 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:39:42 2014 [03:40:32] PROBLEM - Puppet freshness on mw1147 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:40:07 2014 [03:40:33] PROBLEM - Puppet freshness on mw1159 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:39:27 2014 [03:40:33] PROBLEM - Puppet freshness on mw1196 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:39:22 2014 [03:40:34] PROBLEM - Puppet freshness on mw1188 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:39:42 2014 [03:40:34] PROBLEM - Puppet freshness on mw1210 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:39:57 2014 [03:40:35] PROBLEM - Puppet freshness on search1014 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:39:32 2014 [03:40:35] PROBLEM - Puppet freshness on search1017 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:39:42 2014 [03:40:36] PROBLEM - Puppet freshness on search1024 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:39:52 2014 [03:41:31] PROBLEM - Puppet freshness on mw1032 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:40:58 2014 [03:41:31] PROBLEM - Puppet freshness on mw1038 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:41:03 2014 [03:41:32] PROBLEM - Puppet freshness on mw1087 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:40:42 2014 [03:41:32] PROBLEM - Puppet freshness on mw1115 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:41:03 2014 [03:41:32] PROBLEM - Puppet freshness on mw1124 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:40:27 2014 [03:41:32] PROBLEM - Puppet freshness on mw1161 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:40:42 2014 [03:41:32] PROBLEM - Puppet freshness on mw1171 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:40:37 2014 [03:42:31] PROBLEM - Puppet freshness on mw1024 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:41:53 2014 [03:42:31] PROBLEM - Puppet freshness on mw1041 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:41:53 2014 [03:42:31] PROBLEM - Puppet freshness on mw1048 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:41:38 2014 [03:42:32] PROBLEM - Puppet freshness on mw1089 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:41:58 2014 [03:42:32] PROBLEM - Puppet freshness on mw1122 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:42:03 2014 [03:42:32] PROBLEM - Puppet freshness on mw1140 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:42:08 2014 [03:42:32] PROBLEM - Puppet freshness on mw1201 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:42:13 2014 [03:42:33] PROBLEM - Puppet freshness on searchidx1001 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:42:13 2014 [03:43:31] PROBLEM - Puppet freshness on fluorine is CRITICAL: Last successful Puppet run was Tue Apr 29 00:43:14 2014 [03:43:31] PROBLEM - Puppet freshness on mw1007 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:42:38 2014 [03:43:32] PROBLEM - Puppet freshness on mw1010 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:43:08 2014 [03:43:32] PROBLEM - Puppet freshness on mw1033 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:42:43 2014 [03:43:32] PROBLEM - Puppet freshness on mw1063 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:42:23 2014 [03:43:32] PROBLEM - Puppet freshness on mw1106 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:42:33 2014 [03:43:32] PROBLEM - Puppet freshness on mw1197 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:42:48 2014 [03:44:31] PROBLEM - Puppet freshness on mw1046 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:44:09 2014 [03:44:31] PROBLEM - Puppet freshness on mw1066 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:43:54 2014 [03:44:31] PROBLEM - Puppet freshness on mw1069 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:44:09 2014 [03:44:31] PROBLEM - Puppet freshness on mw1091 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:43:34 2014 [03:44:32] PROBLEM - Puppet freshness on mw1142 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:43:44 2014 [03:44:32] PROBLEM - Puppet freshness on mw1107 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:43:59 2014 [03:44:32] PROBLEM - Puppet freshness on mw1153 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:43:59 2014 [03:44:33] PROBLEM - Puppet freshness on mw1173 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:44:09 2014 [03:44:33] PROBLEM - Puppet freshness on mw1204 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:44:04 2014 [03:44:34] PROBLEM - Puppet freshness on search1022 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:43:39 2014 [03:45:31] PROBLEM - Puppet freshness on bast1001 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:44:59 2014 [03:45:31] PROBLEM - Puppet freshness on mw1003 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:44:34 2014 [03:45:31] PROBLEM - Puppet freshness on mw1021 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:44:59 2014 [03:45:31] PROBLEM - Puppet freshness on mw1025 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:45:11 2014 [03:45:31] PROBLEM - Puppet freshness on mw1027 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:44:34 2014 [03:45:32] PROBLEM - Puppet freshness on mw1131 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:45:06 2014 [03:45:32] PROBLEM - Puppet freshness on mw1068 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:44:54 2014 [03:45:33] PROBLEM - Puppet freshness on mw1104 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:45:06 2014 [03:45:33] PROBLEM - Puppet freshness on mw1143 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:44:49 2014 [03:45:34] PROBLEM - Puppet freshness on mw1150 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:44:39 2014 [03:45:34] PROBLEM - Puppet freshness on mw1166 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:44:49 2014 [03:45:35] PROBLEM - Puppet freshness on mw1154 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:45:11 2014 [03:45:35] PROBLEM - Puppet freshness on mw1205 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:44:29 2014 [03:45:36] PROBLEM - Puppet freshness on search1001 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:45:11 2014 [03:45:36] PROBLEM - Puppet freshness on mw1189 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:44:29 2014 [03:45:37] PROBLEM - Puppet freshness on search1012 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:44:39 2014 [03:46:31] PROBLEM - Puppet freshness on mw1054 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:46:17 2014 [03:46:31] PROBLEM - Puppet freshness on mw1092 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:45:46 2014 [03:46:31] PROBLEM - Puppet freshness on mw1118 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:45:41 2014 [03:46:31] PROBLEM - Puppet freshness on mw1129 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:45:51 2014 [03:46:31] PROBLEM - Puppet freshness on mw1137 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:46:06 2014 [03:46:32] PROBLEM - Puppet freshness on mw1155 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:45:21 2014 [03:46:32] PROBLEM - Puppet freshness on mw1194 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:46:01 2014 [03:46:33] PROBLEM - Puppet freshness on mw1211 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:46:01 2014 [03:46:33] PROBLEM - Puppet freshness on mw1213 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:46:07 2014 [03:46:34] PROBLEM - Puppet freshness on search1003 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:46:01 2014 [03:46:34] PROBLEM - Puppet freshness on search1007 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:46:12 2014 [03:47:31] PROBLEM - Puppet freshness on mw1011 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:46:32 2014 [03:47:32] PROBLEM - Puppet freshness on mw1020 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:47:18 2014 [03:47:32] PROBLEM - Puppet freshness on mw1047 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:46:43 2014 [03:47:32] PROBLEM - Puppet freshness on mw1075 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:47:13 2014 [03:47:32] PROBLEM - Puppet freshness on mw1128 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:46:58 2014 [03:47:32] PROBLEM - Puppet freshness on mw1078 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:46:48 2014 [03:47:32] PROBLEM - Puppet freshness on mw1206 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:47:08 2014 [03:47:33] PROBLEM - Puppet freshness on tmh1001 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:46:27 2014 [03:48:31] PROBLEM - Puppet freshness on mw1049 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:48:13 2014 [03:48:31] PROBLEM - Puppet freshness on mw1136 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:48:03 2014 [03:48:31] PROBLEM - Puppet freshness on mw1179 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:47:48 2014 [03:48:31] PROBLEM - Puppet freshness on mw1180 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:47:58 2014 [03:48:31] PROBLEM - Puppet freshness on mw1208 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:47:38 2014 [03:48:32] PROBLEM - Puppet freshness on tantalum is CRITICAL: Last successful Puppet run was Tue Apr 29 00:48:03 2014 [03:48:34] !log LocalisationUpdate ResourceLoader cache refresh completed at Tue Apr 29 03:48:29 UTC 2014 (duration 48m 28s) [03:48:41] Logged the message, Master [03:49:31] PROBLEM - Puppet freshness on mw1083 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:48:28 2014 [03:49:31] PROBLEM - Puppet freshness on mw1084 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:48:48 2014 [03:49:31] PROBLEM - Puppet freshness on mw1094 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:48:38 2014 [03:49:31] PROBLEM - Puppet freshness on mw1169 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:48:23 2014 [03:49:31] PROBLEM - Puppet freshness on search1005 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:48:23 2014 [03:49:32] PROBLEM - Puppet freshness on search1021 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:48:23 2014 [03:50:31] PROBLEM - Puppet freshness on mw1030 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:49:33 2014 [03:50:31] PROBLEM - Puppet freshness on mw1040 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:50:19 2014 [03:50:31] PROBLEM - Puppet freshness on mw1132 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:49:59 2014 [03:50:31] PROBLEM - Puppet freshness on mw1116 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:50:14 2014 [03:50:31] PROBLEM - Puppet freshness on mw1138 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:49:23 2014 [03:50:32] PROBLEM - Puppet freshness on mw1053 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:49:54 2014 [03:50:32] PROBLEM - Puppet freshness on mw1181 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:49:23 2014 [03:50:33] PROBLEM - Puppet freshness on search1023 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:49:59 2014 [03:50:33] PROBLEM - Puppet freshness on mw1165 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:49:33 2014 [03:51:31] PROBLEM - Puppet freshness on mw1062 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:50:44 2014 [03:51:31] PROBLEM - Puppet freshness on mw1072 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:50:49 2014 [03:51:31] PROBLEM - Puppet freshness on mw1074 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:50:34 2014 [03:51:31] PROBLEM - Puppet freshness on mw1080 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:50:44 2014 [03:51:31] PROBLEM - Puppet freshness on mw1198 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:50:24 2014 [03:51:32] PROBLEM - Puppet freshness on mw1218 is CRITICAL: Last successful Puppet run was Tue Apr 29 00:50:29 2014 [04:10:51] ok who broke puppet on all the app servers [04:10:58] 1b115611 I see [04:15:09] (03PS1) 10ArielGlenn: fix up maxsem's ssh key [operations/puppet] - 10https://gerrit.wikimedia.org/r/130284 [04:15:31] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1% data above the threshold [250.0] [04:18:50] (03CR) 10ArielGlenn: [C: 032] fix up maxsem's ssh key [operations/puppet] - 10https://gerrit.wikimedia.org/r/130284 (owner: 10ArielGlenn) [04:21:19] (03PS1) 10ArielGlenn: and fix up maxsem's key the rest of the way (still waking up...) [operations/puppet] - 10https://gerrit.wikimedia.org/r/130285 [04:21:42] yawn [04:23:37] (03CR) 10ArielGlenn: [C: 032] and fix up maxsem's key the rest of the way (still waking up...) [operations/puppet] - 10https://gerrit.wikimedia.org/r/130285 (owner: 10ArielGlenn) [04:24:51] RECOVERY - Puppet freshness on mw1009 is OK: puppet ran at Tue Apr 29 04:24:44 UTC 2014 [04:24:51] RECOVERY - Puppet freshness on terbium is OK: puppet ran at Tue Apr 29 04:24:44 UTC 2014 [04:24:52] RECOVERY - Puppet freshness on mw1112 is OK: puppet ran at Tue Apr 29 04:24:50 UTC 2014 [04:25:11] RECOVERY - Puppet freshness on mw1158 is OK: puppet ran at Tue Apr 29 04:25:05 UTC 2014 [04:25:11] RECOVERY - Puppet freshness on mw1031 is OK: puppet ran at Tue Apr 29 04:25:05 UTC 2014 [04:25:11] RECOVERY - Puppet freshness on mw1119 is OK: puppet ran at Tue Apr 29 04:25:10 UTC 2014 [04:25:21] RECOVERY - Puppet freshness on mw1170 is OK: puppet ran at Tue Apr 29 04:25:15 UTC 2014 [04:25:21] RECOVERY - Puppet freshness on mw1135 is OK: puppet ran at Tue Apr 29 04:25:15 UTC 2014 [04:25:41] RECOVERY - Puppet freshness on mw1110 is OK: puppet ran at Tue Apr 29 04:25:40 UTC 2014 [04:25:51] RECOVERY - Puppet freshness on mw1061 is OK: puppet ran at Tue Apr 29 04:25:41 UTC 2014 [04:25:51] RECOVERY - Puppet freshness on mw1172 is OK: puppet ran at Tue Apr 29 04:25:46 UTC 2014 [04:26:01] RECOVERY - Puppet freshness on mw1177 is OK: puppet ran at Tue Apr 29 04:25:51 UTC 2014 [04:26:01] RECOVERY - Puppet freshness on mw1073 is OK: puppet ran at Tue Apr 29 04:25:56 UTC 2014 [04:26:21] RECOVERY - Puppet freshness on mw1199 is OK: puppet ran at Tue Apr 29 04:26:16 UTC 2014 [04:26:31] RECOVERY - Puppet freshness on mw1039 is OK: puppet ran at Tue Apr 29 04:26:21 UTC 2014 [04:27:01] RECOVERY - Puppet freshness on mw1085 is OK: puppet ran at Tue Apr 29 04:26:57 UTC 2014 [04:27:11] RECOVERY - Puppet freshness on mw1055 is OK: puppet ran at Tue Apr 29 04:27:07 UTC 2014 [04:27:21] RECOVERY - Puppet freshness on mw1058 is OK: puppet ran at Tue Apr 29 04:27:18 UTC 2014 [04:27:31] RECOVERY - Puppet freshness on mw1076 is OK: puppet ran at Tue Apr 29 04:27:23 UTC 2014 [04:27:31] RECOVERY - Puppet freshness on mw1149 is OK: puppet ran at Tue Apr 29 04:27:23 UTC 2014 [04:27:41] RECOVERY - Puppet freshness on mw1195 is OK: puppet ran at Tue Apr 29 04:27:38 UTC 2014 [04:27:51] RECOVERY - Puppet freshness on mw1019 is OK: puppet ran at Tue Apr 29 04:27:43 UTC 2014 [04:27:51] RECOVERY - Puppet freshness on mw1102 is OK: puppet ran at Tue Apr 29 04:27:43 UTC 2014 [04:27:51] RECOVERY - Puppet freshness on mw1070 is OK: puppet ran at Tue Apr 29 04:27:48 UTC 2014 [04:27:51] RECOVERY - Puppet freshness on mw1184 is OK: puppet ran at Tue Apr 29 04:27:48 UTC 2014 [04:28:01] RECOVERY - Puppet freshness on mw1157 is OK: puppet ran at Tue Apr 29 04:27:53 UTC 2014 [04:28:11] RECOVERY - Puppet freshness on mw1202 is OK: puppet ran at Tue Apr 29 04:28:04 UTC 2014 [04:28:11] RECOVERY - Puppet freshness on mw1079 is OK: puppet ran at Tue Apr 29 04:28:04 UTC 2014 [04:28:11] RECOVERY - Puppet freshness on search1009 is OK: puppet ran at Tue Apr 29 04:28:09 UTC 2014 [04:28:21] RECOVERY - Puppet freshness on mw1133 is OK: puppet ran at Tue Apr 29 04:28:14 UTC 2014 [04:28:31] RECOVERY - Puppet freshness on mw1051 is OK: puppet ran at Tue Apr 29 04:28:24 UTC 2014 [04:28:32] RECOVERY - Puppet freshness on mw1168 is OK: puppet ran at Tue Apr 29 04:28:29 UTC 2014 [04:28:32] RECOVERY - Puppet freshness on search1002 is OK: puppet ran at Tue Apr 29 04:28:29 UTC 2014 [04:28:32] RECOVERY - Puppet freshness on mw1151 is OK: puppet ran at Tue Apr 29 04:28:29 UTC 2014 [04:28:41] RECOVERY - Puppet freshness on mw1191 is OK: puppet ran at Tue Apr 29 04:28:40 UTC 2014 [04:28:41] RECOVERY - Puppet freshness on tmh1002 is OK: puppet ran at Tue Apr 29 04:28:40 UTC 2014 [04:28:41] RECOVERY - Puppet freshness on mw1014 is OK: puppet ran at Tue Apr 29 04:28:40 UTC 2014 [04:28:51] RECOVERY - Puppet freshness on mw1098 is OK: puppet ran at Tue Apr 29 04:28:50 UTC 2014 [04:29:01] RECOVERY - Puppet freshness on mw1034 is OK: puppet ran at Tue Apr 29 04:28:55 UTC 2014 [04:29:01] RECOVERY - Puppet freshness on mw1036 is OK: puppet ran at Tue Apr 29 04:28:55 UTC 2014 [04:29:01] RECOVERY - Puppet freshness on mw1050 is OK: puppet ran at Tue Apr 29 04:28:55 UTC 2014 [04:29:11] RECOVERY - Puppet freshness on mw1156 is OK: puppet ran at Tue Apr 29 04:29:05 UTC 2014 [04:29:11] RECOVERY - Puppet freshness on mw1183 is OK: puppet ran at Tue Apr 29 04:29:10 UTC 2014 [04:29:21] RECOVERY - Puppet freshness on hooft is OK: puppet ran at Tue Apr 29 04:29:20 UTC 2014 [04:29:51] RECOVERY - Puppet freshness on mw1096 is OK: puppet ran at Tue Apr 29 04:29:41 UTC 2014 [04:29:51] RECOVERY - Puppet freshness on mw1035 is OK: puppet ran at Tue Apr 29 04:29:41 UTC 2014 [04:29:51] RECOVERY - Puppet freshness on mw1023 is OK: puppet ran at Tue Apr 29 04:29:46 UTC 2014 [04:29:51] RECOVERY - Puppet freshness on mw1013 is OK: puppet ran at Tue Apr 29 04:29:46 UTC 2014 [04:30:11] RECOVERY - Puppet freshness on mw1216 is OK: puppet ran at Tue Apr 29 04:30:06 UTC 2014 [04:30:21] RECOVERY - Puppet freshness on mw1192 is OK: puppet ran at Tue Apr 29 04:30:16 UTC 2014 [04:30:21] RECOVERY - Puppet freshness on mw1212 is OK: puppet ran at Tue Apr 29 04:30:16 UTC 2014 [04:30:31] RECOVERY - Puppet freshness on mw1029 is OK: puppet ran at Tue Apr 29 04:30:21 UTC 2014 [04:30:41] RECOVERY - Puppet freshness on mw1097 is OK: puppet ran at Tue Apr 29 04:30:36 UTC 2014 [04:30:51] RECOVERY - Puppet freshness on mw1109 is OK: puppet ran at Tue Apr 29 04:30:41 UTC 2014 [04:30:51] RECOVERY - Puppet freshness on search1020 is OK: puppet ran at Tue Apr 29 04:30:47 UTC 2014 [04:31:01] RECOVERY - Puppet freshness on mw1005 is OK: puppet ran at Tue Apr 29 04:30:57 UTC 2014 [04:31:21] RECOVERY - Puppet freshness on mw1148 is OK: puppet ran at Tue Apr 29 04:31:17 UTC 2014 [04:31:31] RECOVERY - Puppet freshness on mw1028 is OK: puppet ran at Tue Apr 29 04:31:22 UTC 2014 [04:31:41] RECOVERY - Puppet freshness on mw1067 is OK: puppet ran at Tue Apr 29 04:31:37 UTC 2014 [04:31:51] RECOVERY - Puppet freshness on search1015 is OK: puppet ran at Tue Apr 29 04:31:42 UTC 2014 [04:31:51] RECOVERY - Puppet freshness on search1013 is OK: puppet ran at Tue Apr 29 04:31:42 UTC 2014 [04:31:51] RECOVERY - Puppet freshness on mw1186 is OK: puppet ran at Tue Apr 29 04:31:47 UTC 2014 [04:32:11] RECOVERY - Puppet freshness on mw1167 is OK: puppet ran at Tue Apr 29 04:32:02 UTC 2014 [04:32:11] RECOVERY - Puppet freshness on mw1006 is OK: puppet ran at Tue Apr 29 04:32:02 UTC 2014 [04:32:11] RECOVERY - Puppet freshness on mw1105 is OK: puppet ran at Tue Apr 29 04:32:07 UTC 2014 [04:32:11] RECOVERY - Puppet freshness on mw1108 is OK: puppet ran at Tue Apr 29 04:32:08 UTC 2014 [04:32:31] RECOVERY - Puppet freshness on mw1043 is OK: puppet ran at Tue Apr 29 04:32:24 UTC 2014 [04:32:31] RECOVERY - Puppet freshness on mw1077 is OK: puppet ran at Tue Apr 29 04:32:24 UTC 2014 [04:32:32] RECOVERY - Puppet freshness on mw1012 is OK: puppet ran at Tue Apr 29 04:32:29 UTC 2014 [04:32:41] RECOVERY - Puppet freshness on mw1121 is OK: puppet ran at Tue Apr 29 04:32:39 UTC 2014 [04:32:51] RECOVERY - Puppet freshness on mw1209 is OK: puppet ran at Tue Apr 29 04:32:44 UTC 2014 [04:33:11] RECOVERY - Puppet freshness on mw1160 is OK: puppet ran at Tue Apr 29 04:33:04 UTC 2014 [04:33:11] RECOVERY - Puppet freshness on mw1187 is OK: puppet ran at Tue Apr 29 04:33:09 UTC 2014 [04:33:31] RECOVERY - Puppet freshness on mw1016 is OK: puppet ran at Tue Apr 29 04:33:24 UTC 2014 [04:33:51] RECOVERY - Puppet freshness on mw1152 is OK: puppet ran at Tue Apr 29 04:33:45 UTC 2014 [04:33:51] RECOVERY - Puppet freshness on mw1064 is OK: puppet ran at Tue Apr 29 04:33:50 UTC 2014 [04:34:01] RECOVERY - Puppet freshness on mw1164 is OK: puppet ran at Tue Apr 29 04:34:00 UTC 2014 [04:34:01] RECOVERY - Puppet freshness on mw1193 is OK: puppet ran at Tue Apr 29 04:34:00 UTC 2014 [04:34:21] RECOVERY - Puppet freshness on mw1088 is OK: puppet ran at Tue Apr 29 04:34:15 UTC 2014 [04:34:21] RECOVERY - Puppet freshness on mw1117 is OK: puppet ran at Tue Apr 29 04:34:20 UTC 2014 [04:34:21] RECOVERY - Puppet freshness on mw1100 is OK: puppet ran at Tue Apr 29 04:34:20 UTC 2014 [04:34:31] RECOVERY - Puppet freshness on mw1176 is OK: puppet ran at Tue Apr 29 04:34:25 UTC 2014 [04:34:41] RECOVERY - Puppet freshness on mw1071 is OK: puppet ran at Tue Apr 29 04:34:35 UTC 2014 [04:34:41] RECOVERY - Puppet freshness on mw1217 is OK: puppet ran at Tue Apr 29 04:34:36 UTC 2014 [04:34:51] RECOVERY - Puppet freshness on mw1113 is OK: puppet ran at Tue Apr 29 04:34:41 UTC 2014 [04:35:01] RECOVERY - Puppet freshness on mw1065 is OK: puppet ran at Tue Apr 29 04:34:51 UTC 2014 [04:35:01] RECOVERY - Puppet freshness on search1018 is OK: puppet ran at Tue Apr 29 04:34:51 UTC 2014 [04:35:01] RECOVERY - Puppet freshness on mw1099 is OK: puppet ran at Tue Apr 29 04:34:51 UTC 2014 [04:35:01] RECOVERY - Puppet freshness on search1010 is OK: puppet ran at Tue Apr 29 04:34:56 UTC 2014 [04:35:01] RECOVERY - Puppet freshness on mw1207 is OK: puppet ran at Tue Apr 29 04:34:56 UTC 2014 [04:35:11] RECOVERY - Puppet freshness on mw1042 is OK: puppet ran at Tue Apr 29 04:35:06 UTC 2014 [04:35:21] RECOVERY - Puppet freshness on mw1037 is OK: puppet ran at Tue Apr 29 04:35:11 UTC 2014 [04:35:21] RECOVERY - Puppet freshness on mw1052 is OK: puppet ran at Tue Apr 29 04:35:16 UTC 2014 [04:35:41] RECOVERY - Puppet freshness on mw1123 is OK: puppet ran at Tue Apr 29 04:35:37 UTC 2014 [04:35:41] RECOVERY - Puppet freshness on mw1144 is OK: puppet ran at Tue Apr 29 04:35:37 UTC 2014 [04:35:41] RECOVERY - Puppet freshness on fenari is OK: puppet ran at Tue Apr 29 04:35:37 UTC 2014 [04:36:01] RECOVERY - Puppet freshness on mw1114 is OK: puppet ran at Tue Apr 29 04:35:52 UTC 2014 [04:36:01] RECOVERY - Puppet freshness on mw1018 is OK: puppet ran at Tue Apr 29 04:35:57 UTC 2014 [04:36:16] jebus [04:36:21] RECOVERY - Puppet freshness on mw1103 is OK: puppet ran at Tue Apr 29 04:36:12 UTC 2014 [04:36:41] RECOVERY - Puppet freshness on mw1175 is OK: puppet ran at Tue Apr 29 04:36:32 UTC 2014 [04:36:41] RECOVERY - Puppet freshness on mw1002 is OK: puppet ran at Tue Apr 29 04:36:37 UTC 2014 [04:36:51] RECOVERY - Puppet freshness on mw1126 is OK: puppet ran at Tue Apr 29 04:36:47 UTC 2014 [04:36:51] RECOVERY - Puppet freshness on search1008 is OK: puppet ran at Tue Apr 29 04:36:47 UTC 2014 [04:37:01] RECOVERY - Puppet freshness on mw1095 is OK: puppet ran at Tue Apr 29 04:36:52 UTC 2014 [04:37:01] RECOVERY - Puppet freshness on mw1017 is OK: puppet ran at Tue Apr 29 04:36:57 UTC 2014 [04:37:10] well it's got to recover them all, give it another 20 mins or so [04:37:11] RECOVERY - Puppet freshness on mw1044 is OK: puppet ran at Tue Apr 29 04:37:03 UTC 2014 [04:37:21] RECOVERY - Puppet freshness on tin is OK: puppet ran at Tue Apr 29 04:37:13 UTC 2014 [04:37:21] RECOVERY - Puppet freshness on mw1162 is OK: puppet ran at Tue Apr 29 04:37:18 UTC 2014 [04:37:41] RECOVERY - Puppet freshness on search1019 is OK: puppet ran at Tue Apr 29 04:37:38 UTC 2014 [04:37:41] RECOVERY - Puppet freshness on mw1101 is OK: puppet ran at Tue Apr 29 04:37:38 UTC 2014 [04:37:51] RECOVERY - Puppet freshness on mw1015 is OK: puppet ran at Tue Apr 29 04:37:43 UTC 2014 [04:38:11] RECOVERY - Puppet freshness on mw1182 is OK: puppet ran at Tue Apr 29 04:38:04 UTC 2014 [04:38:11] RECOVERY - Puppet freshness on mw1127 is OK: puppet ran at Tue Apr 29 04:38:09 UTC 2014 [04:38:21] RECOVERY - Puppet freshness on mw1214 is OK: puppet ran at Tue Apr 29 04:38:14 UTC 2014 [04:38:41] RECOVERY - Puppet freshness on mw1125 is OK: puppet ran at Tue Apr 29 04:38:39 UTC 2014 [04:38:41] RECOVERY - Puppet freshness on mw1190 is OK: puppet ran at Tue Apr 29 04:38:39 UTC 2014 [04:38:51] RECOVERY - Puppet freshness on mw1111 is OK: puppet ran at Tue Apr 29 04:38:49 UTC 2014 [04:39:01] RECOVERY - Puppet freshness on mw1159 is OK: puppet ran at Tue Apr 29 04:38:59 UTC 2014 [04:39:01] RECOVERY - Puppet freshness on mw1196 is OK: puppet ran at Tue Apr 29 04:38:59 UTC 2014 [04:39:11] RECOVERY - Puppet freshness on mw1004 is OK: puppet ran at Tue Apr 29 04:39:09 UTC 2014 [04:39:21] RECOVERY - Puppet freshness on search1017 is OK: puppet ran at Tue Apr 29 04:39:19 UTC 2014 [04:39:31] RECOVERY - Puppet freshness on mw1081 is OK: puppet ran at Tue Apr 29 04:39:24 UTC 2014 [04:39:41] RECOVERY - Puppet freshness on mw1056 is OK: puppet ran at Tue Apr 29 04:39:35 UTC 2014 [04:39:41] RECOVERY - Puppet freshness on mw1057 is OK: puppet ran at Tue Apr 29 04:39:35 UTC 2014 [04:39:41] RECOVERY - Puppet freshness on search1014 is OK: puppet ran at Tue Apr 29 04:39:40 UTC 2014 [04:39:41] RECOVERY - Puppet freshness on mw1146 is OK: puppet ran at Tue Apr 29 04:39:40 UTC 2014 [04:39:51] RECOVERY - Puppet freshness on search1024 is OK: puppet ran at Tue Apr 29 04:39:50 UTC 2014 [04:39:52] RECOVERY - Puppet freshness on mw1130 is OK: puppet ran at Tue Apr 29 04:39:50 UTC 2014 [04:40:01] RECOVERY - Puppet freshness on mw1210 is OK: puppet ran at Tue Apr 29 04:40:00 UTC 2014 [04:40:11] RECOVERY - Puppet freshness on mw1188 is OK: puppet ran at Tue Apr 29 04:40:05 UTC 2014 [04:40:21] RECOVERY - Puppet freshness on mw1161 is OK: puppet ran at Tue Apr 29 04:40:20 UTC 2014 [04:40:31] RECOVERY - Puppet freshness on mw1087 is OK: puppet ran at Tue Apr 29 04:40:25 UTC 2014 [04:40:31] RECOVERY - Puppet freshness on mw1124 is OK: puppet ran at Tue Apr 29 04:40:25 UTC 2014 [04:40:41] RECOVERY - Puppet freshness on mw1171 is OK: puppet ran at Tue Apr 29 04:40:30 UTC 2014 [04:40:51] RECOVERY - Puppet freshness on mw1147 is OK: puppet ran at Tue Apr 29 04:40:46 UTC 2014 [04:41:01] RECOVERY - Puppet freshness on mw1089 is OK: puppet ran at Tue Apr 29 04:40:51 UTC 2014 [04:41:31] RECOVERY - Puppet freshness on mw1115 is OK: puppet ran at Tue Apr 29 04:41:21 UTC 2014 [04:41:31] RECOVERY - Puppet freshness on mw1048 is OK: puppet ran at Tue Apr 29 04:41:21 UTC 2014 [04:41:32] RECOVERY - Puppet freshness on mw1038 is OK: puppet ran at Tue Apr 29 04:41:26 UTC 2014 [04:41:41] RECOVERY - Puppet freshness on mw1032 is OK: puppet ran at Tue Apr 29 04:41:36 UTC 2014 [04:42:01] RECOVERY - Puppet freshness on mw1201 is OK: puppet ran at Tue Apr 29 04:41:51 UTC 2014 [04:42:01] RECOVERY - Puppet freshness on mw1024 is OK: puppet ran at Tue Apr 29 04:41:56 UTC 2014 [04:42:11] RECOVERY - Puppet freshness on mw1197 is OK: puppet ran at Tue Apr 29 04:42:07 UTC 2014 [04:42:21] RECOVERY - Puppet freshness on mw1063 is OK: puppet ran at Tue Apr 29 04:42:12 UTC 2014 [04:42:21] RECOVERY - Puppet freshness on mw1007 is OK: puppet ran at Tue Apr 29 04:42:12 UTC 2014 [04:42:41] RECOVERY - Puppet freshness on mw1106 is OK: puppet ran at Tue Apr 29 04:42:32 UTC 2014 [04:42:41] RECOVERY - Puppet freshness on mw1122 is OK: puppet ran at Tue Apr 29 04:42:32 UTC 2014 [04:42:51] RECOVERY - Puppet freshness on mw1033 is OK: puppet ran at Tue Apr 29 04:42:42 UTC 2014 [04:42:51] RECOVERY - Puppet freshness on mw1041 is OK: puppet ran at Tue Apr 29 04:42:42 UTC 2014 [04:43:01] RECOVERY - Puppet freshness on fluorine is OK: puppet ran at Tue Apr 29 04:42:52 UTC 2014 [04:43:01] RECOVERY - Puppet freshness on mw1091 is OK: puppet ran at Tue Apr 29 04:42:52 UTC 2014 [04:43:01] RECOVERY - Puppet freshness on mw1142 is OK: puppet ran at Tue Apr 29 04:42:52 UTC 2014 [04:43:01] RECOVERY - Puppet freshness on mw1140 is OK: puppet ran at Tue Apr 29 04:42:57 UTC 2014 [04:43:11] RECOVERY - Puppet freshness on mw1010 is OK: puppet ran at Tue Apr 29 04:43:02 UTC 2014 [04:43:11] RECOVERY - Puppet freshness on search1022 is OK: puppet ran at Tue Apr 29 04:43:07 UTC 2014 [04:43:51] RECOVERY - Puppet freshness on mw1153 is OK: puppet ran at Tue Apr 29 04:43:48 UTC 2014 [04:43:51] RECOVERY - Puppet freshness on mw1107 is OK: puppet ran at Tue Apr 29 04:43:48 UTC 2014 [04:43:51] RECOVERY - Puppet freshness on mw1205 is OK: puppet ran at Tue Apr 29 04:43:48 UTC 2014 [04:44:11] RECOVERY - Puppet freshness on search1012 is OK: puppet ran at Tue Apr 29 04:44:03 UTC 2014 [04:44:11] RECOVERY - Puppet freshness on searchidx1001 is OK: puppet ran at Tue Apr 29 04:44:08 UTC 2014 [04:44:21] RECOVERY - Puppet freshness on mw1003 is OK: puppet ran at Tue Apr 29 04:44:13 UTC 2014 [04:44:21] RECOVERY - Puppet freshness on mw1066 is OK: puppet ran at Tue Apr 29 04:44:13 UTC 2014 [04:44:31] RECOVERY - Puppet freshness on mw1143 is OK: puppet ran at Tue Apr 29 04:44:23 UTC 2014 [04:44:32] RECOVERY - Puppet freshness on mw1173 is OK: puppet ran at Tue Apr 29 04:44:28 UTC 2014 [04:44:41] RECOVERY - Puppet freshness on mw1204 is OK: puppet ran at Tue Apr 29 04:44:33 UTC 2014 [04:44:41] RECOVERY - Puppet freshness on mw1046 is OK: puppet ran at Tue Apr 29 04:44:38 UTC 2014 [04:44:41] RECOVERY - Puppet freshness on mw1027 is OK: puppet ran at Tue Apr 29 04:44:38 UTC 2014 [04:44:41] RECOVERY - Puppet freshness on mw1068 is OK: puppet ran at Tue Apr 29 04:44:38 UTC 2014 [04:44:51] RECOVERY - Puppet freshness on mw1069 is OK: puppet ran at Tue Apr 29 04:44:43 UTC 2014 [04:44:51] RECOVERY - Puppet freshness on bast1001 is OK: puppet ran at Tue Apr 29 04:44:48 UTC 2014 [04:44:51] RECOVERY - Puppet freshness on mw1189 is OK: puppet ran at Tue Apr 29 04:44:48 UTC 2014 [04:44:51] RECOVERY - Puppet freshness on mw1150 is OK: puppet ran at Tue Apr 29 04:44:48 UTC 2014 [04:45:11] RECOVERY - Puppet freshness on mw1154 is OK: puppet ran at Tue Apr 29 04:45:03 UTC 2014 [04:45:11] RECOVERY - Puppet freshness on search1001 is OK: puppet ran at Tue Apr 29 04:45:03 UTC 2014 [04:45:21] RECOVERY - Puppet freshness on mw1021 is OK: puppet ran at Tue Apr 29 04:45:13 UTC 2014 [04:45:21] RECOVERY - Puppet freshness on mw1166 is OK: puppet ran at Tue Apr 29 04:45:13 UTC 2014 [04:45:21] RECOVERY - Puppet freshness on mw1104 is OK: puppet ran at Tue Apr 29 04:45:18 UTC 2014 [04:45:31] RECOVERY - Puppet freshness on mw1092 is OK: puppet ran at Tue Apr 29 04:45:28 UTC 2014 [04:45:41] RECOVERY - Puppet freshness on mw1025 is OK: puppet ran at Tue Apr 29 04:45:33 UTC 2014 [04:45:41] RECOVERY - Puppet freshness on mw1118 is OK: puppet ran at Tue Apr 29 04:45:38 UTC 2014 [04:45:51] RECOVERY - Puppet freshness on mw1155 is OK: puppet ran at Tue Apr 29 04:45:43 UTC 2014 [04:45:51] RECOVERY - Puppet freshness on mw1131 is OK: puppet ran at Tue Apr 29 04:45:48 UTC 2014 [04:46:01] RECOVERY - Puppet freshness on search1003 is OK: puppet ran at Tue Apr 29 04:45:53 UTC 2014 [04:46:11] RECOVERY - Puppet freshness on mw1213 is OK: puppet ran at Tue Apr 29 04:46:03 UTC 2014 [04:46:11] RECOVERY - Puppet freshness on mw1054 is OK: puppet ran at Tue Apr 29 04:46:08 UTC 2014 [04:46:21] RECOVERY - Puppet freshness on tmh1001 is OK: puppet ran at Tue Apr 29 04:46:13 UTC 2014 [04:46:22] RECOVERY - Puppet freshness on mw1137 is OK: puppet ran at Tue Apr 29 04:46:18 UTC 2014 [04:46:31] RECOVERY - Puppet freshness on mw1128 is OK: puppet ran at Tue Apr 29 04:46:24 UTC 2014 [04:46:32] RECOVERY - Puppet freshness on search1007 is OK: puppet ran at Tue Apr 29 04:46:29 UTC 2014 [04:46:41] RECOVERY - Puppet freshness on mw1194 is OK: puppet ran at Tue Apr 29 04:46:34 UTC 2014 [04:46:41] RECOVERY - Puppet freshness on mw1129 is OK: puppet ran at Tue Apr 29 04:46:39 UTC 2014 [04:46:41] RECOVERY - Puppet freshness on mw1047 is OK: puppet ran at Tue Apr 29 04:46:39 UTC 2014 [04:46:51] RECOVERY - Puppet freshness on mw1211 is OK: puppet ran at Tue Apr 29 04:46:44 UTC 2014 [04:46:51] RECOVERY - Puppet freshness on mw1011 is OK: puppet ran at Tue Apr 29 04:46:49 UTC 2014 [04:47:01] RECOVERY - Puppet freshness on mw1206 is OK: puppet ran at Tue Apr 29 04:46:54 UTC 2014 [04:47:11] RECOVERY - Puppet freshness on mw1020 is OK: puppet ran at Tue Apr 29 04:47:04 UTC 2014 [04:47:11] RECOVERY - Puppet freshness on mw1208 is OK: puppet ran at Tue Apr 29 04:47:09 UTC 2014 [04:47:31] RECOVERY - Puppet freshness on mw1179 is OK: puppet ran at Tue Apr 29 04:47:24 UTC 2014 [04:47:41] RECOVERY - Puppet freshness on mw1078 is OK: puppet ran at Tue Apr 29 04:47:39 UTC 2014 [04:47:51] RECOVERY - Puppet freshness on mw1075 is OK: puppet ran at Tue Apr 29 04:47:49 UTC 2014 [04:48:01] RECOVERY - Puppet freshness on tantalum is OK: puppet ran at Tue Apr 29 04:47:59 UTC 2014 [04:48:11] RECOVERY - Puppet freshness on mw1083 is OK: puppet ran at Tue Apr 29 04:48:10 UTC 2014 [04:48:21] RECOVERY - Puppet freshness on search1021 is OK: puppet ran at Tue Apr 29 04:48:15 UTC 2014 [04:48:21] RECOVERY - Puppet freshness on mw1084 is OK: puppet ran at Tue Apr 29 04:48:15 UTC 2014 [04:48:41] RECOVERY - Puppet freshness on mw1136 is OK: puppet ran at Tue Apr 29 04:48:31 UTC 2014 [04:48:41] RECOVERY - Puppet freshness on mw1094 is OK: puppet ran at Tue Apr 29 04:48:31 UTC 2014 [04:48:41] RECOVERY - Puppet freshness on mw1180 is OK: puppet ran at Tue Apr 29 04:48:31 UTC 2014 [04:48:41] RECOVERY - Puppet freshness on search1005 is OK: puppet ran at Tue Apr 29 04:48:36 UTC 2014 [04:48:41] RECOVERY - Puppet freshness on mw1049 is OK: puppet ran at Tue Apr 29 04:48:36 UTC 2014 [04:48:42] RECOVERY - Puppet freshness on mw1169 is OK: puppet ran at Tue Apr 29 04:48:36 UTC 2014 [04:49:31] RECOVERY - Puppet freshness on mw1030 is OK: puppet ran at Tue Apr 29 04:49:22 UTC 2014 [04:49:31] RECOVERY - Puppet freshness on mw1165 is OK: puppet ran at Tue Apr 29 04:49:27 UTC 2014 [04:49:41] RECOVERY - Puppet freshness on mw1138 is OK: puppet ran at Tue Apr 29 04:49:37 UTC 2014 [04:49:51] RECOVERY - Puppet freshness on mw1181 is OK: puppet ran at Tue Apr 29 04:49:42 UTC 2014 [04:49:51] RECOVERY - Puppet freshness on mw1218 is OK: puppet ran at Tue Apr 29 04:49:42 UTC 2014 [04:50:01] RECOVERY - Puppet freshness on mw1040 is OK: puppet ran at Tue Apr 29 04:49:58 UTC 2014 [04:50:11] RECOVERY - Puppet freshness on search1023 is OK: puppet ran at Tue Apr 29 04:50:03 UTC 2014 [04:50:11] RECOVERY - Puppet freshness on mw1116 is OK: puppet ran at Tue Apr 29 04:50:03 UTC 2014 [04:50:11] RECOVERY - Puppet freshness on mw1198 is OK: puppet ran at Tue Apr 29 04:50:03 UTC 2014 [04:50:11] RECOVERY - Puppet freshness on mw1053 is OK: puppet ran at Tue Apr 29 04:50:08 UTC 2014 [04:50:21] RECOVERY - Puppet freshness on mw1074 is OK: puppet ran at Tue Apr 29 04:50:18 UTC 2014 [04:50:32] RECOVERY - Puppet freshness on mw1132 is OK: puppet ran at Tue Apr 29 04:50:29 UTC 2014 [04:50:41] RECOVERY - Puppet freshness on mw1062 is OK: puppet ran at Tue Apr 29 04:50:34 UTC 2014 [04:50:51] RECOVERY - Puppet freshness on mw1178 is OK: puppet ran at Tue Apr 29 04:50:44 UTC 2014 [04:51:01] RECOVERY - Puppet freshness on mw1080 is OK: puppet ran at Tue Apr 29 04:50:54 UTC 2014 [04:51:11] RECOVERY - Puppet freshness on mw1072 is OK: puppet ran at Tue Apr 29 04:51:05 UTC 2014 [04:51:31] RECOVERY - Puppet freshness on mw1134 is OK: puppet ran at Tue Apr 29 04:51:21 UTC 2014 [04:51:31] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 1.69491525424% of data exceeded the critical threshold [500.0] [04:52:01] RECOVERY - Puppet freshness on mw1200 is OK: puppet ran at Tue Apr 29 04:51:56 UTC 2014 [04:52:11] RECOVERY - Puppet freshness on mw1141 is OK: puppet ran at Tue Apr 29 04:52:01 UTC 2014 [04:52:11] RECOVERY - Puppet freshness on mw1022 is OK: puppet ran at Tue Apr 29 04:52:06 UTC 2014 [04:52:21] RECOVERY - Puppet freshness on mw1059 is OK: puppet ran at Tue Apr 29 04:52:11 UTC 2014 [04:52:21] RECOVERY - Puppet freshness on mw1001 is OK: puppet ran at Tue Apr 29 04:52:11 UTC 2014 [04:52:21] RECOVERY - Puppet freshness on mw1045 is OK: puppet ran at Tue Apr 29 04:52:16 UTC 2014 [04:52:31] RECOVERY - Puppet freshness on mw1185 is OK: puppet ran at Tue Apr 29 04:52:21 UTC 2014 [04:53:01] RECOVERY - Puppet freshness on mw1145 is OK: puppet ran at Tue Apr 29 04:52:51 UTC 2014 [04:53:01] RECOVERY - Puppet freshness on mw1082 is OK: puppet ran at Tue Apr 29 04:52:56 UTC 2014 [04:53:01] RECOVERY - Puppet freshness on mw1219 is OK: puppet ran at Tue Apr 29 04:52:56 UTC 2014 [04:53:01] RECOVERY - Puppet freshness on search1016 is OK: puppet ran at Tue Apr 29 04:52:56 UTC 2014 [04:53:01] RECOVERY - Puppet freshness on search1011 is OK: puppet ran at Tue Apr 29 04:52:56 UTC 2014 [04:53:11] RECOVERY - Puppet freshness on mw1139 is OK: puppet ran at Tue Apr 29 04:53:02 UTC 2014 [04:53:11] RECOVERY - Puppet freshness on mw1093 is OK: puppet ran at Tue Apr 29 04:53:02 UTC 2014 [04:53:21] RECOVERY - Puppet freshness on mw1174 is OK: puppet ran at Tue Apr 29 04:53:12 UTC 2014 [04:53:51] RECOVERY - Puppet freshness on mw1026 is OK: puppet ran at Tue Apr 29 04:53:42 UTC 2014 [04:53:51] RECOVERY - Puppet freshness on mw1220 is OK: puppet ran at Tue Apr 29 04:53:43 UTC 2014 [04:53:51] RECOVERY - Puppet freshness on search1006 is OK: puppet ran at Tue Apr 29 04:53:43 UTC 2014 [04:54:01] RECOVERY - Puppet freshness on mw1086 is OK: puppet ran at Tue Apr 29 04:53:58 UTC 2014 [04:54:11] RECOVERY - Puppet freshness on mw1203 is OK: puppet ran at Tue Apr 29 04:54:08 UTC 2014 [04:54:21] RECOVERY - Puppet freshness on mw1060 is OK: puppet ran at Tue Apr 29 04:54:13 UTC 2014 [04:54:21] RECOVERY - Puppet freshness on mw1090 is OK: puppet ran at Tue Apr 29 04:54:13 UTC 2014 [04:54:27] let's count all the ways that message is broken [04:54:31] RECOVERY - Puppet freshness on mw1215 is OK: puppet ran at Tue Apr 29 04:54:23 UTC 2014 [04:54:31] RECOVERY - Puppet freshness on mw1120 is OK: puppet ran at Tue Apr 29 04:54:28 UTC 2014 [04:54:32] RECOVERY - Puppet freshness on mw1008 is OK: puppet ran at Tue Apr 29 04:54:28 UTC 2014 [04:54:32] RECOVERY - Puppet freshness on search1004 is OK: puppet ran at Tue Apr 29 04:54:28 UTC 2014 [04:54:58] 1. it's not requests, it's responses [04:55:11] 2. it's not on tungsten; it's the set of varnishes [04:55:26] 3. 'CRITICAL' is needlessly repeated [04:55:43] 4. the meaning of '[500.0]' is obscure [04:56:27] :-D [04:56:28] 5. it's not the percent of data that exceeds threshold; it's the current rate. [04:56:50] * apergos scroll back through all the cruft to find the message [04:57:42] ah ha [04:57:43] 6. it's not clear what the number represents, but at least it's accurate to 11 decimal points [04:57:56] it is, this was noted yesterday evening [04:58:12] accurate, I mean [04:59:59] 'HTTP 5xx' is the only useful signal, and it's so out of place that one is tempted to remove it [05:00:33] i propose: 'PROBLEM - CRITICAL: CRITICAL: 1.69491525424% of data exceeded the critical threshold [500.0]' [05:01:34] maybe just: 'PROBLEM - CRITICAL: CRITICAL: 1.69491525424% [500.0]' [05:01:41] that has a certain poetry to it, no? [05:01:58] poetic quality, even [05:04:17] ori: i'm checking db1048 for duplicate uuid, then will do the schema change. has anything changed in the last 8h? [05:05:19] springle: i rolled out the change to make EventLogging declare those constraints for any new tables [05:05:35] ori: also, i've isolated log data over the migration period in the binary logs. if you'd prefer we restore from there, we probably can [05:06:02] because all uuid's should exist either in db1048 binlog or db1047 dump, presumably [05:07:02] how hard would it be? restoring from a file is not too hard; it's a matter of pointing the consumer at a file rather than a zeromq stream. [05:08:26] tho i've done that before, carefree in the knowledge that the unique constraint on uuids (which turned out not to actually exist) would ensure there are no dupes [05:08:43] so there may actually be some dupes [05:08:48] yeah consumer would be easier. but data is there just in case [05:09:12] for a month (binlog expiry period) [05:09:30] springle: how would you check for duplicates efficiently? [05:09:43] select id, uuid from log.$tbl where $range group by uuid having (count(*) > 1) [05:10:24] i figured the query planner would do something horrid with that [05:10:44] it's certainly slow [05:11:58] do you still have the dump file? [05:12:41] yes [05:14:30] the query isn't too bad. moving through the M* tables now [05:14:50] really? and no dupes so far? [05:15:05] not so far [05:15:08] :) [05:16:42] thanks again sean. this turned out to be a bigger job than i anticipated by an order of magnitude or two [05:17:50] no worries. i should have double and triple checked the unique index assumptions and saved us the effort [05:18:02] so i want to resolve it as much as you do :) [05:22:52] ori: out of interest, why VARCHAR(191) everywhere? something to do with utf8? [05:23:47] #: Maximum length for string and string-like types. Because InnoDB limits index [05:23:47] #: columns to 767 bytes, the maximum length for a utf8mb4 column (which [05:23:47] #: reserves up to four bytes per character) is 191 (191 * 4 = 764). [05:23:49] STRING_MAX_LEN = 191 [05:24:01] aha utf8mb4 [05:24:21] cool [05:25:57] !log Manually removed a few 10000s of duplicate Cyberbot job duplicates [05:26:04] Logged the message, Master [05:26:10] ori: just fyi, varchar(191) for the things that will always be shorter, like the uuid's, would mean mysql/mariadb allocate any static buffers handling those fields at 191 * charset bytes [05:27:39] uuid should be CHAR(32) [05:27:42] for large result or working sets, where dynamic rows become fixed-length arrays in memory, it can help to be economical [05:28:24] uuid varies. varchar(191). another varchar(255)... etc [05:28:36] older tables i guess [05:28:47] different versions of sqlalchemy, i think [05:31:08] the column definition has always been: sqlalchemy.Column('uuid', sqlalchemy.CHAR(32), index=True) [05:31:59] but sqlalchemy.CHAR(32) is cast to the appropriate SQL based on the underlying database engine and its configuration [05:33:04] that is a bit odd, though [05:33:27] wonder how they determine that. 191 makes sense except the tables aren't even using utf8mb4. and you've explicitly told it there will be 32 chars [05:38:46] hmmmmmmm! i think i know what might be going on [05:38:55] i need to do a quick test to verify [05:41:34] ori: schema changes have begun, walking tables in alpha order. i'm modifying uuid to char(32), dropping the old key, and adding the unique key [05:41:48] there are other fields that look like they could be smaller [05:41:52] clientIp [05:42:01] wiki [05:42:03] etc [05:42:12] not without making the implementation substantially more complex [05:42:18] these fields are defined in [05:43:14] ah, the joys of automating schema creation :) [05:43:14] and thus the table schema is generated by mapping the event schema and the capsule schema from JSON Schema to SQL [05:43:54] and in fact this is why uuid is varchar(191) [05:44:12] the CHAR(32) column definition is clobbered by the one defined in the event capsule [05:45:03] sqlalchemy is quietly handling the fact that there are two columns with name 'uuid' in the schema def by choosing the latter when it comes time to generate the create table statement [05:45:59] which successfully reduces development time at the expense of runtime [05:46:11] familar story :) [05:47:40] ORMs are awful [05:48:15] actually databases are awful [05:48:47] making humans think about field sizes, optimization, types [05:48:58] data is awful; 1.69491525424% of it exceeds critical threshold [05:48:58] springle: any ETA when those "Too many connections" errors will stop happening (e.g. more slaves)? :) [05:49:02] everything should be a blob! [05:49:36] * AaronSchulz should add some more job runners in for refreshLinks [05:49:38] AaronSchulz: just waiting on cmjohnson getting back to eqiad. we have 10 new 160G slaves waiting to rack [05:56:47] that was a fascinating backread [05:57:17] (about varchar(191) etc) [05:59:59] <_joe_> ori, springle lost in sqlalchemy hell? [06:01:02] hehe [06:01:18] nah, i think i have an elegant solution cooked up [06:01:36] <_joe_> sqlalchemy as any other ORM I ever tried to use is a real PITA when you gotta get fancy [06:04:08] i think django's ORM is a classic example of that [06:04:35] <_joe_> ori: regarding the message from check_graphite, I'd like to keep the text but I'll pretty-format the percentage [06:05:52] the packaging says "you don't have to worry about nasty queries, just write expressive python!", the reality is https://docs.djangoproject.com/en/dev/topics/db/queries/#complex-lookups-with-q-objects [06:06:50] _joe_: it's really obscure. what does it mean for a 1.7% of data to exceed the critical threshold? [06:08:30] WARNING: The Varnishes are serving more error responses than usual. [06:08:44] CRITICAL: The Varnishes are serving a *lot* more error responses than usual. [06:08:49] what more do you need? [06:11:44] <_joe_> ori: it's saying what the condition is - just that [06:12:03] <_joe_> ori: brb [06:20:13] 1.7% could be a *lot*, and the message does mention "critical" [06:20:35] drop the percentage and s/critical/CRITICAL/ and the perception will be entirely different [06:30:31] PROBLEM - Puppet freshness on rhenium is CRITICAL: Last successful Puppet run was Tue Apr 29 00:28:36 2014 [06:33:13] (03PS1) 10ArielGlenn: motd updates must be sh wrappers [operations/puppet] - 10https://gerrit.wikimedia.org/r/130289 [06:37:17] (03CR) 10ArielGlenn: [C: 032] motd updates must be sh wrappers [operations/puppet] - 10https://gerrit.wikimedia.org/r/130289 (owner: 10ArielGlenn) [06:42:27] (03PS1) 10Withoutaname: Include language-0 categories for betawikiversity [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130291 (https://bugzilla.wikimedia.org/64168) [06:43:35] https://gerrit.wikimedia.org/r/#/c/130290/ should make new tables have CHAR(32) NOT NULL UUID columns. i'd like to test it by deploying it and triggering an event for a new schema revision. ok with you? [06:47:41] RECOVERY - Check status of defined EventLogging jobs on vanadium is OK: OK: All defined EventLogging jobs are runnning. [06:48:44] that was @ springle [06:52:36] ori: go ahead [06:57:41] springle: try "show create table Test_8327132;" :) [06:58:06] nice :) [07:00:00] ori: so it seems there were duplicates in (at least) CentralAuth_5690875, but they look older than the migration date. am re-running that dupes query with older time range just out of interest [07:00:20] but they'll get mopped up regardless of date by the conversion to unique key [07:00:25] oh, yeah, you'll find dupes [07:00:34] i didn't realize earlier that you were only running it for the migration period [07:00:52] yes sorry. that was $range, but not clear [07:01:20] CentralAuth_5690875 has ~4M dupes [07:01:24] had, rather [07:01:24] as i said earlier, at some point a few months ago db1047 was out for a while and i imported from the file logs, naively expecting the uuid constraint i thought was there to take care of it [07:01:32] :) [07:01:38] (03PS3) 10Tim Landscheidt: Tools: Alias tools.wmflabs.org to internal webproxy [operations/puppet] - 10https://gerrit.wikimedia.org/r/123149 (https://bugzilla.wikimedia.org/54052) [07:01:51] ah ok, i must not have been listening [07:02:24] in that case, i've just proven you did exactly what you already told me you said you did :) [07:02:45] at some point in the past i transitioned imperceptibly from: "the uuid column could have a unique constraint, which would make data recovery and integrity checks really is" [07:02:53] to: "the uuid column has a unique constraint, which makes data recovery and integrity checks really is" [07:02:57] s/is/easy [07:03:36] Krinkle: do the makesysop etc. pages do some specific harm? [07:08:52] ori: found some NULL uuid rows in Mobile* tables. kill them? [07:09:04] not many. 10's per table [07:09:14] yes please [07:09:31] ok [07:14:17] (03CR) 10Tim Landscheidt: "At first I was baffled why testing on Toolsbeta failed until I realized that there was no toolsbeta-webproxy.eqiad.wmflabs :-). Good to g" [operations/puppet] - 10https://gerrit.wikimedia.org/r/123149 (https://bugzilla.wikimedia.org/54052) (owner: 10Tim Landscheidt) [07:24:10] <_joe_> matanya: ping [07:24:21] _joe_: pong [07:25:49] <_joe_> matanya: we should get back to work on the puppet3 migration; I was looking at http://etherpad.wikimedia.org/p/Puppet3 and I'm going to tackle the $cluster fiasco (which accounts for ~ 90% of the remaining differences) [07:26:05] thank you [07:26:11] <_joe_> matanya: we have like *a ton* of templates to fix, though [07:26:21] most of the stuff there needs second look [07:26:33] didn't work on it for some time [07:27:14] <_joe_> matanya: <%= an_array %> becomes "a b c d e" in puppet 2.7 and 'abcde' in puppet 3.x, we should use <%= @an_array %> [07:27:32] <_joe_> (and I'm still not sure that works exactly as expected) [07:27:53] yes, that was about 90% of my fixes so far [07:28:11] <_joe_> matanya: ok so, first I resolve this '$cluster' fiasco [07:28:36] <_joe_> matanya: you can just look at the compilation warnings on http://puppet-transition-helper.wmflabs.org/html/ [07:28:46] <_joe_> and start ironing out some of them :) [07:29:00] i can't _joe_ I get 500 [07:29:07] <_joe_> matanya: what is missing of the things in etherpad? [07:29:19] <_joe_> matanya: I told you, do not look at the ones with compilation errors [07:29:33] oh, right [07:29:42] <_joe_> that should be a 404, but I configured nginx badly, feel free to correct it :) [07:29:51] <_joe_> matanya: I'll add you to the labs project [07:29:59] thanks [07:30:08] <_joe_> what is your username there? [07:30:20] from what i see the etherpad is update (mostly) [07:30:22] matanya [07:30:40] <_joe_> project puppet3-diffs [07:31:19] <_joe_> there is just one instance, the interesting things are under /vagrant there (meh) [07:31:21] _joe_: that's pretty nifty [07:31:45] the puppet-transition-helper, i mean [07:31:50] <_joe_> ori: the CSS and the html part in general is pretty ugly [07:32:52] <_joe_> ori: I've embarked in the task of migrating a large, mission critical codebase I know little about between very different versions of a language... If we don't have some testing, it's foolish to do [07:33:23] <_joe_> ori: still, there are things that are not catched by our diff tool, like undeclared dependencies [07:34:56] (03CR) 10Gilles: [C: 031] Enable MediaViewer survey on Spanish Wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130234 (owner: 10Gergő Tisza) [07:35:43] _joe_: i'll hop in, thank you. afk for a bit [07:36:25] _joe_: *nod* hosting it on a page on a labs vhost is just a nice intuition, makes it easier to check [07:37:19] <_joe_> matanya: no problem, any time you can help, it's a pleasure :) [07:38:10] <_joe_> ori: well, my original grand plan included a web API that could be invoked from jenkins, but given the number of errors I'll stop here probably [07:38:35] <_joe_> and I will do that work so that people can just look at changes in the catalogs due to their own commits [07:38:59] <_joe_> (which is what the differ that akosiaris wrote did) [07:49:40] akosiaris: thanks for the jsduck package update! [07:49:50] :-) [07:50:01] I am doing a final round on python-gear btw [07:50:03] i [07:50:14] I will probably upload it soonish [07:50:18] does it need any specific tweaks? [07:50:24] if so you might want to push then to the svn repo [07:50:35] (the debian svn shared repo used by the python module team) [07:51:34] when I build it on labs, I had to pass 'nocheck' in the debuild option and removed a bunch of build dependencies that do not exist in Precise / apt.wm.o [07:51:40] like python-testrepository [07:51:45] I have not noticed something yet, just rebuilding it to make sure [07:52:00] you will see two lintian errors iirc [07:52:06] one is a missing manpage (none upstream) [07:52:27] the other is a pedantic error to depends on xz compression package for support with some old Debian/Ubuntu version [07:52:38] <_joe_> hashar: ah! create the classical fake-debian-manpage-so-that-lintian-doesnt-complain [07:52:50] * hashar blames _joe_ :D [07:52:59] <_joe_> oldest trick of the bag :P [07:53:12] _joe_: upstream uses sphinx for documentation, so potentially we could generate the manpage out of sphinx [07:53:46] <_joe_> hashar: hm the manpage will suck then with a 99% certainty, but give it a try [07:54:17] html2man ? [07:54:36] <_joe_> uh never used it [07:54:39] btw the xz compression thing is new, not old [07:55:09] well, define new but anyway.. [07:55:18] _joe_: it works... badly... [07:56:17] sphinx has building support for man pages [07:56:52] <_joe_> hashar: yes I know and I tried with little success in the past [07:56:57] <_joe_> but YMMV [07:57:17] though the generated manpage is not that helpful :/ [07:57:23] that is the whole package doc [07:57:39] maybe I will fix it one day :] [07:58:29] anyway the most important thing right now is to upgrade that gear python module [07:58:57] it has a few flawed conditionals that are most probably the root cause of jobs stalling completely from time to time :) [08:06:08] (03CR) 10Giuseppe Lavagetto: [C: 04-1] "The code needs some corrections; also I think we have an occasion to refactor poor choices we made precedingly." (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/118966 (owner: 10Matanya) [08:29:29] (03CR) 10Alexandros Kosiaris: [C: 032] Publish carbon's IPv6 address in DNS [operations/dns] - 10https://gerrit.wikimedia.org/r/130087 (owner: 10Alexandros Kosiaris) [08:30:39] !log Published carbon's IPv6 address in DNS. apt.wikimedia.org and ubuntu.wikimedia.org are now IPv6 enabled [08:30:46] Logged the message, Master [08:36:51] https://ganglia.wikimedia.org/latest/?r=month&cs=02%2F01%2F2014+00%3A00+&ce=04%2F27%2F2014+00%3A00+&c=Miscellaneous+eqiad&h=palladium.eqiad.wmnet&tab=m&vn=&hide-hf=false&mc=2&z=medium&metric_group=ALLGROUPS [08:37:08] I wonder if now is the best time to have puppet run every 20 minutes instead of 30 [08:41:13] <_joe_> akosiaris: did you ever use $caller_module_name in puppet? [08:41:29] I don't think so [08:43:02] what do you want to do with it ? [08:43:11] <_joe_> I was about to use it to resolve the whole $cluster fiasco. It's a way to determine the value of a parameter depending on the caller module [08:43:57] well not everything is a module unfortunately [08:44:33] so unless the variable name is misleading I ain't sure this is going to work in all cases [08:44:42] <_joe_> akosiaris: the problem is that in some cases we declare $cluster at node-level, while in other cases we declare it within a role; of course this does not work in puppet3 [08:44:59] aahhh the beauty.. [08:45:32] so, $cluster is a global variable for almost all intents and purposes [08:46:00] <_joe_> akosiaris: exactly. Which you can't use in puppet 3 [08:46:12] ? [08:46:36] <_joe_> akosiaris: global variables cannot be modified in inner scopes and be visible to outer scopes [08:47:15] they can not be modified at all you mean [08:47:31] <_joe_> so, if you set $cluster within role::applicationserver, it will be usable outside of it only as $::role::applicationserver::cluster [08:47:52] <_joe_> the top-leve $cluster will remain unchanged [08:48:01] yes obviously [08:48:10] but what if you set $::cluster ? [08:48:10] <_joe_> this is sane btw, but we exploited this abuse of globals :) [08:48:15] <_joe_> you can't [08:48:23] only if it is already defined [08:48:28] <_joe_> you can't set $::cluster within a class [08:48:42] <_joe_> akosiaris: hmmm not sure, let me check [08:48:46] I think you can [08:48:59] but only if something else has not already done so [08:49:25] <_joe_> let me test it [08:50:43] <_joe_> Error: Could not parse for environment production: Cannot assign to variables in other namespaces at /home/joe/prova.pp:7 [08:50:49] <_joe_> akosiaris: you can't :/ [08:52:54] :-( [08:53:49] thought not [08:54:03] otherwise everyone would be solving their puppet3 issues this way [08:54:24] also: "morning" [08:56:47] yeah, this is not allowed even in puppet 2, I don't know what I was thinking [08:57:10] probably hoping for the unachievable [08:57:50] <_joe_> OTOH, what I wanted to do won't work in our puppet setup, I guess [08:57:59] <_joe_> but lemme try [09:00:14] (03PS1) 10ArielGlenn: keep two weeks of apache logs instead of a year [operations/puppet] - 10https://gerrit.wikimedia.org/r/130296 [09:11:21] paravoid, fwiw, I was just able to push wiki rc feeds to xmpp. it /might/ be able to replace the irc stuff, with some work [09:12:21] Krinkle|detached: ^ [09:22:06] matanya: https://www.mediawiki.org/wiki/Requests_for_comment/Publishing_the_RecentChanges_feed [09:22:58] (03CR) 10Mark Bergsma: "You can remove the .ex example files - if you're not using them, no point in checking them in." (031 comment) [operations/debs/ircd-ratbox] (debian) - 10https://gerrit.wikimedia.org/r/130145 (owner: 10Rush) [09:23:32] thanks legoktm i'm talking about current situation [09:24:36] that documents the current situation pretty well I think :P [09:29:04] (03PS1) 10Alexandros Kosiaris: Allow wikimedia's IPv6 space through apt proxy [operations/puppet] - 10https://gerrit.wikimedia.org/r/130302 [09:29:44] legoktm: not if matanya is doing some secret work which that page doesn't mention :P [09:29:57] :D [09:30:45] (03CR) 10Alexandros Kosiaris: [C: 032] Allow wikimedia's IPv6 space through apt proxy [operations/puppet] - 10https://gerrit.wikimedia.org/r/130302 (owner: 10Alexandros Kosiaris) [09:30:46] Nemo_bis / legoktm do you have xmpp you can test what i have done? [09:31:31] PROBLEM - Puppet freshness on rhenium is CRITICAL: Last successful Puppet run was Tue Apr 29 00:28:36 2014 [09:31:32] matanya: how did you set up the xmpp feeds? is it embedding XML? [09:31:37] yes [09:31:43] sweet :D [09:31:58] it be more accurate: rss -- > xmpp gateway [09:32:02] *if [09:32:07] err, why rss? [09:32:25] because wiki provides the rc feed in rss [09:32:48] it can be changed easily [09:33:01] this was just fastest way to do a poc [09:33:45] so legoktm do you have any xmpp client i can show you the work? [09:34:02] sure, legoktm@gmail.com should work [09:34:42] so please add mouffette@foss.co.il as your friend [09:35:45] matanya: done [09:35:54] open a chat with him [09:36:21] :OOO [09:36:24] this is cool [09:36:34] you see the feed? [09:38:04] yeaaaah [09:38:24] cool stuff [09:38:38] but for recentchanges, RSS won't work [09:39:02] the backend is already implemented in mediawiki, we just need an ejabbered instance and public url/hostname for it [09:39:13] this is recentchanges [09:39:31] en.wiki RC [09:39:39] yes, but confusingly, IRC recentchanges is different from on-wiki recentchanges [09:40:13] different as in ? [09:40:41] some things like abusefilter hits will go to IRC feed, but they don't show up in Special:RecentChanges, so they won't show up in the RSS feed [09:40:51] ah [09:41:01] well, as a POC it is still nice :) [09:41:12] the status quo is that mediawiki emits a UDP packet, which contains a formatted IRC line. there's a python script which sends the udp packet to the IRC server [09:42:07] ah, so this is why RTRC doesn't show so,e stuff, i guess [09:42:10] we can change the formatter to emit XML instead of an IRC line, and then need whatever receives the UDP packet to send it to ejabbered instead [09:45:25] you didn't get meta, did you legoktm ? [09:45:49] huh? I turned the feed on, watched a few edits come in, then disabled it [09:46:38] oh, ok [09:46:48] i saw an error in the console [09:47:01] it didn't flush all the edits [09:47:12] hashar: you might be intersted too :) [09:48:39] matanya: not really :-] [09:48:41] it is a huge mess [09:49:01] yeah, well :) [09:49:02] someone might want to lead the effort, write down an RFC describing the current infrastructure and what is meant for [09:49:20] then list all the issues we have with it and list requirements/nice things to have etc [09:49:25] then propose out some new architecture [09:50:01] I would use some message bus like EventLogging, write a subscriber that then emits json events to some web socket or something like that [09:50:38] there is bunch that needs to be rethink. Having MediaWiki emit UDP and crafting formatted IRC messages is not very robust [09:50:50] last time it broke it caused me both headaches and nightmares [09:51:42] yeah, i totally see your point [09:59:55] !log update python-gear on apt.wikimedia.org to 0.5.4-1 [10:00:02] Logged the message, Master [10:08:50] (03PS1) 10Alexandros Kosiaris: Add the non SLAAC IPv6 addresses for bastions [operations/puppet] - 10https://gerrit.wikimedia.org/r/130304 [10:14:21] !log Jenkins / Zuul : upgrading python-gear from 0.4.0-1 to 0.5.4-1 . Should fix a bunch of jobs registrations issues in Zuul Gearman. {{bug|63758}} [10:14:27] Logged the message, Master [10:14:52] bah that is bug 63760 [10:20:15] !log restarting Zuul [10:20:22] Logged the message, Master [10:22:35] pfff [10:22:44] 10 seconds to do a git remote update of mediawiki/core :-( [10:23:29] akosiaris: I have upgraded Gearman and restarted Zuul. We will see how well it goes :] Thank you very much [10:23:40] :-) [10:23:51] akosiaris: my turn :) [10:23:55] (03CR) 10Dzahn: [C: 04-2] "already done in I49c36df0a and i liked the one that is merged better, +1 for having the admin group, that one has an issue though, lookin" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130231 (owner: 10Jkrauska) [10:24:07] hashar: btw, you have sent an email for safe restart of jenkins [10:24:18] zuul restart ? is there some docs that I have missed ? [10:24:19] can we proceed with ferm on antinomy ? [10:24:36] matanya: I think so [10:24:52] (03CR) 10Dzahn: "looked good, but: err: Failed to apply catalog: Could not find dependency Group[500] for User[jkrauska] at /etc/puppet/manifests/admins.pp" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130241 (owner: 10Jkrauska) [10:24:55] akosiaris: Zuul connects to Gerrit over ssh and received events from Gerrit. It then takes some decisions and starts functions in an embedded Gearman Server [10:25:13] akosiaris: Jenkins is a Gearman client which exposes to Zuul Gearman servers the functions it can runs (i.e. jobs). [10:25:15] this has a possibility of breaking gerrit so I am kind of hesitant, but it is a good time now [10:25:23] so they are tightly coupled but can be restarted independently. [10:26:04] so zuul basically triggers jenkins jobs [10:26:24] yup via Gearman [10:26:45] (so we could even write our own Gearman client that would executes something outside of Jenkins) [10:27:00] https://www.mediawiki.org/wiki/Continuous_integration/Zuul [10:27:05] yeah that is the page [10:27:08] ah ma bad [10:27:12] I should have checked more [10:27:19] I am missing a 30000 feet overview of the architecture though [10:27:56] niah, you do have quite a lot [10:27:57] ant git.wikimedia.org give 500 for days [10:28:29] ? gitblit ? [10:28:31] poked ^demon|away a few times, maybe time for a ticket :_ [10:28:33] yes [10:28:34] let me check [10:31:31] PROBLEM - gitblit.wikimedia.org on antimony is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Temporarily Unavailable - 516 bytes in 0.014 second response time [10:31:44] hmm [10:32:31] RECOVERY - gitblit.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 53363 bytes in 0.491 second response time [10:33:11] (03CR) 10Alexandros Kosiaris: [C: 032] Add the non SLAAC IPv6 addresses for bastions [operations/puppet] - 10https://gerrit.wikimedia.org/r/130304 (owner: 10Alexandros Kosiaris) [10:34:03] thanks akosiaris, and the next question on this, does gitblit need access from outside or only via varnish? [10:34:10] !log restarted gitblit on antimony [10:34:17] Logged the message, Master [10:34:31] this software seems to be even worse quality than most java software [10:35:16] so the idea is for gitlit to not be accessible to the public [10:35:58] ok, so pushing antimony ferm patch. seems all needed rules are in place. still waiting fir review for archive/titanuim [10:38:07] (03PS1) 10Matanya: antimony: add firewall [operations/puppet] - 10https://gerrit.wikimedia.org/r/130306 [10:40:59] apergos: releases.wikimedia.org is not served over 443 ? [10:41:53] it could be, the misc web cluster could handle that piece [10:42:00] well, should [10:42:24] seems like it doesn't [10:42:34] https://releases.wikimedia.org/ [10:43:50] _joe_: regarding https://gerrit.wikimedia.org/r/#/c/118966/ i fear i didn't get your question [10:46:04] <_joe_> matanya: 1 sec [10:47:24] <_joe_> matanya: the array you cycle on should be called with the @ in front [10:47:55] why, the var is defined a line above [10:48:06] it is not coming from a manifest [10:48:27] matanya: xml/jabber won't be an option for what we need however [10:48:30] <_joe_> proxy_addresses, the array is coming from the manifest [10:48:49] yeah, i guessed so Krinkle [10:49:03] <_joe_> matanya: proxy_addresses[site].each do <--- @proxy_addresses[site].each do [10:49:20] oh, om line 33, nit 34 [10:49:23] *not [10:49:36] <_joe_> matanya: yes sorry [10:49:51] <_joe_> (also, on line 34 there's the second issue) [10:50:16] got really confused. ok, i got it now. [10:50:20] fixing [10:53:40] (03PS1) 10Dzahn: add wikidev group to admins::pmacct [operations/puppet] - 10https://gerrit.wikimedia.org/r/130309 [10:53:50] _joe_: iirc, there is no need for @ at the begining of the second one, i call it with a fqdn [10:54:01] so role::protoproxy::enable_ipv6_proxy will be ok. [10:55:32] (03PS2) 10Dzahn: add wikidev group to admins::pmacct [operations/puppet] - 10https://gerrit.wikimedia.org/r/130309 [10:55:47] (03PS3) 10Dzahn: add wikidev group to admins::pmacct [operations/puppet] - 10https://gerrit.wikimedia.org/r/130309 [10:59:44] (03CR) 10Dzahn: [C: 032] "follow-up to I49c36df0a" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130309 (owner: 10Dzahn) [11:00:23] (03PS1) 10Alexandros Kosiaris: Introduce role::puppetmaster classes [operations/puppet] - 10https://gerrit.wikimedia.org/r/130310 [11:01:01] RECOVERY - Puppet freshness on rhenium is OK: puppet ran at Tue Apr 29 11:00:51 UTC 2014 [11:01:15] (03CR) 10Dzahn: "needed Ibddb80200b" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130241 (owner: 10Jkrauska) [11:02:07] (03PS2) 10Matanya: appserver: no more hardy boxes [operations/puppet] - 10https://gerrit.wikimedia.org/r/130042 [11:02:16] (03CR) 10Matanya: "same as : https://gerrit.wikimedia.org/r/#/c/118240/" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130042 (owner: 10Matanya) [11:06:53] (03PS6) 10Matanya: protoproxy: call enable_ipv6_proxy in a sane way [operations/puppet] - 10https://gerrit.wikimedia.org/r/118966 [11:06:55] (03Abandoned) 10Dzahn: Planet update [operations/puppet] - 10https://gerrit.wikimedia.org/r/80760 (owner: 10Dereckson) [11:07:50] mutante: what ports does the planet need open? 80? anything else? [11:08:06] (03CR) 10Dzahn: "duplicate of I8c43a78bc5de which was abandoned with "should be part of app server"" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130113 (owner: 10Dzahn) [11:08:23] (03PS2) 10Odder: Additional two Swiss domains to wgCopyUploadsDomains [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130070 (https://bugzilla.wikimedia.org/64536) [11:08:32] (03CR) 10Dzahn: "tried again in Change-Id: If2bb0882408 , but abandon as well" [operations/puppet] - 10https://gerrit.wikimedia.org/r/95440 (owner: 10Matanya) [11:09:18] matanya: 443 [11:09:58] that's all, ok, that would be easy. what about rackatables and rt? same? [11:10:12] i think so, yea [11:10:23] neat [11:10:51] well, planet also needs to fetch data from remote webservers, but we are not talking about outgoing [11:12:38] (03Abandoned) 10Dzahn: WIP - turn imagescaler into a module [operations/puppet] - 10https://gerrit.wikimedia.org/r/130113 (owner: 10Dzahn) [11:13:44] (03PS1) 10Matanya: planet: add ferm rules [operations/puppet] - 10https://gerrit.wikimedia.org/r/130311 [11:17:23] (03PS1) 10Matanya: rt: add ferm rules [operations/puppet] - 10https://gerrit.wikimedia.org/r/130312 [11:19:01] (03PS1) 10Matanya: racktables: add ferm rules [operations/puppet] - 10https://gerrit.wikimedia.org/r/130313 [11:19:35] (03PS2) 10Matanya: rt: add ferm rules [operations/puppet] - 10https://gerrit.wikimedia.org/r/130312 [11:20:11] (03PS2) 10Matanya: planet: add ferm rules [operations/puppet] - 10https://gerrit.wikimedia.org/r/130311 [11:22:30] matanya: is it a problem if those things are duplicated on a node? [11:22:50] no [11:22:54] good [11:23:06] unless they have the same name [11:23:17] which is why you ask i guess :/ [11:23:50] i'm asking because f.e. zirconium has the planet and the bugzilla role and more [11:23:56] being the misc. webserver host [11:24:06] so if you add a ferm::service { 'http': into all of them ... [11:24:19] they might be duplicates [11:24:36] i seriously think of adding those two as a defined type to ferm [11:24:46] i wanted to hear input for akosiaris [11:24:57] you _could_ say, that the actual "role" should be "misc-webserver" [11:25:05] which has the ferm rules for 80/443 [11:25:23] yes, akosiaris :) [11:25:44] that is another level deeper :) [11:25:47] (03PS1) 10BBlack: Revert "Unset $wgUseXVO" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130315 [11:26:01] ? [11:26:09] i spoke with paravoid about this in the past, i think. [11:26:26] akosiaris: say we add a "ferm::service { 'http':" into role foo and also into role bar [11:26:30] (03PS2) 10BBlack: Revert "Unset $wgUseXVO" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130315 [11:26:35] duplicates [11:26:38] akosiaris: and then we applied foo and bar, both on a single node [11:26:57] so do ferm::service { '-http': [11:27:03] ack, so is the actual "role" more like "misc-webserver" [11:27:10] and that should have the ferm rules once [11:27:15] (03CR) 10BBlack: [C: 032 V: 032] Revert "Unset $wgUseXVO" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130315 (owner: 10BBlack) [11:27:19] and then include the services [11:27:25] or just give them different names [11:27:34] magnesium is going to show this issue [11:27:56] so as long as the names are different, we dont care if a ferm rule is applied twice [11:27:58] (03PS1) 10Mark Bergsma: Revert "Unset $wgUseXVO" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130316 [11:28:04] mutante: exactly [11:28:08] so magnesium example is [11:28:10] ok, good [11:28:22] ferm::service {'rt-http': blablah [11:28:24] matanya: then let's just fix the names, and don't call them just "http" [11:28:28] nods [11:28:35] ferm::service { 'racktables-http': and so on [11:28:44] yep, thx [11:28:48] :-) [11:28:55] (03Abandoned) 10Mark Bergsma: Revert "Unset $wgUseXVO" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130316 (owner: 10Mark Bergsma) [11:29:27] a good solution would be to define it on the level of the task the server is assigned to, be yes, changing [11:29:30] *but [11:29:50] that is what i meant by "misc-webserver" being the actual role of the node [11:30:25] oh [11:30:33] you are right as usual [11:30:59] !log bblack synchronized wmf-config/InitialiseSettings.php 'Revert "Unset $wgUseXVO"' [11:31:01] (03PS2) 10Matanya: racktables: add ferm rules [operations/puppet] - 10https://gerrit.wikimedia.org/r/130313 [11:31:02] so it could do the ferm rules and then include planet and bugzilla and ... [11:31:07] Logged the message, Master [11:31:08] but i'm fine keeping it simple for now [11:31:14] follows akosiaris :) [11:31:15] actully [11:31:23] i have a better solution now [11:31:40] i'll push it and let me know if you thing it is a better one [11:31:47] ok [11:35:16] (03PS1) 10Matanya: magnesium: misc-webserver-ferm rules and firewall [operations/puppet] - 10https://gerrit.wikimedia.org/r/130317 [11:35:35] akosiaris , mutante ^ how about this approach ? [11:36:13] (03CR) 10jenkins-bot: [V: 04-1] magnesium: misc-webserver-ferm rules and firewall [operations/puppet] - 10https://gerrit.wikimedia.org/r/130317 (owner: 10Matanya) [11:36:29] mutante do you have root to varnish servers? I need to get the password for the zero acct tha bblack created [11:38:04] yeah that's surely not raising any flags like that ;-) [11:38:24] matanya: well apart from the syntax error, this will work, but it is not clear why those rules are there. Does racktables need them? does rt need them ? does something else entirely that is unpuppetized ? [11:38:36] mutante, never mind, bblack is here [11:38:47] yurikR: yep, just pinged [11:39:06] akosiaris: i'll add comments to explain this [11:39:08] matanya: i think i don't like it directly on the node in site.pp [11:40:32] so which is better? [11:41:15] the role approach is clearer IMHO. it also keeps site.pp uncluttered and makes moving/adding roles to other machines easier [11:41:38] matanya: i vote for keeping site.pp uncluttered and just changing the resource names for now to avoid the duplicates [11:41:54] ok, back to design board, will push the fixes. thanks for the input! [11:42:32] (03Abandoned) 10Matanya: magnesium: misc-webserver-ferm rules and firewall [operations/puppet] - 10https://gerrit.wikimedia.org/r/130317 (owner: 10Matanya) [11:44:03] (03PS3) 10Matanya: rt: add ferm rules [operations/puppet] - 10https://gerrit.wikimedia.org/r/130312 [11:44:39] hmm i see changes in 118966, retesting [11:47:12] (03PS3) 10Matanya: planet: add ferm rules [operations/puppet] - 10https://gerrit.wikimedia.org/r/130311 [11:50:56] (03PS1) 10BBlack: Revert "Set domain to TLD on GeoIP cookie" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130321 [11:51:07] (03PS2) 10BBlack: Revert "Set domain to TLD on GeoIP cookie" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130321 [11:51:29] (03PS1) 10Matanya: magnesium: add firewall [operations/puppet] - 10https://gerrit.wikimedia.org/r/130322 [11:51:59] (03CR) 10BBlack: [C: 032 V: 032] Revert "Set domain to TLD on GeoIP cookie" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130321 (owner: 10BBlack) [11:52:08] (03CR) 10Matanya: [C: 04-1] "only after https://gerrit.wikimedia.org/r/130312 and https://gerrit.wikimedia.org/r/130313" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130322 (owner: 10Matanya) [11:54:12] (03CR) 10Alexandros Kosiaris: [C: 04-1] applicationserver: lint and tidy (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/122269 (owner: 10Matanya) [11:58:46] (03PS4) 10Matanya: applicationserver: lint and tidy [operations/puppet] - 10https://gerrit.wikimedia.org/r/122269 [12:04:01] (03CR) 10Dzahn: "ah, i had already added mariadb repos for labs back in the test branch, including the keys. unfortunately got lost in the test->prod merge" [operations/puppet] - 10https://gerrit.wikimedia.org/r/123882 (owner: 10Tim Landscheidt) [12:07:59] !log Running deleteEqualMessages.php on cswiktionary (bug 43917) [12:08:06] Logged the message, Master [12:13:21] are there any issues with running this on tin? mwscript changePassword.php zerowiki --user=JohnSmith --password=Secret12 [12:13:38] (need to change the password of a private wiki's acct) [12:26:28] well, other than the fact that you're storing a password in bash history (and I don't know where after that)? [12:35:35] !log Running deleteEqualMessages.php on cswikiversity (bug 43917) [12:35:42] Logged the message, Master [12:36:00] yurikR: bblack: Leading space in bash should exclude it from history right? [12:37:06] Krinkle, bblack, if someone can access shell, we have much bigger problems than some silly zerowiki account :) [12:42:19] Krinkle: yurikR: depends on the local setup of the bashrcs ... ignorespace needs to be set for that to work [13:10:08] !log Running deleteEqualMessages.php on nlwiki (bug 43917) [13:10:16] Logged the message, Master [13:12:29] (03PS1) 10Jgreen: add 'rebuild' mode to otrs mail exporter, for use when retraining Bayes database [operations/puppet] - 10https://gerrit.wikimedia.org/r/130329 [13:12:45] (03PS2) 10Alexandros Kosiaris: Introduce role::puppetmaster classes [operations/puppet] - 10https://gerrit.wikimedia.org/r/130310 [13:12:47] (03PS1) 10Alexandros Kosiaris: Usage of a weighted loadfactor in puppetmaster [operations/puppet] - 10https://gerrit.wikimedia.org/r/130331 [13:20:50] (03CR) 10Jgreen: [C: 032 V: 031] add 'rebuild' mode to otrs mail exporter, for use when retraining Bayes database [operations/puppet] - 10https://gerrit.wikimedia.org/r/130329 (owner: 10Jgreen) [13:25:23] (03PS1) 10Jgreen: disable Bayes auto-learn for OTRS now that the db is working well [operations/puppet] - 10https://gerrit.wikimedia.org/r/130333 [13:29:51] (03PS1) 10Matanya: bugzilla: add ferm rules [operations/puppet] - 10https://gerrit.wikimedia.org/r/130334 [13:30:03] (03CR) 10Jgreen: [C: 032 V: 031] disable Bayes auto-learn for OTRS now that the db is working well [operations/puppet] - 10https://gerrit.wikimedia.org/r/130333 (owner: 10Jgreen) [13:31:44] (03PS1) 10BBlack: Work around cache explosion from gettingStarted feature [operations/puppet] - 10https://gerrit.wikimedia.org/r/130336 [13:33:26] (03PS2) 10BBlack: Work around cache explosion from gettingStarted feature [operations/puppet] - 10https://gerrit.wikimedia.org/r/130336 [13:34:19] (03CR) 10BBlack: [C: 032 V: 032] Work around cache explosion from gettingStarted feature [operations/puppet] - 10https://gerrit.wikimedia.org/r/130336 (owner: 10BBlack) [13:34:48] (03CR) 10Dzahn: "this opens a larger discussion whether or not people with access to research data should have to go to a special host or do it from the ba" [operations/puppet] - 10https://gerrit.wikimedia.org/r/126027 (owner: 10Hoo man) [13:37:36] (03PS1) 10BBlack: Revert "Revert "Unset $wgUseXVO"" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130341 [13:37:59] (03CR) 10BBlack: [C: 032 V: 032] Revert "Revert "Unset $wgUseXVO"" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130341 (owner: 10BBlack) [13:40:09] (03CR) 10Dzahn: "this is the oldest change that is open and does not have votes. added Coren and springle because "general-puropose mysql database for labs" [operations/puppet] - 10https://gerrit.wikimedia.org/r/74158 (owner: 10coren) [13:40:23] (03PS1) 10BBlack: Revert "Revert "Set domain to TLD on GeoIP cookie"" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130342 [13:40:39] (03PS2) 10BBlack: Revert "Revert "Set domain to TLD on GeoIP cookie"" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130342 [13:40:59] (03CR) 10BBlack: [C: 032 V: 032] Revert "Revert "Set domain to TLD on GeoIP cookie"" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130342 (owner: 10BBlack) [13:41:47] !log bblack synchronized wmf-config/InitialiseSettings.php 'Revert "Unset $wgUseXVO"' [13:45:54] !log Running deleteEqualMessages.php on lnwiki (bug 43917) [13:46:00] Logged the message, Master [13:48:31] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1% data above the threshold [250.0] [13:53:12] <_joe_> bblack: \o/ [14:02:06] Reedy: ping? [14:02:29] I'm going to go deploy the GettingStarted reverts [14:07:22] (03PS2) 10Dzahn: ferm rule for bacula director [operations/puppet] - 10https://gerrit.wikimedia.org/r/96226 (owner: 10Alexandros Kosiaris) [14:07:50] (03PS1) 10Ottomata: Using separate job for eventlogging camus imports [operations/puppet] - 10https://gerrit.wikimedia.org/r/130347 [14:09:50] (03CR) 10Dzahn: "just setting the IP manually then in another variable. the resolve@ Faidon suggested would be http://ferm.foo-projects.org/download/2.0/fe" [operations/puppet] - 10https://gerrit.wikimedia.org/r/96226 (owner: 10Alexandros Kosiaris) [14:10:37] away-afk [14:10:47] eh. [14:14:58] (03CR) 10Ottomata: [C: 032 V: 032] Using separate job for eventlogging camus imports [operations/puppet] - 10https://gerrit.wikimedia.org/r/130347 (owner: 10Ottomata) [14:15:00] when deploying PrivateSettings.php, do i need to simply sync-file it, or are there any other magic steps? [14:15:09] (won't deploy until greg-g gets here) [14:16:22] (03CR) 10Dzahn: "i like Chris' suggestion. and i didn't merge this yet because it also loads the headers module, which might enable the caching (see I8e616" [operations/puppet] - 10https://gerrit.wikimedia.org/r/127256 (owner: 10JanZerebecki) [14:21:55] (03PS1) 10BBlack: fix bug in ae30ae0b [operations/puppet] - 10https://gerrit.wikimedia.org/r/130352 [14:22:09] (03CR) 10BBlack: [C: 032 V: 032] fix bug in ae30ae0b [operations/puppet] - 10https://gerrit.wikimedia.org/r/130352 (owner: 10BBlack) [14:39:27] !log faidon synchronized php-1.24wmf1/extensions/GettingStarted/ 'Revert GettingStarted anon tokens' [14:39:33] apergos: something is wrong with snapshot hosts & mw deployments [14:39:35] Logged the message, Master [14:39:37] snapshot1004: Permission denied (publickey). [14:39:37] snapshot1002: Permission denied (publickey). [14:39:37] snapshot1003: Permission denied (publickey). [14:39:47] no user accounts there I guess, but they're still in the deployment host group [14:40:06] !log faidon synchronized php-1.24wmf2/extensions/GettingStarted/ 'Revert GettingStarted anon tokens' [14:40:11] Logged the message, Master [14:40:25] bblack: all done with reverts [14:40:37] that's odd but scaps to them always work [14:41:03] (03CR) 10Daniel Kinzler: [C: 031] "agree that we want this. no clue about the technical details of setting this up." [operations/puppet] - 10https://gerrit.wikimedia.org/r/120535 (owner: 10Hoo man) [14:43:14] (03CR) 10Alexandros Kosiaris: [C: 032] "Good idea Daniel. This will work!" [operations/puppet] - 10https://gerrit.wikimedia.org/r/96226 (owner: 10Alexandros Kosiaris) [14:44:48] your acct and authorized keys are there also [14:44:54] did they recently change your uid, paravoid? [14:45:21] because your home dir over there is owned by uid 2186 [14:45:39] I recently changed paravoid's UID, did I break something? [14:45:44] yes [14:45:55] what did I miss? [14:46:01] home dirs on snapshot100x [14:46:12] are snapshots hosts properly in salt? [14:46:15] indeed [14:46:18] hm, is there no salt on that box? [14:46:18] (03PS1) 10Springle: Add CNAMEs for analytics all-shards databases. [operations/dns] - 10https://gerrit.wikimedia.org/r/130356 [14:46:21] * andrewbogott looks [14:46:59] minion is running right now and I've gotten rsponses from it many times [14:47:10] any chance you did not run with a hight timeout (20 or 25)? [14:47:12] *high [14:47:22] The problem isn't salt, it's puppet. [14:47:25] I think. One moment... [14:47:34] Because 2186 is the /new/ UID [14:48:12] (03PS2) 10Springle: Add CNAMEs for analytics all-shards databases. [operations/dns] - 10https://gerrit.wikimedia.org/r/130356 [14:49:18] (03CR) 10Springle: [C: 032] Add CNAMEs for analytics all-shards databases. [operations/dns] - 10https://gerrit.wikimedia.org/r/130356 (owner: 10Springle) [14:51:00] apergos, paravoid: Most likely explanation is that the login accounts on that box were formerly puppetized but are now orphaned. Any idea if that's possible or likely? [14:51:08] I'm still chasing down the puppet code... [14:51:17] maybe, I can check it [14:52:30] weird [14:52:39] I'll see where they ought to be and add them [14:52:42] Looks like the accounts are defined in the role class. [14:52:59] But that /is/ defined for 1003 and it still has the wrong UID for faidon [14:53:31] well don't worry about it, since it's a puppet thing I can figure something out [14:53:42] apergos: there seems to be some confusion there were some boxes include the module directly and some the role... [14:53:46] But, sure, I'll leave you to it, thanks. [14:53:55] akosiaris: :) one old one gone [14:54:06] :-) [14:54:16] I'm /pretty sure/ that applying that new pupet class will resolve things. Although I've always done it in the reverse order (changing /etc/passwd before chowning) [14:54:21] aude: I see you have a swat deploy scheduled with "patch forthcoming" [14:54:48] (03PS1) 10John F. Lewis: Add Draft namespace on chapcomwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130357 (https://bugzilla.wikimedia.org/64123) [14:55:20] manybubbles: coming [14:55:35] about to push the patch, then wait for jenkins [14:56:04] (03CR) 10Dzahn: "this needs an init script: /Stage[main]/Pmacct/Service[pmacctd]: Could not evaluate: Could not find init script for 'pmacctd'" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130060 (owner: 10Dzahn) [14:56:26] (03PS2) 10Rush: debian packaging directory with our details [operations/debs/ircd-ratbox] (debian) - 10https://gerrit.wikimedia.org/r/130145 [14:56:28] (03PS3) 10Rush: change .gitreview for debian branch [operations/debs/ircd-ratbox] (debian) - 10https://gerrit.wikimedia.org/r/130142 [14:57:30] mutante: this probably needs a *package* :) [14:58:16] oh [14:58:19] the package is there [14:58:26] it's just /etc/init.d/pmacct, not /etc/init.d/pmacctd [14:59:51] paravoid: lol, i just found this [14:59:54] # Package (have a fresh one built by Faidon) [14:59:56] ok:) [15:00:04] yeah :) [15:00:38] aude: k.. I've got others in the window today so it can come later [15:00:42] paravoid: ii pmacct 0.12.5-4 [15:00:49] 17:58 < paravoid> the package is there [15:00:49] 17:58 < paravoid> it's just /etc/init.d/pmacct, not /etc/init.d/pmacctd [15:00:49] twkozlowski: mind if I do you first? [15:00:52] :) [15:01:01] there's an extra "d" in our puppet manifest [15:01:02] paravoid: ok, thanks [15:01:12] :) [15:01:12] let me fix that [15:02:09] (03PS1) 10ArielGlenn: make sure snapshots have common role class [operations/puppet] - 10https://gerrit.wikimedia.org/r/130360 [15:02:46] waiting for jenkins [15:03:06] awesome, thanks for debugging it apergos/andrewbogott [15:03:21] thanks for th heads up [15:03:32] and sorry for the brevity, I was just dealing with this outage :) [15:03:39] * manybubbles has the conch [15:03:40] deal away :-) [15:04:11] (03CR) 10ArielGlenn: [C: 032] make sure snapshots have common role class [operations/puppet] - 10https://gerrit.wikimedia.org/r/130360 (owner: 10ArielGlenn) [15:04:49] (03PS1) 10Dzahn: fix pmacct service name [operations/puppet] - 10https://gerrit.wikimedia.org/r/130361 [15:05:36] (03PS1) 10John F. Lewis: Add autopatrolled group for shwiktionary [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130362 (https://bugzilla.wikimedia.org/61380) [15:06:51] (03CR) 10Dzahn: [C: 032] "Service[pmacctd]: Could not evaluate: Could not find init script for 'pmacctd'"" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130361 (owner: 10Dzahn) [15:07:37] (03PS2) 10Manybubbles: Add more redundency for enwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/127793 [15:07:41] apergos: I think you need to do something with 1003 as well. [15:07:43] (03CR) 10Manybubbles: [C: 032] Add more redundency for enwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/127793 (owner: 10Manybubbles) [15:07:56] Maybe alter role::snapshot::cron::primary] [15:08:29] (03CR) 10Dzahn: "notice: /Stage[main]/Pmacct/Service[pmacct]/ensure: ensure changed 'stopped' to 'running'" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130361 (owner: 10Dzahn) [15:08:32] (03CR) 10Dzahn: "fixed in Change-Id: Ie6ed27db10e" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130060 (owner: 10Dzahn) [15:08:42] (03Merged) 10jenkins-bot: Add more redundency for enwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/127793 (owner: 10Manybubbles) [15:09:02] (03PS2) 10Manybubbles: Enable experimental highlighter on testing wikis [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/129039 [15:09:05] (03CR) 10Manybubbles: [C: 032] Enable experimental highlighter on testing wikis [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/129039 (owner: 10Manybubbles) [15:09:32] (03Merged) 10jenkins-bot: Enable experimental highlighter on testing wikis [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/129039 (owner: 10Manybubbles) [15:10:46] outage? should I delay the SWAT deploy? [15:12:08] cajoel: Service[pmacct]/ensure: ensure changed 'stopped' to 'running [15:12:08] paravoid: I'm around now if you still need me.. [15:12:25] Reedy: no it was more of a heads-up of "I'm deploying" [15:12:35] manybubbles: outage is over, go ahead :) [15:12:43] paravoid: sweet [15:12:47] andrewbogott: yeah I was getting that fixed up next [15:13:04] it was an interesting one, bblack has a report to share with everyone [15:13:15] (03PS1) 10ArielGlenn: include the snapshot role common class in the other snapshot role classes [operations/puppet] - 10https://gerrit.wikimedia.org/r/130364 [15:14:07] (03CR) 10PiRSquared17: [C: 031] Add Draft namespace on chapcomwiki (031 comment) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130357 (https://bugzilla.wikimedia.org/64123) (owner: 10John F. Lewis) [15:14:17] !log manybubbles synchronized php-1.24wmf2/extensions/CirrusSearch/ 'SWAT upgrade - improves as yet undeployed highlighter config' [15:14:24] Logged the message, Master [15:14:58] (03CR) 10ArielGlenn: [C: 032] include the snapshot role common class in the other snapshot role classes [operations/puppet] - 10https://gerrit.wikimedia.org/r/130364 (owner: 10ArielGlenn) [15:15:59] bblack: I have an unpushed changes for you that I just fetched to tin - they looks like the combine to a noop [15:16:36] 557904bf6959acb248212ef73f49a1d21c274ba0 and c768faf18123de60f33b28a1e3be360c9828898d [15:17:16] (03CR) 10PiRSquared17: "The bug was not just requesting an autopatroller group, but also rollbacker and a few others." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130362 (https://bugzilla.wikimedia.org/61380) (owner: 10John F. Lewis) [15:17:39] mind if I sync them? it shouldn't make a difference, but I'd like to know that you know I'm doing it [15:17:53] (03CR) 10John F. Lewis: "See my comment at the end about just enabling this one." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130362 (https://bugzilla.wikimedia.org/61380) (owner: 10John F. Lewis) [15:17:59] (03CR) 10PiRSquared17: "Actually it seems the community wants a "patrollers" group with both those rights." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130362 (https://bugzilla.wikimedia.org/61380) (owner: 10John F. Lewis) [15:22:23] bblack: double ping. I'm stuck until I get an answer on my question above [15:22:50] manybubbles: whoops, sory for a late answer [15:23:07] twkozlowski: np - I just started on my own stuff. I'll get to you once I'm done. [15:23:31] kk [15:24:00] manybubbles: updating core submodule now [15:24:07] then will have patch [15:24:12] (03PS1) 10Dzahn: unquote Booleans in RT role class [operations/puppet] - 10https://gerrit.wikimedia.org/r/130365 [15:25:09] (03PS2) 10Dzahn: unquote Booleans in RT role class [operations/puppet] - 10https://gerrit.wikimedia.org/r/130365 [15:25:10] gah, still waiting for jenkins [15:25:31] aude: it takes some time [15:25:38] but, I'm not sure how much time I'll have [15:25:59] I'm still stuck on the first config update because there were unfetched patches [15:26:03] ok [15:26:21] I should be able to do your submodule update around it, I guess, once you are ready [15:27:50] (03PS3) 10Dzahn: unquote Booleans in RT role class [operations/puppet] - 10https://gerrit.wikimedia.org/r/130365 [15:29:44] manybubbles: I'd say those changes from bblack are safe to push. That was part of the great varnish backend debugging festival last night. [15:29:59] they are safe to push, yes [15:30:01] go ahead [15:30:07] bd808: probably - I believe they even amount to noop. k. will push [15:30:23] s/last night/until ~two hours ago/ [15:30:28] :) [15:30:50] * bd808 saw backscroll of m-ark explaining how to run scap in another channel [15:30:51] our jenkins is busy [15:31:01] !log manybubbles synchronized wmf-config/InitialiseSettings.php 'SWAT deploy - extra setting for cirrus' [15:31:08] Logged the message, Master [15:31:41] !log manybubbles synchronized wmf-config/CirrusSearch-common.php 'SWAT deploy - move group0 wikis to experimental highlighter and give enwiki its redundency back' [15:31:47] Logged the message, Master [15:32:54] !log cirrus deploys look good, moving on to twkozlowski's requests [15:33:00] zzzzz [15:33:01] Logged the message, Master [15:33:13] (03CR) 10Manybubbles: [C: 032] Add image-reviewer group to Persian Wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130079 (https://bugzilla.wikimedia.org/64532) (owner: 10Odder) [15:33:20] (03CR) 10Manybubbles: [C: 032] Additional two Swiss domains to wgCopyUploadsDomains [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130070 (https://bugzilla.wikimedia.org/64536) (owner: 10Odder) [15:33:25] (03Merged) 10jenkins-bot: Add image-reviewer group to Persian Wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130079 (https://bugzilla.wikimedia.org/64532) (owner: 10Odder) [15:33:29] (03Merged) 10jenkins-bot: Additional two Swiss domains to wgCopyUploadsDomains [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130070 (https://bugzilla.wikimedia.org/64536) (owner: 10Odder) [15:35:36] (03CR) 10Dzahn: [C: 032] ldaplist: Fix typo [operations/puppet] - 10https://gerrit.wikimedia.org/r/117314 (owner: 10Tim Landscheidt) [15:35:54] (03PS2) 10Odder: Add University of Neuchâtel to wgCopyUploadsDomains [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130067 (https://bugzilla.wikimedia.org/64535) [15:36:14] manybubbles: ^^ [15:36:32] (03CR) 10Manybubbles: [C: 032] Add University of Neuchâtel to wgCopyUploadsDomains [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130067 (https://bugzilla.wikimedia.org/64535) (owner: 10Odder) [15:36:45] twkozlowski: thanks, was rebasing but get my wires crossed for a moment [15:36:46] (03Merged) 10jenkins-bot: Add University of Neuchâtel to wgCopyUploadsDomains [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130067 (https://bugzilla.wikimedia.org/64535) (owner: 10Odder) [15:37:55] !log manybubbles synchronized wmf-config/InitialiseSettings.php 'SWAT deploy - extra setting for cirrus and new groups and sources for gwtoolset' [15:38:02] Logged the message, Master [15:38:09] !log manybubbles synchronized wmf-config/CirrusSearch-common.php 'SWAT deploy - move group0 wikis to experimental highlighter and give enwiki its redundency back' [15:39:09] twkozlowski: your SWATed [15:39:24] and it looks like I forgot to rebase to get mine so it went with yours [15:39:28] no problem, though [15:39:46] (03CR) 10Faidon Liambotis: [C: 032] Add eventlogging service alias [operations/dns] - 10https://gerrit.wikimedia.org/r/129220 (owner: 10Ori.livneh) [15:39:50] Yeah, just checked fawiki and the sources -- all looks fine [15:39:51] gerrit fail [15:39:51] (03CR) 10Dzahn: [C: 031] ldaplist: Switch to new servicegroups structure [operations/puppet] - 10https://gerrit.wikimedia.org/r/117313 (owner: 10Tim Landscheidt) [15:40:18] JohnLewis: I see you have swat patches as well, want me to review and push those now? [15:40:22] aude: sorry! [15:40:29] manybubbles: Sure [15:40:44] (03CR) 10Manybubbles: [C: 032] Add Draft namespace on chapcomwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130357 (https://bugzilla.wikimedia.org/64123) (owner: 10John F. Lewis) [15:40:53] (03Merged) 10jenkins-bot: Add Draft namespace on chapcomwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130357 (https://bugzilla.wikimedia.org/64123) (owner: 10John F. Lewis) [15:40:55] some failure of fetching, but safely can be ignored because we have 2 jenkins doing our tests for our "build" [15:41:11] jenkins-bot approves [15:41:24] (03CR) 10Dzahn: [C: 031] Tools: Rename references to local-admin to tools.admin [operations/puppet] - 10https://gerrit.wikimedia.org/r/118036 (owner: 10Tim Landscheidt) [15:42:09] (03CR) 10Manybubbles: [C: 032] Add autopatrolled group for shwiktionary [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130362 (https://bugzilla.wikimedia.org/61380) (owner: 10John F. Lewis) [15:42:18] (03Merged) 10jenkins-bot: Add autopatrolled group for shwiktionary [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130362 (https://bugzilla.wikimedia.org/61380) (owner: 10John F. Lewis) [15:42:20] (03CR) 10Billinghurst: "@PiRSquared17. If you are not going to read the bugzilla commentary thoroughly, it might be better to not make a comment." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130362 (https://bugzilla.wikimedia.org/61380) (owner: 10John F. Lewis) [15:42:51] heya hashar [15:42:55] (03PS1) 10Ottomata: Adding diskstat monitoring to hadoop nodes [operations/puppet] - 10https://gerrit.wikimedia.org/r/130366 [15:43:05] (03CR) 10Manybubbles: "Is there still active debate going on with this one? I'm getting confused, sorry." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130362 (https://bugzilla.wikimedia.org/61380) (owner: 10John F. Lewis) [15:43:16] JohnLewis: is https://gerrit.wikimedia.org/r/#/c/130362/ ready too? I see some debate or something [15:43:19] any reason we don't use ganglia::plugin::python { 'diskstat': } [15:43:19] everywhere yet? :) [15:43:50] manybubbles: No debate, Pi didn't read the whole bugzilla thread where we discussed it all :) [15:44:05] JohnLewis: so good to deploy? [15:44:10] It's good. [15:45:17] Reedy: are there 2 kinds of "purge_securepoll"? [15:45:20] !log manybubbles synchronized wmf-config/InitialiseSettings.php 'SWAT add autopatrolled group to shwiktionary and draft namespace to chapcomwiki' [15:45:22] JohnLewis: done [15:45:26] Logged the message, Master [15:45:29] manybubbles: After the bug was WONTFIX'd and reopened about 10 times by stewards disputing - we came to the conclusion that an 'autopatrolled' group is fine but the rollbacker and flood groups are not :) [15:45:31] PROBLEM - DPKG on tantalum is CRITICAL: DPKG CRITICAL dpkg reports broken packages [15:45:41] manybubbles: Thanks! [15:45:50] Reedy: '/usr/local/bin/foreachwiki extensions/AbuseFilter/maintenance/purgeOldLogIPData.php vs. /usr/local/bin/foreachwiki extensions/SecurePoll/cli/purgePrivateVoteData.php ? [15:46:18] (03CR) 10Manybubbles: "Talked to JohnLewis on IRC and he told me it was ready so I deployed it." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130362 (https://bugzilla.wikimedia.org/61380) (owner: 10John F. Lewis) [15:46:20] (03CR) 10Billinghurst: "@John F. Lewis Looks okay to me." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130362 (https://bugzilla.wikimedia.org/61380) (owner: 10John F. Lewis) [15:46:31] (03PS2) 10Ottomata: Adding diskstat monitoring to hadoop nodes [operations/puppet] - 10https://gerrit.wikimedia.org/r/130366 [15:46:54] aude: gerrit/jenkins still being upset? [15:47:02] just taking forever [15:47:19] (03CR) 10Dzahn: "is this different from the (meanwhile existing): cron { 'purge_securepoll' with this commandline? extensions/AbuseFilter/maintenance/purge" [operations/puppet] - 10https://gerrit.wikimedia.org/r/74592 (https://bugzilla.wikimedia.org/51574) (owner: 10Reedy) [15:47:29] i could just approve since jenkins-bot approves (and i think wikidata jenkins is ok / can be ignored because of gerrit issue) [15:47:55] !log rebuilding test2wiki's cirrus index after swat deploy [15:48:03] Logged the message, Master [15:48:16] (03PS3) 10Ottomata: Adding diskstat monitoring to hadoop nodes [operations/puppet] - 10https://gerrit.wikimedia.org/r/130366 [15:48:43] (03PS1) 10Springle: Use dbstore1001 for analytics instead of 02. [operations/dns] - 10https://gerrit.wikimedia.org/r/130367 [15:48:45] aude: whichever you'd like. maybe have someone else in the office look at it to compensate for jenkins? [15:48:59] approved [15:49:26] we have enough layers of testing that i'm not worried [15:49:46] (03CR) 10Springle: [C: 032] Use dbstore1001 for analytics instead of 02. [operations/dns] - 10https://gerrit.wikimedia.org/r/130367 (owner: 10Springle) [15:50:06] (03PS4) 10Ottomata: Adding diskstat monitoring to hadoop nodes [operations/puppet] - 10https://gerrit.wikimedia.org/r/130366 [15:50:11] (03CR) 10Ottomata: [C: 032 V: 032] Adding diskstat monitoring to hadoop nodes [operations/puppet] - 10https://gerrit.wikimedia.org/r/130366 (owner: 10Ottomata) [15:50:29] then we have gate and submit [15:50:35] aude: cool [15:50:44] yuri said he won't deploy until I get here, should I be worried? :) [15:50:45] yeah, is this the gate and submit for the submodule update on wmf2? [15:50:55] greg-g: morning! [15:50:59] SWAT is busy today [15:51:08] no for wikidata.git [15:51:17] manybubbles: g'morning, all ok? [15:51:22] greg-g: ok [15:51:23] * aude is slow [15:52:08] greg-g and aude: we don't have anything scheduled for the next two hours so we can probably bleed out of this window if required [15:52:17] ok [15:53:10] * greg-g nods [15:53:14] we also might have a few things for swat tonight... things not merged in time for this swat and not absolutely critical for deploy [15:53:22] but not things we want to wait 2 weeks [15:54:06] (03CR) 10Reedy: "Sounds like that one is misnamed as it's pointing at AbuseFilter..." [operations/puppet] - 10https://gerrit.wikimedia.org/r/74592 (https://bugzilla.wikimedia.org/51574) (owner: 10Reedy) [15:54:31] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 1.69491525424% of data exceeded the critical threshold [500.0] [15:54:42] aude: well, if we want to hold it for tonight I have no objects. my blood preasure drops when I log out of tin [15:54:44] lovely, jenkins is experiencing technical problem [15:54:59] jenkins has merged [15:55:07] i just cant view https://integration.wikimedia.org/ci/view/Extensions/ [15:57:18] https://gerrit.wikimedia.org/r/#/c/130369/ [15:57:20] manybubbles: ^ [15:57:31] RECOVERY - DPKG on tantalum is OK: All packages OK [15:57:49] aude: +2ed [15:57:53] thanks [15:58:17] shall be better prepared next swat [15:59:13] (03PS1) 10Reedy: Rename misc::maintenance::purge_abusefilter cron job name [operations/puppet] - 10https://gerrit.wikimedia.org/r/130370 [15:59:14] mutante: ^^ [16:00:57] (03PS5) 10Reedy: Make puppet cronjob to run SecurePoll/cli/purgePrivateVoteData.php [operations/puppet] - 10https://gerrit.wikimedia.org/r/74592 (https://bugzilla.wikimedia.org/51574) [16:01:13] (03CR) 10jenkins-bot: [V: 04-1] Make puppet cronjob to run SecurePoll/cli/purgePrivateVoteData.php [operations/puppet] - 10https://gerrit.wikimedia.org/r/74592 (https://bugzilla.wikimedia.org/51574) (owner: 10Reedy) [16:01:49] aude: looks like it is merged. I'll go sync it [16:01:57] (03CR) 10Dzahn: [C: 032] "ah:) this should unblock Change-Id: Iab860b8a5d" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130370 (owner: 10Reedy) [16:01:59] yay [16:02:38] (03PS6) 10Reedy: Make puppet cronjob to run SecurePoll/cli/purgePrivateVoteData.php [operations/puppet] - 10https://gerrit.wikimedia.org/r/74592 (https://bugzilla.wikimedia.org/51574) [16:03:18] (03CR) 10Alexandros Kosiaris: [C: 04-1] "Moving along nicely. A few things and it will be ready for merging." (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/108314 (owner: 10Matanya) [16:04:55] mutante: git grep enable_mailman templates/exim/exim4.conf.SMTP_IMAP_MM.erb [16:04:57] manybubbles: Reedy then actual general deploy window is in an hour? [16:05:05] !log manybubbles synchronized php-1.24wmf2/extensions/Wikidata/ 'SWAT upgrade wikidata for date parsing fixes' [16:05:09] thanks [16:05:13] Logged the message, Master [16:05:18] 2 [16:05:18] I am inclined to say the unquoting of booleans will not work in this case at least [16:05:19] manybubbles: looks good! [16:05:22] Reedy: ok [16:05:26] (03PS4) 10Ottomata: archiva: add ferm rule [operations/puppet] - 10https://gerrit.wikimedia.org/r/130061 (owner: 10Matanya) [16:05:31] (03CR) 10Ottomata: [C: 032 V: 032] archiva: add ferm rule [operations/puppet] - 10https://gerrit.wikimedia.org/r/130061 (owner: 10Matanya) [16:05:31] * aude has enough time to eat and go home :) [16:05:43] sweet, I'm going to log off tin [16:05:51] * manybubbles puts down the conch [16:05:52] (03PS3) 10Ottomata: titanium: add firewall to the host [operations/puppet] - 10https://gerrit.wikimedia.org/r/130066 (owner: 10Matanya) [16:05:56] :) [16:06:05] (03CR) 10jenkins-bot: [V: 04-1] titanium: add firewall to the host [operations/puppet] - 10https://gerrit.wikimedia.org/r/130066 (owner: 10Matanya) [16:06:33] mutante: same goes for enable_mail_relay [16:07:05] there are actual comparations with "true" and "false" in both cases. As in strings, not booleans [16:07:26] (03CR) 10Ottomata: [C: 032 V: 032] titanium: add firewall to the host [operations/puppet] - 10https://gerrit.wikimedia.org/r/130066 (owner: 10Matanya) [16:07:40] hehe, seems like all of them [16:07:43] (03CR) 10Dzahn: "# Puppet Name: purge_abusefilteripdata" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130370 (owner: 10Reedy) [16:09:49] (03CR) 10Alexandros Kosiaris: [C: 04-2] "A git grep enable_mail_relay says" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130365 (owner: 10Dzahn) [16:09:53] akosiaris: ah, then i hit one of the reasons we can never merge https://gerrit.wikimedia.org/r/#/c/109073/ [16:10:12] mutante: ahh never is such a strong word [16:10:25] no jenkins? [16:10:45] let's just say in the - soon to be present- future [16:11:13] yea,ok [16:11:16] (03PS4) 10Ottomata: titanium: add firewall to the host [operations/puppet] - 10https://gerrit.wikimedia.org/r/130066 (owner: 10Matanya) [16:11:43] inclined to abandon that and wait for exim class refactor [16:11:59] I vote yes [16:12:17] paravoid is already fixing it as we speak [16:12:38] ah! sweet [16:12:58] (03CR) 10Ottomata: [C: 032 V: 032] titanium: add firewall to the host [operations/puppet] - 10https://gerrit.wikimedia.org/r/130066 (owner: 10Matanya) [16:13:40] (03Abandoned) 10Dzahn: unquote Booleans in RT role class [operations/puppet] - 10https://gerrit.wikimedia.org/r/130365 (owner: 10Dzahn) [16:13:43] another one hits the ground. thanks ottomata [16:13:59] next host [16:15:22] Jeff_Green: does civic require any external port other than 80/443? [16:17:02] (03PS1) 10Aaron Schulz: Increased htmlCacheUpdate throttle [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130371 [16:17:20] (03CR) 10Aaron Schulz: [C: 032] Increased htmlCacheUpdate throttle [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130371 (owner: 10Aaron Schulz) [16:17:27] (03Merged) 10jenkins-bot: Increased htmlCacheUpdate throttle [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130371 (owner: 10Aaron Schulz) [16:17:56] matanya: aluminium has stateful iptables rules already, and is slated for shutdown in the relatively near future [16:18:10] !log aaron synchronized wmf-config/CommonSettings.php 'Increased htmlCacheUpdate throttle' [16:18:17] Logged the message, Master [16:18:25] matanya: so for now, let's just leave it as is [16:18:45] Jeff_Green: i'm referring to zirconium [16:19:46] oh, that's not a fundraising host. in general civi is just a web interface, but it's concievable it uses incoming mail too [16:19:55] Jeff_Green: matanya , you mean contacts.wikimedia.org and the answer should be No [16:20:07] more than one vici [16:20:08] civi [16:20:44] vene vidi wiki [16:20:50] thanks both [16:20:58] will firewall [16:21:43] (03PS1) 10Andrew Bogott: Don't ensure=>absent dab's group [operations/puppet] - 10https://gerrit.wikimedia.org/r/130372 [16:22:34] (03CR) 10Andrew Bogott: [C: 032] Don't ensure=>absent dab's group [operations/puppet] - 10https://gerrit.wikimedia.org/r/130372 (owner: 10Andrew Bogott) [16:26:18] Reedy: mind if i give some love to https://gerrit.wikimedia.org/r/#/c/74592 ? [16:26:35] (03PS1) 10Matanya: contacts: add ferm rule [operations/puppet] - 10https://gerrit.wikimedia.org/r/130374 [16:27:11] matanya: Feel free :) [16:27:18] thanks [16:27:21] thanks, i saw this coming:) [16:29:03] !log reedy updated /a/common to {{Gerrit|Idb2a86791}}: Increased htmlCacheUpdate throttle [16:29:09] Logged the message, Master [16:29:15] (03PS1) 10Reedy: Non Wikipedias to 1.24wmf2 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130375 [16:32:53] (03PS7) 10Matanya: Make puppet cronjob to run SecurePoll/cli/purgePrivateVoteData.php [operations/puppet] - 10https://gerrit.wikimedia.org/r/74592 (https://bugzilla.wikimedia.org/51574) (owner: 10Reedy) [16:34:08] (03PS8) 10Matanya: Make puppet cronjob to run SecurePoll/cli/purgePrivateVoteData.php [operations/puppet] - 10https://gerrit.wikimedia.org/r/74592 (https://bugzilla.wikimedia.org/51574) (owner: 10Reedy) [16:34:19] joy [16:34:26] next! [16:37:31] Wikibase\PropertyParserFunction seems to have some mismatched profiling [16:37:52] AaronSchulz: make a bug ticket [16:38:02] * aude going home but can look later [16:38:14] (03PS1) 10Matanya: etherpad: add ferm rule [operations/puppet] - 10https://gerrit.wikimedia.org/r/130377 [16:39:46] (03PS1) 10Springle: Increase available file handles for TokuDB. [operations/puppet] - 10https://gerrit.wikimedia.org/r/130378 [16:40:37] i'm receiving mails from jenkins/gerrit on changes [16:41:55] *i'm not [16:42:02] (03CR) 10Springle: [C: 032] Increase available file handles for TokuDB. [operations/puppet] - 10https://gerrit.wikimedia.org/r/130378 (owner: 10Springle) [16:45:49] (03PS1) 10Matanya: wikimania_scholarships: add ferm rule [operations/puppet] - 10https://gerrit.wikimedia.org/r/130379 [16:47:56] manybubbles: yes, those two are intended to combine to a no-op. They were pushed with sync-file manually rather than scap [16:48:12] (03PS1) 10Matanya: zirconium: add firewall [operations/puppet] - 10https://gerrit.wikimedia.org/r/130380 [16:48:16] sorry, I'm not very familiar with that process :) [16:48:22] bblack: they weren't actually fetched, so far as I could tell. [16:48:27] or maybe not rebased [16:48:36] but they did noop, so I pushed them [16:49:26] well, what I mean is I pushed the first with sync-file, and some hours later pushed the second with sync-file. I had hoped that actually deployed the change in the interim [16:49:41] what would have been missing to make that happen, just for future ref? [16:51:01] PROBLEM - MySQL Slave Delay on db1046 is CRITICAL: CRIT replication delay 50482 seconds [16:51:11] PROBLEM - MySQL Replication Heartbeat on db1046 is CRITICAL: CRIT replication delay 50394 seconds [16:51:25] ACKNOWLEDGEMENT - MySQL Replication Heartbeat on db1046 is CRITICAL: CRIT replication delay 50394 seconds Sean Pringle Catching up... - The acknowledgement expires at: 2014-05-01 16:50:55. [16:51:26] ACKNOWLEDGEMENT - MySQL Slave Delay on db1046 is CRITICAL: CRIT replication delay 50482 seconds Sean Pringle Catching up... - The acknowledgement expires at: 2014-05-01 16:50:55. [16:51:55] manybubbles: ah, meeting schedule overlap today [16:52:04] ottomata: sok [16:52:04] analytics showcase is at the same time as the search checkin [16:52:47] bblack: I _think_ the rebase line in https://wikitech.wikimedia.org/wiki/How_to_deploy_code#Step_2:_get_the_code_on_tin [16:53:05] but I'm not sure because I didn't check the state of the repository before I fetched my changes [16:53:39] !log osmium install complete, ticket resolved, ready for ^d and ori to take over [16:53:46] Logged the message, RobH [16:54:07] manybubbles: woah, that page just melted my brain [16:54:20] bblack: fun [16:56:42] (03CR) 10Matanya: [C: 04-1] "depends on :" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130380 (owner: 10Matanya) [16:59:01] (03CR) 10BryanDavis: wikimania_scholarships: add ferm rule (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/130379 (owner: 10Matanya) [17:02:14] means no need for https bd808 ? [17:03:18] matanya: I don't think it's needed on zirconium [17:03:28] And that module is only used there [17:04:00] In labs I use the MW-Vagrant module [17:04:19] bd808: so why is 80 needed? [17:04:41] Varnish hits port 80 [17:04:56] ssl -> nginx -> varnish -> apache [17:05:10] oh, this way [17:05:13] ok, sure. [17:05:41] (03PS2) 10Matanya: wikimania_scholarships: add ferm rule [operations/puppet] - 10https://gerrit.wikimedia.org/r/130379 [17:12:29] (03Abandoned) 10Jkrauska: Add jkrauska to rhenium [operations/puppet] - 10https://gerrit.wikimedia.org/r/130231 (owner: 10Jkrauska) [17:13:08] !log raising number of replicas of enwiki's cirrus index from 1 to 2. cluster will probably complain while they allocate [17:13:15] Logged the message, Master [17:14:35] Hi there, haithams! [17:14:44] hi. [17:15:31] Not often we see you around IRC :) [17:15:54] yeah, I have not been a big fan of IRC (although i know I should :) [17:16:14] haithams: Would you like some help getting your account set up, finding a client, etc.? [17:16:56] sure. in the meanwhile, I wonder if andrewbogott is here? [17:17:25] haithams: I'm here, will be with you in a moment :) [17:17:44] cool. i will be here. [17:17:49] any sign of asher f ? [17:17:56] ? [17:19:11] marktraceur, what are the best IRC clients for Mac would you recommend? I tried a few, but I've not found a really good one yet. [17:19:25] haithams: I've seen people using Colloquy and LimeChat primarily [17:19:28] haithams: i use limechat since its open source [17:19:38] colloquy is prettier, but i had odd issues with it [17:19:46] Adium "has" IRC support but it's crap. [17:19:55] indeed, ryan lane used adium's irc support [17:19:58] i have no idea how. [17:20:10] (so hacky and ugly) [17:20:13] Alcohol helped I'm sure [17:20:14] ok, i will check those out. [17:20:25] limechat is available in the os x app store for free as well [17:20:35] (dunno about colloquy since i migrated off) [17:20:37] <^demon> +1 to limechat. [17:21:06] ok, limechat it is. [17:21:14] haithams: ok, do you want to write your own puppet patch or shall I? [17:22:02] andrewbogott : I've uploaded my public key here https://office.wikimedia.org/wiki/User:Hshammaa/stat1003_key [17:22:12] ok, stay tuned... [17:22:50] (03PS1) 10Jkrauska: Fix typo in key [operations/puppet] - 10https://gerrit.wikimedia.org/r/130384 [17:23:30] any puppet mergers awake right now? [17:23:53] (03CR) 10Chad: "Last time I did something like this I was told to abandon the key and generate a new one since it would make figuring out the private bits" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130384 (owner: 10Jkrauska) [17:24:50] Chad: this is my normal key.. just has a single character appended on the pub.. very small typo.. [17:25:09] <^demon> I know it's a normal key :) [17:25:25] <^demon> I was told the same bit last time I had a public key like that that I typo'd by 1 or 2 chars. [17:25:36] not sure I follow your 'figuring out the private bits' thing. [17:26:41] (03PS1) 10Andrew Bogott: Add a fresh key for user haithams. [operations/puppet] - 10https://gerrit.wikimedia.org/r/130385 [17:26:43] haithams: check my work? ^ [17:26:45] <^demon> I'm trying to find the old discussion. [17:27:10] um… HaithamS_ : https://gerrit.wikimedia.org/r/#/c/130385/ [17:27:40] andrewbogott : looks great! Thank you [17:28:11] (03CR) 10Andrew Bogott: [C: 032] Add a fresh key for user haithams. [operations/puppet] - 10https://gerrit.wikimedia.org/r/130385 (owner: 10Andrew Bogott) [17:28:25] <^demon> cajoel_: "private bits" isn't the right word. Anyway, I was told to just abandon the typo in the public key and regenerate. [17:28:32] <^demon> I can't find the old discussion. [17:29:02] I'd really like to debate that.. :) [17:29:25] <^demon> This was like...over a year ago. [17:29:27] would you merge for me if I generated a new one? If that's the case, I'll gladly capitulate.. [17:29:42] <^demon> I can't merge, I'm just commenting from my armchair ;-) [17:30:19] akosiaris: paravoid: Hm.. so what needs to happen for the jsduck upgrade to be effective? From what I can see the Jenkins node is still running jsduck 4.x [17:30:48] Krinkle: you need hashar to add it to jenkins [17:31:02] HaithamS_: It's not totally obvious to me that you ever actually had access to stat1003. did you? And/or do you now? [17:31:17] (I don't see that key being added to that box but I could be missing it someplace) [17:31:20] matanya: The puppet manifest for gallium already has jsduck on it. It's not a new package, and it wasn't pinned. [17:31:47] I guess it just needs a puppet run? I guess the automatic 30 min puppet runs don't do upgrades. [17:31:51] andrewbogott: i don't think he did [17:31:51] only ensure => present [17:31:55] should that be latest? [17:31:57] stat1003 is new [17:31:59] hrm, wildcards don't work in graphite anymore [17:32:04] his key was revoked before that [17:32:15] andrewbogott : well, I had access to stat1, I assume my account was migrated to stat1003 according to aotto. [17:32:27] yes Krinkle [17:32:39] i'll push a patch for you, ok? [17:32:42] https://github.com/wikimedia/operations-puppet/blob/3a09c54d1d4fccaad10626777425d9b4e5c63463/modules/contint/manifests/packages.pp#L87 [17:32:45] ottomata, HaithamS_, can you get together on this? [17:32:47] matanya: That'd be great. thank you [17:33:07] is bast1001 a different thing from stat1003? [17:33:19] HaithamS_: yes [17:33:22] andrewbogott: [17:33:29] answer: HaithamS_ shoudl ahve access to bast1001 and stat1003 [17:33:37] (sorry, in meeting) [17:33:42] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 6.77966101695% of data exceeded the critical threshold [500.0] [17:33:50] right, that's what's my understanding. [17:33:53] ensure => present seems evil since that means if we create a new machine with the same manifest, it'll run different versions. But then again, we don't want an upgrade to the package repository to result in upgrade acrorss the board when we're not expecting it. [17:34:39] ahh, just really slow [17:35:19] (03PS1) 10Andrew Bogott: Give haithams access to stat1003. [operations/puppet] - 10https://gerrit.wikimedia.org/r/130390 [17:35:21] (03CR) 10BryanDavis: "Looks good to me, but beware of applying until everything else running on zirconium has ferm rules as well. Nobody wants to see bugzilla o" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130379 (owner: 10Matanya) [17:37:28] (03CR) 10Andrew Bogott: [C: 032] Give haithams access to stat1003. [operations/puppet] - 10https://gerrit.wikimedia.org/r/130390 (owner: 10Andrew Bogott) [17:38:59] (03PS1) 10Matanya: contint: upgrade jsduck from 4.x to 5.x [operations/puppet] - 10https://gerrit.wikimedia.org/r/130391 [17:39:19] HaithamS: try now? [17:40:10] andrewbogott: can you please merge this one ^ ? [17:40:16] it is blocking Krinkle [17:41:00] I would rather just hand-upgrade that package rather than set it to 'latest' forever. [17:41:02] RECOVERY - MySQL Slave Delay on db1046 is OK: OK replication delay 0 seconds [17:41:03] What boxes? [17:41:07] latest is evil. [17:41:29] gallium [17:41:45] (03CR) 10RobH: [C: 04-1] "'latest' is evil, peg it to a specific version, or it can roll an unexpected update to a package to break things." [operations/puppet] - 10https://gerrit.wikimedia.org/r/130391 (owner: 10Matanya) [17:42:10] andrewbogott RobH i agree, and that is what the comment says [17:42:25] cool [17:42:40] update it by a merge and i'll roll back to present [17:42:52] but hand update is even better [17:43:06] matanya: just gallium? [17:43:09] andrewbogott : I got public key denied on bastion1001 ... dumb question, if i have two set of keys on my device, do i need to do something special, or would the keys match automatically? [17:43:12] PROBLEM - MySQL Slave Running on db1046 is CRITICAL: CRIT replication Slave_IO_Running: Yes Slave_SQL_Running: No Last_Error: Error Cant create federated table. Foreign data src error: databas [17:43:13] * matanya is looking [17:43:36] HaithamS: as long as both keys are in ~/.ssh you shouldn't have to specify. [17:43:45] Try again? I just forced a puppet run on bast1001 [17:44:05] matanya: gallium is updated [17:44:11] ssh bast1001.wikimedia.org [17:44:11] Permission denied (publickey). [17:44:12] thanks [17:44:21] sorry ... [17:44:24] Krinkle: ^^ [17:44:27] (03CR) 10Andrew Bogott: [C: 04-2] "I updated this by hand on gallium -- an easier and better fix than this patch :)" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130391 (owner: 10Matanya) [17:44:50] (03Abandoned) 10Matanya: contint: upgrade jsduck from 4.x to 5.x [operations/puppet] - 10https://gerrit.wikimedia.org/r/130391 (owner: 10Matanya) [17:44:58] OK [17:45:42] PROBLEM - ElasticSearch health check on elastic1014 is CRITICAL: CRITICAL - elasticsearch (production-search-eqiad) is running. status: red: timed_out: false: number_of_nodes: 16: number_of_data_nodes: 16: active_primary_shards: 2032: active_shards: 6070: relocating_shards: 0: initializing_shards: 6: unassigned_shards: 0 [17:46:08] andrewbogott: lanthanum too [17:46:12] PROBLEM - ElasticSearch health check on elastic1001 is CRITICAL: CRITICAL - elasticsearch (production-search-eqiad) is running. status: red: timed_out: false: number_of_nodes: 16: number_of_data_nodes: 16: active_primary_shards: 2032: active_shards: 6070: relocating_shards: 0: initializing_shards: 6: unassigned_shards: 0 [17:46:13] PROBLEM - ElasticSearch health check on elastic1003 is CRITICAL: CRITICAL - elasticsearch (production-search-eqiad) is running. status: red: timed_out: false: number_of_nodes: 16: number_of_data_nodes: 16: active_primary_shards: 2032: active_shards: 6070: relocating_shards: 0: initializing_shards: 6: unassigned_shards: 0 [17:46:43] manybubbles: ^ s'ok or no-'sok? [17:46:58] I check, probably ok [17:47:03] it corresponds with me doing something [17:47:07] red does not sound ok, but also sounds like maybe a monitoring problem [17:47:08] matanya: ok, done [17:47:17] thanks a lot [17:47:17] matanya: andrewbogott: Thanks for everything. Result is now published here: https://doc.wikimedia.org/mediawiki-core/master/js/ [17:47:20] HaithamS: try ssh -vvv and see what it says about key attempts [17:47:23] andrewbogott : still no connection on neither bast1001 nor stat1003 [17:47:46] sok [17:48:05] Krinkle: please close the bug if happy :) [17:48:19] https://bugzilla.wikimedia.org/show_bug.cgi?id=55753 [17:48:20] matanya: will do [17:48:26] ottomata: yeah, it goes red sometimes when new shards are being created. I _think_ it is because they are unassigned but I'm not sure. [17:51:12] RECOVERY - MySQL Slave Running on db1046 is OK: OK replication Slave_IO_Running: Yes Slave_SQL_Running: Yes Last_Error: [17:51:42] RECOVERY - ElasticSearch health check on elastic1014 is OK: OK - elasticsearch (production-search-eqiad) is running. status: green: timed_out: false: number_of_nodes: 16: number_of_data_nodes: 16: active_primary_shards: 2033: active_shards: 6076: relocating_shards: 1: initializing_shards: 0: unassigned_shards: 0 [17:52:12] RECOVERY - ElasticSearch health check on elastic1003 is OK: OK - elasticsearch (production-search-eqiad) is running. status: green: timed_out: false: number_of_nodes: 16: number_of_data_nodes: 16: active_primary_shards: 2033: active_shards: 6076: relocating_shards: 0: initializing_shards: 0: unassigned_shards: 0 [17:52:12] RECOVERY - ElasticSearch health check on elastic1001 is OK: OK - elasticsearch (production-search-eqiad) is running. status: green: timed_out: false: number_of_nodes: 16: number_of_data_nodes: 16: active_primary_shards: 2033: active_shards: 6076: relocating_shards: 0: initializing_shards: 0: unassigned_shards: 0 [17:53:00] HaithamS: next time best to use a service like dpaste.org [17:53:02] PROBLEM - MySQL Slave Delay on db1046 is CRITICAL: CRIT replication delay 42080 seconds [17:53:22] HaithamS: it looks to me like your name on your local laptop is 'loaner' and you are trying to log in as 'loaner@bast1001.wikimedia.org' [17:53:45] Can you try ssh -v haithams@bast1001.wikimedia.org? [17:54:20] andrewbogott : sure [17:54:35] andrewbogott : same thing. looks like bast1001 is trying to connect to my _other_ private key. [17:55:01] You can specify which key to use with -i [17:55:06] but I'd expect it to try both [17:56:22] ok. it worked with -i [17:56:58] is there a work around this? because I usually use a data base query client to access, and can't specify keys there [17:57:31] * aude back  [17:57:56] hm. I don't know why it didn't find the right key originally. [17:58:22] You can set up a proxy command which can get you to stat1003 directly with a preset bastion hop and key. [17:58:46] some docs here: http://sshmenu.sourceforge.net/articles/transparent-mulithop.html [17:58:55] oh well. anyway, looks like the problem is solved now. Thanks a lot. I really appreciate your help. [18:00:33] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1% data above the threshold [250.0] [18:03:00] (03PS1) 10Aude: Bump cache epoch for Wikidata, due to changes in dom [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130398 [18:03:13] (03CR) 10Reedy: [C: 032] Non Wikipedias to 1.24wmf2 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130375 (owner: 10Reedy) [18:03:21] (03Merged) 10jenkins-bot: Non Wikipedias to 1.24wmf2 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130375 (owner: 10Reedy) [18:03:50] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf2 [18:03:55] Logged the message, Master [18:04:11] !log reedy synchronized docroot and w [18:04:17] Logged the message, Master [18:05:40] Reedy: want to deploy https://gerrit.wikimedia.org/r/#/c/130398/ ? [18:06:58] (03PS2) 10Reedy: Bump cache epoch for Wikidata, due to changes in dom [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130398 (owner: 10Aude) [18:07:04] Not gonna update the time? ;) [18:07:19] good enough :) [18:07:50] (03CR) 10Reedy: [C: 032] Bump cache epoch for Wikidata, due to changes in dom [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130398 (owner: 10Aude) [18:07:58] (03Merged) 10jenkins-bot: Bump cache epoch for Wikidata, due to changes in dom [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130398 (owner: 10Aude) [18:08:04] thanks [18:08:32] (03PS1) 10coren: Tool Labs: (partial) tomcat support [operations/puppet] - 10https://gerrit.wikimedia.org/r/130399 [18:08:51] Thanks andrewbogott, i was also able to login to bast1001 on a client software. [18:09:05] !log reedy synchronized wmf-config/ 'I52293b29a87e2c645735b37215e4113e561e47da' [18:09:10] Logged the message, Master [18:10:49] (03CR) 10coren: [C: 032] "What could go wrong?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130399 (owner: 10coren) [18:13:00] (03PS2) 10Reedy: Update getRealmSpecificFilename comments [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/129667 [18:13:05] (03CR) 10Reedy: [C: 032] Update getRealmSpecificFilename comments [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/129667 (owner: 10Reedy) [18:13:32] (03Merged) 10jenkins-bot: Update getRealmSpecificFilename comments [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/129667 (owner: 10Reedy) [18:13:46] (03PS2) 10Reedy: Update unit tests to drop pmtpa [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130227 [18:13:50] (03CR) 10Reedy: [C: 032] Update unit tests to drop pmtpa [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130227 (owner: 10Reedy) [18:14:00] (03Merged) 10jenkins-bot: Update unit tests to drop pmtpa [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130227 (owner: 10Reedy) [18:16:33] (03PS1) 10coren: Tool Labs: minor tweaks to tomcat node manifest [operations/puppet] - 10https://gerrit.wikimedia.org/r/130401 [18:16:39] !log reedy synchronized multiversion/ [18:16:44] Logged the message, Master [18:22:46] (03CR) 10coren: [C: 032] "Trivial fixes." [operations/puppet] - 10https://gerrit.wikimedia.org/r/130401 (owner: 10coren) [18:30:22] andrewbogott: Can you upgrade jsduck on lanthanum as well? [18:30:34] (03PS1) 10Odder: Change group name on the Persian Wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130413 (https://bugzilla.wikimedia.org/64532) [18:30:36] Krinkle: I think I did… do you see otherwise? [18:30:52] andrewbogott: I forgot about the slave. Things are failing in Gerrit now because I just dropped jsduck 4 support. [18:31:05] Oh, you did run it on both [18:32:26] Krinkle: so does that mean things are working? Or is there more package wrangling needed? [18:32:48] I'll need to investigate further. Jumped the gun on thinking it's an old jsduck version [18:33:07] nvm for now. I'll get back if I need anything. thanks :) [18:42:28] (03CR) 10Calak: [C: 031] Change group name on the Persian Wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130413 (https://bugzilla.wikimedia.org/64532) (owner: 10Odder) [19:14:31] can someone tell me which host is releases.wikimedia.org in again? [19:15:01] caesium [19:15:18] ori: dammit. I've been trying to login to 'cesium' and 'ceasium' [19:15:26] should've googled the right spelling [19:17:32] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 1.69491525424% of data exceeded the critical threshold [500.0] [19:19:38] (03PS1) 10Gergő Tisza: Enable MediaViewer survey on Dutch Wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130428 [19:25:28] (03PS2) 10Rush: puppet repo local linter [operations/puppet] - 10https://gerrit.wikimedia.org/r/120976 [19:26:54] mutante: hey! I need to be added to this group called 'mobileupld' I think, so I can push releases to releases.wm.o. Do I need to submit a patch for that? [19:28:45] (03PS1) 10Rush: puppet repo localrun and local lint helpers [operations/puppet] - 10https://gerrit.wikimedia.org/r/130430 [19:31:27] (03Abandoned) 10Rush: puppet repo local linter [operations/puppet] - 10https://gerrit.wikimedia.org/r/120976 (owner: 10Rush) [19:46:35] (03CR) 10Andrew Bogott: "We maybe want to give this a few days in order to verify that no one is attached to the old local-lint file, but this is totally fine with" (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/130430 (owner: 10Rush) [19:47:50] (03CR) 10Rush: "the old one doesn't even run, so I doubt anyone has been using it, and it was last touched months ago." [operations/puppet] - 10https://gerrit.wikimedia.org/r/130430 (owner: 10Rush) [19:51:54] (03CR) 10Rush: puppet repo localrun and local lint helpers (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/130430 (owner: 10Rush) [19:54:44] (03CR) 10Andrew Bogott: [C: 031] "Objections withdrawn :)" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130430 (owner: 10Rush) [19:57:20] (03PS10) 10Matanya: Torrus: add torrus to netmon1001 [operations/puppet] - 10https://gerrit.wikimedia.org/r/108314 [19:57:38] (03CR) 10Rush: [C: 032 V: 032] "ok spoke to andrew seems ok to merge. should be no effects to anyone." [operations/puppet] - 10https://gerrit.wikimedia.org/r/130430 (owner: 10Rush) [20:03:52] PROBLEM - Puppet freshness on osmium is CRITICAL: Last successful Puppet run was Thu Jan 1 00:00:00 1970 [20:04:30] even before puppet was invented [20:10:02] (03CR) 10John F. Lewis: [C: 031] "Looks good." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130291 (https://bugzilla.wikimedia.org/64168) (owner: 10Withoutaname) [20:11:54] (03CR) 10Ori.livneh: [C: 032] Change group name on the Persian Wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130413 (https://bugzilla.wikimedia.org/64532) (owner: 10Odder) [20:12:56] (03Merged) 10jenkins-bot: Change group name on the Persian Wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130413 (https://bugzilla.wikimedia.org/64532) (owner: 10Odder) [20:13:12] !log ori updated /a/common to {{Gerrit|I56ae921ca}}: Change group name on the Persian Wikipedia [20:13:18] Logged the message, Master [20:13:39] !log ori synchronized wmf-config/InitialiseSettings.php 'I56ae921ca: Change group name on the Persian Wikipedia' [20:13:45] Logged the message, Master [20:13:54] twkozlowski: ^ [20:13:56] \o/ [20:15:49] ori: Yep, looks fine in Arabic [20:16:02] greg-g: OK if I graduate an existing Beta Feature from Thursday? Sorry for shortness of notice. [20:16:11] greg-g: (The formula editor one in VE.) [20:17:10] twkozlowski: thank you! [20:17:32] James_F: would we be ok with other BFs announcing they're going to graduate 2 days in advance? [20:18:03] greg-g: Ones which announced it on-wiki last week and in Tech/News? [20:18:22] James_F: heh, that changes my analysis :) [20:18:26] greg-g: :-) [20:18:34] pong [20:18:40] greg-g: But yeah, sorry; completely forgot to add it to [[Deployments]]. [20:18:55] no worries, for that kind of thing, I'm more worried about technews/ambassadors [20:19:09] * James_F nods. [20:19:14] (Can the DCE please help with this part? :) ) [20:19:49] Department of Collegiate Education? [20:20:23] greg-g: Maybe a little, but… engineering process is more an Engineering thing than a Product thing… [20:20:26] Director of Community Engagement? [20:20:34] twkozlowski: What marktraceur said. [20:20:50] The only employee at the WMF with a question mark in their title [20:21:03] What is a Director of Community Engagement? [20:21:19] James_F: When did we announce the formula thingy on Tech News? [20:21:53] twkozlowski: Argh, did I not make that change? [20:21:55] * James_F looks. [20:22:07] * James_F sighs. [20:22:15] I searched all the issues till a month ago, couldn't find anything. [20:22:20] (03PS8) 10Ori.livneh: Move beta scap source directory off of NFS [operations/puppet] - 10https://gerrit.wikimedia.org/r/127399 (https://bugzilla.wikimedia.org/63746) (owner: 10BryanDavis) [20:22:21] marktraceur: yeah [20:22:31] twkozlowski: It was meant to go into /18; I'll put it in /19. [20:22:40] okay, thanks. [20:22:49] James_F: does my analysis change again? :) [20:23:08] We didn't announce the Personal Bar thing last week because it will be only enabled on Thursday [20:23:19] greg-g: I don't think 10 days' lead time is necessary. [20:23:27] I never heard about anything re: VE, that's why I asked :) [20:23:29] somewhere between 2 and 10? :P [20:23:48] greg-g: Well, we're talking about the code going out in wmf3, so… [20:24:31] oh, not everywhere all at once, but riding the train.... [20:24:35] yeah, that should be fine [20:24:42] OK. [20:24:47] But will put a note in right now. :-) [20:26:01] twkozlowski: /18 got the announcement about the new Beta Feature, not the graduation of the old one. [20:27:25] It did? [20:27:38] "You will soon be able to set content language and direction with VisualEditor.". [20:27:39] Oh, you mean the content language part [20:28:18] (03PS2) 10Gage: Fix typo in key [operations/puppet] - 10https://gerrit.wikimedia.org/r/130384 (owner: 10Jkrauska) [20:28:32] * twkozlowski a slow typer [20:30:05] (03CR) 10Gage: [V: 032] Fix typo in key [operations/puppet] - 10https://gerrit.wikimedia.org/r/130384 (owner: 10Jkrauska) [20:30:29] (03CR) 10Gage: [C: 032] Fix typo in key [operations/puppet] - 10https://gerrit.wikimedia.org/r/130384 (owner: 10Jkrauska) [20:31:53] (03PS9) 10Ori.livneh: Move beta scap source directory off of NFS [operations/puppet] - 10https://gerrit.wikimedia.org/r/127399 (https://bugzilla.wikimedia.org/63746) (owner: 10BryanDavis) [20:34:32] (03CR) 10Ori.livneh: [C: 032] Move beta scap source directory off of NFS [operations/puppet] - 10https://gerrit.wikimedia.org/r/127399 (https://bugzilla.wikimedia.org/63746) (owner: 10BryanDavis) [20:38:32] (03CR) 10MarkTraceur: [C: 031] "Seems fine to me, can go out today." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130428 (owner: 10Gergő Tisza) [20:47:52] !log rebuilding search indexes for group1 wikis after the train upgraded cirrus for them [20:47:59] Logged the message, Master [21:11:59] !log populateImageSha1 fixer script finished on all wikis [21:12:06] Logged the message, Master [21:25:22] !log Running deleteEqualMessages.php on newwiki (bug 43917) [21:25:28] Logged the message, Master [21:27:36] (03PS1) 10Milimetric: Fix provision from scratch [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/130499 [21:27:57] new wiki? :D [21:28:23] yepwiki, nowiki, wuuwiki, wootwiki, newwiki [21:28:28] we've got some funny ones [21:28:36] herpwiki [21:29:03] zerowiki [21:29:07] http://noc.wikimedia.org/conf/all.dblist [21:29:08] :D [21:29:19] warwiki [21:30:01] at least we have one quality wiki, qualitywiki [21:30:20] "warwiki" i.e. where we keep all of our Java web apps [21:31:03] wait, what? [21:31:09] Why is there a PHP file in the MediaWIki namespace [21:31:13] https://new.wikipedia.org/wiki/%E0%A4%AE%E0%A4%BF%E0%A4%A1%E0%A4%BF%E0%A4%AF%E0%A4%BE%E0%A4%B5%E0%A4%BF%E0%A4%95%E0%A4%BF:Digittransform.php [21:31:23] * greg-g ain't clicking that [21:31:36] greg-g: It's SFW :p [21:31:48] it's a valid page name, .php means nothing, but its still weird [21:31:54] Krinkle: too cool for server scripts? :p [21:31:56] I wonder what gadget uses that [21:32:04] can't be good [21:32:05] :) [21:33:19] They probably thought if they can override MediaWiki messages, they could override server scripts the same way :p [21:35:55] JohnLewis: that's not even funny. You do realise that used to be possible, right? [21:36:38] Krinkle: Wait, seriously? [21:36:44] Just... kidding... [21:37:12] :p [21:49:37] HaithamS: Might want to register a nickserv account with that username, just a suggestion :) [21:51:26] hi, Flow deploy window, going to enable Flow on two mw talk pages for James_F [21:52:54] (03CR) 10Spage: [C: 032] "deploy window" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/129721 (owner: 10Jforrester) [21:59:26] spagewmf: Yay. [21:59:43] !log spage synchronized wmf-config/InitialiseSettings.php 'enable Flow on two mw talk pages for James_F' [21:59:49] Logged the message, Master [22:08:46] greg-g: Flow finished folks [22:09:26] JohnLewis : i think i just did. [22:09:31] spagewmf: thankya [22:10:08] did it work? [22:10:11] HaithamS: Account is not registered :) [22:10:18] oops. [22:10:34] HaithamS: try /ns register :) [22:11:01] how about now? [22:11:10] how do you know, btw. just curious :) [22:12:23] HaithamS: /ns info HaithamS [22:12:41] HaithamS: Just need to verify the email now and you're set :) [22:13:10] right! :) [22:13:15] Then /ns identify to login and you're done. You can then request a Wikimedia cloak after to hide your IP etc :) [22:14:32] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1% data above the threshold [250.0] [22:14:51] should be ok by now. Thanks for tips JohnLewis [22:15:28] HaithamS: Yep. Do you want a link to where you can request a Wikimedia cloak? [22:16:53] yeah, I think I know the page on wikitech. I just wanted to spend sometime here to see whether I am gonna stick around or not ... I've not been a big fan of IRC, but trying to adapt now :) [22:17:19] HaithamS: It's not a page on Wikitech (unless they have one :p) It's on Meta. [22:19:00] no worries, I should be able to look it up. Thanks for your help! [22:19:58] HaithamS: Welcome :) [22:20:32] Out of interest - SWAT is in an hour right? Need to make sure I have times right :P [22:21:04] JohnLewis: 40 minutes, but yeah :) [22:21:27] greg-g: Good :p [22:36:13] (03PS1) 10coren: Tool Labs: fix to tomcat-starter [operations/puppet] - 10https://gerrit.wikimedia.org/r/130510 [22:38:31] (03CR) 10coren: [C: 032] Tool Labs: fix to tomcat-starter [operations/puppet] - 10https://gerrit.wikimedia.org/r/130510 (owner: 10coren) [22:47:53] (03PS1) 10BBlack: Fix regex bug in 36e9fd2a (which was a fix for ae30ae0b) [operations/puppet] - 10https://gerrit.wikimedia.org/r/130516 [22:49:29] (03CR) 10BBlack: [C: 032 V: 032] Fix regex bug in 36e9fd2a (which was a fix for ae30ae0b) [operations/puppet] - 10https://gerrit.wikimedia.org/r/130516 (owner: 10BBlack) [22:59:23] ori, ebernhardson; I cant do swat today, so if one of you would take it that would be swell [22:59:49] Cancelling last minute mwalker? Good :p [23:01:07] it's not last minute; I'm just trying to see if someone else can take it [23:01:11] I have something at 430 [23:01:17] and there is a LOT of stuff in the swat today [23:01:23] mwalker: Ph fair enough :) [23:01:31] i'm in the middle of something, but i can do it if ebernhardson can't [23:01:50] * aude waves [23:02:10] i'll wait a minute for ebernhardson to reply on the off-chance that he can drive, and if not i'll start [23:02:26] (it is a big'un today) [23:03:09] i can drive [23:03:20] i just don't get ipnged when my name is mid-sentance ;) [23:03:22] Yup. [23:03:23] * ebernhardson should fix irssi [23:03:38] for once, we are well prepared with submodule patch eready [23:03:41] ready* [23:03:48] ebernhardson: <3, many thanks! here's a virtual IOU :) [23:03:54] * ebernhardson gasps at the list [23:03:56] agreed [23:03:57] :P [23:04:52] PROBLEM - Puppet freshness on osmium is CRITICAL: Last successful Puppet run was Thu Jan 1 00:00:00 1970 [23:10:33] greg-g: that seems unreasonable [23:10:36] we should cap it [23:10:45] it's very stressful [23:10:57] i hadn't had a chance to look at the list, or else i would have said something earlier [23:12:10] 18 patches... [23:12:10] yeah, I agree [23:12:35] ebernhardson: i can do "Include language-0 categories for betawikiversity" and ve [23:12:38] eek, yeah, didn't count all of the mv ones [23:12:47] one of the patches from odder, https://gerrit.wikimedia.org/r/#/c/130413/, says from jenkins: "https://integration.wikimedia.org/ci/job/beta-mediawiki-config-update-eqiad/213/console : Change has been deployed on the EQIAD beta cluster in 6s" [23:13:01] i assume that only means beta cluster? [23:13:02] yes, i deployed that earlier [23:13:05] right [23:13:19] ori: and deployed already? excellent [23:13:21] i can do the other config change and the ve ones [23:13:23] ebernhardson: yep [23:13:35] ori: sure, then i can focus on the stack of tgr patches :) [23:13:38] ori: Thanks! [23:13:41] tgr: around? [23:13:51] present [23:13:53] cool [23:13:56] the mwalker one, 130515, is working its way through jenkins right now [23:13:56] *thanks guys* /me is off to adopt a kitty [23:14:10] mwalker: Server kitties? [23:14:16] ebernhardson, yepyep; I saw. your point of contact for that is TheDJ [23:14:26] mwalker: you know, adopting a kitty is like open source software... [23:14:42] mwalker: ok cool, remember the multi-colored cats are always better ;) [23:15:16] heh [23:15:47] greg-g, indeed... in this case it's a tripod kitty -- much like open source software; always missing some key thing [23:16:03] mwalker: awwww [23:16:37] !log ebernhardson synchronized php-1.24wmf2/extensions/WikiEditor 'Update WikiEditor to 1.24wmf2' [23:16:43] Logged the message, Master [23:16:56] Stumpy the cat? [23:17:02] thedj: the reverted revert to WikiEditor is now live on 1.24wmf2 [23:18:00] (03PS2) 10Ori.livneh: Include language-0 categories for betawikiversity [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130291 (https://bugzilla.wikimedia.org/64168) (owner: 10Withoutaname) [23:18:02] (03CR) 10Ori.livneh: [C: 032] Include language-0 categories for betawikiversity [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130291 (https://bugzilla.wikimedia.org/64168) (owner: 10Withoutaname) [23:18:15] (03Merged) 10jenkins-bot: Include language-0 categories for betawikiversity [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130291 (https://bugzilla.wikimedia.org/64168) (owner: 10Withoutaname) [23:18:38] tgr: is it safe to merge all the MultimediaViewer patches and update as one? i imagine so? [23:18:42] ori: Thanks ;) [23:18:43] *:) [23:18:51] ebernhardson: yes [23:18:52] (well, 2 since its wmf1 and wmf2) [23:18:54] ok [23:19:20] tgr: mediawiki-config before or after the others, does it matter? [23:19:28] does not matter [23:19:44] tgr: jenkins has a -1 on this for failing doc-test: https://gerrit.wikimedia.org/r/#/c/130395/ [23:19:46] you might have to override some complaints from JSDuck [23:19:49] ok :) [23:21:38] (03CR) 10EBernhardson: [C: 032] Enable MediaViewer survey on Spanish Wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130234 (owner: 10Gergő Tisza) [23:21:47] !log ori updated /a/common to {{Gerrit|I59e1fa87e}}: Include language-0 categories for betawikiversity [23:21:49] (03CR) 10EBernhardson: [C: 032] Enable MediaViewer survey on Dutch Wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130428 (owner: 10Gergő Tisza) [23:21:53] Logged the message, Master [23:22:33] !log ori synchronized wmf-config/InitialiseSettings.php 'I59e1fa87e: Include language-0 categories for betawikiversity' [23:22:40] Logged the message, Master [23:24:01] (03Merged) 10jenkins-bot: Enable MediaViewer survey on Spanish Wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130234 (owner: 10Gergő Tisza) [23:24:03] (03Merged) 10jenkins-bot: Enable MediaViewer survey on Dutch Wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130428 (owner: 10Gergő Tisza) [23:25:30] !log ebernhardson synchronized wmf-config/InitialiseSettings.php 'Enable MediaViewer survey on Spanish and Dutch Wikipedia' [23:25:37] Logged the message, Master [23:27:43] tgr: hmm, now more things failing in jenkins when trying to +2/+2. https://gerrit.wikimedia.org/r/#/c/130396/ [23:27:53] now it fails -lint -qunit and -doc-test [23:27:57] but i dont see why, the patch is minimal [23:28:24] same on https://gerrit.wikimedia.org/r/#/c/130492/ [23:28:41] and https://gerrit.wikimedia.org/r/#/c/130494/ [23:29:01] so basically, jenkins denied my override? not sure what to do with that [23:29:13] ebernhardson: one of the survey patches depends on the other [23:29:16] shouldn't be 'pt-br': ' [23:29:18] pt-br' [23:29:18] could that be a problem? [23:29:27] nevermind [23:29:29] tgr: no, there is a 4th patch i didnt mention which is a dependency :) [23:29:34] looked like too many quotes [23:29:37] tgr: these 3 don't have any unmerged dependencies [23:29:47] if i can read right(not always) [23:29:58] aude: gerrit has weird linebreaks [23:30:15] yeah [23:30:22] lemme just try rebase and re +2 [23:31:41] ori: exact number can be argued about, but I'm going to add "Maximum of 8 patches per window", that's 4 people fixing things on both live versions, seems enough. [23:32:44] hmm, no zuul isn't picking up and retrying the rebased, re +2'd commit :S [23:33:31] tgr: might be a merge conflict? https://gerrit.wikimedia.org/r/#/c/130396/ is reporting "unable to be merged with the current state of the repository" [23:33:40] will manually rebase, how fun :P [23:33:58] ebernhardson: just skip them if they cause you trouble [23:34:23] i'll rebase and reschedule [23:34:40] don't see what it could possibly conflict with, though [23:34:42] slowly getting there, thing is i already +2'd half the patches [23:35:25] all the patches are independent, except one of the survey patches, and that is marked as such in gerrit [23:35:35] so omitting some of them is fine [23:37:08] !log ori synchronized php-1.24wmf2/extensions/VisualEditor 'I5818dce62' [23:37:14] Logged the message, Master [23:37:54] !log ori synchronized php-1.24wmf2/skins 'I66c56c577bad' [23:37:58] ok all merged finally :) forced a few [23:38:01] Logged the message, Master [23:38:20] ebernhar1son: VE + wmf-config changes done [23:47:03] (03PS1) 10Ori.livneh: Add wikidev and mortals to osmium [operations/puppet] - 10https://gerrit.wikimedia.org/r/130530 [23:48:39] many patches, so SWAT, wow [23:49:31] so many bugs [23:49:32] (03CR) 10Ori.livneh: [C: 032] "Faidon, merging this since access for core team members is entailed in the original request. Please revert if it's not OK." [operations/puppet] - 10https://gerrit.wikimedia.org/r/130530 (owner: 10Ori.livneh) [23:51:05] Oh, yeah, we're awful at this software stuff [23:51:13] RECOVERY - Puppet freshness on osmium is OK: puppet ran at Tue Apr 29 23:51:10 UTC 2014 [23:52:29] !log ebernhardson synchronized php-1.24wmf2/extensions/MultimediaViewer/ 'I84f8e347f' [23:52:36] Logged the message, Master [23:52:44] tgr: ok your live on 1.24wmf2, do you want a moment to test before i sent it to wmf1? [23:53:04] yes, thx [23:53:22] ok [23:54:22] thanks ebernhardson :) [23:54:31] aude: getting there :) [23:55:43] ebernhardson: all good, thanks [23:59:36] !log ebernhardson synchronized php-1.24wmf2/extensions/Wikidata/ 'I84c2283e07' [23:59:43] Logged the message, Master