[00:02:38] TimStarling: some bot was doing pass action=purge [00:02:50] *doing mass [00:02:51] yeah, I know, I commented on commons [00:02:56] did you block it? [00:03:00] no [00:03:50] where did you see load increases? [00:05:05] TimStarling: swift graphs, FileBackendStore profiling in graphite, and the wmf-config/swift.php profiling [00:05:36] I figured it was purges due to the absence of patterns in the backend store profiling and the filejournal [00:13:16] huh. apparently gerrit commentlink config changes are retroactive. /me expected not [00:13:27] (I *think* they're not in bugzilla) [00:15:24] TimStarling: you can check wmfOnLocalFilePurgeThumbnails-list.count [00:15:47] I guess something is still going on [00:46:31] TimStarling: Hi [00:46:48] hi Dispenser [00:47:14] I was just building a case for blocking you [00:47:32] when did you start purging exactly? [00:47:48] Jun 17, 2012 at 07:35 PM [00:47:56] timezone? [00:48:10] UTC? That was the first test cases [00:48:33] first test runs* [00:49:15] right, so you started doing it at the full rate at around 05:00 on the 18th? [00:50:10] hah, a nearby culprit [00:50:29] I hard to recall exactly when, but it might have been around then [00:51:01] this is based on the massive increase in image backend server load at that time: http://ganglia.wikimedia.org/latest/?r=week&cs=&ce=&m=&c=Swift+pmtpa&h=ms-be3.pmtpa.wmnet&tab=m&vn=&mc=2&z=medium&metric_group=ALLGROUPS [00:51:16] Honestly you should use an average since April 2011 [00:51:34] It not like I was quite about it [00:51:57] it'd probably be OK if you were just refreshing links [00:51:58] what's this purging for anyway? [00:52:13] https://commons.wikimedia.org/wiki/Commons:Bots/Work_requests#3_million_null_edits see the collapsed section [00:52:20] the main place you're causing trouble is on the image backend [00:52:53] because you're deleting all the image thumbnails [00:54:05] you can see it at the scalers as well: http://ganglia.wikimedia.org/latest/?r=week&cs=&ce=&m=load_one&s=by+name&c=Image+scalers+pmtpa&h=&host_regex=&max_graphs=0&tab=m&vn=&sh=1&z=small&hc=4 [00:54:15] but they are overprovisioned, unlike the file servers [00:54:38] so what's the problem? just some part of swift is overloaded? which piece? [00:55:05] So what speed should I be running it on? Its purging at 11,000 per hour. [00:55:06] anyway, you can stop it now and we can talk about other ways to update those link tables that don't overload unrelated pieces of infrastructure [00:55:27] zero [00:55:52] I don't want you to purge the thumbnails at all, doesn't matter how slow you do it [00:55:53] I know this WMF tactic, tell them to wait and hopefully it'll go away [00:56:13] I understand that you're angry, but I don't know the wikitech-l posts you're talking about [00:56:18] I don't read every post to wikitech-l [00:56:32] * jeremyb doesn't understand why the switch was made from 1/sec? and why is it ok at all to do this in parallel? [00:56:35] you say you haven't been quiet about it, but I haven't heard about it so maybe you weren't yelling in the right ears [00:57:04] i also don't read every message on that list and also didn't hear about it [00:57:09] but i'm on commons-l too [00:57:42] maybe you should read your scrollback more carefully [00:58:16] oh, it was just IRC then? [00:58:59] I also put a mention in the Signpost [00:59:02] I was camping over the weekend and quite busy with other things too. spent today catching up. i don't think i'm obligated to also catch up on all scrollback i miss [00:59:36] certainly i think tim is justified in having expected a mailing list thread [00:59:56] I don't read the signpost, and I don't read the channels when I'm away except for th [01:00:00] but otoh, i haven't seen the scrollback so maybe there's something interesting in there. [01:00:07] e last few lines [01:00:25] so you tell me now what is the problem [01:01:18] TimStarling: there was a template change over a year ago and ~1M files never had descriptions regenerated to match the new template. sounds like the template has had more edits since then and still no propagation [01:03:42] TimStarling: 20 00:52:12 < Dispenser> https://commons.wikimedia.org/wiki/Commons:Bots/Work_requests#3_million_null_edits see the collapsed section [01:03:54] I read it already [01:03:59] only, the interesting bits are not collapsed [01:04:11] k [01:04:33] TimStarling: so what now? file a bug? write a maint script? resume at a reasonable rate? [01:04:34] http://toolserver.org/~dispenser/temp/old_logs/20120616/coord-commonswiki.log [88 MB] list the ~900,000 errors before the script barfed [01:10:39] TimStarling: https://graphite.wikimedia.org/render/?c=MySQL%2Bpmtpa&z=small&h=&m=load_one&ce=&cs=&s=by%2Bname&sh=1&vn=&tab=m&host_regex=&hc=4&r=hour&max_graphs=0&width=1136&height=609&_salt=1340154592.362&target=wmfOnLocalFilePurgeThumbnails-purge.tp50&target=wmfOnLocalFilePurgeThumbnails-list.tp50&from=-1weeks [01:11:10] youch [01:11:28] yeah, so the script is clearly still running [01:11:55] do you know if it's API or index.php? [01:12:05] Why isn't the previous run (once every 10 seconds on 3 different server) showing up their [01:12:48] Dispenser: what username are you using to do the purges? [01:12:54] * AaronSchulz isn't sure [01:13:51] TimStarling: api [01:14:09] https://graphite.wikimedia.org/render/?c=MySQL%2Bpmtpa&z=small&h=&m=load_one&ce=&cs=&s=by%2Bname&sh=1&vn=&tab=m&host_regex=&hc=4&r=hour&max_graphs=0&width=586&height=301&_salt=1340154824.841&target=API.purge.count&from=-1weeks [01:14:21] right [01:14:24] TimStarling: http://p.defau.lt/?1v67cp5i_uaNFSJpzBC3Yg [01:14:25] we really need an API log [01:15:00] What a helpful User-Agent! [01:15:00] good UA string at least! [01:15:03] Heh. [01:15:09] * jeremyb was first [01:15:31] yeah it would be extra good if we had a log of it [01:16:01] I could give you a log of the pages purged [01:16:06] actually list [01:16:34] I could give you a list of pages successfully purged* [01:16:50] did some fail? [01:16:54] yes [01:17:17] 504 Gateway Time-out [01:17:22] Dispenser: you should look into http://docs.python-requests.org/ [01:17:47] * Log file or URL (TCP or UDP) to log API requests to, or false to disable [01:17:47] * API request logging [01:17:47] */ [01:17:47] $wgAPIRequestLog = false; [01:17:52] sounds like a useful feature [01:18:30] and if you want a sample? [01:19:04] nah [01:22:05] I'll just disable the module for now [01:22:34] jeremyb: That looks neat. [01:22:54] TimStarling: The API's purge module, you mean? Taking the Domas approach? [01:23:11] !log tstarling synchronized wmf-config/CommonSettings.php [01:23:18] Logged the message, Master [01:23:18] the Domas approach would be to both disable it and insult everyone who is using it [01:23:21] I'm only disabling it [01:24:01] lol [01:24:24] huh. /me wonders what a dolphin browser is [01:24:29] So when can we expect it be available again? [01:24:42] jeremyb: it's a way to see different dolphins [01:25:04] oh, it's a UA [01:25:11] * jeremyb stabs Reedy [01:25:27] lmgtfy http://dolphin-browser.com/ [01:25:34] yeah, yeah [01:25:50] i even had it installed once upon a time i think [01:26:17] I'm writing a message on commons first [01:26:30] !g 11973 | Reedy [01:26:30] Reedy: https://gerrit.wikimedia.org/r/#q,11973,n,z [01:26:48] I know [01:28:43] Only 360,088 purges done :-( [01:29:07] ok let's look at the bug now [01:29:23] can you give me an example of an image description page that you haven't purged yet? [01:29:38] https://commons.wikimedia.org/w/api.php?action=query&prop=extlinks&titles=File:20100926_Kompsatos_river_Rhodope_Thrace_Greece_Panoramic.jpg [01:29:42] From my example [01:30:48] right, so it has type:landmark? [01:31:19] But on https://en.wikipedia.org/wiki/File:20100926_Kompsatos_river_Rhodope_Thrace_Greece_Panoramic.jpg it type:camera [01:31:35] in "Camera location" row [01:33:02] and the link comes from the location template? [01:34:11] Yes http://commons.wikimedia.org/w/index.php?title=Template%3AObject_location&diff=52803602&oldid=44535632 [01:35:22] People put in junky types. The new system suppose to detect it. [01:36:10] At least was until I realized I spelled classification wrong last week [01:37:31] that template is not in the templatelinks for the image [01:38:19] it's not listed as used in my preview of the image either [01:39:08] I've noticed my complaints about inconsistent tables (e.g. red links from non-existant pages) went unheard. That's why I decide to go alone. [01:40:12] I wrote most of the code involved, so I'm interested to know if there is any bug in it [01:40:57] if {{object location}} is not in the templatelinks for the image, then the images will never be updated when you change it [01:41:10] You've got 5 template to look for https://commons.wikimedia.org/wiki/Template:Location/doc [01:41:47] {{Location}}, {{Location dec}}, {{Object location}}, {{Object location dec}}, {{Globe location}} [01:42:38] so object location is not meant to be on that page? [01:43:01] but I suppose location dec was also updated in 2011 [01:43:52] they also update the subtemplates as well [01:45:04] sure, but what I want to know is: was a template updated while it was in templatelinks for the problematic page [01:45:31] probably the answer is yes, but we may as well get the answer before we move on [01:45:50] Template:Location/ExternalLink - Jan 2012; Template:Cc-by-sa-layout - March 2012 [01:51:44] there's no links changes in Template:Cc-by-sa-layout, just a change in some plain text [01:53:52] Template:Location/ExternalLink is conclusive I guess [01:56:11] hmm, maybe not [01:57:07] that change doesn't seem to have any impact on anything visible [01:57:13] If you look at the before and after, lots of links were switch to protocol relative [02:00:29] ok [02:03:01] what's a more recent template change? [02:04:18] I can add refreshLinks jobs to the job queue for any changed templates [02:04:58] that way we can test to see whether the job queue is at fault [02:06:01] The Google Maps links on [[Template:Location/ExternalLink]] should be made protocol relative at some point [02:08:30] would that affect all images that use any of those location templates? [02:08:41] ok it's 2.8M images [02:09:59] it took 33s just to count them, I wonder how long it will take to do the partitioning job [02:10:17] maybe that is the problem, maybe edits to these templates typically fail and give errors [02:14:50] assuming partitioning is possible, it would be partitioned into batches of 500, so 5567 batches [02:17:45] not sure what can be protocol relative there that isn't already [02:20:16] how about I just make a trivial change [02:20:20] Basically all google links, although /mars and /moon have some problems with scripts (Still better than sending to http, IMHO) [02:23:49] TimStarling: Ive seen a lot of pollution/incorrect data in the different link tables [02:24:22] !log started socat for /var/log/mw/fatal.log on fenari [02:24:31] Logged the message, Master [02:24:57] I'm making a trivial change, changing those google.com links to protocol-relative might imply that I think that's a good idea [02:25:47] grrr [02:26:14] this was the wrong time to try this, the babysitter's going to leave in about 5 minutes [02:26:26] !log LocalisationUpdate completed (1.20wmf5) at Wed Jun 20 02:26:26 UTC 2012 [02:26:28] oh well, there's always logs [02:26:29] Editing a really high-use template usually results in an error message on page save. [02:26:31] Logged the message, Master [02:26:37] The green and yellow one. [02:26:39] oh, no need, it's failed already [02:27:14] Brooke: did you file a bug report? [02:27:26] it should be fixable [02:27:56] I don't think so. I'm not even sure it still does it. I haven't edited a very high-use temlpate in a long time. [02:27:57] I just got a read timeout, it should be still running [02:28:10] there's no error in the log yet [02:28:11] You got a read timeout from editing a template? [02:28:27] sure, but squid times out before PHP [02:28:48] what's interesting is whether PHP will hit an error [02:28:50] I think that's what I used to see. I never understood why it timed out. [02:28:59] anyway, gtg [02:29:27] [20-Jun-2012 02:28:44] Fatal error: Maximum execution time of 180 seconds exceeded at /usr/local/apache/common-local/php-1.20wmf5/includes/db/DatabaseMysql.php on line 285 [02:29:33] that's me [02:29:45] and it fails in BacklinkCache->partition, very nice [02:35:49] TimStarling: https://bugzilla.wikimedia.org/show_bug.cgi?id=37731 [02:48:41] !log LocalisationUpdate completed (1.20wmf4) at Wed Jun 20 02:48:41 UTC 2012 [02:48:46] Logged the message, Master [03:23:08] !log on fenari, queueing refreshLinks jobs for some 2.8M commons image description pages that use location templates [03:23:13] Logged the message, Master [03:32:23] If your doing it manually you might want to use [[Template:Coor URL]], which _should_ be included in every page I extract [03:33:25] 3,168,093 transclusion [03:42:31] it's already done with Location/ExternalLink [04:00:45] If you're forcing a manual update then why is the job queue only at 5,490? [04:03:04] it only needed 5600 jobs, 500 articles per job [04:12:28] !log experimentally stopping gmond on srv258 to check for effects on oscillating appserver stats [04:12:33] Logged the message, Master [04:19:00] !log on srv258: started gmond [04:19:05] Logged the message, Master [04:20:06] !log on nickel: restarting gmetad [04:20:11] Logged the message, Master [04:26:48] !log on nickel: ran gmetad with -d3, it spews errors when trying to write to the faulty summary info files [04:26:53] Logged the message, Master [04:35:56] !log on nickel: there were data sources for both "Apaches 8 CPU" and "Application servers", these were getting the same cluster name from the remote gmonds, and so different threads in gmetad were trying to write to the same summary files. Fixed temporarily, will fix in puppet shortly [04:36:01] Logged the message, Master [09:25:14] !log hashar synchronized wmf-config/InitialiseSettings.php '(bug 37327) Configure chr.wikipedia site logo' [09:25:21] Logged the message, Master [09:40:08] !log hashar synchronized wmf-config/mobile.php '$wgMobileResourceVersion does not exist anymore' [09:40:14] Logged the message, Master [09:43:45] !log hashar synchronized wmf-config/InitialiseSettings.php '(bug 37457) viwikibooks can import from fr/it wikibooks' [09:43:49] Logged the message, Master [10:42:31] /names/ [10:42:34] grmbl [10:42:36] names/ [10:42:56] (lacking sleep :o ) [10:44:54] go back to sleep! [10:45:01] and enjoy a nap esby [10:46:00] Reedy: Hi. If you have time, see this please: https://bugzilla.wikimedia.org/show_bug.cgi?id=37740 . Sorry about the short notice. [10:46:10] hashar: unfortunally i am at work :/ [10:50:02] esby: http://www.wikihow.com/Sleep-on-the-Job ;) [10:50:16] at my previous jobs, we had people taking a quick nap after lunch [10:50:35] probably better than being asleep for hours during the afternoon [10:52:49] hashar: hehe [12:20:16] Hi hashar . Can you see https://bugzilla.wikimedia.org/show_bug.cgi?id=37740 ? [12:22:15] aharoni: let me finish a review and I take a look at it :) [12:32:37] hmm [12:32:42] that would be throttle.php [12:32:44] let me fix that [12:35:50] aharoni: can you review https://gerrit.wikimedia.org/r/12165 ? [12:39:53] !log hashar synchronized wmf-config/throttle.php '(bug 37740) raise account throttle for an edit marathon' [12:39:58] Logged the message, Master [12:39:58] hashar: thank you. [12:40:36] !log hashar synchronized wmf-config/InitialiseSettings.php 'touching InitialiseSettings.php to refresh cache' [12:40:41] Logged the message, Master [12:40:49] aharoni: should be fine [12:41:02] aharoni: limit raised on any wiki [12:41:13] aharoni: please ping me if there is any trouble [12:41:43] OK [13:19:32] !log Created EducationProgram database tables on enwiki [13:19:38] Logged the message, Master [13:35:54] Reedy: can you ping me when you are available ? :-] [13:37:46] !log reedy synchronized php-1.20wmf5/extensions/EducationProgram/ 'Push out master EP' [13:37:51] Logged the message, Master [13:41:54] !log reedy synchronized php-1.20wmf5/extensions/EducationProgram/ 'Push out master EP' [13:41:59] Logged the message, Master [13:45:00] !log reedy synchronized wmf-config/InitialiseSettings.php 'Enable EducationProgram on enwiki *gulp*' [13:45:06] Logged the message, Master [13:48:20] Reedy: PHP fatal error in /usr/local/apache/common-local/php-1.20wmf5/extensions/EducationProgram/includes/EPRoleObject.php line 253: [13:48:20] Call to undefined method EPCourses::getCoursesForUser() [13:48:26] https://en.wikipedia.org/wiki/Special:ManageCourses [13:48:53] Yup [13:48:59] just told Jeroe about that in -dev [13:49:23] lol [13:49:54] It's not breaking anything but itself currently [13:50:36] ah [13:50:41] it's a plural problem [13:50:42] s [14:00:20] !log reedy synchronized php-1.20wmf5/extensions/EducationProgram/ [14:00:25] Logged the message, Master [14:04:20] I think that's done at least [14:38:47] Reedy: you there ? [14:39:02] Reedy: you got live hacks in wmf-config [14:39:08] no idea if they should be public [14:46:20] hashar: not live hack [14:46:23] just uncommitted [14:46:43] CommonSettings was tim, but that's on noc, so it's public [14:46:47] let me tidy up [14:47:45] danke [14:48:01] also we might want to update TrustedXFF extension on live cluster (wmf5 ? ) https://gerrit.wikimedia.org/r/#/c/11270/ [14:48:12] Tim updated the list of IP address for Opera Mini [14:48:52] I don't want to disturb the wmf branches :-] [14:51:07] our IP address has been blocked on all wikis. [14:51:08] I just had to git pull to commit... just an fyi [14:51:09] The block was made by Shizhao (meta.wikimedia.org). The reason given is Open proxy. [14:51:11] Start of block: 08:58, 30 March 2012 [14:51:12] Expiry of block: 08:58, 30 March 2013 [14:51:14] You can contact Shizhao to discuss the block. You cannot use the "E-mail this user" feature unless a valid e-mail address is specified in your account preferences and you have not been blocked from using it. Your current IP address is 220.255.2.115. Please include all above details in any queries you make. [14:51:51] Is it an open proxy? If so, why? [14:54:13] Hildanknight, anyway #wikimedia-stewards is the right channel [14:55:25] I'm from Singapore, not China, and am using wireless provided by my ISP. [14:55:43] Hildanknight: I use HTTPS to connect [14:56:21] Hildanknight: you are also hardblocked on enwiki due to a CU block [14:56:47] hashar: if we want to do TorBlock, be nice to pull in https://gerrit.wikimedia.org/r/#/c/11215/ too [14:58:29] http://en.wikipedia.org/w/index.php?title=Special:Log/block&page=User%3A220.255.2.115 --> No matching items in log? [15:00:01] "use of specified attribute in attributes is deprecated. It always returns true" Any idea what is it talking about? [15:00:39] Hildanknight, http://meta.wikimedia.org/w/index.php?title=Special:Log&page=User%3A220.255.2.115 [15:07:29] I am out for now [15:07:41] I will not be connected this evening, will write documentation [15:07:46] Reedy: I am out for now :/ [15:07:55] feel free to update TorBlock too [15:07:57] oook [15:08:03] sounds like something we want indeed [15:09:55] Aaron|home: Notice: FSFileBackend::doStoreInternal: copy() failed but returned true. in /usr/local/apache/common-local/php-1.20wmf5/includes/filerepo/backend/FSFileBackend.php on line 208 [15:18:46] anyone with an access to our gerrit's etc/gerrit.config? need [commitlinks] section [15:27:32] got it [15:58:14] what, torblock made functioning again? [16:00:18] no idea [16:01:06] Reedy: can you take a look at http://en.wikipedia.org/wiki/Wikipedia%3AVillage_pump_%28technical%29#Wikipedia_Page_Not_Showing_Up_in_Google_or_Bing_Search [16:01:19] I have no idea why that page is getting noindex'ed [16:02:30] Nemo_bis, it's working again [16:02:47] Platonides, uh-oh, couldn't believe it [16:03:01] actually, we didn't fix it :P [16:03:02] Platonides, what about Tim's idea of running our own service to list nodes? [16:03:29] Platonides, ah, so we're still in the state described in the last commen to the bug [16:04:23] it failed because the web went down [16:04:32] the service was restored [16:04:44] and our extension started working again [16:39:43] Platonides, yes, but do we at least have a way to notice it's broken next time? [17:08:49] !log preilly synchronized wmf-config/mobile.php 'add telenor' [17:08:54] Logged the message, Master [17:11:54] !log preilly synchronized wmf-config/mobile.php 'add Grameenphone Bangladesh' [17:11:59] Logged the message, Master [17:42:56] !log preilly synchronized docroot/bits [17:43:01] Logged the message, Master [17:58:54] !log preilly synchronized php-1.20wmf5/extensions/MobileFrontend 'testing with disable caching off' [17:58:59] Logged the message, Master [17:59:23] !log preilly synchronized php-1.20wmf4/extensions/MobileFrontend 'testing with disable caching off' [17:59:28] Logged the message, Master [18:02:36] !log preilly synchronized php-1.20wmf5/extensions/MobileFrontend 'test with disable caching on' [18:02:41] Logged the message, Master [18:03:01] !log preilly synchronized php-1.20wmf4/extensions/MobileFrontend 'test with disable caching on' [18:03:06] Logged the message, Master [18:08:11] Reedy: ready to deploy to 1.20wmf5 to *.wikipedia.org? [18:10:30] Can do [18:16:28] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Rest of pedias to 1.20wmf5 [18:16:33] Logged the message, Master [18:17:15] lies [18:17:25] AaronSchulz: /usr/local/bin/sync-wikiversions: line 24: sudo -u mwdeploy rsync -l 10.0.5.8::common/wikiversions.{dat,cdb} /usr/local/apache/common-local/: No such file or directory [18:17:38] :O [18:17:54] !log reedy synchronized wikiversions.cdb [18:18:00] Logged the message, Master [18:18:18] !log reedy synchronized wikiversions.dat [18:18:23] Logged the message, Master [18:21:09] nothing new showing [18:21:54] most of the errors pointing at wmf4 [18:22:11] what's still on wmf4? [18:22:34] nothing [18:22:44] or shouldn't be [18:22:49] just not enough other errors to clear them off [18:23:29] all gone now [18:24:16] * robla assumes AaronSchulz is quietly fixing the sync-wikiversions problem rather than completely ignoring Reedy ;-) [18:24:23] heh [18:24:38] I'll point php to php-1.20wmf5 in an hour or 2 if there's no issues [18:24:51] Reedy: what still uses /php? [18:25:17] stuff [18:25:28] and things [18:25:42] Feel free to remove it if you want to see what breaks ;) [18:31:41] !log aaron rebuilt wikiversions.cdb and synchronized wikiversions files: [18:31:46] Logged the message, Master [18:42:09] does this wiki look like this on purpose? http://chy.wikipedia.org/wiki/Va%27ohtama [18:42:40] (where "this way" is "looks like there's no stylesheet applied whatsoever" [18:42:56] robla: i see the css [18:42:56] It doesn't look excessively broken to me [18:42:56] nevermind....shift reload fixed it for me [18:43:03] Although the layout of the main page is a bit weird, granted [18:44:06] yeah, I probably loaded them too quickly [18:45:08] I had been manually clicking each one, and I finally did the cursory search for an extension to "open all links in a selection", which of course there are several [18:45:21] (of course, also of varying quality) [20:21:08] wiki slow [20:21:21] nl [20:21:37] +api [20:24:14] probably related to the apache spam in -operations [20:27:00] Hmmm... probablyu already known, but just got an error on WP trying to save a page: [20:27:02] Request: POST http://en.wikipedia.org/w/index.php?title=User:Careymur/Mazie_Hirono_and_her_work_with_NOW&action=submit, from 10.64.0.125 via cp1014.eqiad.wmnet (squid/2.7.STABLE9) to 10.2.1.1 (10.2.1.1) [20:27:02] Error: ERR_READ_TIMEOUT, errno [No Error] at Wed, 20 Jun 2012 20:26:33 GMT [20:27:34] Also, Twinkle timed out - "Tagging page: Failed to save edit: error "Gateway Time-out" occurred while contacting the API." [20:28:41] Working now. [20:29:36] still very very slow here [20:30:06] Same [20:30:37] In fact, I'm about to error out again I bet... it started working like normal as soon as I pasted the error. xD [20:30:56] same [20:31:31] i dont use https and still it is slow... :S [20:32:14] some apaches seem to be having problems [20:32:48] it times out on both http and https [20:34:43] s1 master load is dropping [20:35:07] probably not the good kind of dropping ;) [20:35:45] . o O (losing editors?) [20:36:01] AaronSchulz: so it's dropping the bass then? [20:36:32] well, when apaches go out driking, the db can get some holidays, too [20:38:49] * AndrewN drops the bass [20:41:05] And she times out again... [20:47:25] Someone just reported a 503 on http://pt.m.wikipedia.org/w/index.php?title=Wikip%C3%A9dia:P%C3%A1gina_principal&mobileaction=toggle_view_mobile [20:47:28] I'm getting that as well [20:48:11] investigating [20:48:25] some apache servers are having issues [20:48:37] given that such url is probably not cached... [20:48:44] TBloemink: also here [20:50:51] Wed Jun 20 20:50:41 UTC 2012 mw55 enwiki Job::pop_type 10.0.6.48 1205 Lock wait timeout exceeded; try restarting transaction (10.0.6.48) SELECT * FROM `job` WHERE job_cmd = 'enotifNotify' LIMIT 1 FOR UPDATE [20:50:54] binasher: spamming [20:51:23] I know tim started some refreshLinks jobs last night [20:52:40] not sure why that would cause problems now though, could just be irrelevant noise [20:59:57] maybe some mails could not be sent and there is tons of enotifNotify jobs? [21:00:22] I also did some changes to the job system [21:00:42] my 2cents. I am heading bed for now :/ [21:00:44] there's now a complaint about slowness on enwiki http://en.wikipedia.org/w/index.php?title=Wikipedia:Village_pump_(technical)&curid=3252662&diff=498556338&oldid=498544315 [21:02:55] Hello there, I would need some help please. [21:03:47] Jasper_Deng: We are aware of those issues and they are being investigated [21:03:50] Bharel: If you're reporting a Wikimedia foundation site down, we know, and I'm fairly certain that the techy people are working on it. [21:04:03] (in case anyone here wanted to comment on-wiki) [21:04:33] !log lcarr synchronized wmf-config/mc.php 'removed broken srv268' [21:04:38] Logged the message, Master [21:05:00] hey [21:05:05] Wooha that was fast. That's exactly what I wanted to report. Since you're already aware of it, there's no need :-) [21:05:14] woo, we may have fixed it [21:05:23] how do things look (hard refreshes please) to people ? [21:05:31] LeslieCarr: so one bad mc box makes the site slow? [21:05:56] LeslieCarr: Looks back to normal (for now). [21:06:37] Interesting... multiple entiries for the same thing in my contribs... [21:07:10] AaronSchulz: well possibly makes all noncached parts of sites dead .... [21:07:45] but…. this is a good time to call out for anyone interested in making memcached not fragile :) [21:07:48] we need your help [21:07:53] only if the connection is being retried with x timeouts for each operation [21:08:58] LeslieCarr: tim has done the MW side of stuff [21:09:20] using php-memcache or php-memcached (one of the 2) [21:09:50] ie MemcachedPeclBagOStuff.php [21:12:29] http://pecl.php.net/package/memcache http://pecl.php.net/package/memcached [21:12:29] srsly [21:13:13] ala https://gerrit.wikimedia.org/r/#/c/7349/ [21:14:49] God these templates on enwiki, https://en.wikipedia.org/wiki/Template:Jct ~7 levels deep and I get to those database like calls [21:16:28] Dispenser, then look at http://en.wikipedia.org/wiki/List_of_tz_database_time_zones [21:17:05] the data comes from a db [21:17:15] for the page any call to {{Tz/zone.tab_cols_linked}} means 4 calls to templates which eventually end up in a switch to determine the values [21:17:31] O(4n²) [21:46:54] * AaronSchulz clicks cancel on downloading a kubuntu update and gets 'Executable: python2.7 PID: 18041 Signal: Segmentation fault (11)' ;) [21:49:28] uh? :S [21:49:38] an exception would be bad, but a segfault?? [21:52:24] !log pointed /usr/local/apache/common/php at /usr/local/apache/common/php-1.20wmf5 on mediawiki-installation [21:52:29] Logged the message, Master [22:23:58] gn8 folks [23:29:56] !log kaldari synchronized php-1.20wmf5/extensions/LastModified/E3Experiments/js/ext.E3Experiments.Timestamp.js 'updating clicktracking for LastModified and E3Experiments exts' [23:30:01] Logged the message, Master