[00:07:29] PROBLEM - Offline Content Generation - Collection on rhodium is CRITICAL: Connection refused [00:18:39] !log ori synchronized php-1.23wmf5/extensions/PagedTiffHandler 'Update PagedTiffHandler to master for I52fe2ec25 / bug 57359' [00:18:54] Logged the message, Master [00:20:58] the exception log is full [00:21:19] Invalid IP given in XFF .. [00:21:22] is there a bug for that? [00:21:54] not that I know of [00:22:06] !log ori synchronized php-1.23wmf4/extensions/PagedTiffHandler 'Update PagedTiffHandler to master for I52fe2ec25 / bug 57359' [00:22:11] https://ganglia.wikimedia.org/latest/graph.php?r=day&z=xlarge&title=MediaWiki+errors&vl=errors+%2F+sec&x=0.5&n=&hreg[]=vanadium.eqiad.wmnet&mreg[]=fatal|exception>ype=stack&glegend=show&aggregate=1&embed=1 [00:22:18] Logged the message, Master [00:22:37] ori-l: the tif no longer times out now [00:22:44] yay! [00:23:02] the tortilla hat :) [00:23:23] heheh [00:23:47] the exceptions are all for favico [00:23:50] favicon.ico [00:25:15] Reedy: https://gerrit.wikimedia.org/r/#/c/97149/ [00:25:35] trivial enough to self merge, but if someone else wants to.... :) [00:25:48] * aude waits for jenkins [00:26:36] ori-l: Do you have a minute to talk me through getting a local branch that tracks an upstream branch in core? [00:27:01] (my ultimate goal -- to commit get an extension version part of that branch) [00:27:02] andrewbogott: git branch --track wmf/1.23wmf5 origin/wmf/1.23wmf5 [00:27:21] where the first arg is my local branch? [00:27:35] yup [00:27:45] Should the local branch exist already, or does that create it? [00:28:08] (and, is there really a wmf5 already? Or is wmf4 the latest?) [00:28:15] there's a wmf5 [00:28:25] (my ultimate goal -- to commit get an extension version part of that branch) [00:28:27] That was ^R and typing --track [00:28:29] i didn't understand that [00:29:09] what's this favicon nonsense [00:29:17] ori-l: Ryan tells me that when I commit a patch to an extension I should make a corresponding patch in core so that 'submodule update' in core doesn't clobber the extension version [00:29:21] is that crazy talk? [00:29:59] ori-l: We use a php script to serv favicons [00:30:01] no, he's right [00:30:41] a production branch of mediawiki consists of mediawiki core and a set of submodules representing all the extensions that are currently deployed [00:30:49] * andrewbogott brushes cobwebs away from submodule knowledge [00:31:15] given in XFF 'unknown, 10.64.0.126' [00:31:17] lol. [00:32:17] ori-l: ok, so I feel like my checkout of core should have reference to a bunch of extensions, and I need to 'git submodule init' to get the submodule actually checked out... [00:32:43] if it's a production branch, yes; 'vanilla' core doesn't have submodules [00:33:07] oh, hah, I'm on the wrong branch [00:33:30] now it works just like I expected :) [00:33:44] :) [00:33:46] it's a bit tricky [00:34:20] !log Manually fixed img_metadata for File:Zentralbibliothek_Zürich_-_Heinrich_Bullingers_Westerhemd_-_000012135.tif by forcing /bin/identify usage [00:34:23] By the way, I note that you use marching branches rather than tagging a single main branch... [00:34:35] Logged the message, Master [00:34:44] Are new branches always derived from the head of the former one? So that e.g. wmf5 is a branch of wmf4 which is a branch from wmf3, etc. etc.? [00:34:50] Nope [00:34:55] From master and diverge from there [00:35:07] by our very own Reedy [00:35:10] ok, so if I want my submodule patch to actually matter I need to make it on master [00:35:12] and on a branch [00:35:15] but in theory everything in an earlier branch should be in a newer one [00:35:17] no [00:35:23] ... [00:35:45] no, which? [00:36:01] when a new branch is cut a set of php scripts populate the set submodules [00:36:04] the set of submodules [00:36:13] it's not something we like to talk about in public [00:36:35] like, if you were giving a presentation about our use of git, this wouldn't go on slide #1 [00:37:15] :) [00:37:23] So how does a new branch decide what version of to track? [00:37:37] Using the submodule commit in the branch of core [00:37:53] in… what branch of core? [00:37:54] it's always master branch, except in certain excptions [00:37:57] exceptions* [00:38:02] master of the extensions [00:38:34] whatever was master at the time the core branch was made [00:38:43] That suggests that, indeed, I should make my change to master (so that it gets picked up by future branches) and to current branch (so it gets picked up now.) [00:38:52] And yet ori-l tells me… 'no' [00:39:04] oh, you meant the master of the extension [00:39:09] if so, then yes, you are correct [00:39:13] No! [00:39:15] * andrewbogott scowls [00:39:26] OK, so, I misread. [00:39:37] You're saying that new branches just get HEAD of all the extensions. [00:39:46] Which seems optimistic, but makes my life easy :) [00:39:55] Not all of them, but most of them yes [00:40:21] So anything that I commit to OpenStackManager will eventually land on the wmf branches. Just not in the current one. [00:40:52] It will be included in future branches [00:41:08] ori-l: heh, it's scary that parses, gallery views, and such can trigger loads of file metadata updates (say if all jpg files had a metadata version change) [00:42:20] Reedy, ori-l, aude, so in short, all I need to do is this, and I'm done: https://gerrit.wikimedia.org/r/#/c/97158/ [00:42:29] (not sure why I am asking about this in ops) [00:43:15] AaronSchulz: eeeeeep [00:44:13] andrewbogott: yes,  [00:44:25] ok then! [00:44:31] Thank you all! [00:44:41] uh, the merge to wmf5 won't go out unless you deploy it [00:44:41] marktraceur: ResourceLoaderFileModule::readStyleFile: style file not found: "/usr/local/apache/common-local/php-1.23wmf4/extensions/UploadWizard/resources/ext.uploadWizard.uploadCampaign.list.css" [00:44:48] * aude always reads "OSM" = openstreetmap :) [00:45:07] wmf5 is already out there [00:45:20] greg-g: That extension is only used on Labs' own weird private system which is not currently attached to the deployment system. [00:45:25] and if you merge something to wmf5 without deploying, you're saying "hey next person, you manage the fall out from this breakage" [00:45:27] So I'll just check out by hand when the time comes. [00:45:43] andrewbogott: then merge to master, not to wmf5? [00:46:20] ori-l: That sounds like a YuviProblem, but I'll take a look [00:46:23] um… I feel like we just established that 'merge to master' is meaningless since new branches automatically get... [00:46:38] ori-l: https://bugzilla.wikimedia.org/buglist.cgi?title=Special%3ASearch&quicksearch=upgradeRow&list_id=252640 [00:46:45] Wait, are you guys deploying something? :/ [00:46:49] greg-g: Other than that 'merge to master' thing I don't disagree with what you're saying though. [00:47:01] !log reedy synchronized php-1.23wmf4/extensions/MobileFrontend/ 'Fix history fatal' [00:47:14] andrewbogott: is this for wikitech wiki? [00:47:15] Logged the message, Master [00:47:22] aude: yes [00:47:34] unless something is odd, it's on 1.23wmf3 https://wikitech.wikimedia.org/wiki/Special:Version [00:47:38] It is currently on wmf/1.23wmf3 [00:47:42] not in sync with the rest of stuff [00:47:47] since it's separate [00:47:58] My plan was to get my OSM patch into the latest, then to update wikitech to the 'latest branch' which I take to be 1.23wmf5 [00:48:00] !log reedy synchronized php-1.23wmf5/extensions/MobileFrontend/ 'Fix history fatal' [00:48:12] * aude nods [00:48:13] gotcha, I came in late [00:48:14] Logged the message, Master [00:48:17] don't listen to me :) [00:48:41] !log reedy synchronized php-1.23wmf5/extensions/Wikibase/client/ 'Fix InfoAction fatal' [00:48:56] Logged the message, Master [00:49:02] greg-g, if trying to insert things into branches causes distress then I can just do nothing and wait for my extension to get updated in the next branch. [00:49:02] ori-l: Yuvi fucked up a filename in a module declaration, fixing now [00:49:07] Not ideal, but acceptable. [00:49:49] (someday soon wikitech will follow standard deployment scheme, at which point that's how it'll work anyway) [00:49:53] I guess I don't undersatnd why it needs to be merged to 1.23wmf5, can't it be a 1.23wikitechX or osmething since it's specific to that? [00:50:00] whatever [00:50:53] I have… no real opinion, just trying to do what Ryan most recently suggested. [00:53:11] gotcha [00:53:15] andrewbogott: thanks for taking care of it [00:53:21] andrewbogott: sorry for being a johnny come lately [00:53:28] (03PS1) 10Reedy: Disable OAuth on fishbowl wikis [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/97166 [00:53:49] https://upload.wikimedia.org/wikipedia/commons/thumb/d/d5/Vladimir_Putin%27s_press_conference_on_2012-12-20.ogv/320px--Vladimir_Putin%27s_press_conference_on_2012-12-20.ogv.jpg [00:53:54] great looking thumbnail :p [00:53:56] (03CR) 10CSteipp: [C: 031] Disable OAuth on fishbowl wikis [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/97166 (owner: 10Reedy) [00:54:40] (03CR) 10Reedy: [C: 032] Disable OAuth on fishbowl wikis [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/97166 (owner: 10Reedy) [00:54:59] marktraceur: it's low-impact, get it reviewed and plan to deploy it next week [00:55:26] ori-l: Aye sah [00:57:00] !log reedy synchronized wmf-config/InitialiseSettings.php 'Disable OAuth on fishbowl wikis' [00:57:16] Logged the message, Master [00:57:46] (03CR) 10Reedy: [V: 032] Disable OAuth on fishbowl wikis [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/97166 (owner: 10Reedy) [00:57:55] the XFF thing is exclusively enwiki [01:04:16] Logged the xff favicon thing at https://bugzilla.wikimedia.org/show_bug.cgi?id=57467 [01:06:04] thanks, I can't focus any more, heading off [01:06:08] thanks for filing that [01:06:28] the unknown should be another IP, right? [01:06:47] no idea [01:07:11] i think the zero folks were doing something XFF related [01:07:19] dr0ptp4kt yurik ^ [01:07:23] i'll cc them on the bug [01:07:41] cool [02:14:03] !log LocalisationUpdate completed (1.23wmf4) at Sat Nov 23 02:14:03 UTC 2013 [02:14:20] Logged the message, Master [02:18:02] PROBLEM - Puppet freshness on rhodium is CRITICAL: No successful Puppet run for 0d 16h 59m 39s [02:26:19] !log LocalisationUpdate completed (1.23wmf5) at Sat Nov 23 02:26:18 UTC 2013 [02:26:32] Logged the message, Master [03:12:24] !log LocalisationUpdate ResourceLoader cache refresh completed at Sat Nov 23 03:12:24 UTC 2013 [03:12:40] Logged the message, Master [03:17:45] (03CR) 10Spage: "If we're disabling VE for Flow on every wiki, then doing it in CommonSettings.php seems clearer." (031 comment) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/96161 (owner: 10EBernhardson) [04:00:40] (03PS3) 10Spage: Enable Flow discussions on a few test wiki pages [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/94106 [04:11:55] (03CR) 10Spage: [C: 04-1] "Do not +2 until we have our window." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/94106 (owner: 10Spage) [04:23:07] PROBLEM - etherpad_lite_process_running on zirconium is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^node node_modules/ep_etherpad-lite/node/server.js [04:27:18] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [04:33:27] PROBLEM - MySQL InnoDB on db1021 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [04:33:28] PROBLEM - MySQL Idle Transactions on db1021 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [04:33:28] PROBLEM - MySQL Recent Restart on db1021 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [04:33:28] PROBLEM - MySQL Slave Delay on db1021 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [04:34:28] RECOVERY - MySQL Recent Restart on db1021 is OK: OK 4040006 seconds since restart [04:34:28] RECOVERY - MySQL Slave Delay on db1021 is OK: OK replication delay 35 seconds [04:35:17] RECOVERY - MySQL InnoDB on db1021 is OK: OK longest blocking idle transaction sleeps for 0 seconds [04:35:17] RECOVERY - MySQL Idle Transactions on db1021 is OK: OK longest blocking idle transaction sleeps for 0 seconds [04:46:18] PROBLEM - check_job_queue on terbium is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 199,999 jobs: , Total (202733) [04:54:18] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [04:57:18] PROBLEM - check_job_queue on terbium is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 199,999 jobs: , Total (210185) [05:01:07] RECOVERY - etherpad_lite_process_running on zirconium is OK: PROCS OK: 1 process with regex args ^node node_modules/ep_etherpad-lite/node/server.js [05:02:18] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [05:03:37] etherpad.wikimedia.org was 503, but came back to life [05:05:26] PROBLEM - check_job_queue on terbium is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 199,999 jobs: , Total (200408) [05:06:17] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [05:10:26] PROBLEM - check_job_queue on terbium is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 199,999 jobs: , Total (204281) [05:12:17] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [05:15:07] not a fun night, eh [05:18:06] PROBLEM - Puppet freshness on rhodium is CRITICAL: No successful Puppet run for 0d 3h 0m 4s [05:18:26] PROBLEM - check_job_queue on terbium is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 199,999 jobs: , Total (203759) [05:21:26] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [05:21:40] job_queue flapping, fun times [05:32:26] PROBLEM - MySQL InnoDB on db1056 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [05:32:56] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [05:33:16] RECOVERY - MySQL InnoDB on db1056 is OK: OK longest blocking idle transaction sleeps for 0 seconds [05:33:26] PROBLEM - check_job_queue on terbium is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 199,999 jobs: , Total (203633) [05:36:18] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [05:41:27] PROBLEM - check_job_queue on terbium is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 199,999 jobs: , Total (200627) [05:42:18] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [06:10:16] (03PS1) 10Faidon Liambotis: Add Twitter account to Varnish's error page [operations/puppet] - 10https://gerrit.wikimedia.org/r/97190 [06:12:11] hrmmmmmm [06:12:53] paravoid: does that actually have anything relevant to outages? [06:13:10] I think it does, on large outages [06:13:26] ok. well guillom would know :) [06:13:40] at first i assumed you were adding @wikimediatech [06:13:57] @wikimediatech needs some work to be more reliable than it is now [06:15:27] case in point! no tweets for ~9 days [06:16:30] Erik proposed that one too, dunno [06:16:36] it's too technical/terse [06:16:42] ori-l: you mentioned something about a rewrite? (was done but not converted yet?) [06:17:16] paravoid: watchmouse maybe? [06:17:19] rewrite of what? [06:17:26] ori-l: morebots [06:17:37] I did not [06:17:38] i keep insisting i won't do it [06:17:41] I was evaluating fedmsg [06:18:08] ori-l: i was thinking you had said something was already written [06:18:19] paravoid: right, i addressed ori-l :) [06:18:28] oh sorry [06:18:33] fedmsg looks good [06:18:45] paravoid: i was saying instead of twitter maybe use watchmouse (status.wm.o) [06:19:07] ori-l: it looks shiny, it isn't as much [06:19:10] I did evaluate it [06:20:15] very undocumented, things such as hardcoded environments from fedora's infrastructure in the python source ('prod', 'stg', 'dev'), terrible Debian package [06:20:33] nothing unfixable, but it was more than an apt-get and a little fiddling away and I didn't find the time [06:21:07] speaking of undocumented, i was finally curious enough about systemd to try and grok it [06:21:12] the docs are really bad [06:21:28] upstart is exemplary in this respect [06:21:40] like, just look at this: [06:21:41] http://www.freedesktop.org/wiki/Software/systemd/ [06:21:50] the first paragraph is presumably a joke, except it runs on for wayyy to long [06:22:04] and then you have to scroll fifty pages to get to the actual documentation [06:22:33] err, the spelling paragraph, i mean [06:23:08] you know Debian had multiple flamewars on whether it should switch to upstart or systemd, right? [06:24:11] only what you told me, but yeah [06:24:22] oh I did [06:24:25] I'm forgetful, sorry :) [06:25:18] I read a tiny bit of one of those threads [06:25:24] then gave it up as a bad job [06:27:19] ori-l: holtWintersForecast(diffSeries(stats.job-insert.count,stats.job-pop.count)) [06:27:44] see where I'm going with this? [06:28:25] clever [06:28:42] check_graphite [06:29:14] I want to alert when the trend is upwards for a while [06:29:23] well, just holtWintersForecast(stats.job-insert.count) should do [06:29:41] why? [06:30:04] we may have a large number of jobs inserted but processing them just fine [06:30:46] we may also be processing jobs quickly if they're cheap [06:31:07] hmmm [06:31:13] exactly [06:31:29] hence the difference [06:33:55] hm, I guess I'd like the diff in a larger series [06:33:56] sure, let's try it [06:34:01] than just each data point [06:43:31] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [06:44:50] hm [06:45:01] I broke the ganglia job queue graph [06:45:06] when I removed the check from hume [06:45:21] RECOVERY - RAID on searchidx1001 is OK: OK: optimal, 1 logical, 4 physical [06:58:39] (03PS3) 10Ori.livneh: rewrite nginx module [operations/puppet] - 10https://gerrit.wikimedia.org/r/96961 [07:00:29] (03CR) 10Ori.livneh: "> it's preferable to manage notifies on the ssl servers" [operations/puppet] - 10https://gerrit.wikimedia.org/r/96961 (owner: 10Ori.livneh) [07:03:31] (03CR) 10Ori.livneh: "Also: I was able to test the module in general and the notify control in particular on Vagrant. Still a bit anxious about weird conflicts " [operations/puppet] - 10https://gerrit.wikimedia.org/r/96961 (owner: 10Ori.livneh) [07:32:51] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [08:18:08] PROBLEM - Puppet freshness on rhodium is CRITICAL: No successful Puppet run for 0d 6h 0m 6s [08:54:29] jeremyb: @wikimedia does have updates about outages; when I happen to be around on IRC and either I see the outage or someone thinks of pinging me :) [09:04:51] we need another you to cover the non-you timezones [09:05:04] or to cover sick-you [09:05:25] or we could just steal the TARDIS [09:05:32] and opensource it! [09:05:44] thanks for volunteering [09:05:55] but if we steal it isn't that good enough? you only need one... [09:06:22] brb [09:06:50] well we are abit against using closed source tools generally [09:32:50] (03PS1) 10Odder: Add an alias for NS_PROJECT on Bengali Wikisource [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/97206 [09:33:30] apergos: it's enough to !log and add a #Wikimedia #Wikipedia hashtag [09:34:01] sadly twitter doesn't have a feature to allow subscription to topics, unlike the old identi.ca, but there isn't much one can do about it [09:37:03] well having someone actually translate our rather terse log entries to something more readable would be nice [09:37:17] and [09:37:25] * apergos is sad abou identi.ca's demise too  [09:37:40] ok, time to find breakfast-like items [09:45:36] PROBLEM - etherpad_lite_process_running on zirconium is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^node node_modules/ep_etherpad-lite/node/server.js [09:49:36] RECOVERY - etherpad_lite_process_running on zirconium is OK: PROCS OK: 1 process with regex args ^node node_modules/ep_etherpad-lite/node/server.js [09:52:36] PROBLEM - etherpad_lite_process_running on zirconium is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^node node_modules/ep_etherpad-lite/node/server.js [10:00:38] RECOVERY - etherpad_lite_process_running on zirconium is OK: PROCS OK: 1 process with regex args ^node node_modules/ep_etherpad-lite/node/server.js [10:13:04] !log stopped etherpadlite, removed all but the last 5000 lines of its log file, (root running out of space), restarted [10:13:19] Logged the message, Master [10:13:45] !log ...on zirconium [10:13:53] still not awake [10:13:59] Logged the message, Master [10:23:40] apergos, euro staff slowly turns into night creatures. mornings aren't for us (even if they're in afternoon):P [10:24:51] actually it would be fine if I hadn't stayed up late last night in The Matrix [10:24:53] (of rfps) [10:25:52] also haven't had morning tea/hot chocolate/whatever yet [10:26:38] did wake up in time to look in on neon, it's a happy camper, which makes me happy :-) [11:18:51] PROBLEM - Puppet freshness on rhodium is CRITICAL: No successful Puppet run for 0d 9h 0m 49s [14:04:07] !log testing 5.5.34 mariadb.org debs on db60 [14:04:21] Logged the message, Master [14:11:15] oohhh [14:15:07] apergos: found a way to pin stuff to keep puppet happy [14:15:26] oh excellent [14:15:34] so can at least test without messing around [14:15:37] yep [14:15:45] care to share? :-) [14:17:41] set http_proxy to brewster. added upstream mariadb.org repo. pinned that version as preferred. installed. removed everything.. and puppet thinks all is still well [14:19:45] PROBLEM - Puppet freshness on rhodium is CRITICAL: No successful Puppet run for 0d 12h 1m 44s [14:20:09] apergos: First you need a REALLY big hammer... [14:20:44] hold still reedy I think I've got one right there, and funny how everything does look like a nail... :-P [14:21:29] The following packages will be REMOVED [14:21:29] libmariadbclient18 mariadb-client-5.5 mariadb-client-core-5.5 mariadb-server mariadb-server-5.5 mariadb-server-core-5.5 [14:21:29] The following packages will be upgraded: [14:21:29] libmysqlclient18 mysql-common [14:21:32] gj ubuntu [14:22:28] Ah. mariadb source was commented out from saucy upgrade [14:22:36] I can't remember if I had to pin it too :/ [14:22:50] seemingly not :) [14:24:27] that reminds me, time to see how the f20 roadmap is coming along [14:25:25] still on track [15:35:40] (03CR) 10Dzahn: [C: 04-1] "didn't you mean https://twitter.com/wikimediatech instead of https://twitter.com/wikimedia ? that's the more technical feed that includes" [operations/puppet] - 10https://gerrit.wikimedia.org/r/97190 (owner: 10Faidon Liambotis) [15:36:23] (03CR) 10Dzahn: "didn't you mean https://twitter.com/wikimediatech instead of https://twitter.com/wikimedia ? that's the more technical feed that includes" (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/97190 (owner: 10Faidon Liambotis) [15:41:33] (03CR) 10Dzahn: "besides, unfortunately it seems it stopped logging on Nov 14 ? is a bot script down again?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/97190 (owner: 10Faidon Liambotis) [16:24:34] mutante: yes, the twitter doesn't work on the grid, (at least AFAICT) only tools-login. but the twitter api library package is installed... haven't gotten around to investigating more [16:24:43] mutante: also see above about choice of twitter account [16:26:39] guillom: sure, but i had just gone back several days looking for relevant tweets and saw nothing. maybe it wasn't a representative sample or there wasn't a big enough outage. but if e.g. only 2 in 5 outages make it to the feed then I don't think we want to be linking to it that way. (no idea if that's an accurate value, essentially just made up a number) [16:26:57] jeremyb: thanks for looking. so did people want to have just "wikimedia" on it instead of "wikimediatech".. on the error page? [16:27:12] jeremyb: could put "wikimediatwork" too :o [16:28:29] i see logs now.. so you noticed it was broken too and that's why.. nod [16:29:25] can we just fix the SAL/tech one? [16:30:01] or can they all be in one account but with different hash tags and then you filter? shrug [16:31:45] (03CR) 10Dzahn: "ok, i saw the IRC backlog about this now. so it was chosen because the tech account is currently not updating. still think we should fix t" [operations/puppet] - 10https://gerrit.wikimedia.org/r/97190 (owner: 10Faidon Liambotis) [16:39:54] mutante: errr, no [16:40:08] mutante: it was chosen separate from account working or not [16:40:37] mutante: 06:10:16 UTC [16:40:42] this channel [16:41:34] mutante: anyway, tech account not working right i knew (I even !log'd when i made relevant changes). didn't realize it broke again though (i had moved it to tools-login which fixes twitter) [16:42:03] not sure who originally set it up there [16:42:17] hmm.ok. oh well. i think SAL is more relevant in an outage than the latest nice images on commons [16:43:10] jeremyb: thanks! see, i didn't even realize it was moved, i usually just check SAL on wikitech but never on twitter [16:44:19] mutante: maybe someone moved it back to the grid or it happened automatically somehow. you're not supposed to run long-term stuff on tools-login... [16:44:32] (when i checked last night it was on the grid) [16:47:43] on tools-login? can't it run where the bots run? ok, i'm sure it will be figured out .. not worrying more about it right now [17:20:12] PROBLEM - Puppet freshness on rhodium is CRITICAL: No successful Puppet run for 0d 15h 2m 10s [19:28:11] (03PS1) 10Dereckson: Undeploy ExpandTemplates extension [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/97331 [19:43:30] PROBLEM - Puppet freshness on searchidx1001 is CRITICAL: No successful Puppet run for 1d 10h 25m 6s [19:54:51] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [20:07:57] jeremyb: It was examples of me not being around, or not noticing it, or not being notified, or all of the above, then :) [20:12:50] RECOVERY - Puppet freshness on searchidx1001 is OK: puppet ran at Sat Nov 23 20:12:44 UTC 2013 [20:19:11] guillom: not the outage was too minor :) [20:19:46] jeremyb: Well, since I wasn't there, I can't say that :) [20:20:20] PROBLEM - Puppet freshness on rhodium is CRITICAL: No successful Puppet run for 0d 18h 2m 18s [21:35:16] (03CR) 10Umherirrender: "Same question as on Ifad0cad215cd0e91746dd08c3f8888c9fbe01c20:" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/97331 (owner: 10Dereckson) [21:53:57] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [22:03:31] (03CR) 10Dereckson: "As noted in bug 57484 description, this change should only deployed after I7ef63488dc3ad3885bcf99ff52852e1c6981942b has reached all wmf br" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/97331 (owner: 10Dereckson) [22:11:14] (03PS2) 10Nemo bis: Add Twitter account to Varnish's error page [operations/puppet] - 10https://gerrit.wikimedia.org/r/97190 (owner: 10Faidon Liambotis) [22:33:51] PROBLEM - SSH on amslvs1 is CRITICAL: Server answer: [22:34:51] RECOVERY - SSH on amslvs1 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [23:20:21] PROBLEM - Puppet freshness on rhodium is CRITICAL: No successful Puppet run for 0d 21h 2m 19s