[00:01:06] LeslieCarr: is it reasonable to guess that the scripts you're running on ms-be* should get done by next week? [00:01:53] i would say yes, looks like about an average of 4-5 days per machine [00:01:55] and i'm on be4 now [00:02:01] just started [00:02:06] so another 8-10 days out ? [00:02:57] great, thanks [00:03:23] is this comment more-or-less correct? https://bugzilla.wikimedia.org/show_bug.cgi?id=34695#c9 [00:05:22] LeslieCarr: ^ [00:08:28] we're doing it on feb 10th now [00:08:41] but other than that, yup [00:09:50] !log upgrading gluster on labstore1-4 [00:09:52] Logged the message, Master [00:10:28] LeslieCarr: thanks! I'm glad to hear the date you're using is a little later. That's hopefully past the point where we fixed all of the known problems [00:10:49] I'm sending an update to PediaPress now [00:12:01] !log upgrading gluster on all instances [00:12:04] Logged the message, Master [00:19:46] LeslieCarr: are you getting paged by cinga right now? [00:22:33] I got paged [00:22:36] for esams [00:22:47] why is it not reporting in the channel? [00:25:12] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:32:06] PROBLEM - check_all_memcacheds on spence is CRITICAL: MEMCACHED CRITICAL - Could not connect: 10.0.8.39:11000 (timeout) [00:32:24] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 335 bytes in 7.669 seconds [00:33:27] RECOVERY - check_all_memcacheds on spence is OK: MEMCACHED OK - All memcacheds are online [01:05:15] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:12:36] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 335 bytes in 4.841 seconds [01:19:03] PROBLEM - check_all_memcacheds on spence is CRITICAL: MEMCACHED CRITICAL - Could not connect: 10.0.8.39:11000 (timeout) [01:20:33] RECOVERY - check_all_memcacheds on spence is OK: MEMCACHED OK - All memcacheds are online [01:33:36] PROBLEM - Puppet freshness on amslvs1 is CRITICAL: Puppet has not run in the last 10 hours [01:43:21] PROBLEM - MySQL Slave Delay on storage3 is CRITICAL: CRIT replication delay 233 seconds [01:46:57] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:49:03] RECOVERY - MySQL Slave Delay on storage3 is OK: OK replication delay 6 seconds [01:52:57] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 335 bytes in 0.354 seconds [02:16:39] PROBLEM - check_all_memcacheds on spence is CRITICAL: MEMCACHED CRITICAL - Could not connect: 10.0.11.32:11000 (timeout) [02:19:30] RECOVERY - check_all_memcacheds on spence is OK: MEMCACHED OK - All memcacheds are online [02:27:09] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:33:00] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 335 bytes in 8.854 seconds [02:35:06] RECOVERY - Puppet freshness on amslvs1 is OK: puppet ran at Fri Apr 27 02:34:37 UTC 2012 [02:49:39] PROBLEM - Puppet freshness on db1004 is CRITICAL: Puppet has not run in the last 10 hours [02:56:06] RECOVERY - Puppet freshness on gallium is OK: puppet ran at Fri Apr 27 02:56:04 UTC 2012 [03:09:36] RECOVERY - Puppet freshness on spence is OK: puppet ran at Fri Apr 27 03:09:25 UTC 2012 [03:45:16] PROBLEM - Puppet freshness on amslvs2 is CRITICAL: Puppet has not run in the last 10 hours [04:17:09] PROBLEM - swift-container-auditor on ms-be4 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [04:20:54] PROBLEM - LVS HTTP on upload.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:22:15] RECOVERY - LVS HTTP on upload.esams.wikimedia.org is OK: HTTP OK HTTP/1.0 200 OK - 656 bytes in 0.218 seconds [04:24:24] LeslieCarr: did you see the questions about cinga pages above? (4 hrs back) [04:24:33] nope [04:24:35] afaik [04:24:39] what was the question [04:24:47] sick today so been fairly out of it [04:25:55] yuck. /me sends some tea ;-) [04:26:03] 27 00:19:46 < binasher> LeslieCarr: are you getting paged by cinga right now? [04:26:06] 27 00:22:33 < Ryan_Lane> I got paged [04:26:09] 27 00:22:36 < Ryan_Lane> for esams [04:26:11] 27 00:22:47 < Ryan_Lane> why is it not reporting in the channel? [04:26:18] (is UTC) [04:26:18] oh, icinga bot must be down [04:26:26] i haven't seen the bot in a while [04:26:30] sigh [04:30:03] RECOVERY - swift-container-auditor on ms-be4 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [04:35:32] !log added an account for myself on observium [04:35:35] Logged the message, Master [04:55:24] PROBLEM - LVS HTTP on upload.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:56:45] RECOVERY - LVS HTTP on upload.esams.wikimedia.org is OK: HTTP OK HTTP/1.0 200 OK - 653 bytes in 0.219 seconds [06:48:41] New patchset: ArielGlenn; "generate list of upload directories for wikis (need for image dumps)" [operations/dumps] (ariel) - https://gerrit.wikimedia.org/r/5947 [06:52:23] New review: ArielGlenn; "(no comment)" [operations/dumps] (ariel); V: 1 C: 2; - https://gerrit.wikimedia.org/r/5947 [06:52:26] Change merged: ArielGlenn; [operations/dumps] (ariel) - https://gerrit.wikimedia.org/r/5947 [06:54:57] New patchset: ArielGlenn; "rsync upload dirs for public wikis, optionally by shard" [operations/dumps] (ariel) - https://gerrit.wikimedia.org/r/5948 [07:16:55] PROBLEM - MySQL Replication Heartbeat on db1043 is CRITICAL: CRIT replication delay 208 seconds [07:17:13] PROBLEM - MySQL Slave Delay on db1043 is CRITICAL: CRIT replication delay 227 seconds [07:32:58] PROBLEM - Puppet freshness on amslvs4 is CRITICAL: Puppet has not run in the last 10 hours [07:35:22] RECOVERY - MySQL Replication Heartbeat on db1043 is OK: OK replication delay 0 seconds [07:35:49] RECOVERY - MySQL Slave Delay on db1043 is OK: OK replication delay 0 seconds [07:48:34] PROBLEM - LVS HTTP on upload.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:51:25] RECOVERY - LVS HTTP on upload.esams.wikimedia.org is OK: HTTP OK HTTP/1.0 200 OK - 653 bytes in 0.219 seconds [08:38:26] PROBLEM - LVS HTTP on upload.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:39:47] RECOVERY - LVS HTTP on upload.esams.wikimedia.org is OK: HTTP OK HTTP/1.0 200 OK - 656 bytes in 0.218 seconds [08:56:45] New patchset: Mark Bergsma; "Add a 'sites' LVS service parameter, which determines for which sites it's active" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5953 [08:57:03] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5953 [08:57:38] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/5953 [08:57:41] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5953 [08:58:32] PROBLEM - LVS HTTP on upload.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:59:53] RECOVERY - LVS HTTP on upload.esams.wikimedia.org is OK: HTTP OK HTTP/1.0 200 OK - 656 bytes in 0.219 seconds [09:05:42] New patchset: Mark Bergsma; "Take payments out of esams LVS" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5954 [09:06:00] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5954 [09:06:15] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/5954 [09:06:18] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5954 [09:07:50] RECOVERY - Puppet freshness on amslvs2 is OK: puppet ran at Fri Apr 27 09:07:38 UTC 2012 [09:11:08] RECOVERY - Puppet freshness on amslvs4 is OK: puppet ran at Fri Apr 27 09:10:51 UTC 2012 [09:11:53] PROBLEM - swift-container-auditor on ms-be4 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [09:31:00] New patchset: Dzahn; "add a table for LXDE wikis,fix the sort arrows in HTML,move listtable config,other minor fixes" [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5955 [09:31:01] New patchset: Dzahn; "move user agent and API query strings into config" [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5956 [09:33:59] PROBLEM - check_all_memcacheds on spence is CRITICAL: MEMCACHED CRITICAL - Could not connect: 10.0.8.23:11000 (timeout) [09:35:38] RECOVERY - check_all_memcacheds on spence is OK: MEMCACHED OK - All memcacheds are online [09:36:23] RECOVERY - swift-container-auditor on ms-be4 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [09:37:39] New patchset: Dzahn; "move user agent and API query strings into config, fix tabs/whitespace" [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5956 [09:37:57] New review: Dzahn; "(no comment)" [operations/debs/wikistats] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/5955 [09:37:59] Change merged: Dzahn; [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5955 [09:38:37] New review: Dzahn; "(no comment)" [operations/debs/wikistats] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/5956 [09:38:39] Change merged: Dzahn; [operations/debs/wikistats] (master) - https://gerrit.wikimedia.org/r/5956 [10:29:36] PROBLEM - Host srv278 is DOWN: PING CRITICAL - Packet loss = 100% [10:30:30] RECOVERY - Host srv278 is UP: PING OK - Packet loss = 0%, RTA = 0.83 ms [10:34:24] PROBLEM - Apache HTTP on srv278 is CRITICAL: Connection refused [11:02:27] RECOVERY - Apache HTTP on srv278 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.028 second response time [11:12:55] New patchset: Mark Bergsma; "Update boot-time preseeding for Precise" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5960 [11:13:12] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5960 [11:13:44] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/5960 [11:13:47] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5960 [11:29:05] New review: ArielGlenn; "(no comment)" [operations/dumps] (ariel); V: 1 C: 2; - https://gerrit.wikimedia.org/r/5948 [11:29:08] Change merged: ArielGlenn; [operations/dumps] (ariel) - https://gerrit.wikimedia.org/r/5948 [11:30:51] New patchset: ArielGlenn; "write errors to stderr" [operations/dumps] (ariel) - https://gerrit.wikimedia.org/r/5961 [12:25:38] !log fixing integration.mw testswarm and applying fixed erb template by hashar [12:25:42] Logged the message, Master [12:44:07] New review: Dzahn; "on gallium puppet runs and mysql keeps running now :)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5796 [12:50:39] PROBLEM - Puppet freshness on db1004 is CRITICAL: Puppet has not run in the last 10 hours [13:33:08] New patchset: Pyoungmeister; "giving otto sudo on bayes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5965 [13:33:13] :) [13:33:25] New review: ArielGlenn; "(no comment)" [operations/dumps] (ariel); V: 1 C: 2; - https://gerrit.wikimedia.org/r/5961 [13:33:25] Change merged: ArielGlenn; [operations/dumps] (ariel) - https://gerrit.wikimedia.org/r/5961 [13:33:26] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5965 [13:33:37] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/5965 [13:33:40] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5965 [13:34:33] ottomata: merged. you should have sudo in 30 minutes or less [13:36:27] thank youuuu [13:36:40] yep yep [13:38:21] New patchset: Ottomata; "statistics.pp - adding subversion to base stat packages" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5966 [13:38:39] New patchset: Ottomata; "misc/statistics.pp - fixing cron timing on mediawiki git pull." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5967 [13:38:55] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5966 [13:38:56] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5967 [14:09:57] hm. [14:10:16] I need mod_gzip for apache on stat1.wikimedia.org [14:10:21] looking at the new apache puppet stuff [14:10:26] i don't see it there [14:10:28] can I add it? [14:10:30] mark? [14:10:42] oh it defaults :) [14:26:02] New patchset: Ottomata; "misc/statistics.pp - fixing stats.wikimedia.org site." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5969 [14:26:18] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5969 [14:52:11] New patchset: Ottomata; "generic-definitions.pp - Modifying git::clone so that it determins suffix from $origin url instead of $title." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5970 [14:52:28] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5970 [14:53:38] ok, boy oh boy do I need some reviews! [14:55:28] omg [14:55:31] ocwiki is violating 'Disrupting the services by placing an undue burden on a Project website or the networks or servers connected with a Project website;' [14:55:33] :))) [14:55:38] shutdown!!!! [14:56:15] apergos: udp2log works now, just dunno how to make the site send data there :| [14:56:42] I changed all addreses in CommonSettings, hopefully it won't spam your prod log [14:56:59] but I guess there should be some variable to enable udp logging [14:57:47] well someone is in here now who actually knows how it works insteead of having to go dig around in the config files and the code [14:58:15] i know a tiny tiny bit about how it works, but probably not more than apergos [14:58:30] or what site you are trying to use to send data from [14:58:35] * apergos looks at domas [14:58:37] but, look reviews to do! [14:58:39] https://gerrit.wikimedia.org/r/#q,status:open+owner:Ottomata,n,z [14:58:58] the ones in rt.2162 topic branch are the most pressing [15:00:01] ottomata: wikimedia labs - beta.wmflabs.org [15:00:07] it uses same config as on prod [15:00:14] poking time: apergos is already my friend. hmmm…notpepter [15:00:18] there is a server which listen on 8420 udp [15:00:27] and write everything to certain file [15:00:33] everything works apart of that it doesn't work [15:00:37] :P [15:00:40] ga [15:00:41] ha [15:00:50] when I use netcat it writes to it [15:00:57] but mediawiki doesn't send any logs in there [15:01:25] ooh [15:01:32] it started to work for some reason [15:01:42] probably it was in cache [15:01:51] ok nevermind [15:04:07] New review: Mark Bergsma; "Could you make a puppet definition similar to git::clone for doing a pull?" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/5897 [15:05:09] mark, sure! [15:05:21] what do you think about git::clone { … ensure => latest [15:05:21] ? [15:05:49] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/5966 [15:05:50] i could do that either by setting up a cron to pull, but it might be better to just run a exec git pull every time puppet runs [15:05:52] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5966 [15:06:15] ottomata: hmm [15:06:48] only drawback I see is that it makes puppet report a change on every run [15:06:56] that's true, can do cron then [15:07:07] you are talking about a regular pull , right? [15:07:10] if you do cron, make it silent on successful runs [15:07:13] yeah [15:07:14] ottomata: just to check, your sudo on bayes is wroking properly, correct? [15:07:28] but I think I prefer it in puppet anyway [15:07:29] notpeter: yup! thank you! [15:07:33] ty [15:07:38] ottomata: let's try the ensure => latest [15:07:44] ok, i can do that either way [15:07:51] that way there isn't an extra define for git::pull [15:08:02] so wait, would you prefer the cron or the puppet exec? [15:08:37] if cron, how often? should I add options to the define for that? [15:08:52] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/5969 [15:09:09] I would prefer the puppet exec [15:09:14] ok, cool, me too [15:09:15] how about this [15:09:17] would be cleaner [15:09:25] you can do it with "onlyif" with git fetch [15:09:36] probably git fetch returns a code that says whether it fetched anything or not [15:09:37] (assumption) [15:09:49] then puppet only runs "git merge" or "git pull" if fetch returns that [15:09:55] that would be cleanest [15:10:04] ok cool, yeah [15:10:08] * mark checks [15:10:13] i'll see if I can check without running a fetch too [15:10:15] might be another way [15:10:29] New patchset: Pyoungmeister; "puppetizing ezachte's sudo on bayes for better visibility" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5971 [15:10:34] http://stackoverflow.com/questions/3258243/git-check-if-pull-needed [15:10:35] ? [15:10:36] reading... [15:10:46] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5971 [15:11:09] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/5971 [15:11:12] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5971 [15:13:20] git-fetch man page doesn't mention a return code, alas [15:14:01] yeah it's always 0 [15:14:35] i think this works [15:14:36] git diff --quiet remotes/origin/HEAD [15:14:46] --quiet [15:14:46] Disable all output of the program. Implies --exit-code. [15:14:48] after a fetch then [15:14:49] yes [15:14:55] looks good [15:14:58] do I need to do fetch first? [15:15:01] of course [15:15:27] so onlyif => "git fetch && git diff --quiet remotes/origin/HEAD" [15:15:32] otherwise you operate on the last fetch, of the previous pull [15:15:38] aye right [15:15:43] and since you already merged then, there will be no changes [15:15:51] git pull is simply git fetch && git merge [15:17:05] right, i haven't used git without pull much, thought maybe by specifying remotes/origin it would actually ask the tracked remote rather than fetching to the um 'local' remote [15:17:24] yes that's good [15:17:55] PROBLEM - check_all_memcacheds on spence is CRITICAL: MEMCACHED CRITICAL - Could not connect: 10.0.11.43:11000 (timeout) [15:17:57] or just git diff --quiet origin/master [15:18:07] that will do without fetch? [15:18:13] sorry, no [15:18:16] I misread [15:18:21] no you always need to do a fetch [15:18:24] ok cool [15:18:41] fetch or pull bring your local repository up to date [15:18:56] i think we want unless => ... [15:18:59] the difference is that one just downloads changes and updates refs, the other one also merges into your local branch [15:19:08] depends on the exit code yeah [15:19:11] exits with 1 if there were differences and 0 means no differences [15:19:13] yeah [15:19:16] RECOVERY - check_all_memcacheds on spence is OK: MEMCACHED OK - All memcacheds are online [15:19:36] logic in bash with exit codes always flips me around though, since 0 == true [15:19:44] it is confusing [15:19:47] fortunately puppet offers both [15:20:46] ok will do that, do I need to amend my previous git clone commit? [15:21:16] up, whatever is most convenient [15:21:24] dunno what the dependencies of the commits were [15:21:34] as long as the end result is good I don't care ;) [15:22:18] oh your comment was on my mediawiki_clone commit [15:22:27] ummm, go ahead and approve that one, i think it will be cleaner if i just do this in a new commit [15:22:38] alright [15:22:47] that one is in a different topic branch, might get me all confused if I have to figure out how to connect them [15:22:55] best to sync up prod and pull before I commit again [15:23:01] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/5967 [15:23:04] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5969 [15:23:05] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5967 [15:28:15] hmm I'm not sure I agree with your suffix change in git::clone [15:28:54] it's already not backwards compatible as our existing use doesn't use/require an origin ending in .git [15:28:59] no? what's up [15:29:03] hm [15:29:24] oh you are right [15:29:26] crap, thought i checked that [15:29:29] is it ok if we do? [15:29:36] there are only 2 places that use it right now [15:29:43] both in puppetmaster.pp [15:29:46] and you may want to determine the local repo name to be different from the origin's name [15:29:57] what do you mean, if we do? [15:30:04] make that a requirement [15:30:04] the origins in puppetmaster aren't bare repos [15:30:11] so they're not .git [15:30:17] https://gerrit.wikimedia.org/r/p/operations/puppet [15:30:39] origin => "https://gerrit.wikimedia.org/r/p/operations/puppet"; [15:31:00] but doesn't a .git exist anyway? hmm [15:31:14] on my local puppet clone [15:31:15] remote.origin.url=ssh://gerrit.wikimedia.org:29418/operations/puppet.git [15:31:25] hmm right, they are now [15:31:32] but you are right [15:31:33] if they weren't [15:31:35] would this not work? [15:31:37] I thought they were cloning from another host [15:31:52] we shoudl support cloning from both bare and other full repos [15:32:10] you could introduce a suffix var that allows you to override it [15:32:12] I think that would fix it [15:32:20] a param I mean [15:32:30] which does the right thing for the most common case, and allows overriding if not [15:32:44] ok, hmm, yeah, taht would be better [15:32:53] can I tell git clone to clone into a dir rather than creating the dir with the suffix? [15:32:53] and then I'm ok with that being based on the $origin instead of $title [15:33:11] I wonder why I did it that way [15:33:14] I had a reason for it [15:33:33] it would almost be better if $directory was the actual directory where the cloned working copy ends up [15:33:38] rather than $directory/suffix [15:33:46] yeah [15:33:56] you just need to strip the suffix from it then for cwd => [15:34:48] k, lemme think about this for a minute, my change the clone exec around and make the directory stuff smarter [15:34:53] will amend and see what you think [15:35:01] ok [16:08:23] New patchset: Pyoungmeister; "removing bellin and blondel from mysql.pp and site.pp" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5976 [16:08:40] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5976 [16:09:31] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/5976 [16:09:34] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5976 [16:10:27] ottomata: I'm merging what I'm asusming is your code on sockpuppet [16:11:25] sockpuppet? [16:11:50] our puppetmaster host [16:11:59] they were in the deploy queue [16:12:01] just letting you know [16:12:19] change to docroot [16:12:25] ah perfect [16:12:26] thank you [16:12:30] and some other jun [16:12:32] k [16:12:34] yup [16:12:34] yeah, everything that has been approved is good [16:12:42] I assumed so [16:12:42] working on one that is not good yet, but it ahsn't been merged so A-ok [16:12:46] just lettin' ya know [16:17:26] New review: Hashar; "Thanks !!! ;-)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5796 [16:19:57] notpeter, can you approve this one? [16:19:57] https://gerrit.wikimedia.org/r/#change,5897 [16:20:02] it is harmless and only affects my stuff [16:20:08] and i think i will avoid conflicts later if I can pull it in locally now [16:20:19] oh, mark commented on that one [16:20:27] mark, I am working on that in another topic branch for another commit [16:20:38] this one: [16:20:38] https://gerrit.wikimedia.org/r/#q,5970,n,z [16:20:42] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/5897 [16:20:43] I will amend that one [16:20:46] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5897 [16:20:50] thank you! [16:43:45] hm, mark, i'm going to try to abandoned that git clone title change and do a new topic branch [16:45:43] !log starting innobackupex from db1040 to db1022 for new eqiad s6 snapshot slave [16:45:45] Logged the message, notpeter [16:50:18] Change abandoned: Ottomata; "Better changes to come in a different topic branch. Abandoning this one after talking with Mark." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5970 [16:51:45] New patchset: Ottomata; "git::clone define improvements." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5977 [16:51:57] New review: gerrit2; "Change did not pass lint check. You will need to send an amended patchset for this (see: https://lab..." [operations/puppet] (production); V: -1 - https://gerrit.wikimedia.org/r/5977 [16:52:08] uh oh! [17:23:40] New patchset: Ottomata; "git::clone define improvements." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5977 [17:23:58] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5977 [17:24:04] oook! [17:24:11] mark, wanna check that one out (if you are still around :) ) [17:24:13] ? [17:25:52] ottomata: I'll review it when I can be around to check the puppet changes on the other systems [17:26:03] i'm going off now [17:26:05] ok cool [17:26:06] thanks [17:26:37] thanks for your work on that, always nice when people help making generic stuff better [17:33:59] yeah, its fun [17:34:03] one of my fav things to do, actually [17:39:17] <^demon> ottomata: That's so awesome, I was wanting that like 4-5 months ago :) [17:40:10] yay! [17:45:15] New patchset: Dzahn; "check_all_memcached - do not rely on NFS, instead fetch mc.php from noc http" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5979 [17:45:32] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5979 [17:47:06] New patchset: Dzahn; "check_all_memcached - do not rely on NFS, instead fetch mc.php from noc http (and add check for unset $wgMemCachedServers)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5979 [17:47:24] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5979 [17:49:31] !change 5979 | notpeter [17:49:31] notpeter: https://gerrit.wikimedia.org/r/5979 [17:53:09] New patchset: Dzahn; "check_all_memcached - do not rely on NFS, instead fetch mc.php from noc http (and add check for unset $wgMemCachedServers)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5979 [17:53:27] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/5979 [17:54:34] mutante: sweet! [18:16:01] New review: Catrope; "(no comment)" [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/5979 [18:22:49] New review: Dzahn; "ok, thanks for +1 catrope. works on spence." [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/5979 [18:22:51] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/5979 [18:50:10] apergos, you probably aren't around, eh? [18:50:25] was wondering about rsync module on nfs1 [19:00:27] PROBLEM - MySQL Slave Delay on db12 is CRITICAL: CRIT replication delay 202 seconds [19:00:36] PROBLEM - MySQL Replication Heartbeat on db36 is CRITICAL: CRIT replication delay 212 seconds [19:00:36] PROBLEM - MySQL Replication Heartbeat on db1047 is CRITICAL: CRIT replication delay 212 seconds [19:01:03] PROBLEM - MySQL Replication Heartbeat on db42 is CRITICAL: CRIT replication delay 238 seconds [19:01:30] PROBLEM - MySQL Replication Heartbeat on db12 is CRITICAL: CRIT replication delay 265 seconds [19:02:06] RECOVERY - MySQL Replication Heartbeat on db36 is OK: OK replication delay 0 seconds [19:02:06] RECOVERY - MySQL Replication Heartbeat on db1047 is OK: OK replication delay 0 seconds [19:02:24] RECOVERY - MySQL Replication Heartbeat on db42 is OK: OK replication delay 0 seconds [19:04:30] RECOVERY - MySQL Replication Heartbeat on db12 is OK: OK replication delay 0 seconds [19:04:48] RECOVERY - MySQL Slave Delay on db12 is OK: OK replication delay 8 seconds [19:13:17] New review: Demon; "We should use the theme configuration first to get the colors we want. Then we can do any remaining ..." [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/3285 [19:24:54] RECOVERY - mysqld processes on db1022 is OK: PROCS OK: 1 process with command name mysqld [19:28:21] PROBLEM - MySQL Replication Heartbeat on db1022 is CRITICAL: CRIT replication delay 5849 seconds [19:29:29] New patchset: Demon; "Re-attempting links for RT and CodeReview." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/6005 [19:29:46] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/6005 [19:30:00] notpeter, I just reviewed faidon's nginx udplog module patch [19:30:08] i also tested on a labs instance [19:30:12] got data on 3 different udp sockets [19:30:14] ottomata: Tim Starling did so yesterday too [19:30:17] cool [19:30:24] peter contacted him privately and he responded privately [19:30:28] perfect, ok [19:30:35] so it should be good to deploy, ja? [19:30:38] he probably didn't know about your efforts [19:30:45] no, its good [19:30:45] PROBLEM - MySQL Slave Delay on db1022 is CRITICAL: CRIT replication delay 4563 seconds [19:30:46] i'm glad he did [19:30:54] i don't konw the nginx context of that code [19:30:55] yes, I'm waiting for Ryan to finish so I can get some help on the operational side [19:30:58] so a 2nd pair of eyes is good [19:30:58] ok cool [19:31:02] thanks! [19:31:08] faidon == paravoid, right? [19:31:13] yes :) [19:31:16] k [19:31:17] :) [19:31:36] I have my real name in /whois [19:31:47] ah cool [19:31:58] notpeter: you've packaged things and added them to the repo before, right? [19:32:25] Ryan_Lane: I have, although more under the old system [19:33:00] paravoid: TimStarling told me yesterday that he had reviewed http://www.mediawiki.org/wiki/Special:Code/MediaWiki/115067 , but I don't see anything on CR. Did he do that via email? [19:33:23] oh, heh....seeing backlog now [19:33:35] nevermind! [19:39:18] RECOVERY - MySQL Slave Delay on db1022 is OK: OK replication delay 0 seconds [19:39:30] sorry db1022, you gonna die again [19:39:45] RECOVERY - MySQL Replication Heartbeat on db1022 is OK: OK replication delay 0 seconds [19:40:19] notpeter: stop with your favoritism! [19:41:53] nevar! [19:49:38] !log de-pooling ssl4 [19:49:41] Logged the message, Master [19:50:45] if you wanna !log and refer to gerrit change ids in it, you can now [[gerrit:1234]] and you get working links in SAL (added gerrit interwiki prefix on wikitech) [19:51:05] bye for weekend.. and May 1st is public holiday over here. laters [19:58:08] starting innobackupex from db1017 to db59 for new s1 slave, again [19:58:18] !log starting innobackupex from db1017 to db59 for new s1 slave, again [19:58:20] Logged the message, notpeter [19:59:20] PROBLEM - mysqld processes on db1022 is CRITICAL: PROCS CRITICAL: 0 processes with command name mysqld [19:59:20] PROBLEM - mysqld processes on db60 is CRITICAL: PROCS CRITICAL: 0 processes with command name mysqld [19:59:31] !log starting innobackupex from db12 to db60 for new s1 slave, again [19:59:33] Logged the message, notpeter [20:00:09] !log starting innobackupex from db1040 to db1022 for new eqiad s6 snapshot slave, again [20:00:12] Logged the message, notpeter [20:09:50] New patchset: Ottomata; "rsyncd.conf.downloadprimary - adding writeable rsync module at /data/xmldatadumps/public/other/pagecounts-ez" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/6038 [20:10:07] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/6038 [20:10:41] not sure who is best to ask to review that one [20:10:48] apergos was gonna but he's offline probably for the weekend [20:10:50] notpeter? [20:22:56] waa waa waaa [20:22:58] crickets. [20:23:03] let's see, who's next to poke... [20:23:36] hm, nobody really [20:30:50] PROBLEM - MySQL disk space on db60 is CRITICAL: DISK CRITICAL - free space: / 0 MB (0% inode=97%): /var/lib/ureadahead/debugfs 0 MB (0% inode=97%): [20:50:20] PROBLEM - MySQL Replication Heartbeat on db1018 is CRITICAL: CRIT replication delay 209 seconds [20:50:20] PROBLEM - MySQL Slave Delay on db1018 is CRITICAL: CRIT replication delay 209 seconds [20:53:11] PROBLEM - check_all_memcacheds on spence is CRITICAL: MEMCACHED CRITICAL - Could not connect: 10.0.8.23:11000 (timeout) [20:54:32] RECOVERY - MySQL Replication Heartbeat on db1018 is OK: OK replication delay 0 seconds [20:54:41] RECOVERY - MySQL Slave Delay on db1018 is OK: OK replication delay 0 seconds [20:54:41] RECOVERY - check_all_memcacheds on spence is OK: MEMCACHED OK - All memcacheds are online [21:10:56] RECOVERY - MySQL disk space on db60 is OK: DISK OK [21:20:30] bah, he committed a patchset for something I submitted yesterday [21:20:40] whatevs [21:35:50] PROBLEM - Host srv278 is DOWN: PING CRITICAL - Packet loss = 100% [21:37:29] RECOVERY - Host srv278 is UP: PING OK - Packet loss = 0%, RTA = 0.50 ms [21:40:38] PROBLEM - Apache HTTP on srv278 is CRITICAL: Connection refused [21:43:29] RECOVERY - Apache HTTP on srv278 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.054 second response time [21:45:50] !log rebooting ssl4 after upgrading (incl. a kernel update) [21:45:53] Logged the message, Master [21:47:14] PROBLEM - Host ssl4 is DOWN: PING CRITICAL - Packet loss = 100% [21:48:44] RECOVERY - Host ssl4 is UP: PING OK - Packet loss = 0%, RTA = 0.63 ms [22:09:25] New patchset: Lcarr; "adding fwconfigtool's packages" [operations/software] (master) - https://gerrit.wikimedia.org/r/6045 [22:09:43] New review: Lcarr; "(no comment)" [operations/software] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/6045 [22:09:45] Change merged: Lcarr; [operations/software] (master) - https://gerrit.wikimedia.org/r/6045 [22:15:20] !log upgraded ssl4 to nginx 0.7.65-5wmf1 and added it back to the pool [22:15:23] Logged the message, Master [22:17:29] PROBLEM - check_job_queue on spence is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 9,999 jobs: , frwiki (12934) [22:18:05] PROBLEM - check_job_queue on neon is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 9,999 jobs: , frwiki (12469) [22:32:56] RECOVERY - check_job_queue on spence is OK: JOBQUEUE OK - all job queues below 10,000 [22:33:32] RECOVERY - check_job_queue on neon is OK: JOBQUEUE OK - all job queues below 10,000 [22:51:41] PROBLEM - Puppet freshness on db1004 is CRITICAL: Puppet has not run in the last 10 hours [22:57:32] LeslieCarr: fwiw, I'm not ignoring you :) [23:01:48] i know [23:02:13] i opened up a ticket so i'd just have it down [23:31:56] RECOVERY - mysqld processes on db1022 is OK: PROCS OK: 1 process with command name mysqld [23:34:11] PROBLEM - MySQL Replication Heartbeat on db1022 is CRITICAL: CRIT replication delay 4166 seconds [23:35:05] PROBLEM - check_all_memcacheds on spence is CRITICAL: MEMCACHED CRITICAL - Could not connect: 10.0.11.47:11000 (timeout) [23:36:08] PROBLEM - MySQL Slave Delay on db1022 is CRITICAL: CRIT replication delay 3034 seconds [23:36:35] RECOVERY - check_all_memcacheds on spence is OK: MEMCACHED OK - All memcacheds are online [23:39:53] RECOVERY - MySQL Replication Heartbeat on db1022 is OK: OK replication delay 0 seconds [23:40:20] RECOVERY - MySQL Slave Delay on db1022 is OK: OK replication delay 0 seconds