[00:27:27] !log olivneh synchronized php-1.21wmf3/extensions/EventLogging 'Updating EventLogging on test2' [00:27:41] Logged the message, Master [00:28:46] !log aaron synchronized php-1.21wmf3/extensions/TimedMediaHandler 'deployed c1ac05640377f4f99cbe2a094e80d3d25d63b93d' [00:29:00] Logged the message, Master [00:29:54] !log aaron synchronized php-1.21wmf2/extensions/TimedMediaHandler 'deployed 18e51f3b06b84d1d5fbf47272d9ebfc5008dc879' [00:30:08] Logged the message, Master [00:43:08] !log olivneh synchronized php-1.21wmf2/extensions/EventLogging [00:43:23] Logged the message, Master [00:55:10] New review: Aaron Schulz; "I don't like how this has to enumerate the standard ones...what if those change? Is there a way arou..." [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/29768 [01:01:59] !log olivneh synchronized php-1.21wmf3/extensions/EventLogging 'Updating EventLogging on test2' [01:02:13] Logged the message, Master [01:04:35] New patchset: Ori.livneh; "Re-enable EventLogging on enwiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/30747 [01:05:45] Change merged: Ori.livneh; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/30747 [01:09:12] !log olivneh synchronized php-1.21wmf2/extensions/EventLogging [01:09:26] Logged the message, Master [01:09:41] !log olivneh synchronized wmf-config/InitialiseSettings.php [01:09:55] Logged the message, Master [01:24:40] seems to be an outage for at least some people [01:24:40] including myself :) [01:29:32] Prodego: what kind? [01:33:28] jeremyb: probably something in the middle, I just get a could not connect error [01:33:31] from chrome [01:33:49] huh [01:34:11] ok, there's other kinds of outages you could have now. like power! [01:34:56] true, true [01:39:17] Hmm [01:39:27] I think theres possibly an apache or 2 out of sync.. [01:40:05] Every so often for wikidata, I'm getting the apache error page: Not Found - The requested URL /wiki/Wikidata:Main_Page was not found on this server. - Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request. [01:41:39] I'm currently failing to connect to irc.wikimedia.org [01:41:50] * Connecting to ekrem.wikimedia.org (208.80.152.178) port 6667... [01:41:50] * Connection failed. Error: Connection timed out [01:42:48] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 235 seconds [01:43:10] hrmmm [01:43:17] mobile site is red in watchmouse [01:43:29] IRC is green! [01:43:44] irc WFM [01:43:58] ditto [01:45:10] idk about mobile. WFM [01:46:06] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 1 seconds [01:53:54] PROBLEM - Apache HTTP on srv229 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:00:57] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 310 seconds [02:06:03] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [02:24:05] !log LocalisationUpdate completed (1.21wmf2) at Tue Oct 30 02:24:04 UTC 2012 [02:24:24] Logged the message, Master [02:49:48] !log LocalisationUpdate completed (1.21wmf3) at Tue Oct 30 02:49:48 UTC 2012 [02:50:03] Logged the message, Master [02:57:21] RECOVERY - Puppet freshness on ms1002 is OK: puppet ran at Tue Oct 30 02:57:12 UTC 2012 [03:35:59] PROBLEM - Puppet freshness on analytics1001 is CRITICAL: Puppet has not run in the last 10 hours [03:35:59] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [03:35:59] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [03:49:39] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 28 seconds [04:49:30] PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours [06:19:17] PROBLEM - Apache HTTP on srv278 is CRITICAL: Connection refused [06:37:28] RECOVERY - Apache HTTP on srv278 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.045 second response time [07:26:50] hello [07:29:40] υο [07:29:44] er [07:29:45] anyways [07:33:15] !log Replaced 2 bits @ esams servers with 4 new servers cp3019-cp3022 [07:33:29] Logged the message, Master [07:36:37] PROBLEM - LVS HTTP IPv6 on bits-lb.esams.wikimedia.org_ipv6 is CRITICAL: Connection refused [07:37:03] huh [07:37:26] there are a lot of ipv6 someloss esams emails this morning [07:38:13] no, it's because i'm an idiot [07:38:30] * apergos raises an eyebrow [07:39:14] New patchset: Mark Bergsma; "Add IPv6 addresses to new bits servers" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30757 [07:39:53] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30757 [07:39:55] RECOVERY - LVS HTTP IPv6 on bits-lb.esams.wikimedia.org_ipv6 is OK: HTTP OK HTTP/1.1 200 OK - 3913 bytes in 3.246 seconds [07:41:20] I see [07:42:58] hm? [07:43:10] you're up early [07:43:10] the jobrunners have some really odd graphs (suspiciously dropping off at midnight utc) and yet when I go look at the counts for a few large projects they are low, and some jobs seem to be processed. [07:43:16] well couldn't sleep much [07:43:25] sorry for that [07:43:26] that was to mark :P [07:43:29] ah [07:43:47] it's not particuarly early for you apergos [07:43:55] mm guess not [07:44:07] !log Fixed IPv6 addresses on new esams bits servers [07:44:22] Logged the message, Master [07:44:54] I saw "bond0" in the add_ip6 puppet stanza, and immediately discarded it as part of the link aggregation setup [07:45:07] hey mark [07:45:11] hi [07:45:25] exciting death of NY, eh ? [07:45:33] is it dead? [07:45:36] hmm? [07:45:47] not exactly, but a lot of network issues [07:45:48] oh didn't notice yet [07:45:51] 111 8th is having generator issues on some floors [07:45:56] AC2 cable's down [07:46:01] oh boy [07:46:04] whoops [07:46:16] lots of (our) providers are having outages [07:46:30] oh wow [07:46:47] ah water in the metro tunnels. nice [07:47:26] huh this really did have a huge impact, how about that [07:47:35] yeah, i thought that the news was overreacting, but nope [07:47:40] it's serious [07:50:04] :( [07:52:26] New patchset: Mark Bergsma; "Swap out ganglia aggregators for Bits caches esams group" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30758 [07:52:54] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30758 [07:53:09] PROBLEM - LVS HTTP IPv6 on bits-lb.esams.wikimedia.org_ipv6 is CRITICAL: Connection refused [07:53:23] wtf [07:53:27] bad mark [07:53:37] * apergos raises the other eyebrow [07:54:43] oh [07:54:48] ARGH. [07:56:28] RECOVERY - LVS HTTP IPv6 on bits-lb.esams.wikimedia.org_ipv6 is OK: HTTP OK HTTP/1.1 200 OK - 3902 bytes in 0.243 seconds [07:59:07] PROBLEM - SSH on lvs1001 is CRITICAL: Server answer: [08:00:46] RECOVERY - SSH on lvs1001 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [08:05:16] I think we've established that I need more coffee. [08:07:47] three time's a charm :p [08:08:40] heh [08:08:46] * apergos goes to get some tea [08:38:27] New patchset: Mark Bergsma; "Repurpose cp3001/cp3002" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30759 [08:39:32] New patchset: Mark Bergsma; "Repurpose cp3001/cp3002" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30759 [08:40:10] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30759 [08:42:03] did you replace bits already? [08:42:07] wow [08:45:56] sure [08:46:05] why not [08:46:18] no more session bug? [08:46:32] session bug? [08:46:46] it wasn't "why", it was "wow you're fast" :) [08:46:54] a bit too fast, before my coffee ;) [08:47:07] i wanted to do it yesterday, right before the ops meet [08:47:15] but then the stupid geoip linking issue delayed me a bit [08:47:20] what was that problem with the concurrent sessions that we were having? [08:47:31] i don't know [08:47:35] i'll look at it again next time it happens [08:47:47] I don't know the details, I just knew that I'd have to restart both servers at the same time or depool/pool esams [08:47:58] might be threads queuing up for one very popular object or something [08:48:54] i'm inclined to install upload there now as well [08:49:08] and perhaps try my varying thumbs idea [08:55:18] hehe [08:58:14] New patchset: Mark Bergsma; "Puppetise domain-maplist, and add wikidata/wikivoyage" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30760 [09:00:19] why is gerrit down so much [09:00:45] is it a requirement for every java program to suck massively or something? [09:00:52] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30760 [09:07:03] !log Fixed wikidata.org language cnames issue [09:07:16] Logged the message, Master [09:19:50] !log nikerabbit synchronized wmf-config/InitialiseSettings.php 'Narayam and Webfonts: bug 41460, bug 39200, bug 41359' [09:20:06] Logged the message, Master [09:30:38] [ 9.980557] bnx2x 0000:01:00.0: eth0: Warning: Unqualified SFP+ module detected, Port 0 from LEONI part number L45593-C100-D10 [09:37:48] aaaaargh [09:41:37] !log nikerabbit synchronized wmf-config/InitialiseSettings.php 'Temporarily enable beta mappings for am wikis' [09:41:51] Logged the message, Master [09:43:59] it does work ;) [09:44:09] it's in the bits servers now serving production traffic [09:49:09] just a warning [09:49:10] that's good [09:56:38] PROBLEM - Puppet freshness on db42 is CRITICAL: Puppet has not run in the last 10 hours [09:56:38] PROBLEM - Puppet freshness on ms-be7 is CRITICAL: Puppet has not run in the last 10 hours [09:56:38] PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours [10:04:19] paravoid: will you be available for some review later this afternoon or tomorrow ? That is for the Zuul puppet classes I have been working on on labs [10:06:35] yeah [10:07:02] should not cause too many trouble, expect the git::clone stuff :-] [10:08:16] !log nikerabbit synchronized php-1.21wmf2/extensions/NewUserMessage/NewUserMessage.class.php 'I0f93ee53' [10:08:29] Logged the message, Master [11:27:17] PROBLEM - Puppet freshness on srv229 is CRITICAL: Puppet has not run in the last 10 hours [12:07:29] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [12:26:50] PROBLEM - SSH on srv229 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:32:04] RECOVERY - SSH on srv229 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [12:47:01] New patchset: J; "Bug 41528 - need more memory for video thumbs" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/30773 [12:58:37] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [13:05:58] PROBLEM - Memcached on srv229 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:10:01] PROBLEM - SSH on srv229 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:14:04] RECOVERY - Memcached on srv229 is OK: TCP OK - 0.002 second response time on port 11000 [13:14:49] RECOVERY - SSH on srv229 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [13:23:13] PROBLEM - SSH on srv229 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:24:07] PROBLEM - Memcached on srv229 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:33:52] RECOVERY - Memcached on srv229 is OK: TCP OK - 0.002 second response time on port 11000 [13:34:51] PROBLEM - Apache HTTP on mw21 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:35:57] RECOVERY - Apache HTTP on mw21 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 1.506 second response time [13:35:57] RECOVERY - SSH on srv229 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [13:37:24] PROBLEM - Puppet freshness on analytics1001 is CRITICAL: Puppet has not run in the last 10 hours [13:37:24] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [13:37:24] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [13:41:00] PROBLEM - SSH on srv229 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:42:03] PROBLEM - Memcached on srv229 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:55:15] PROBLEM - Apache HTTP on mw39 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:56:45] RECOVERY - Apache HTTP on mw39 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.378 second response time [14:00:30] PROBLEM - NTP on srv229 is CRITICAL: NTP CRITICAL: No response from NTP server [14:01:47] New patchset: Mark Bergsma; "Add bits@eqiad as esams bits backend as well, in round-robin" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30777 [14:03:33] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30777 [14:08:01] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:09:30] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.526 seconds [14:17:03] New patchset: Mark Bergsma; "Random director is easier" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30779 [14:17:25] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30779 [14:26:36] PROBLEM - Apache HTTP on mw55 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:26:56] New patchset: Mark Bergsma; "Double bits cache memory to 2 GB" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30781 [14:27:17] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30781 [14:28:15] RECOVERY - Apache HTTP on mw55 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 8.076 second response time [14:31:42] PROBLEM - Apache HTTP on mw38 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:33:12] RECOVERY - Apache HTTP on mw38 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 4.775 second response time [14:38:32] !log powercycling srv229, dead for unknown reason so far [14:38:45] Logged the message, Master [14:38:58] I tried stopping and restarting memcached on it but it never got done with the stop part [14:39:12] (took ages to get on the host, etc) [14:39:42] use the server admin log? [14:40:08] I would have logged it if it ever completed [14:40:22] but it wasn't going to, a powercycle was next anyways [14:40:52] it was certainly in swapdeath but dunno why [14:41:54] RECOVERY - SSH on srv229 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [14:43:00] RECOVERY - Memcached on srv229 is OK: TCP OK - 0.006 second response time on port 11000 [14:43:00] New patchset: Hashar; "/etc/wikimedia-cluster containing $::cluster" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30784 [14:44:12] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:44:16] New review: Mark Bergsma; "In that case it should be called "site" for consistency. cluster is very ambiguous." [operations/puppet] (production); V: 0 C: -2; - https://gerrit.wikimedia.org/r/30784 [14:44:47] mark: indeed :) [14:44:55] mark: I thought about $::datacenter hehe [14:45:03] RECOVERY - NTP on srv229 is OK: NTP OK: Offset -0.05753302574 secs [14:45:23] New review: Mark Bergsma; "A better idea btw would be to make one file containing some variables (realm, site, etc.), which can..." [operations/puppet] (production); V: 0 C: -2; - https://gerrit.wikimedia.org/r/30784 [14:45:52] i merged your -lGeoIP change btw [14:46:09] seen that, thanks :-] [14:46:20] thanks for fixing my issue too ;) [14:46:26] got that while setting up varnish on a Precise labs instance [14:46:36] got that while setting up varnish on precise in production yesterday [14:46:43] so i had to fix it, hehe [14:48:01] might find other bugs though [14:48:01] New patchset: Hashar; "/etc/wikimedia-site containing $::site" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30784 [14:48:06] I haven't reloaded my varnish instance yet [14:48:29] it's working fine [14:48:31] it's serving european traffic since this morning [14:48:43] anyway, check my other comment on that gerrit change [14:48:57] New review: Hashar; "renamed the file to /etc/wikimedia-site which now provides $::site. Will add a new file containing ..." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/30784 [14:49:04] templates for the win! [14:49:15] something like /etc/wikimedia-conf.php ? [14:49:18] not even needed [14:49:43] you can just do file { "/etc/wikimedia-vars": content => "WIKIMEDIA_REALM=$::realm\nWIKIMEDIA_SITE=$::site\n" } ;) [14:49:53] .php ?! [14:50:10] that is going to be loaded from the MediaWiki configuration files (commonsettings.php / initialisettings.php ...) [14:50:22] we could provide both a shell and a php version [14:50:27] PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours [14:50:36] meh [14:50:42] can't mediawiki just parse the very simple shell version? [14:51:00] then we have to write a parsing function [14:51:10] whereas we could just include("/etc/wikimedia.php"); [14:51:27] now I want a python version as well [14:51:34] let s do it! [14:51:35] and a C object [14:51:52] perhaps a java servlet as well [14:51:52] yeah so hmm [14:52:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 2.563 seconds [14:53:42] <^demon> mark: How about putting that in /etc/defaults/? [14:53:55] New patchset: J; "Dont overwrite $wgMaxShellMemory in labs" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/30785 [14:54:21] it's not really for a specific program though [14:54:30] lsb_release isn't in defaults either [14:54:48] RECOVERY - Puppet freshness on srv229 is OK: puppet ran at Tue Oct 30 14:54:34 UTC 2012 [14:55:21] <^demon> Hmm, ok. /etc makes enough sense. [14:55:31] so shell format [14:55:34] and we parse it in PHP ? [14:55:42] RECOVERY - Host storage3 is UP: PING OK - Packet loss = 0%, RTA = 0.36 ms [14:55:50] I am a bit afraid of our custom function choking whenever the file is not the expected format [14:55:50] i do agree that if it needs to be used from php, separate files would be easier [14:56:09] just read contents and strip odd chars [14:56:29] (we could use a neutral format such as json, yaml or xml) but I guess it does not play nice with bash :/ [14:56:39] <^demon> Why not just plain text? [14:56:56] <^demon> I see no harm in something like /etc/wikimedia-realm being just a text file. [14:57:07] like key\tvalue [14:57:07] RECOVERY - Apache HTTP on srv229 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.041 second response time [14:57:28] just use your original idea of separate files if it becomes complicated [14:57:37] <^demon> Separate files is best imho. [14:57:40] <^demon> Easiest to maintain. [14:57:40] my idea was to facilitate bash scripts, but that's not the only use [14:57:42] <^demon> Less to parse. [14:57:43] agreed [14:57:51] fine [14:57:56] should we put them in /etc/wikimedia/ ? [14:58:05] named conf or something? [14:58:21] that's a directory to maintain, more complicated ;) [14:58:26] /etc/wikimedia-site is fine for now [15:00:14] mark: means you merge in https://gerrit.wikimedia.org/r/#/c/30784/ ? ; ) [15:00:38] yes [15:00:49] New review: Mark Bergsma; "u" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/30784 [15:01:07] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30784 [15:02:17] thanks! [15:09:41] RECOVERY - SSH on storage3 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [15:18:31] New patchset: Mark Bergsma; "Add missing servers strontium/palladium" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30787 [15:18:48] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30787 [15:27:26] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:30:41] mark: mediawiki-config already use $site to describe the project name ;-] [15:30:59] whereas puppet / ops assume site is the datacenter name :-] [15:31:01] lovely semantic issue [15:31:52] hehe [15:32:03] I guess we will load wikimedia-site in our variable [15:32:11] yes [15:32:12] named something like $datacenter or $wikimedia-dc [15:32:48] if we were a big company, we would create a dictionary of the data and hire a team of consultant to write us an ETL [15:32:58] that would take care of the transformation between groups :-] [15:33:04] of course that would cost millions of dollars [15:33:11] and takes a few years [15:33:16] Heh, Wikimedia DC has a whole different meaning... [15:33:28] It's a chapter for the District of Columbia. [15:41:45] also a site can span multiple datacenters [15:42:05] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.039 seconds [16:00:05] notpeter, can you help me with the Solr package? [16:00:21] New patchset: Anomie; "Add switching for eqiad-specific configuration" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/30792 [16:03:27] New review: J; "from reading he source code there is currently now way to just add headers:" [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/29768 [16:07:08] paravoid: you around? [16:07:09] yes [16:07:36] paravoid: I might need your help in the next 30 minutes [16:07:44] for? [16:07:50] paravoid: on VUMI USSD stuff on Silver [16:07:53] paravoid: will you be available? [16:07:55] ah, hi Patrick [16:07:58] I will [16:08:25] and you just reminded me to reply to Dan, which I forgot to do earlier. [16:08:28] paravoid: sorry, I didn't realize I lost my nick [16:09:30] paravoid: can you run sudo supervisorctl status [16:09:52] paravoid: and send me the output on pastebin.mozilla.org [16:10:14] all of them RUNNING [16:10:58] paravoid: can I have the output please [16:11:27] http://pastebin.com/Rjd0sQLm [16:11:48] paravoid: not the pastebin I requested [16:12:10] no, does it matter?! [16:12:24] paravoid: you forced me to see ads [16:12:28] paravoid: not nice ;-) [16:12:43] dude, have you heard of adblock? :-) [16:12:55] paravoid: I shouldn't have to use adblock [16:15:04] what are you looking for? [16:15:05] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:15:37] relaying commands that you tell me to run doesn't make much sense [16:16:01] paravoid: that is why I need root [16:16:08] either you should get access and do it, or we (ops) should understand what's going on there and try to fix things [16:16:23] you know, "operate" :-) [16:16:43] paravoid: also for the record I don't like "AdBlock" because it can access your data on all websites and access your tabs and browsing activity [16:17:03] paravoid: we are on a conference call with TATA in India right now [16:17:14] paravoid: would you like to join the call? [16:18:12] heh, I guess that's what I get for asking, isn't it? :-) [16:18:36] paravoid: ha ha ha [16:18:56] paravoid: International Callers [16:18:56] 0091-22-67934444 / +91-22-67914444/55 [16:19:09] Participant Pin Code [16:19:09] 2225886 [16:20:08] do you need me there? [16:20:16] sbernardin: hi & welcome! :) [16:20:58] paravoid: I only need root access [16:21:05] hi paravoid...Thanks [16:21:12] paravoid: But if you want to hear the call you can too [16:21:23] hi sbernardin ! [16:21:31] Everyone! sbernardin is our new data center contractor in Tampa. [16:21:53] paravoid: otherwise I'm forced to relay commands that I tell you to run but that doesn't make much sense [16:21:59] Please welcome him [16:22:00] sbernardin: welcome [16:22:01] welcome, sbernardin! [16:22:58] welcome, sbernardin [16:23:20] Thanks everyone...happy to be on board! [16:27:16] New patchset: Demon; "Set isGithubRepo = true for github replication" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30796 [16:29:47] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.025 seconds [16:30:18] new people? [16:30:29] welcome sbernardin [16:30:33] New patchset: Demon; "Resource references should be capitalized" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30797 [16:31:29] Change abandoned: Demon; "This didn't work like I'd hoped." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30797 [16:33:54] paravoid: ping [16:34:12] woosters: is anybody from operations available right now? [16:34:24] woosters: I'm on a call with TATA in India [16:34:37] woosters: nevermind paravoid responded [16:34:37] let me try to get paravoid [16:34:41] ok [16:34:51] woosters: he already responded thanks [16:38:14] !log stpped stray jobrunner on srv278 [16:38:29] Logged the message, Master [16:38:30] slow bots get beaten [16:44:51] New patchset: Mark Bergsma; "Prepare for Varnish upload @ esams: parameterize storage sizes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30799 [16:45:26] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30799 [16:54:02] !log reedy synchronized wmf-config/ExtensionMessages-1.21wmf3.php [16:54:15] Logged the message, Master [17:01:08] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:08:27] Jeff_Green: nice owrk on apache-fast-test [17:11:20] New patchset: Mark Bergsma; "Define backends for upload-backend in esams" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30802 [17:14:14] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.879 seconds [17:14:15] Reedy: ah you're using it? great! [17:14:34] Yup, mutante mentioned it [17:14:58] ya. that was written out of pure fear . . . of production apache conf changes. [17:15:06] Had a suspicion 2 apaches were out of sync, that confirmed which it was [17:15:14] Which makes finding out which to kick MUCH easier [17:15:19] yup [17:15:28] turns out we're frequently out of sync [17:15:31] New patchset: Mark Bergsma; "Define backends for upload-backend in esams" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30802 [17:15:54] !log reedy synchronized wmf-config/omgtestfile [17:16:09] Logged the message, Master [17:17:53] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30802 [17:19:49] !log reedy synchronized wmf-config/CommonSettings.php 'wgMaxAnimatedGifArea = 2.5e7' [17:20:02] Logged the message, Master [17:24:11] kaldari: hi there! [17:24:11] (this is faidon) [17:25:34] New review: Demon; "This won't change anything on merge, it's just pre-configuring something before I deploy the hack." [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/30796 [17:26:39] garg stupid git tricks [17:26:40] paravoid: Oh hey :) [17:27:05] what does one do about the dreaded ahead-of-origin/production by 2 commits [17:27:16] Heh, I never put the IRC handle together with the face [17:28:09] paravoid: Anyway, like I said, we don't need very accurate IPv6 look-up, just something that will return a country value that is hopefully close to their actual country. [17:28:56] We can even add the lookup support ourselves, if Ops is willing to review the code and sign-off on it. [17:33:01] at this point it would be less risky for us to do that than changing CentralNotice back to the old banner-loading scheme [17:35:23] Jeff_Green: so, how's the ganglia [17:35:53] the part you fixed is great, the part I need to fix via puppet is enmired in git stupidity [17:36:23] fun times [17:38:05] so, mind if i lock down the payments <-> admin zone again ? [17:38:16] kaldari: hey, sorry, was in a phonecall [17:38:17] also, i'll open up port 80 to maxmind [17:38:51] so, I don't recommend using this for the reasons I stated, but in the end, the decision is yours/fr's [17:39:13] have in mind that it's not about being inaccurate, the database is also incomplete [17:39:30] the database's README: http://geolite.maxmind.com/download/geoip/database/GeoLiteCityv6-beta/README [17:39:38] BETA GeoLiteCityv6 BETA [17:39:41] Here is the first GeoLiteCityv6 database to resolve IPv6 addresses. The current [17:39:45] IPv6 support is rather poor and is currently only a GeoLiteCity database with [17:39:53] teredo and 6to4 support on a city level. [17:39:57] <^demon> paravoid: Any chance you could poke https://gerrit.wikimedia.org/r/#/c/30796/ for me? [17:39:59] I wouldn't use something that it's stated as "rather poor" [17:40:30] We don't use city resolution at all, only country. How functional do you estimate the country-level resolution? [17:40:45] LeslieCarr: sure, go for it [17:41:03]