[12:01:38] New review: Dzahn; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/12997 [12:01:38] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12997 [12:25:46] PROBLEM - Puppet freshness on maerlant is CRITICAL: Puppet has not run in the last 10 hours [12:43:02] New patchset: Hashar; "detect cluster with /etc/wikimedia-realm" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/12583 [12:43:23] New review: Hashar; "patchset 2 fix a typo in a comment." [operations/mediawiki-config] (master); V: 0 C: 0; - https://gerrit.wikimedia.org/r/12583 [12:46:44] New review: Hashar; "(no comment)" [operations/apache-config] (master) C: -1; - https://gerrit.wikimedia.org/r/9874 [13:18:21] hey notpeter, you round? [13:18:30] q for you about the lucene udp2log [13:27:53] PROBLEM - Puppet freshness on db29 is CRITICAL: Puppet has not run in the last 10 hours [13:31:02] PROBLEM - Puppet freshness on ms-be5 is CRITICAL: Puppet has not run in the last 10 hours [13:50:35] New patchset: Mark Bergsma; "An empty capabilities field is not allowed" [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/13010 [13:50:36] New patchset: Mark Bergsma; "Fix __hash__ of MPReachNLRIAttribute" [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/13011 [13:50:37] New patchset: Mark Bergsma; "Add Advertisement.__repr__ for easier debugging" [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/13012 [13:50:38] New patchset: Mark Bergsma; "Add IPv6 BGP prefix advertisement support to PyBal" [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/13013 [13:51:33] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 1 C: 2; - https://gerrit.wikimedia.org/r/13010 [13:51:35] Change merged: Mark Bergsma; [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/13010 [13:52:21] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 1 C: 2; - https://gerrit.wikimedia.org/r/13011 [13:52:24] Change merged: Mark Bergsma; [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/13011 [13:53:15] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 1 C: 2; - https://gerrit.wikimedia.org/r/13012 [13:53:17] Change merged: Mark Bergsma; [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/13012 [13:54:19] New review: Mark Bergsma; "(no comment)" [operations/debs/pybal] (mp-bgp); V: 1 C: 2; - https://gerrit.wikimedia.org/r/13013 [13:54:21] Change merged: Mark Bergsma; [operations/debs/pybal] (mp-bgp) - https://gerrit.wikimedia.org/r/13013 [13:56:34] New review: Hashar; "(no comment)" [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/8344 [14:07:15] !log shutting down unused cp1037-cp1040 per RT-3189 [14:07:24] Logged the message, Master [14:08:52] hmm.. getting "Retrieving file" as the only content on console output of one of them [14:33:50] New patchset: Jgreen; "added file expiration-purge to dump_fundraisingdb" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13017 [14:34:22] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/13017 [14:34:24] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 242 seconds [14:34:31] New review: Jgreen; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/13017 [14:34:33] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13017 [14:41:36] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 14 seconds [14:49:08] New patchset: Matthias Mullie; "lower AFTv4 odds to kickstart AFTv5 at 1% (inverse odds)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/13019 [15:44:55] PROBLEM - Host ms-be5 is DOWN: PING CRITICAL - Packet loss = 100% [15:50:28] RECOVERY - Host ms-be5 is UP: PING OK - Packet loss = 0%, RTA = 0.32 ms [15:57:21] New patchset: Mark Bergsma; "Disable GRO for esams LVS servers" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13028 [15:57:58] New patchset: Mark Bergsma; "Use the cp-varnish recipe for ms-be5 (for now)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13029 [15:58:34] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/13028 [15:58:34] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/13028 [15:58:34] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/13028 [15:58:35] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13028 [15:58:35] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/13029 [15:58:42] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/13029 [15:58:44] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13029 [15:59:45] New patchset: Hashar; "all `images` dirs were ignored" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/13030 [16:00:26] New patchset: Hashar; "import images/sul/ images" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/13031 [16:01:09] New review: Reedy; "(no comment)" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/13031 [16:01:27] New review: Reedy; "(no comment)" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/13030 [16:01:29] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/13031 [16:01:30] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/13030 [16:03:22] PROBLEM - Host ms-be5 is DOWN: PING CRITICAL - Packet loss = 100% [16:07:19] New patchset: Hashar; "ignore xfc and gif files in /images/" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/13032 [16:07:37] New review: Hashar; "(no comment)" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/13032 [16:07:39] Change merged: Hashar; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/13032 [16:14:37] RECOVERY - Host ms-be5 is UP: PING OK - Packet loss = 0%, RTA = 0.54 ms [16:21:58] PROBLEM - Host ms-be5 is DOWN: PING CRITICAL - Packet loss = 100% [16:23:01] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [16:53:31] RECOVERY - Host ms-be5 is UP: PING OK - Packet loss = 0%, RTA = 1.26 ms [17:06:25] PROBLEM - Host ms-be5 is DOWN: PING CRITICAL - Packet loss = 100% [17:32:40] RECOVERY - SSH on ms-be5 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [17:32:49] RECOVERY - Host ms-be5 is UP: PING OK - Packet loss = 0%, RTA = 1.90 ms [17:38:57] New patchset: Bhartshorne; "adding a second swift cluster to labs to test the swift upgrade process." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13040 [17:39:30] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/13040 [17:39:30] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/13040 [17:39:33] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13040 [17:47:47] New patchset: Bhartshorne; "updating ms-be partman config after mark switched the SSDs to AHCI so they can boot." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13043 [17:48:18] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/13043 [17:48:54] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/13043 [17:48:56] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13043 [17:53:13] PROBLEM - Host ms-be5 is DOWN: PING CRITICAL - Packet loss = 100% [17:58:46] RECOVERY - Host ms-be5 is UP: PING OK - Packet loss = 0%, RTA = 0.52 ms [18:01:55] PROBLEM - SSH on ms-be5 is CRITICAL: Connection refused [18:05:37] New patchset: Pyoungmeister; "removing access to DBs for erosen" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13045 [18:06:08] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/13045 [18:06:34] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/13045 [18:06:36] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13045 [18:09:16] RECOVERY - SSH on ms-be5 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [18:12:56] RobHalsell: woosters told me you'd be looking at https://rt.wikimedia.org/Ticket/Display.html?id=2996 . let me now if you have any questions [18:13:19] robh is still sick today, he pinged me [18:13:27] tfinc - sorry about that [18:13:52] if he illness continues who else can look at this ? [18:14:09] all that we need for now is the new dns entries [18:14:17] i will have someone look at it if he is still out tomorrow [18:14:20] yep [18:14:22] thanks [18:14:36] i understand u need it in 2 weeks time [18:14:52] so, will make sure they are up before that [18:15:28] New patchset: Pyoungmeister; "adding shell on fenari for erosen RT 3119" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13046 [18:16:01] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/13046 [18:17:17] New review: Kaldari; "(no comment)" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/13019 [18:17:19] Change merged: Kaldari; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/13019 [18:18:11] Change abandoned: Matthias Mullie; "Other method: inverse AFTv4 lottery" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/12849 [18:21:10] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/13046 [18:21:12] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13046 [18:34:23] New patchset: Bhartshorne; "bumping / to 60G for swift backend nodes to store all the logs they generate - consider moving /var/log/ to a separate partition later." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13047 [18:34:55] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/13047 [18:34:55] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13047 [18:39:00] PROBLEM - Host ms-be5 is DOWN: PING CRITICAL - Packet loss = 100% [18:41:58] New patchset: Hashar; "(bug 37076) `lint` tool require php5-lint" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13048 [18:42:30] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/13048 [18:43:08] paravoid: thanks for packaging php5-parsekit. The related puppet change is in https://gerrit.wikimedia.org/r/13048 [18:43:18] not urgent though :-] [18:44:33] RECOVERY - Host ms-be5 is UP: PING OK - Packet loss = 0%, RTA = 0.50 ms [18:44:58] New review: Faidon; "Package['php5-parsekit'] is required but not defined" [operations/puppet] (production); V: 0 C: -1; - https://gerrit.wikimedia.org/r/13048 [18:46:23] hashar: ^ :) [18:46:41] ahrarha [18:47:13] :-) [18:47:16] hashar: please provide IPA transcription for that statement ;) [18:47:23] paravoid: isn't puppet clever enough to find out it needs to run apt-get install ? :-D [18:47:42] nope [18:47:42] PROBLEM - SSH on ms-be5 is CRITICAL: Connection refused [18:48:00] you depend on puppet resources, not system resources [18:48:04] ahh [18:48:07] the usual mistake [18:48:12] you know [18:48:15] when you say File["/foo/bar"] you depend on the file { } definition, not the actual file [18:48:18] wow, that commit msg is *so* much longer than the diff [18:48:25] our conversation are more or like watching a movie for the second time :-] [18:49:09] paravoid: but some dirs have file resources autocreated if needed [18:49:11] could I: file { "foo": source=> "bla", require { php5-parsekit: ensure => present } } [18:49:51] i think not [18:50:46] no [18:51:28] so basically [18:51:41] New patchset: Hashar; "(bug 37076) `lint` tool require php5-lint" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13048 [18:51:44] I am not even thinking about what I am doing while amending changes :-D [18:52:03] the git command just flow directly from some part of the brain straight to my fingers on the keyboard [18:52:09] without me even noticing about it [18:52:16] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/13048 [18:52:16] (was talking and looking at my wife while doing it) [18:52:18] wieird [18:52:26] does not make me better at puppetization though :-( [18:52:48] paravoid: here the update https://gerrit.wikimedia.org/r/#/c/13048/2/manifests/misc-servers.pp,unified [18:52:52] +» package { "php5-parsekit": ensure => present; } [18:52:54] do you own puppets? [18:53:16] jeremyb: my daughter has plenty :-] [18:53:29] good [18:56:34] RobHalsell: you around today ? [18:56:44] LeslieCarr: sick [18:56:49] oh [18:57:18] PROBLEM - Puppet freshness on nfs2 is CRITICAL: Puppet has not run in the last 10 hours [18:57:22] hashar: /me wonders why you don't use a name like php-lint? or lint-php? not the end of the world... [18:57:42] that is the debian way to name packages [18:57:50] so you could later install php6-parsekit [18:58:03] (parsekit being the php extension name in pecl) [18:58:11] that is a kit to parse … PHP [18:58:11] no, not the package. the files installed by puppet [18:58:26] which is also the only way to lint a php file I think [18:59:00] though maybe we could use token_get_all() [18:59:05] ahh [18:59:31] ohh you mean 'lint' name ? [18:59:36] that is historical I guess [18:59:48] yeah. i guess you were just puppetizing what was already there [18:59:49] feel free to submit a change to rename it and set a symlink for back compat [19:00:22] well it looks like (at least within puppet at a glance) it's just used by scap [19:01:03] RECOVERY - SSH on ms-be5 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [19:01:05] anyway, if that's the way it already is in prod then i guess it makes sense to do it as is for now. can submit a separate change [19:01:12] PROBLEM - Puppet freshness on nfs1 is CRITICAL: Puppet has not run in the last 10 hours [19:01:23] jeremyb: I try to submit one feature per change [19:01:31] easier to have them reviewed and applied [19:01:48] yeah, sure [19:01:48] renaming lint might just create side effects in some script / cron which are not puppetized yet [19:01:54] yeah [19:14:53] hoarazrhoazrh ar [19:14:57] (no IPA) [19:15:07] so hmm [19:15:16] either I use puppet and learn varnish [19:15:22] or i install squid and hack a local conf [19:15:33] well that is for labs anyway, moving out :-) [19:57:29] RECOVERY - Puppet freshness on ms-be5 is OK: puppet ran at Tue Jun 26 19:57:02 UTC 2012 [19:58:32] RECOVERY - swift-account-server on ms-be5 is OK: PROCS OK: 25 processes with regex args ^/usr/bin/python /usr/bin/swift-account-server [19:58:41] RECOVERY - swift-object-updater on ms-be5 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-object-updater [19:58:41] RECOVERY - swift-container-auditor on ms-be5 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [19:58:50] RECOVERY - swift-container-updater on ms-be5 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-updater [19:58:50] RECOVERY - swift-object-auditor on ms-be5 is OK: PROCS OK: 2 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [19:58:50] RECOVERY - swift-container-replicator on ms-be5 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-replicator [19:58:59] RECOVERY - swift-account-auditor on ms-be5 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-auditor [19:59:08] RECOVERY - swift-account-reaper on ms-be5 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-reaper [19:59:08] RECOVERY - swift-object-replicator on ms-be5 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-object-replicator [19:59:26] RECOVERY - swift-object-server on ms-be5 is OK: PROCS OK: 25 processes with regex args ^/usr/bin/python /usr/bin/swift-object-server [19:59:44] RECOVERY - swift-container-server on ms-be5 is OK: PROCS OK: 25 processes with regex args ^/usr/bin/python /usr/bin/swift-container-server [20:00:37] New patchset: Bhartshorne; "updating ms-be5 with the correct listing of disks" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13057 [20:01:09] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/13057 [20:02:52] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/13057 [20:02:55] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13057 [20:05:55] RECOVERY - swift-account-replicator on ms-be5 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-replicator [20:09:29] PROBLEM - swift-object-auditor on ms-be5 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [20:11:19] New review: Reedy; "(no comment)" [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/12377 [20:11:44] RECOVERY - NTP on ms-be5 is OK: NTP OK: Offset 0.0007705688477 secs [20:13:53] hey woosters [20:14:02] hey [20:14:05] try drdee_ [20:14:13] evan does not need interproxy? [20:14:39] i think he is fine if he has fenari [20:14:52] anyone know why dumps.wikimedia.org is throwing 403's for everything ? [20:15:00] internproxy is more restrictive [20:15:03] is anyone else seeing the same issue ? [20:15:06] i actually think that we should think about deprecating internproxy [20:15:24] and use it as a general E2 / E3 data crunch machine [20:15:35] evan is staff member [20:15:36] 403s? no [20:15:46] tfinc: looks ok for me.. [20:16:01] [tfinc@Fluffy:~]$ curl 'http://dumps.wikimedia.org' -I [20:16:02] HTTP/1.1 403 Forbidden [20:16:03] internproxy was built for the summer of research researchers [20:16:05] and it is working for me [20:16:25] well i'm getting in on *every* request now [20:16:29] did i trigger some filter ? [20:16:34] maybe [20:16:43] do we have filters ? [20:16:44] I mean we don't filer on te host [20:16:59] Are you at the office? Get someone else to try? [20:17:54] CT just tried it and it worked [20:18:06] with curl? [20:18:40] apergos: he did it in the web browser. i did it with curl as you see in the backscroll [20:18:53] reedy@fenari:~$ curl 'http://dumps.wikimedia.org' -I [20:18:53] HTTP/1.1 200 OK [20:18:57] and now it works [20:19:02] trying from toronto, it works as well [20:19:08] same exact command [20:19:28] do we throttle lighted in some way? [20:19:33] lighttpd [20:19:40] 2 conns per ip iirc [20:19:48] and there is a bw limit [20:19:51] Change abandoned: Hashar; "abandon for now." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/12388 [20:19:55] we did this because we had some hoggers [20:20:26] do you have other downloads going simultaneously? [20:20:34] yup. 2012-06-26 20:20:18: (mod_evasive.c.183) 84.227.0.44 turned away. Too many connections. [20:20:40] thats not my ip but confirms that we throttle [20:21:06] well it's on the main index page [20:21:21] been anounced there since we added that measure some while back [20:22:17] it would be better to still be able to see that index page when someone triggers it [20:22:27] currently we serve 403's for every single page [20:22:32] when it happens [20:22:40] presumaly the person triggering it already saw the notice, as they already have two downloads going [20:23:12] if you bz i or rt it (I don't care which) and assign it to me I'll add that [20:23:26] I won't right now because it's 11:30 at night and I'm in a different (non-work) meeting [20:23:46] no rush [20:23:48] thanks for the info apergos [20:23:52] uh huh [20:24:27] afkin this channel now... have a good day all [20:28:41] RECOVERY - swift-object-auditor on ms-be5 is OK: PROCS OK: 2 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [20:33:32] New review: Hashar; "(no comment)" [operations/mediawiki-config] (master); V: 0 C: 0; - https://gerrit.wikimedia.org/r/12185 [20:42:35] PROBLEM - Host ms-be5 is DOWN: PING CRITICAL - Packet loss = 100% [20:43:10] New patchset: Pyoungmeister; "switching erosen account with admins::globaldev" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13063 [20:43:43] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/13063 [20:43:50] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/13063 [20:43:52] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13063 [20:43:54] peter, where are you switching that? [20:43:56] in which node? [20:44:01] notpeter^^ [20:44:02] ? [20:45:18] notpeter ^^ [20:45:31] fenari [20:45:38] his user isn't being generated properly [20:46:36] what. the fuck. [20:47:05] RECOVERY - Host ms-be5 is UP: PING OK - Packet loss = 0%, RTA = 0.24 ms [21:03:17] New patchset: Bhartshorne; "changing xfs inode size for swift to 512, the default suggested" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13091 [21:03:49] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/13091 [21:03:54] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/13091 [21:03:56] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13091 [21:21:37] New patchset: Bhartshorne; "putting swift host ms-be5 back in rotation - now with SSDs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13093 [21:21:50] woosters: ^^^^ [21:21:57] also AaronSchulz ^^^ [21:22:07] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/13093 [21:22:13] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/13093 [21:22:15] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13093 [21:23:58] maplebed :-) [21:57:00] PROBLEM - Puppet freshness on mw56 is CRITICAL: Puppet has not run in the last 10 hours [22:02:13] maplebed: btw, did you start the container listing script on enwiki too? [22:02:27] no, I haven't yet. moops. [22:07:14] AaronSchulz: it's running now. [22:22:25] I get errors trying to import images: Importing Fotothek-df_ge_0000192-Verona.jpg...failed. (Could not create directory "mwstore://local-NFS/local-public/archive/b/b8".) [22:22:36] The mwscript URL should translate into /mnt/upload6/wikipedia/commons/archive/b/b8, right? [22:23:28] I don't think so [22:23:35] I think it would go somewhere on swift [22:23:48] only thumbs are on swift. [22:23:53] (so far) [22:23:59] then yes [22:24:10] something is broken on mediawiki there [22:24:19] hashar had been looking at mwstore:// recently [22:24:34] JeLuF: are you running it as apache? [22:24:35] only a small part of the images cause import errors, most are fine [22:24:35] not sure if it was the same issue [22:24:44] Reedy: no, as jeluf [22:24:52] run it as apache [22:25:04] I've had weird errors, and running as apache dealt with them [22:25:17] sudo -u apache mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user=Foo uploads/ [22:26:14] b8 has rwxr-xr-x+ permissions, b9 has rwxrwxrwx [22:26:31] drwxr-xr-x+ 2 apache apache 2902 2012-06-26 17:50 b8 [22:26:31] drwxr-xr-x+ 2 apache apache 3205 2012-06-26 20:23 b9 [22:26:59] typo, b7 has rwxrwxrwx [22:27:00] PROBLEM - Puppet freshness on maerlant is CRITICAL: Puppet has not run in the last 10 hours [22:27:01] sorry [22:27:25] yeah, not exactly sure why.. [22:30:38] ok, I'm not going to touch the permissions, running it as apache works fine [22:31:36] So, just 100,000 files left to import :) [22:31:57] heh [22:36:12] could wrong permission of some sort (not those) stop the purging of files? [22:36:20] which quite often happens [22:56:12] Change abandoned: Jalexander; "done on https://gerrit.wikimedia.org/r/#/c/12863/" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/12706 [23:00:59] AaronSchulz: an interesting data point that may claim that the listing I'm doing has no effect on when you need to do a listing: http://pastebin.com/rAXXnwgH [23:01:23] I'm gonna dig for other similar examples and see what they look like. [23:24:00] OK, this is a ridiculous question, but why can't I ping github.com from labs? [23:24:10] Or maybe just my labs instance [23:25:03] Yes, just that one instance [23:25:05] Awesome [23:25:42] And anyway, I've joined the wrong channel. /embarrassed [23:25:54] works for me [23:26:08] you probably have some bad firewall rule [23:28:57] PROBLEM - Puppet freshness on db29 is CRITICAL: Puppet has not run in the last 10 hours [23:31:19] New patchset: Tim Starling; "ExtensionDistributor conf update" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/13097 [23:32:37] New review: Reedy; "(no comment)" [operations/mediawiki-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/13097 [23:32:40] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/13097 [23:43:31] maplebed: Labs instances seem unable to reach the outside world. Any idea if that's for some reason intentional? [23:43:39] And/or has always been the case and I didn't notice? [23:43:41] yes, I'm pretty sure it is. [23:43:46] and I tihnk it's always been the case. [23:43:52] (I don't think it can have always been the case, since I was running an IRC proxy on labs a few weeks ago.) [23:44:01] (which talked to freenode.) [23:44:07] no hosts in our internal network can reach the public internet (almost) [23:44:23] Hm. [23:44:55] So e.g. doing a git checkout on a labs machine: not allowed? [23:45:11] I've definitely done that plenty. [23:46:12] PROBLEM - SSH on iron is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:47:00] maplebed, if Leslie is within earshot can you ask her to confirm? [23:47:05] she's sick today. [23:47:21] but our publicly IPed hosts have different rules from 'the public internet' in general. [23:47:34] i.e. the difference is not whether it has a public IP but whether it's our host. [23:48:09] Right, but... freenode, github, definitely not ours. [23:48:19] yes. [23:48:36] I think instances with public IPs are understandably different. maybe you were doing it on one of those? [23:49:03] New patchset: Jalexander; "Add WikimediaShopLink settings" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/13099 [23:49:11] there are no instances with public IPs. just IPs forwarded. i think [23:49:41] you sure? [23:49:48] I don't tihnk that's true. [23:49:57] (but I'm fuzzy on many labs things.) [23:50:15] There are instances with public IPs, but I was definitely not working on one. [23:50:28] and i just tested it, i can get to the outside from gerrit.pmtpa.wmflabs [23:50:51] ok, but that /is/ a machine with a public ip. [23:51:13] not according to ip addr [23:51:21] jeremyb: Can you confirm that you can't get to the outside from non-public labs machines? [23:51:27] (Just to make sure I'm not hallucinating?) [23:51:52] !rt foo [23:51:52] http://rt.wikimedia.org/Ticket/Display.html?id=foo [23:52:00] that bot that just responded is on labs [23:52:27] Um... on an instance w/out a public IP? [23:52:29] andrewbogott: I was juts able to get to the public world from a privately-iped labs host. [23:52:30] anyway, really have to go. can troubleshoot in ~60-90 mins [23:52:38] (I tried on su-fe1 in the swiftupgrade project) [23:52:50] my test was 'telnet google.com 80' [23:53:14] mine was: curl -vs google.com >/dev/null [23:53:15] second test, ping 128.32.136.9, also successful. [23:53:16] On my instance I get a hang with the same command. [23:53:21] * jeremyb disconnects [23:53:42] so I guess the answer is just that you're special. [23:53:43] :D [23:53:50] Me and everyone in the wikimedia-labs channel. [23:53:51] err... your instance is special. [23:53:54] yeah, that's what I meant. [23:58:03] RECOVERY - SSH on iron is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0)