[00:10:24] LionelScheepmans: i think it's fairly exact, but it's also possible [00:10:36] LionelScheepmans: can you tell us more about why it's so big? or link to it? [00:31:59] New patchset: Asher; "moving db26 from s1 to s7" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2076 [00:32:16] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2076 [00:32:17] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2076 [00:45:06] !log asher synchronized wmf-config/db.php 'pulling db32 - this will be the new enwiki pmtpa snapshot host' [00:45:07] Logged the message, Master [00:45:59] jeremyb, that's ok I don't know why but the size indicate by my computor was 101.8 but the size indicate by uplaoder was arround 98 ! Informatic can be strange some time :) [00:46:30] hrmmm [00:46:59] ok! [00:49:47] LionelScheepmans: commons? name? [00:54:06] jeremyb : http://commons.wikimedia.org/wiki/File:Un_anthropologue_venu_des_p%C3%A8res_blancs.ogg [00:54:37] LionelScheepmans: could use some english ;-P [00:55:30] I did understand what you meen jeremyb ? [00:55:47] LionelScheepmans: the Description field is not translated [00:57:05] Ohhh yes I'm gone do it right now ;) [00:57:12] are you sure it said 101.8 not 101,807,717 ? [00:57:17] they are not the same [01:00:56] I don't know exactky jeremyb [01:00:58] $ python -c 'print "{:,}".format((2**20)*100)' [01:00:58] 104,857,600 [01:01:22] that should be the limit give or take a byte [01:03:52] But any way it's done ! The only problem was i made a mistake with the sound of the last part of the movie it's a bite low... Tomorow, I will switch the file with a beter version... [01:05:07] bc says: scale=2; 101807717/2^20=97.09 [01:20:12] New patchset: Lcarr; "Pushing labs configuration into the main config module + making labs mirror the prod puppet repo configuration" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2077 [01:21:46] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/2077 [01:23:53] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2077 [01:25:50] New review: Lcarr; "approving to wipe it out :)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2072 [01:25:51] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2077 [01:25:51] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2072 [01:30:24] New patchset: Lcarr; "removing conflicting apachesite definition" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2078 [01:30:40] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2078 [01:30:57] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2078 [01:43:54] PROBLEM - Puppet freshness on knsq9 is CRITICAL: Puppet has not run in the last 10 hours [01:44:45] New patchset: Lcarr; "Revert "removing conflicting apachesite definition"" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2079 [01:44:59] New patchset: Lcarr; "Revert "Pushing labs configuration into the main config module + making labs mirror the prod puppet repo configuration"" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2080 [01:45:15] New patchset: Lcarr; "Revert "addingin cron job to sync software repo in labs"" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2081 [01:45:36] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2081 [01:46:06] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2081 [01:46:15] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2080 [01:46:19] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2079 [01:46:19] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2079 [01:46:49] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2080 [01:46:56] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2081 [01:49:22] Got an error about "did you forget to run maintenance.php after upgrade?" when accessing ka.wikibooks.org, but it went away on refresh of the page [02:06:17] !log LocalisationUpdate completed (1.18) at Wed Jan 25 02:06:17 UTC 2012 [02:06:19] Logged the message, Master [02:08:39] PROBLEM - MySQL replication status on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 809s [02:24:09] PROBLEM - Misc_Db_Lag on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1739s [02:34:09] RECOVERY - Misc_Db_Lag on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 0s [02:38:49] RECOVERY - MySQL replication status on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 0s [03:04:29] PROBLEM - DPKG on db57 is CRITICAL: DPKG CRITICAL dpkg reports broken packages [03:18:09] PROBLEM - DPKG on db58 is CRITICAL: DPKG CRITICAL dpkg reports broken packages [04:18:53] RECOVERY - Disk space on es1004 is OK: DISK OK [04:20:53] RECOVERY - MySQL disk space on es1004 is OK: DISK OK [04:39:53] PROBLEM - MySQL slave status on es1004 is CRITICAL: CRITICAL: Slave running: expected Yes, got No [04:53:03] PROBLEM - ps1-d2-sdtpa-infeed-load-tower-A-phase-Z on ps1-d2-sdtpa is CRITICAL: ps1-d2-sdtpa-infeed-load-tower-A-phase-Z CRITICAL - *2413* [05:19:26] PROBLEM - ps1-d2-sdtpa-infeed-load-tower-A-phase-Z on ps1-d2-sdtpa is CRITICAL: ps1-d2-sdtpa-infeed-load-tower-A-phase-Z CRITICAL - *2425* [05:39:26] PROBLEM - ps1-d2-sdtpa-infeed-load-tower-A-phase-Z on ps1-d2-sdtpa is CRITICAL: ps1-d2-sdtpa-infeed-load-tower-A-phase-Z CRITICAL - *2425* [05:49:33] !log Disabling the processing of recurring payments in CiviCRM until we can add the appropriate payment_method to the queue msgs [05:49:36] Logged the message, Master [06:07:28] !log disabled queue consumption of payments in jenkins until stuck message can be removed from queue [06:07:30] Logged the message, Master [09:07:45] New patchset: ArielGlenn; "add new host for xml dump rsync mirror" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2082 [09:08:01] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2082 [09:09:03] New review: ArielGlenn; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2082 [09:09:03] Change merged: ArielGlenn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2082 [09:41:48] New patchset: Pyoungmeister; "REVIEW REQUESTED major cleanup and refactoring and some parameterization of udp2log class, but should be no substantive changes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2083 [09:42:02] New review: gerrit2; "Change did not pass lint check. You will need to send an amended patchset for this (see: https://lab..." [operations/puppet] (production); V: -1 - https://gerrit.wikimedia.org/r/2083 [09:43:22] PROBLEM - Puppet freshness on copper is CRITICAL: Puppet has not run in the last 10 hours [09:43:22] PROBLEM - Puppet freshness on ms3 is CRITICAL: Puppet has not run in the last 10 hours [09:43:22] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [09:45:22] PROBLEM - Puppet freshness on magnesium is CRITICAL: Puppet has not run in the last 10 hours [09:48:22] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [09:49:22] PROBLEM - Puppet freshness on owa3 is CRITICAL: Puppet has not run in the last 10 hours [09:53:22] PROBLEM - Puppet freshness on ms-fe2 is CRITICAL: Puppet has not run in the last 10 hours [09:55:22] PROBLEM - Puppet freshness on owa1 is CRITICAL: Puppet has not run in the last 10 hours [09:57:12] PROBLEM - Disk space on es1004 is CRITICAL: DISK CRITICAL - free space: /a 438405 MB (3% inode=99%): [09:57:22] PROBLEM - Puppet freshness on ms-fe1 is CRITICAL: Puppet has not run in the last 10 hours [10:00:52] PROBLEM - MySQL disk space on es1004 is CRITICAL: DISK CRITICAL - free space: /a 410996 MB (3% inode=99%): [10:01:22] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [10:07:22] PROBLEM - Puppet freshness on owa2 is CRITICAL: Puppet has not run in the last 10 hours [11:00:22] RECOVERY - MySQL slave status on es1004 is OK: OK: [11:33:52] TimStarling (or any other shellies) if your around: https://bugzilla.wikimedia.org/show_bug.cgi?id=33900#c6 (It's for a workshop that is apparently on now) [11:38:57] PROBLEM - HTTP on formey is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:41:51] Hi :) Could someone have a look at https://bugzilla.wikimedia.org/show_bug.cgi?id=33900 as it needs a fix ASAP for the event today :) [11:42:27] PROBLEM - HTTPS on formey is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:43:54] !log formey oom (I guess), unresponsive from mgmt console, powercycling. [11:43:55] Logged the message, Master [11:45:02] ugh [11:45:14] a bunch of errors about unreachable nameservers [11:48:47] RECOVERY - HTTP on formey is OK: HTTP OK HTTP/1.1 200 OK - 3596 bytes in 0.007 seconds [11:50:08] I'm getting an error message trying to download an extension via the distributor -- [11:50:10] Download MediaWiki extension [11:50:10] Invalid response from remote subversion client. [11:51:03] try againin about 2 minutes [11:51:15] k. [11:52:27] RECOVERY - HTTPS on formey is OK: OK - Certificate will expire on 08/22/2015 22:23. [11:53:17] PROBLEM - Puppet freshness on knsq9 is CRITICAL: Puppet has not run in the last 10 hours [11:56:51] Still no-go on svn; I'll try again in 10-20 minutes. [11:59:31] that'sworrying [12:23:11] svn working now. [12:23:41] ok great [13:19:53] PROBLEM - ps1-d2-sdtpa-infeed-load-tower-A-phase-Z on ps1-d2-sdtpa is CRITICAL: ps1-d2-sdtpa-infeed-load-tower-A-phase-Z CRITICAL - *2450* [13:28:38] ^demon: https://bugzilla.wikimedia.org/show_bug.cgi?id=33900#c6 (It's for a workshop that is apparently on now) [13:29:34] <^demon> We do that in CommonSettings, yes? [13:30:11] <^demon> Already done it seems [13:30:12] <^demon> http://noc.wikimedia.org/conf/highlight.php?file=CommonSettings.php [13:30:22] <^demon> ctrl+f for 124.30.195.210 [13:31:33] <^demon> Oh, they want it changed from 210->211 [13:31:34] <^demon> Can do [13:33:24] !log demon synchronized wmf-config/CommonSettings.php 'Change IP address for bnwiki account creation throttle per bug 33900' [13:33:25] Logged the message, Master [13:38:40] * Nemo_bis wonders how long the workshop is [14:30:49] New review: Dzahn; "(no comment)" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/2083 [15:13:11] I broke the Internet!!111 ;-) JFYI: Datenbankfehler - Funktion „LinksUpdate::incrTableUpdate“. Die Datenbank meldete den Fehler „1205: Lock wait timeout exceeded; try restarting transaction (10.0.6.41)“. Von „http://commons.wikimedia.org/wiki/Template:Location/ExternalLink“ When updating a highly used geolocation template [15:13:38] but seems not to make any problems [16:37:59] New patchset: Pyoungmeister; "REVIEW REQUESTED major cleanup and refactoring and some parameterization of udp2log class, but should be no substantive changes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2083 [16:38:25] New review: gerrit2; "Change did not pass lint check. You will need to send an amended patchset for this (see: https://lab..." [operations/puppet] (production); V: -1 - https://gerrit.wikimedia.org/r/2083 [16:47:27] New patchset: Pyoungmeister; "REVIEW REQUESTED major cleanup and refactoring and some parameterization of udp2log class, but should be no substantive changes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2083 [17:56:59] New patchset: RobH; "added new simple shell script for ipmi mgmt" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2084 [17:57:15] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2084 [18:04:57] New patchset: Ottomata; "Documentation and brainstorming, mostly of Observation class." [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2085 [18:12:20] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/2084 [18:14:28] WMF localisation team office hour happening now on #wikimedia-office [18:45:36] RECOVERY - Host knsq8 is UP: PING OK - Packet loss = 0%, RTA = 109.38 ms [18:45:38] RECOVERY - Host knsq29 is UP: PING OK - Packet loss = 0%, RTA = 109.68 ms [18:45:38] RECOVERY - Host knsq24 is UP: PING OK - Packet loss = 0%, RTA = 109.29 ms [18:45:38] TimStarling, RobH, or anyone, are, you on that? [18:45:56] seems to be bouncing back up [18:46:02] checkin [18:46:08] http://ganglia.wikimedia.org/2.2.0/?c=Bits%20caches%20eqiad&h=niobium.wikimedia.org&m=load_one&r=hour&s=by%20name&hc=4&mc=2 [18:46:14] niobium is getting hammered [18:46:26] RECOVERY - Host wikisource-lb.esams.wikimedia.org is UP: PING OK - Packet loss = 0%, RTA = 109.36 ms [18:46:40] johnduhart: bet it's the varnish bug again… logging into restart that [18:47:39] root@niobium:~# uptime [18:47:39] 18:47:29 up 78 days, 1:24, 2 users, load average: 12670.67, 7487.67, 3122.59 [18:47:40] teehee [18:47:45] !log restarted varnish on niobium [18:47:49] damn [18:47:59] this just random or you guys doin network stuffs? [18:48:04] haha [18:48:08] i'll keep talking on just this channel [18:48:09] LeslieCarr: lol [18:48:20] doh, logging bot dead again [18:48:28] !log morebots are you there? [18:48:29] so basically we turned multicast on for some logging down in tampa [18:48:29] I'll bring it back [18:48:36] we have old crappy core swithces down there [18:48:43] so they flipped out and stopped forwarding packets [18:49:54] !log Restarted morebots [18:50:01] Logged the message, Mr. Obvious [18:50:19] varnish has a bug that when its connection goes down, it spirals out of control [18:50:34] and sadly i restarted it before asher could check it out and try to fix the bug (doh!) [18:50:41] !log restarted varnish on niobium [18:50:43] Logged the message, Mistress of the network gear. [18:50:52] heh [18:50:56] RECOVERY - Host wikiquote-lb.esams.wikimedia.org is UP: PING OK - Packet loss = 0%, RTA = 109.60 ms [18:50:56] o_O [19:38:49] New review: Rich Smith; "(no comment)" [analytics/reportcard] (master) C: 1; - https://gerrit.wikimedia.org/r/2073 [19:41:26] New review: Diederik; "Ok." [analytics/reportcard] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2073 [19:41:26] Change merged: Diederik; [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2073 [19:42:03] New review: Diederik; "Ok." [analytics/reportcard] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2074 [19:42:03] Change merged: Diederik; [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2074 [19:43:04] New review: Diederik; "Ok." [analytics/reportcard] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2085 [19:43:04] Change merged: Diederik; [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2085 [19:44:58] !log updating TrustedXFF host list using fenari [19:45:00] Logged the message, Master [19:46:34] New patchset: Bhartshorne; "configuring lvs to front swift" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2087 [19:46:47] New review: gerrit2; "Change did not pass lint check. You will need to send an amended patchset for this (see: https://lab..." [operations/puppet] (production); V: -1 - https://gerrit.wikimedia.org/r/2087 [19:53:22] PROBLEM - Puppet freshness on copper is CRITICAL: Puppet has not run in the last 10 hours [19:53:22] PROBLEM - Puppet freshness on ms3 is CRITICAL: Puppet has not run in the last 10 hours [19:53:22] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [19:55:22] PROBLEM - Puppet freshness on magnesium is CRITICAL: Puppet has not run in the last 10 hours [19:58:22] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [19:59:22] PROBLEM - Puppet freshness on owa3 is CRITICAL: Puppet has not run in the last 10 hours [20:03:22] PROBLEM - Puppet freshness on ms-fe2 is CRITICAL: Puppet has not run in the last 10 hours [20:05:21] !log on srv197: temporarily disabled puppet and enabled core dumps in apache2.conf for segfault flood investigation [20:05:22] PROBLEM - Puppet freshness on owa1 is CRITICAL: Puppet has not run in the last 10 hours [20:05:23] Logged the message, Master [20:07:22] PROBLEM - Puppet freshness on ms-fe1 is CRITICAL: Puppet has not run in the last 10 hours [20:11:22] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [20:15:37] New patchset: Bhartshorne; "configuring lvs to front swift" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2087 [20:15:53] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2087 [20:16:00] New patchset: Asher; "thread_pool_max was way too high (setting changed meaning in 3.0, became max * pools) try to catch backend failures quicker and better serve stale content mobile frontend probes now test backend varnish, not apaches" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2088 [20:16:16] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2088 [20:16:19] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/2087 [20:17:05] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2087 [20:17:05] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2087 [20:18:34] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/2088 [20:20:30] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2088 [20:20:31] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2088 [20:20:57] PROBLEM - Puppet freshness on owa2 is CRITICAL: Puppet has not run in the last 10 hours [20:53:17] PROBLEM - Host knsq22 is DOWN: PING CRITICAL - Packet loss = 100% [20:54:47] RECOVERY - Host knsq22 is UP: PING WARNING - Packet loss = 80%, RTA = 109.52 ms [20:55:17] PROBLEM - Host knsq16 is DOWN: PING CRITICAL - Packet loss = 100% [20:55:34] New patchset: Ottomata; "Moved Observation class to observation.py, created test directory and added unit tests for new Observation class. Need to work on UserAgentObservation next" [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2089 [20:56:07] PROBLEM - Host knsq21 is DOWN: PING CRITICAL - Packet loss = 100% [20:56:07] PROBLEM - Host ns2.wikimedia.org is DOWN: PING CRITICAL - Packet loss = 100% [20:57:07] RECOVERY - Host knsq21 is UP: PING WARNING - Packet loss = 37%, RTA = 110.94 ms [20:57:42] New review: Diederik; "Ok." [analytics/reportcard] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2089 [20:57:42] Change merged: Diederik; [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2089 [20:58:57] PROBLEM - Host knsq18 is DOWN: PING CRITICAL - Packet loss = 100% [21:00:37] RECOVERY - Host knsq16 is UP: PING WARNING - Packet loss = 50%, RTA = 110.15 ms [21:01:17] RECOVERY - Host knsq18 is UP: PING OK - Packet loss = 0%, RTA = 109.54 ms [21:04:17] RECOVERY - Host ns2.wikimedia.org is UP: PING OK - Packet loss = 0%, RTA = 109.46 ms [21:08:06] New patchset: Asher; "thread_pool_min must always be < max" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2090 [21:08:24] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2090 [21:08:25] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2090 [21:10:45] New patchset: Asher; "what a differnce a % makes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2091 [21:11:01] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2091 [21:11:02] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2091 [21:11:02] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2091 [21:23:16] !log on srv197: compiled and installed a local version of wmerrors for segfault investigation [21:23:17] Logged the message, Master [21:25:53] PROBLEM - DPKG on db1029 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [21:32:33] PROBLEM - Host ms-fe.pmtpa.wmnet is DOWN: PING CRITICAL - Packet loss = 100% [21:33:43] PROBLEM - Apache HTTP on srv197 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:35:53] RECOVERY - DPKG on db1029 is OK: All packages OK [21:42:53] PROBLEM - Host amssq52 is DOWN: PING CRITICAL - Packet loss = 100% [21:44:23] RECOVERY - Host amssq52 is UP: PING WARNING - Packet loss = 61%, RTA = 108.91 ms [21:46:03] PROBLEM - Host knsq17 is DOWN: PING CRITICAL - Packet loss = 100% [21:46:43] RECOVERY - Host knsq17 is UP: PING WARNING - Packet loss = 28%, RTA = 109.30 ms [21:47:33] PROBLEM - Host bits.esams.wikimedia.org_https is DOWN: PING CRITICAL - Packet loss = 100% [21:48:23] PROBLEM - Host bits.esams.wikimedia.org is DOWN: PING CRITICAL - Packet loss = 100% [21:50:03] PROBLEM - BGP status on csw2-esams is CRITICAL: CRITICAL: No response from remote host 91.198.174.244, [21:50:03] RECOVERY - Host bits.esams.wikimedia.org is UP: PING OK - Packet loss = 0%, RTA = 109.36 ms [22:00:03] RECOVERY - BGP status on csw2-esams is OK: OK: host 91.198.174.244, sessions up: 4, down: 0, shutdown: 0 [22:01:43] RECOVERY - Host bits.esams.wikimedia.org_https is UP: PING OK - Packet loss = 0%, RTA = 114.58 ms [22:02:15] !log awjrichards synchronized wmf-config/InitialiseSettings.php 'Setting up FeaturedFeeds config; disabled by default' [22:02:16] Logged the message, Master [22:03:03] PROBLEM - Puppet freshness on knsq9 is CRITICAL: Puppet has not run in the last 10 hours [22:05:04] !log awjrichards synchronized wmf-config/CommonSettings.php 'Setting up FeaturedFeeds config; disabled by default' [22:05:05] Logged the message, Master [22:07:17] !log awjrichards synchronized wmf-config/CommonSettings.php 'fixed spelling mistake fore FeaturedFeeds configuration' [22:07:19] Logged the message, Master [22:31:53] New patchset: Lcarr; "Make test puppet repo act like production (pull from git)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2096 [22:32:35] !log awjrichards synchronizing Wikimedia installation... : Dark-deploying FeaturedFeeds [22:32:36] Logged the message, Master [22:35:08] sync done. [22:38:44] New patchset: Lcarr; "Make test puppet repo act like production (pull from git)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2096 [22:42:31] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: -1; - https://gerrit.wikimedia.org/r/2096 [22:46:21] New patchset: Pyoungmeister; "adding cp1001-cp1020 as text squids" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2097 [22:49:06] New patchset: Bhartshorne; "first stab at getting ms-be into puppet" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2098 [22:50:15] !log awjrichards synchronizing Wikimedia installation... : Enabling FeaturedFeeds everywhere [22:50:16] Logged the message, Master [22:52:00] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2097 [22:52:01] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2097 [22:52:01] sync done. [22:54:12] New patchset: Lcarr; "Make test puppet repo act like production (pull from git)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2096 [22:55:10] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2098 [22:55:10] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2098 [22:56:21] New review: Mark Bergsma; "The git::clone generic definition doesn't actually work this way. Check the comments in the previous..." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/2096 [22:58:40] New patchset: Pyoungmeister; "also needed" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2099 [23:00:33] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/2096 [23:00:47] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2096 [23:01:04] Change abandoned: Pyoungmeister; "(no reason)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2099 [23:05:30] !log awjrichards synchronized wmf-config/CommonSettings.php 'Checking if wmgDisplayFeedsInSidebar === false rather than true, since it defaults to true in the install file' [23:05:32] Logged the message, Master [23:12:58] RECOVERY - Puppet freshness on owa1 is OK: puppet ran at Wed Jan 25 23:12:36 UTC 2012 [23:12:58] RECOVERY - Puppet freshness on copper is OK: puppet ran at Wed Jan 25 23:12:51 UTC 2012 [23:21:57] New patchset: Lcarr; "Make test puppet repo act like production (pull from git)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2096 [23:22:58] RECOVERY - Puppet freshness on owa2 is OK: puppet ran at Wed Jan 25 23:22:32 UTC 2012 [23:22:58] RECOVERY - Puppet freshness on zinc is OK: puppet ran at Wed Jan 25 23:22:32 UTC 2012 [23:22:58] RECOVERY - Puppet freshness on magnesium is OK: puppet ran at Wed Jan 25 23:22:32 UTC 2012 [23:22:58] RECOVERY - Puppet freshness on owa3 is OK: puppet ran at Wed Jan 25 23:22:46 UTC 2012 [23:22:58] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Wed Jan 25 23:22:56 UTC 2012 [23:22:59] RECOVERY - Puppet freshness on ms3 is OK: puppet ran at Wed Jan 25 23:22:57 UTC 2012 [23:24:58] RECOVERY - Puppet freshness on ms-fe1 is OK: puppet ran at Wed Jan 25 23:24:56 UTC 2012 [23:26:33] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/2096 [23:26:50] are we having issues with irc.wikimedia? [23:29:18] PROBLEM - Host srv189 is DOWN: PING CRITICAL - Packet loss = 100% [23:29:28] RECOVERY - Puppet freshness on ms-fe2 is OK: puppet ran at Wed Jan 25 23:29:12 UTC 2012 [23:31:38] New patchset: Mark Bergsma; "Fix purge exec" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2100 [23:31:57] New patchset: Mark Bergsma; "Add platform support for Dell PowerEdge C2100 in base.pp" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2101 [23:32:08] New patchset: Pyoungmeister; "nagios monitoring group and also some realphebitizing" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2102 [23:32:22] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2100 [23:32:23] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2100 [23:33:16] New patchset: Mark Bergsma; "Add platform support for Dell PowerEdge C2100 in base.pp" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2101 [23:33:43] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/2101 [23:35:19] !log awjr synchronized CiviCRM on aluminium to r1211 [23:35:21] Logged the message, Master [23:35:47] !log re-enabled queue consumption for payments through Jenkins [23:35:49] Logged the message, Master [23:35:58] RECOVERY - Puppet freshness on ms1 is OK: puppet ran at Wed Jan 25 23:35:38 UTC 2012 [23:59:09] gn8 folks