[00:10:25] Feb 2 00:09:31 10.0.11.9 apache2[15047]: PHP Fatal error: Call to a member function setTimestamp() on a non-object in /usr/local/apache/common-local/php-1.18/extensions/FeaturedFeeds/FeaturedFeeds.body.php on line 257 [00:16:11] AaronSchulz, it's always supposed to be set in the constructor.. [00:16:19] I'd log it as a bug as MaxSem isn't about atm [00:16:30] or you can log it :) [00:16:44] I didn't find it :p [00:17:42] Done [00:24:12] New patchset: Bhartshorne; "adding in country filters for mobile using new udp_filter framework" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2185 [00:24:28] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2185 [00:24:46] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2185 [00:24:47] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2185 [00:27:28] New patchset: Bhartshorne; "typo correction" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2186 [00:27:54] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2186 [00:27:55] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2186 [00:48:40] New patchset: Asher; "fix innodb statistic collection (at long last), update more frequently, add query rate" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2187 [00:57:41] PROBLEM - Mobile WAP site on ekrem is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:05:41] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: Puppet has not run in the last 10 hours [01:05:41] PROBLEM - Puppet freshness on lvs1003 is CRITICAL: Puppet has not run in the last 10 hours [01:09:11] RECOVERY - Mobile WAP site on ekrem is OK: HTTP OK HTTP/1.1 200 OK - 1642 bytes in 9.440 seconds [01:10:52] zzz =____= [01:20:22] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2187 [01:20:22] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2187 [01:22:49] !log running "mwscriptwikiset purgeDeletedFiles.php all.dblist --starttime=20120126000000" in a screen on fenari [01:22:52] Logged the message, Master [01:43:21] PROBLEM - Mobile WAP site on ekrem is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:53:11] PROBLEM - Disk space on srv222 is CRITICAL: DISK CRITICAL - free space: / 98 MB (1% inode=60%): /var/lib/ureadahead/debugfs 98 MB (1% inode=60%): [01:58:08] New patchset: Bhartshorne; "disabling 'check-apache' command for swift since it actually checks ssh, not apache." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2188 [01:58:26] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2188 [01:58:26] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2188 [01:58:26] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2188 [02:06:49] !log LocalisationUpdate completed (1.18) at Thu Feb 2 02:06:48 UTC 2012 [02:06:50] Logged the message, Master [02:07:11] RECOVERY - Mobile WAP site on ekrem is OK: HTTP OK HTTP/1.1 200 OK - 1642 bytes in 9.412 seconds [02:15:51] RECOVERY - Disk space on srv222 is OK: DISK OK [02:18:31] PROBLEM - Misc_Db_Lag on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1452s [02:26:31] PROBLEM - MySQL replication status on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1932s [02:38:01] RECOVERY - MySQL replication status on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 9s [02:41:11] RECOVERY - Misc_Db_Lag on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 19s [02:52:21] PROBLEM - Mobile WAP site on ekrem is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:03:51] RECOVERY - Puppet freshness on spence is OK: puppet ran at Thu Feb 2 03:03:27 UTC 2012 [03:04:42] PROBLEM - Puppet freshness on knsq9 is CRITICAL: Puppet has not run in the last 10 hours [03:22:43] RECOVERY - Mobile WAP site on ekrem is OK: HTTP OK HTTP/1.1 200 OK - 1642 bytes in 9.630 seconds [03:32:53] PROBLEM - mysqld processes on db56 is CRITICAL: PROCS CRITICAL: 0 processes with command name mysqld [03:37:33] RECOVERY - Puppet freshness on mw8 is OK: puppet ran at Thu Feb 2 03:37:21 UTC 2012 [04:16:10] RECOVERY - Disk space on es1004 is OK: DISK OK [04:18:40] RECOVERY - MySQL disk space on es1004 is OK: DISK OK [04:44:20] PROBLEM - MySQL slave status on es1004 is CRITICAL: CRITICAL: Slave running: expected Yes, got No [08:51:51] PROBLEM - MySQL Slave Delay on db24 is CRITICAL: CRIT replication delay 682 seconds [09:04:42] Request: POST http://www.mediawiki.org/wiki/Special:Watchlist, from 208.80.152.69 via sq61.wikimedia.org (squid/2.7.STABLE9) to () [09:04:42] Error: ERR_CANNOT_FORWARD, errno [No Error] at Thu, 02 Feb 2012 09:04:25 GMT [09:06:43] Umm [09:07:02] I'm getting more of these [09:07:09] Request: GET http://wikimediafoundation.org/wiki/Template_talk:Staff_and_contractors, from 208.80.152.87 via sq65.wikimedia.org (squid/2.7.STABLE9) to () [09:07:10] Error: ERR_CANNOT_FORWARD, errno (11) Resource temporarily unavailable at Thu, 02 Feb 2012 09:06:51 GMT [09:07:16] same [09:07:22] Error: ERR_CANNOT_FORWARD, errno [No Error] [09:07:29] <-- Vancouver, Canada [09:07:46] apergos ? [09:07:49] nothing is happening [09:08:06] huh? [09:08:09] can we fire closedmouth into the sun [09:08:21] apergos, multiple reports of errors, seems global [09:08:24] ok [09:09:56] I would bring up the blamewheel... but yeah... [09:10:18] hello [09:10:20] i blame arbcom [09:10:34] I get several error messages on Commons [09:10:46] Request: POST http://commons.wikimedia.org/w/index.php?title=File:ND_guitar.jpg&action=delete, from 91.198.174.55 via sq65.wikimedia.org (squid/2.7.STABLE9) to () [09:10:46] Error: ERR_CANNOT_FORWARD, errno [No Error] at Thu, 02 Feb 2012 09:09:54 GMT [09:10:52] commons too eh [09:11:29] Request: POST http://commons.wikimedia.org/w/index.php?title=File:Gnomon_Logo.svg&action=submit, from 91.198.174.42 via sq39.wikimedia.org (squid/2.7.STABLE9) to () [09:11:29] Error: ERR_CANNOT_FORWARD, errno [No Error] at Thu, 02 Feb 2012 09:11:19 GMT [09:12:01] PROBLEM - MySQL Slave Delay on db13 is CRITICAL: CRIT replication delay 1339 seconds [09:12:11] PROBLEM - MySQL Slave Delay on db54 is CRITICAL: CRIT replication delay 317 seconds [09:12:31] apergos, some requests seem to go through, and some others don't. [09:12:44] things on s2 are failing I guess [09:13:44] now i'm getting errno (11) Resource temporarily unavailable [09:13:50] guillom: I am presuming that it means it is broken, it is being worked upon, be patient [09:13:55] I get nothing [09:14:08] sDrewthedoff, right; ap.ergos is on it [09:15:02] back [09:15:12] front [09:15:13] and down again [09:15:19] k i'm off to bewd [09:15:22] bed [09:15:32] better be fixed by morning!! [09:15:35] ;P [09:16:38] here it is morning [09:16:46] btw it's saying "via sq65.wikimedia.org (squid/2.7.STABLE9) to ()" [09:16:49] if that means anything [09:16:57] PROBLEM - LVS HTTP on appservers.svc.pmtpa.wmnet is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:17:23] apergos, I'm going to tweet something about it, unless you tell me it's going to be fixed imminently [09:17:30] please do so [09:17:31] happening all across the property [09:17:34] OlEnglish, that's the Squid cache id [09:17:34] ok [09:17:45] how the hell am I going to find something to do? [09:17:47] ;-) [09:18:06] yup i'm seeing both on en:wp and meta: i'm getting Request: GET http://en.wikipedia.org/wiki/Chew_Stoke, from 208.80.152.86 via sq66.wikimedia.org (squid/2.7.STABLE9) to () [09:18:06] OlEnglish, probably not the source of the problem [09:18:07] Error: ERR_CANNOT_FORWARD, errno (11) Resource temporarily unavailable at Thu, 02 Feb 2012 09:16:53 GMT [09:18:15] sDrewth: How about you look that up on wikip... oh wait... [09:18:26] please fix topic [09:18:46] matanya: it is [09:19:30] p858snake|l: no source, no commons, no meta, no pedia ... my life is empty and I may need to talk to my family [09:19:48] if things get really desperate [09:19:53] don't be crazy, just turn on the TV [09:19:59] talking to people in rl? that is deperate [09:20:13] sDrewth, lol [09:20:15] SEE! My point [09:20:22] Can you discuss that somewhere other than the tech channel? The ops dudes actually use this for work and stuff. [09:22:07] chillax, bro [09:22:42] hi [09:22:47] RECOVERY - ps1-d2-pmtpa-infeed-load-tower-A-phase-Z on ps1-d2-pmtpa is OK: ps1-d2-pmtpa-infeed-load-tower-A-phase-Z OK - 1200 [09:22:47] PROBLEM - Apache HTTP on mw8 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:23:04] guillom, ? [09:23:27] tegra, they don't know how long, someone is onto it [09:23:32] pinging someone with a question mark is not helpful [09:23:47] PROBLEM - MySQL Replication Heartbeat on db24 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:23:48] yes sorry [09:23:51] tegra, ? [09:24:01] i read topic :) [09:24:10] ok [09:24:11] Request: GET http://outreach.wikimedia.org/wiki/Special:RecentChanges, from 91.198.174.43 via sq66.wikimedia.org (squid/2.7.STABLE9) to () [09:24:14] Error: ERR_CANNOT_FORWARD, errno [No Error] at Thu, 02 Feb 2012 09:11:51 GMT [09:24:31] ah, topic... [09:24:46] mmm [09:24:46] should be on the first place [09:25:17] PROBLEM - Apache HTTP on srv270 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:26:27] PROBLEM - Apache HTTP on mw35 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:26:39] seems work now. [09:26:43] !log tstarling synchronized php-1.18/includes/SiteStats.php [09:26:46] Logged the message, Master [09:26:54] !log disabling SiteStatsInit::articles [09:26:56] Logged the message, Master [09:27:27] PROBLEM - MySQL Replication Heartbeat on db54 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:27:57] RECOVERY - LVS HTTP on appservers.svc.pmtpa.wmnet is OK: HTTP OK HTTP/1.1 200 OK - 57953 bytes in 0.155 seconds [09:28:01] thx a.pergos and g.uillom [09:28:09] !log restarting all apaches [09:28:10] Logged the message, Master [09:29:00] wikipedia seems slow atm. [09:29:17] PROBLEM - Apache HTTP on srv247 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:29:45] ok now. [09:30:17] PROBLEM - Apache HTTP on mw14 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:30:18] PROBLEM - Apache HTTP on mw31 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:30:27] PROBLEM - Apache HTTP on mw13 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:32:19] !log on db54, killed all SiteStats queries [09:32:21] Logged the message, Master [09:33:18] looks good [09:33:57] RECOVERY - Apache HTTP on mw8 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.026 second response time [09:34:14] !log tstarling synchronized php-1.18/includes/SiteStats.php 'disabling even more' [09:34:15] Logged the message, Master [09:35:35] !log killing statistics queries on all s2 slave servers [09:35:37] Logged the message, Master [09:36:27] RECOVERY - Apache HTTP on srv270 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.021 second response time [09:37:37] RECOVERY - Apache HTTP on mw35 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.033 second response time [09:38:21] thank you tim [09:38:37] RECOVERY - MySQL Replication Heartbeat on db54 is OK: OK replication delay 0 seconds [09:40:27] RECOVERY - Apache HTTP on srv247 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.024 second response time [09:41:37] RECOVERY - Apache HTTP on mw31 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.037 second response time [09:41:47] RECOVERY - Apache HTTP on mw14 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.025 second response time [09:42:07] RECOVERY - Apache HTTP on mw13 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.026 second response time [09:43:37] RECOVERY - MySQL Slave Delay on db54 is OK: OK replication delay 0 seconds [09:44:28] closedmouth: no offense meant, I just have enough trouble figuring out the strictly tech stuff all by itself. :) [09:45:27] RECOVERY - MySQL Slave Delay on db13 is OK: OK replication delay 0 seconds [09:46:17] RECOVERY - MySQL Replication Heartbeat on db24 is OK: OK replication delay 0 seconds [09:48:07] RECOVERY - MySQL Slave Delay on db24 is OK: OK replication delay 0 seconds [09:57:19] is there a tech meet in Berlin this spring? [09:57:31] Yes [09:57:35] Dates to be announced soon [09:57:56] Or at least that's the rumors I hear [09:58:01] it's not mentioned on http://meta.wikimedia.org/wiki/Wikimedia_Conference_2012 [09:58:18] is it another date than the chapters conference? [09:58:22] This year, the tech meeting is decoupled from the chapters conference [09:58:24] yes [09:58:37] PROBLEM - Disk space on srv223 is CRITICAL: DISK CRITICAL - free space: / 158 MB (2% inode=60%): /var/lib/ureadahead/debugfs 158 MB (2% inode=60%): [09:58:40] btw, I'm coming to fosdem [09:59:34] Cool [09:59:43] There's a bunch of Wikimedians going [10:00:48] !log reinserted the deleted site_stats row for plwiki [10:00:49] Logged the message, Master [10:01:23] what's the name of this German centralized data project? [10:01:55] ah, http://meta.wikimedia.org/wiki/WikiData_WMDE [10:11:27] PROBLEM - Disk space on es1004 is CRITICAL: DISK CRITICAL - free space: /a 446525 MB (3% inode=99%): [10:14:15] PROBLEM - MySQL disk space on es1004 is CRITICAL: DISK CRITICAL - free space: /a 432545 MB (3% inode=99%): [10:22:55] RECOVERY - Disk space on srv223 is OK: DISK OK [10:27:41] New patchset: Mark Bergsma; "Discard mailman bounces" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2189 [10:28:09] New patchset: Mark Bergsma; "Add removal support for system roles" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2175 [10:28:25] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2175 [10:28:26] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2175 [10:28:43] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2189 [10:28:44] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2189 [10:31:31] New patchset: Mark Bergsma; "It's :blackhole: instead of :discard:" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2190 [10:31:59] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2190 [10:31:59] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2190 [11:17:18] New patchset: ArielGlenn; "rsyncd stanza for folks mirroring all public content from dumps.wm.o" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2191 [11:18:38] New review: ArielGlenn; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2191 [11:18:39] Change merged: ArielGlenn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2191 [11:23:02] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: Puppet has not run in the last 10 hours [11:23:02] PROBLEM - Puppet freshness on lvs1003 is CRITICAL: Puppet has not run in the last 10 hours [11:27:20] New patchset: Mark Bergsma; "Add new role::cache::squid::upload class" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2192 [11:27:37] New patchset: Mark Bergsma; "Rename service IPs to be consistent" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2193 [11:27:53] New patchset: Mark Bergsma; "Add TODO" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2194 [11:28:08] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2193 [11:28:27] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2192 [11:28:28] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2192 [11:28:54] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2193 [11:28:55] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2193 [11:29:16] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2194 [11:29:16] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2194 [11:40:42] RECOVERY - MySQL slave status on es1004 is OK: OK: [11:48:16] New patchset: Hashar; "GraphViz needed for Jenkins installation" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2196 [11:51:52] okay, who broke code review [11:51:53] 4 PHP Catchable fatal error: Method MailAddress::__toString() must return a string value in /usr/local/apache/common-local/php-1.18/includes/UserMailer.php on line 128 [12:23:09] PROBLEM - Puppet freshness on mw65 is CRITICAL: Puppet has not run in the last 10 hours [12:59:49] PROBLEM - Disk space on srv221 is CRITICAL: DISK CRITICAL - free space: / 0 MB (0% inode=60%): /var/lib/ureadahead/debugfs 0 MB (0% inode=60%): [13:07:34] New patchset: Mark Bergsma; "Make lvs::realserver support hashes as well as arrays for service IPs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2204 [13:07:52] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2204 [13:08:40] New patchset: Mark Bergsma; "Merge LVS changes from test into production" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2204 [13:08:58] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2204 [13:09:22] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2204 [13:09:23] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2204 [13:11:06] !log catrope synchronized php-1.18/extensions/LocalisationUpdate/LocalisationUpdate.class.php 'Deploy live-hacked version that will hopefully fix bug 33768' [13:11:08] Logged the message, Master [13:11:09] RECOVERY - Disk space on srv221 is OK: DISK OK [13:12:54] !log Running l10nupdate by hand to hopefully fix bug 33768 [13:12:55] Logged the message, Mr. Obvious [13:13:12] siebrand: Here goes ---^^ this will hopefully fix things, or it might make them worse [13:13:46] I've backed up the l10nupdate cache just in case [13:17:58] New patchset: Mark Bergsma; "Use a dynamic lookup for now" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2205 [13:18:15] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2205 [13:18:22] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2205 [13:18:22] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2205 [13:23:51] * siebrand crosses fingers. [13:24:03] PROBLEM - Puppet freshness on knsq9 is CRITICAL: Puppet has not run in the last 10 hours [13:24:24] RoanKattouw: you've initiated the LU run manually now? [13:24:30] Yes [13:24:33] It's taking forever [13:24:44] RoanKattouw: define forever? :) [13:27:03] Starting l10nupdate at Thu Feb 2 13:13:59 UTC 2012. [13:27:12] So it's been 13 minutes now [13:28:02] New patchset: Mark Bergsma; "Refactor, remove duplication" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2206 [13:28:20] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2206 [13:29:22] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2206 [13:29:22] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2206 [13:33:37] New patchset: Mark Bergsma; "A hash is unordered, so sort values to avoid constant Puppet changes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2207 [13:33:55] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2207 [13:33:55] It's probably taking this long because it needs to write a lot more data now [13:34:20] Aha, it's syncing now, and failing to do so [13:34:32] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2207 [13:34:32] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2207 [13:34:53] !log LocalisationUpdate completed (1.18) at Thu Feb 2 13:34:53 UTC 2012 [13:34:55] Logged the message, Master [13:35:20] !log Running sync-l10nupdate again to investigate rsync errosr [13:35:22] Logged the message, Mr. Obvious [13:38:55] !log Fixing ownership of /usr/local/apache/common-local/php-1.18/cache/l10n on srv191, srv199, srv219-224 [13:38:57] Logged the message, Mr. Obvious [13:43:20] New patchset: Mark Bergsma; "Remove old system roles, preparing for migration of role classes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2208 [13:43:23] PROBLEM - Mobile WAP site on ekrem is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:43:31] ZOMg [13:43:37] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2208 [13:43:39] srv222: rsync: write failed on "/usr/local/apache/common-local/php-1.18/cache/l10n/l10nupdate-ab.cache": No space left on device (28) [13:43:41] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2208 [13:43:41] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2208 [13:44:06] !log srv219-224 have a full disk according to rsync [13:44:08] Logged the message, Mr. Obvious [13:44:14] PROBLEM - HTTP on ekrem is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:45:31] !log Deleting /tmp/mw-cache-1.17 on srv219 and srv223 [13:45:33] Logged the message, Mr. Obvious [13:49:22] !log Deleting /home/wikipedia/common/php-1.17-test , has been unused for a long time [13:49:24] Logged the message, Mr. Obvious [13:54:46] New patchset: Mark Bergsma; "Cleanup Squid monitoring" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2209 [13:55:23] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2209 [13:55:23] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2209 [13:58:38] hi [13:59:30] !log catrope synchronizing Wikimedia installation... : Deleted php-1.17-test on fenari, running scap to delete it on the Apaches as well [13:59:31] Logged the message, Master [14:00:20] hi where can i find .net prog. chanels? [14:03:20] sync done. [14:07:38] New patchset: Mark Bergsma; "Rename squid role caches and nagios groups for consistency with varnish" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2210 [14:07:55] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2210 [14:08:08] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2210 [14:08:08] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2210 [14:09:52] !log Scalers now have disk space available because php-1.17-test is gone [14:09:53] Logged the message, Mr. Obvious [14:10:17] !log Finally fixed ownership of cache/l10n on scalers , sync-l10nupdate only throws the expected errors, no more perms errors on the scalers [14:10:18] Logged the message, Mr. Obvious [14:12:42] <^demon> CR is triggering PHP Catchable fatal error: Method MailAddress::__toString() must return a string value in /usr/local/apache/common-local/php-1.18/includes/UserMailer.php on line 128 [14:18:25] New patchset: Mark Bergsma; "Migrate varnish cache::bits and cache::mobile classes to role::cache in role/cache.pp" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2211 [14:18:43] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2211 [14:19:25] !log catrope synchronized php-1.18/extensions/LocalisationUpdate/LocalisationUpdate.class.php 'r110570' [14:19:27] Logged the message, Master [14:20:10] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2211 [14:20:10] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2211 [14:21:50] !log hashar synchronized php-1.18/includes/UserMailer.php 'work around bug 34158' [14:21:52] Logged the message, Master [14:22:27] ^demon: Nikerabbit I have synced the change against blank page on code review. [14:22:40] <^demon> I don't like that fix :\ [14:22:43] <^demon> I left CR. [14:23:11] that is a backport from trunk [14:23:19] does not fix bug 34158 though [14:23:28] <^demon> Yes, I left a comment that I didn't like the fix in trunk. [14:23:29] i have no idea why CR try to notify an empty address :( [14:23:54] <^demon> Well that's what's worth finding out. [14:24:08] <^demon> We should throw an exception instead so we get a stack trace. [14:28:26] New review: Dzahn; "moved puppetmaster monitoring out of site.pp" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2166 [14:28:26] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2166 [14:30:27] New review: Dzahn; "yep, graphviz needed" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2196 [14:30:28] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2196 [14:47:31] New review: Dzahn; "you are deleting nightly.css but still using it?" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/2174 [14:49:02] New review: Demon; "(no comment)" [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/2174 [15:01:05] PROBLEM - Disk space on srv220 is CRITICAL: DISK CRITICAL - free space: / 1 MB (0% inode=64%): /var/lib/ureadahead/debugfs 1 MB (0% inode=64%): [15:03:32] wow, 1MB left – that's enough to copy a HOLE floppy over ;) [15:10:29] !log reedy synchronized php-1.18/extensions/CodeReview/api/ 'r110574' [15:10:30] Logged the message, Master [15:11:55] RECOVERY - Disk space on srv220 is OK: DISK OK [15:14:31] New patchset: Mark Bergsma; "Split ridiculously long lines in lvs.pp" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2212 [15:14:48] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2212 [15:15:05] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2212 [15:15:06] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2212 [15:17:25] RECOVERY - HTTP on ekrem is OK: HTTP OK HTTP/1.1 200 OK - 453 bytes in 9.383 seconds [15:17:49] sigh, srv220 is out of space AGAIN? [15:20:45] RoanKattouw: I guess the etchs. have to find another server for their porn-collection ;) [15:23:05] PROBLEM - Disk space on srv223 is CRITICAL: DISK CRITICAL - free space: / 1 MB (0% inode=64%): /var/lib/ureadahead/debugfs 1 MB (0% inode=64%): [15:25:24] New review: Hashar; "Please note nightly.css was *renamed* not deleted :" [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/2174 [15:27:13] New patchset: Mark Bergsma; "Remove old LVS services and old service IPs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2213 [15:28:36] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2213 [15:28:37] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2213 [15:31:09] New patchset: Mark Bergsma; "Remove old payments LVS service as well, rename the new one" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2214 [15:31:26] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2214 [15:31:47] New patchset: Mark Bergsma; "Qualify all $site references in lvs.pp" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2215 [15:33:45] RECOVERY - Disk space on srv223 is OK: DISK OK [15:35:58] New patchset: Dzahn; "redirect http://irc.wikimedia.org to IRC meta page (currently "it works" page)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2216 [15:36:16] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2216 [15:37:03] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2214 [15:37:04] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2214 [15:37:33] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2215 [15:37:33] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2215 [16:05:50] New review: Dzahn; "re: asterisk in : ah,ok so Apache will expand that? re: renamed file: yes, i see. nevermi..." [operations/puppet] (production); V: 1 C: 1; - https://gerrit.wikimedia.org/r/2174 [16:06:09] New patchset: Mark Bergsma; "Migrate esams text squids to new role classes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2217 [16:06:31] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2217 [16:07:56] New patchset: Mark Bergsma; "Migrate esams squids to new role classes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2218 [16:08:24] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2217 [16:08:25] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2217 [16:13:51] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2218 [16:13:52] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2218 [16:21:00] New patchset: Mark Bergsma; "Migrate remaining squids to new role classes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2219 [16:21:57] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2219 [16:21:58] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2219 [16:22:16] mutante: the star in match anything but / [16:22:34] mutante: I need a better long term fix though [16:22:46] alright [16:22:51] mutante: will do that later on by facto rising common code and use an Include [16:24:15] gotcha [16:24:21] New patchset: Mark Bergsma; "Revert "Migrate remaining squids to new role classes"" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2220 [16:24:39] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2220 [16:24:39] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2220 [16:24:44] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2220 [16:24:44] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2220 [16:25:10] who knows enough German and git to fix https://github.com/filbertkm/Scholarships/blob/master/languages/de.php from https://github.com/filbertkm/Scholarships ? [16:25:24] knows German [16:25:35] i'll take a look [16:26:03] German translation is in the sorry state - it's last year's needs to be updated to match https://github.com/filbertkm/Scholarships/blob/master/languages/en.php [16:26:16] you can for it on github if you have an account there [16:26:35] New patchset: Mark Bergsma; "Revert "Revert "Migrate remaining squids to new role classes""" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2221 [16:26:51] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2221 [16:26:52] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2221 [16:26:52] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2221 [16:27:34] saper: ok, it doesnt even look that bad at first glance [16:27:53] hashar: then lets merge your change? [16:28:11] New patchset: Mark Bergsma; "Let's not convert live text squids into upload squids" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2222 [16:28:12] mutante: no, it's just not up to date and there some new messages [16:28:24] mutante: probably fine yes :) [16:28:28] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2222 [16:28:33] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2222 [16:28:34] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2222 [16:28:37] aude is quite fast at merging and deploying changes (it's her repo) [16:29:29] New review: Dzahn; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2174 [16:29:30] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2174 [16:29:59] hashar: done [16:31:23] thx! [16:31:30] mutante: don't bother running puppet [16:31:34] it is not urgent [16:31:52] unless it is fast to do, this way we can verify the change in production :D [16:34:19] hashar: Applying configuration version '1328200389' .. [16:34:39] and finished [16:38:06] New patchset: Hashar; "fix nightly.css relative path" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2223 [16:38:20] mutante: I have screwed the nightly.css relative path by forgetting a ../ [16:38:25] change 2223 should fix it [16:38:32] the rest is fine. Thanks for the merge! [16:42:35] New review: Dzahn; "done:)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2223 [16:42:36] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2223 [16:44:42] hashar: it's applied [16:46:01] and of course I have changed the wrong part :-(((((((((( [16:46:59] uh? where [16:47:09] you just wanted one level up [16:47:11] I have changed the apache IndexStyleSheet directive [16:47:13] which is not used [16:47:17] oh [16:47:21] should have changed the ones in the .html files [16:47:43] RECOVERY - RAID on db13 is OK: OK: 1 logical device(s) checked [16:47:52] i see, do you want to use IndexStyleSheet ? [16:48:02] na it is not needed anymore. I am pushing a change [16:48:35] then remove it? [16:48:45] k [16:49:15] New patchset: Hashar; "fix up nightly.css" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2224 [16:49:17] yeah I am removing the apache directive to avoid confusion [16:49:31] I have used that at the very beginning [16:49:33] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2224 [16:49:35] yep [16:49:40] and replaced it with the HEADER.html [16:49:46] so 2224 should fix it now [16:50:20] Are there known problems with translated messages? [16:50:35] Like wmf cluster running on old messages? [16:51:32] New review: Dzahn; "looks good, not using IndexStyleSheet" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2224 [16:51:33] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2224 [16:53:41] hashar: check again [16:54:31] mutante: all four URLs work!! [16:54:35] great [16:55:07] thanks! [16:55:24] saper: i forked it on github... will add some translations [16:55:32] hashar: yw [16:56:03] PROBLEM - Host text.pmtpa.wikimedia.org is DOWN: PING CRITICAL - Packet loss = 100% [16:56:11] New patchset: Mark Bergsma; "Cleanup: remove obsolete old role classes text-squid and upload-squid" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2225 [16:56:29] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2225 [16:57:07] mutante: cool [16:57:22] you will see some commits from me on pl and some emergency changes on de [16:57:46] saper: so i will look at the diff between "en" and "de" and add the missing messages [16:58:22] thanks, should be fast [17:02:55] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2225 [17:02:56] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2225 [17:08:14] New patchset: Mark Bergsma; "Migrate swift cluster role classes into role/swift.pp with role::prefix" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2226 [17:08:32] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2226 [17:08:43] PROBLEM - Backend Squid HTTP on knsq25 is CRITICAL: Connection refused [17:09:04] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/2226 [17:09:53] PROBLEM - Puppetmaster HTTPS on virt0 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 8140: HTTP/1.1 403 Forbidden [17:14:22] Hi folks! [17:14:51] I need help. [17:15:05] I just got my access to pywikipedia SVN [17:15:19] I installed TortoiseSVN for Windows, checked out Pywikipedia and userinfo. Then I successfully committed my USERINFO. So the system works. But when I wanted to commit to Pywikipedia, I got this error message and action failed: Commit failed (details follow): Access to '/svnroot/pywikipedia/!svn/act/9ba1fec7-3f3a-1c43-84b7-fe6a3a100ce1' forbidden Where is the error? [17:16:47] RECOVERY - Mobile WAP site on ekrem is OK: HTTP OK HTTP/1.1 200 OK - 1642 bytes in 8.909 seconds [17:17:17] New patchset: Mark Bergsma; "Migrate swift cluster role classes into role/swift.pp with role::prefix" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2226 [17:17:35] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2226 [17:19:00] binbot, sounds like you haven't been given access to the pywikipedia repo [17:19:47] PROBLEM - Disk space on srv219 is CRITICAL: DISK CRITICAL - free space: / 0 MB (0% inode=64%): /var/lib/ureadahead/debugfs 0 MB (0% inode=64%): [17:20:44] Thats's what I am afraid of, too. [17:21:42] Reedy: Sumana told me my access has been created, and I asked her and she told me to come here. [17:25:08] binbot: make sure you checked out pywikipedia using your svn+ssh url, not the https url [17:25:48] if you already checked out using https, you need to use svn switch --relocate (or whatever the tortoise equivalent is) [17:25:52] or delete and check out again. [17:26:05] (this bites me time and time again) [17:27:31] Oh, you must be right! After I committed to userinfo, I thought it would be OK now, and didn't listen to the protocol again. [17:30:57] RECOVERY - Disk space on srv219 is OK: DISK OK [17:32:48] Daniel_WMDE: Full success! [17:32:55] Thank you very much! [17:37:37] New patchset: Bhartshorne; "adding in the final country filter for mobile for diederik" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2228 [17:37:55] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2228 [17:38:12] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2228 [17:38:13] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2228 [17:43:57] PROBLEM - Host cp1002 is DOWN: PING CRITICAL - Packet loss = 100% [17:45:24] Reedy, was my eol-style not good? [17:45:41] I have set it in SVN as written in manual [17:45:50] Yeah, but you're committing an extensionless file [17:45:56] And svn doesn't add it to those [17:45:58] Not a big deal :) [17:46:30] Ahh. [17:46:48] There is a USERINFO = svn:eol-style=native line in, and I thougth it would do it. [17:47:03] hmm [17:47:18] How can I set it for userinfo next time? [17:47:37] I don't think you can do it automatically [17:47:41] svn propset svn:eol-style native nameOfFile [17:47:41] Surely there won't be a next time? [17:48:04] Could you please check http://www.mediawiki.org/wiki/Special:Code/pywikipedia/9852 for EOL? [17:48:16] This must be good, as .py is in settings [17:48:21] Look at hte bottom, yeah, it's fine [17:48:24] But I am not sure any more [17:48:56] Thank you. [17:48:57] PROBLEM - HTTP on ekrem is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:49:16] I was told that __version__ = '$Id$' would become a version number in script, but it didn't [17:49:31] need svn:keywords Id setting [17:50:24] I have this line in SVN settings: *.h = svn:keywords=Author Date Id Rev URL;svn:eol-style=native [17:50:31] Should .py look the same? [17:50:45] I thought it would be the repository's task [17:52:49] RoanKattouw: thanks, I found it in Tortoise [17:53:02] Next time I update it should be OK. [17:57:29] New patchset: ArielGlenn; "up bw limit for rsyncers on downloads.wm.o" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2229 [17:57:45] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2229 [17:58:27] New review: ArielGlenn; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2229 [17:58:28] Change merged: ArielGlenn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2229 [18:14:58] Another problem: [18:15:39] I wanted to commit again, and Tortoise says it can't be done because my working copy is already locked on C: [18:15:59] But when I try to release the lock it says nothing is locked, nothing to release [18:16:21] svn cleanup? [18:26:31] New patchset: Dzahn; "simple SMS pager script as a starter" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2230 [18:29:07] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 1; - https://gerrit.wikimedia.org/r/2230 [18:29:42] New patchset: Dzahn; "simple SMS pager script as a starter, missed a /" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2230 [18:29:59] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2230 [18:41:43] PROBLEM - Mobile WAP site on ekrem is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:51:59] vvv: Thank you, succeeded! [18:52:06] np [18:52:29] I just don't understand why Tortoise asks me for login name at each commit [18:52:52] did you checkout with svn+ssh://username@svn..... [18:53:33] RECOVERY - Host cp1002 is UP: PING OK - Packet loss = 0%, RTA = 30.89 ms [19:08:04] !log asher synchronized wmf-config/db.php 'add db55 - new s5 slave' [19:08:06] Logged the message, Master [19:15:31] !log asher synchronized wmf-config/db.php 'raising db55 weight' [19:15:32] Logged the message, Master [19:15:47] A bot to control another bot. How efficient. [19:16:48] New patchset: Asher; "fix string formatting of mysql version" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2231 [19:17:05] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2231 [19:17:09] TBloemink: You know that it does that for the visibility in here, don't you? :P [19:20:07] :S [19:20:19] TBloemink: in the next step, another bot should replace the human who does the stuff ;) [19:20:26] Efficient [19:37:41] RECOVERY - Mobile WAP site on ekrem is OK: HTTP OK HTTP/1.1 200 OK - 1642 bytes in 5.761 seconds [19:40:16] Reedy: thank you, relocated. This is my first day with real SVN. [19:43:03] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2231 [19:43:04] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2231 [19:51:16] binasher, would it be worth doing a db graph for the ES boxes like the one you've done for core? [19:51:32] Reedy: definitely [19:52:01] i'm going to iterate on that [19:52:21] cool [19:52:31] PROBLEM - Backend Squid HTTP on cp1003 is CRITICAL: Connection refused [19:52:31] PROBLEM - Frontend Squid HTTP on cp1006 is CRITICAL: Connection refused [19:52:31] PROBLEM - Backend Squid HTTP on cp1009 is CRITICAL: Connection refused [19:52:31] PROBLEM - Backend Squid HTTP on cp1015 is CRITICAL: Connection refused [19:52:31] PROBLEM - Frontend Squid HTTP on cp1012 is CRITICAL: Connection refused [19:52:32] PROBLEM - Frontend Squid HTTP on cp1018 is CRITICAL: Connection refused [19:52:34] and possibly visualize it differently [19:52:48] was just thinking the ES boxes were even more unknown [19:52:59] its been pointed out that many people can't scroll horizontally from their mice.. silly non-trackpad users [19:53:08] I can [19:53:12] Click mouse wheel [19:53:15] move left/right [19:53:24] the ES layout completely changed not too long ago too [19:54:07] <^demon> binasher: You mean everyone hasn't joined the wonderful world of the magic trackpad? [19:54:51] or even the magic retro trackball! [19:56:21] PROBLEM - Backend Squid HTTP on cp1010 is CRITICAL: Connection refused [19:56:31] PROBLEM - Frontend Squid HTTP on cp1007 is CRITICAL: Connection refused [19:56:41] PROBLEM - Frontend Squid HTTP on cp1019 is CRITICAL: Connection refused [19:56:41] PROBLEM - Backend Squid HTTP on cp1004 is CRITICAL: Connection refused [19:57:11] PROBLEM - Frontend Squid HTTP on cp1001 is CRITICAL: Connection refused [19:57:21] PROBLEM - Frontend Squid HTTP on cp1013 is CRITICAL: Connection refused [19:58:31] PROBLEM - Backend Squid HTTP on cp1016 is CRITICAL: Connection refused [19:58:31] PROBLEM - Frontend Squid HTTP on cp1009 is CRITICAL: Connection refused [19:58:31] PROBLEM - Frontend Squid HTTP on cp1020 is CRITICAL: Connection refused [19:58:31] PROBLEM - Backend Squid HTTP on cp1011 is CRITICAL: Connection refused [19:58:41] PROBLEM - Frontend Squid HTTP on cp1003 is CRITICAL: Connection refused [19:58:42] PROBLEM - Backend Squid HTTP on cp1012 is CRITICAL: Connection refused [19:58:51] PROBLEM - Backend Squid HTTP on cp1006 is CRITICAL: Connection refused [19:58:51] PROBLEM - Frontend Squid HTTP on cp1002 is CRITICAL: Connection refused [19:59:01] PROBLEM - Backend Squid HTTP on cp1017 is CRITICAL: Connection refused [19:59:12] PROBLEM - Frontend Squid HTTP on cp1014 is CRITICAL: Connection refused [19:59:31] PROBLEM - Frontend Squid HTTP on cp1008 is CRITICAL: Connection refused [19:59:50] cp? [20:00:11] PROBLEM - Backend Squid HTTP on cp1005 is CRITICAL: Connection refused [20:00:11] PROBLEM - Frontend Squid HTTP on cp1004 is CRITICAL: Connection refused [20:00:11] PROBLEM - Backend Squid HTTP on cp1001 is CRITICAL: Connection refused [20:00:11] PROBLEM - Frontend Squid HTTP on cp1010 is CRITICAL: Connection refused [20:00:11] PROBLEM - Backend Squid HTTP on cp1013 is CRITICAL: Connection refused [20:00:21] PROBLEM - Backend Squid HTTP on cp1007 is CRITICAL: Connection refused [20:00:31] PROBLEM - Backend Squid HTTP on cp1019 is CRITICAL: Connection refused [20:00:41] PROBLEM - Frontend Squid HTTP on cp1016 is CRITICAL: Connection refused [20:02:11] DaBPunkt, cache proxy [20:02:11] PROBLEM - Frontend Squid HTTP on cp1015 is CRITICAL: Connection refused [20:02:31] PROBLEM - Backend Squid HTTP on cp1018 is CRITICAL: Connection refused [20:02:44] Reedy: so sq was renamed to cp? [20:03:05] For new installs, yes [20:03:12] PROBLEM - Backend Squid HTTP on cp1002 is CRITICAL: Connection refused [20:03:16] Ah, caching proxy [20:03:22] PROBLEM - Backend Squid HTTP on cp1014 is CRITICAL: Connection refused [20:03:42] PROBLEM - Backend Squid HTTP on cp1020 is CRITICAL: Connection refused [20:03:51] PROBLEM - Frontend Squid HTTP on cp1017 is CRITICAL: Connection refused [20:04:01] PROBLEM - Frontend Squid HTTP on cp1005 is CRITICAL: Connection refused [20:04:01] PROBLEM - Frontend Squid HTTP on cp1011 is CRITICAL: Connection refused [20:05:02] PROBLEM - Backend Squid HTTP on cp1008 is CRITICAL: Connection refused [20:06:19] new caching [20:06:35] this is false positive as these servers are not actually serving content to the public, disregard. [20:07:02] renamed cuz we are using both squid and varnish [20:07:07] so generic cache proxy hence cp [20:07:50] or rp (reverse proxy) [20:08:34] what happened to hashar [20:18:02] jeremyb: looks like a virus ;) [20:18:20] i know [20:19:12] RECOVERY - HTTP on ekrem is OK: HTTP OK HTTP/1.1 200 OK - 453 bytes in 8.128 seconds [20:22:04] New patchset: Diederik; "Added support for not having to define a filter, (the -f option)." [analytics/udp-filters] (master) - https://gerrit.wikimedia.org/r/2232 [20:23:34] New patchset: Diederik; "Updated the documentation with the new -f or --force option." [analytics/udp-filters] (master) - https://gerrit.wikimedia.org/r/2233 [20:23:36] New patchset: Diederik; "Updated control file to work on emery server." [analytics/udp-filters] (master) - https://gerrit.wikimedia.org/r/2234 [20:23:38] New patchset: Diederik; "Fixed link." [analytics/udp-filters] (master) - https://gerrit.wikimedia.org/r/2235 [20:31:10] API's down? [20:31:20] yeah... [20:31:29] nothing's loading [20:32:45] ouch…. I lost a translation :-'( [20:32:55] eh? [20:33:42] * mys_721tx forgets to save the translation locally [20:36:06] Is the API back up now? [20:36:13] I saw a weird dip on the network graphs [20:37:03] Hmm, maybe that's just Ganglia weirdness [20:37:36] Anyway, whatever happened, things should be back up now [20:57:00] New patchset: Pyoungmeister; "squid class not getting included for some reason. maybe this is a workaround?" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2236 [20:57:18] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2236 [20:58:34] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2236 [20:58:35] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2236 [21:07:02] !log reedy synchronized wmf-config/CommonSettings.php 'Drop FundraiserPortal config' [21:07:04] Logged the message, Master [21:07:41] !log reedy synchronized wmf-config/InitialiseSettings.php 'Drop FundraiserPortal config' [21:07:42] Logged the message, Master [21:16:39] New patchset: ArielGlenn; "add dataset1001 to the dataset2 stanza in site.pp" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2237 [21:16:57] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2237 [21:17:37] New review: ArielGlenn; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2237 [21:17:37] Change merged: ArielGlenn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2237 [21:25:46] New patchset: ArielGlenn; "and fix the expression for dataset2/1001" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2238 [21:26:03] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2238 [21:26:26] New review: ArielGlenn; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2238 [21:26:26] Change merged: ArielGlenn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2238 [21:37:35] !log started rsync from dataset2 to dataset1001 in screen session as root on dataset1001 [21:37:37] Logged the message, Master [21:38:12] oooh [21:38:52] i never screen my rsyncs [21:39:10] always nohup and background and redirection [21:40:30] I do, so someoe can find and shoot it easily if there's a need [21:40:51] there shouldn't be but that's how everything is around here [21:41:01] "this shouldn't cause any problems"... until it does [21:41:11] need food [21:41:12] brb [21:50:42] * jeremyb would just !log pid and log file path and if special kill/restart instructions [21:50:53] * jeremyb understands it's a personal pref thing [21:51:03] uh huh [21:51:17] there's plenty of stuff that runs happier in a screen [21:51:24] so we get used to doing stuff that way [21:52:23] right, but rsync is not one of those ;) [21:52:41] I'm just saying, that we get used to doing things generally that way [21:52:56] sure. it's also a matter of preference [21:54:07] uh huh [22:19:31] !log asher synchronized wmf-config/db.php 'pulling db24 from s2 for upgrade' [22:19:33] Logged the message, Master [22:42:03] New patchset: Pyoungmeister; "attempting to force absolute lookup of class per mr. feldman's suggestion" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2239 [22:44:22] !log reedy synchronized wmf-config/ 'Disable VariablePage completely' [22:44:23] Logged the message, Master [22:44:37] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2239 [22:44:37] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2239 [22:53:31] New patchset: Asher; "upgrading db24" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2240 [22:53:48] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2240 [22:55:03] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2240 [22:55:03] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2240 [22:59:55] PROBLEM - Puppet freshness on lvs1003 is CRITICAL: Puppet has not run in the last 10 hours [22:59:55] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: Puppet has not run in the last 10 hours [22:59:55] PROBLEM - Puppet freshness on mw65 is CRITICAL: Puppet has not run in the last 10 hours [23:02:28] !log K4-713 synchronized production CiviCRM to r1288 on Aluminium [23:02:29] Logged the message, Master [23:08:14] !log asher synchronized wmf-config/db.php 'pulling db12 from enwiki, temporarily moving watchlist/recentchanges to db54' [23:08:15] Logged the message, Master [23:10:23] ashley: does db54 have those crazy query covering indexes? [23:10:33] gahh..I mean binasher [23:10:50] heh [23:10:51] no [23:11:01] tough, it's temporary? :p [23:11:07] are they documented somewhere? [23:11:13] db12 will be back.. er.. in a bit [23:11:16] show indexes from tablename; [23:11:17] ;) [23:11:18] New patchset: Ottomata; "user_agent1.py - need to import Observation" [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2241 [23:11:20] New patchset: Ottomata; "Reworking Observation so that it observes every combination of properties." [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2242 [23:11:31] don't make me diff enwiki schemas [23:11:33] * binasher cries [23:11:43] should only be on 2 tables [23:11:56] binasher: it's one of the uses of the MW load balancer query grouping [23:12:00] Though, it needs documented *somewhere*... [23:12:04] not that it's critical [23:12:33] which tables? [23:13:21] recentchanges and watchlist [23:13:25] New review: Diederik; "Ok." [analytics/reportcard] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2242 [23:13:43] or is that recent_changes, I can't remember ;) [23:13:59] no _ [23:13:59] New review: Diederik; "Ok." [analytics/reportcard] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2241 [23:13:59] Change merged: Diederik; [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2242 [23:14:00] Change merged: Diederik; [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2241 [23:14:01] ahh, 'recentchanges' [23:14:15] * AaronSchulz can never remember that, since some tables have _ and others don't [23:14:22] fuck yeah, consistency [23:14:40] consistent people are boring anyway [23:22:55] PROBLEM - MySQL Slave Delay on db24 is CRITICAL: CRIT replication delay 1219 seconds [23:28:53] New patchset: Asher; "upgrading db12" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2243 [23:29:11] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2243 [23:30:52] New patchset: Bhartshorne; "adding ganglia logtailer and a log tailing module to swift proxy servers" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2244 [23:31:10] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2244 [23:31:26] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2244 [23:31:26] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2244 [23:32:00] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2243 [23:32:01] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2243 [23:34:35] PROBLEM - Puppet freshness on knsq9 is CRITICAL: Puppet has not run in the last 10 hours [23:41:42] New patchset: Bhartshorne; "whoops, copy/paste error" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2245 [23:42:01] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2245 [23:42:01] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2245 [23:43:05] PROBLEM - Disk space on srv223 is CRITICAL: DISK CRITICAL - free space: / 13 MB (0% inode=64%): /var/lib/ureadahead/debugfs 13 MB (0% inode=64%): [23:45:55] RECOVERY - MySQL Slave Delay on db24 is OK: OK replication delay 0 seconds [23:49:00] New patchset: Bhartshorne; "suppressing ganglia-logtailer messages until they're less spammy and specfiying full path because /usr/sbin is not in cron's search path" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2246 [23:49:17] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2246 [23:49:17] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2246 [23:49:17] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2246 [23:51:06] !log updated production civicrm to r1291 [23:51:08] Logged the message, Master [23:53:25] PROBLEM - Disk space on srv219 is CRITICAL: DISK CRITICAL - free space: / 92 MB (1% inode=64%): /var/lib/ureadahead/debugfs 92 MB (1% inode=64%):