[00:00:14] Looks fine for me [00:00:49] Note, there are very few tools that directly connect to the centralauth db though ;) [00:01:27] hm. This could be tricky then, we have 2 or 3 tools, an SUL util, global contribs, etc, which are responding either "Unable to connect to centralauth database" or simply returning nothing (in this case, our SUL Util) [00:03:00] don't they run on TS and not directly to the CA db? [00:03:19] indeed [00:03:40] Nagios is bitching on TS also [00:03:50] yeah, i'm discussing with mlpearc in our channel, I think we're gonna notify #wikimedia-toolserver [00:08:20] !log preilly synchronized php-1.18/extensions/MobileFrontend/MobileFrontend.php 'weekly update to Mobile Frontend' [00:08:21] Logged the message, Master [00:08:44] !log push weekly mobile frontend update [00:08:46] Logged the message, Master [00:14:55] RECOVERY - RAID on mw1110 is OK: OK: no RAID installed [00:15:05] RECOVERY - DPKG on mw1110 is OK: All packages OK [00:18:05] RECOVERY - Disk space on mw1110 is OK: DISK OK [00:19:06] !log asher synchronized wmf-config/db.php 'adding new enwiki slave db52, with a low weight' [00:19:07] Logged the message, Master [00:20:55] RECOVERY - Disk space on srv287 is OK: DISK OK [00:21:25] RECOVERY - Puppet freshness on spence is OK: puppet ran at Fri Jan 20 00:21:14 UTC 2012 [00:27:31] RECOVERY - Host virt1 is UP: PING OK - Packet loss = 0%, RTA = 0.26 ms [00:28:01] RECOVERY - Puppet freshness on bast1001 is OK: puppet ran at Fri Jan 20 00:27:56 UTC 2012 [00:32:25] !log asher synchronized wmf-config/db.php 'setting db52 to full weight' [00:32:26] Logged the message, Master [00:38:55] !log asher synchronized wmf-config/db.php 'lowering db52 weight' [00:38:57] Logged the message, Master [00:43:41] RECOVERY - MySQL replication status on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 0s [00:44:11] PROBLEM - DPKG on ms-fe1 is CRITICAL: Connection refused by host [00:44:11] PROBLEM - DPKG on ms-fe2 is CRITICAL: Connection refused by host [00:45:15] !log asher synchronized wmf-config/db.php 'doubling db52 weight' [00:45:16] Logged the message, Master [00:47:21] RECOVERY - Misc_Db_Lag on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 0s [00:47:31] PROBLEM - Disk space on ms-fe1 is CRITICAL: Connection refused by host [00:49:31] PROBLEM - RAID on ms-fe1 is CRITICAL: Connection refused by host [00:49:31] PROBLEM - RAID on ms-fe2 is CRITICAL: Connection refused by host [00:51:11] PROBLEM - Disk space on ms-fe2 is CRITICAL: Connection refused by host [00:51:31] PROBLEM - SSH on ms-fe1 is CRITICAL: Connection refused [00:51:31] PROBLEM - SSH on ms-fe2 is CRITICAL: Connection refused [00:57:41] PROBLEM - RAID on virt1 is CRITICAL: CRITICAL: Degraded [01:01:31] RECOVERY - MySQL replication status on db1025 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 0s [01:08:54] !log asher synchronized wmf-config/db.php 'pulling db52' [01:08:55] Logged the message, Master [01:17:21] !log updated representative/zipcode mapping and some contact info for a handful of reps/senators for CongressLookup r109598 [01:17:22] Logged the message, Master [01:28:15] New patchset: Bhartshorne; "added new config for ms-fe hosts." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1985 [01:28:51] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/1985 [01:28:52] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1985 [01:33:58] PROBLEM - check_job_queue on spence is CRITICAL: JOBQUEUE CRITICAL - check plugin (check_job_queue) or PHP errors - --wiki [01:44:48] RECOVERY - SSH on ms-fe1 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [01:54:38] RECOVERY - SSH on ms-fe2 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [01:57:19] RECOVERY - DPKG on ms-fe2 is OK: All packages OK [01:57:28] RECOVERY - DPKG on ms-fe1 is OK: All packages OK [01:57:48] RECOVERY - check_job_queue on spence is OK: JOBQUEUE OK - all job queues below 10,000 [01:58:07] hi [02:00:03] how to delete a page from etherpad.wikimedia.org guys? [02:00:28] RECOVERY - Disk space on ms-fe1 is OK: DISK OK [02:01:08] RECOVERY - Disk space on ms-fe2 is OK: DISK OK [02:04:30] I don't know if it's possible [02:04:51] Ryan_Lane: really? dont tell me! [02:05:15] Ryan_Lane: people on etherpad tell me it can be done via api [02:05:21] !log LocalisationUpdate completed (1.18) at Fri Jan 20 02:05:21 UTC 2012 [02:05:23] Logged the message, Master [02:05:35] api and deletePad(padID) deletes a pad [02:45:15] PROBLEM - MySQL replication status on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1338s [02:45:25] PROBLEM - MySQL replication status on db1025 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1348s [02:51:25] PROBLEM - Misc_Db_Lag on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1708s [03:31:56] RECOVERY - Puppet freshness on mw1096 is OK: puppet ran at Fri Jan 20 03:31:35 UTC 2012 [04:16:54] RECOVERY - Misc_Db_Lag on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 0s [04:18:15] RECOVERY - Disk space on es1004 is OK: DISK OK [04:19:24] RECOVERY - MySQL disk space on es1004 is OK: DISK OK [04:21:44] RECOVERY - MySQL replication status on db1025 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 33s [04:34:46] PROBLEM - MySQL slave status on es1004 is CRITICAL: CRITICAL: Slave running: expected Yes, got No [04:41:16] PROBLEM - MySQL replication status on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1205s [04:45:56] PROBLEM - Misc_Db_Lag on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1485s [04:50:06] PROBLEM - MySQL replication status on db1025 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1735s [05:31:16] RECOVERY - MySQL replication status on db1025 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 0s [07:03:33] commit emails aren't coming through to me [07:10:59] did you check the normal spots, like the spam can? [07:11:57] lol that would be stupid [07:12:06] maybe the code slush is really working.... just you wish [07:16:05] 10:22 AM (6 hours ago) is the last follow up I got, but that is normally since i'm not active on CR [07:47:42] Nikerabbit: File a bug? [07:54:42] p858snake|l: no it's not about followups [07:54:51] Joan: to where? [08:47:04] Bugzilla? [09:51:37] suggestion for new Wikimedia error page: http://kvartirakrasivo.ru/404/index.php [09:52:29] PROBLEM - Disk space on es1004 is CRITICAL: DISK CRITICAL - free space: /a 443032 MB (3% inode=99%): [09:54:19] PROBLEM - MySQL disk space on es1004 is CRITICAL: DISK CRITICAL - free space: /a 433037 MB (3% inode=99%): [10:44:07] RECOVERY - MySQL slave status on es1004 is OK: OK: [11:23:47] PROBLEM - Disk space on srv220 is CRITICAL: DISK CRITICAL - free space: / 189 MB (2% inode=60%): /var/lib/ureadahead/debugfs 189 MB (2% inode=60%): [11:29:37] PROBLEM - Disk space on srv223 is CRITICAL: DISK CRITICAL - free space: / 166 MB (2% inode=60%): /var/lib/ureadahead/debugfs 166 MB (2% inode=60%): [11:41:32] PROBLEM - Disk space on srv220 is CRITICAL: DISK CRITICAL - free space: / 189 MB (2% inode=60%): /var/lib/ureadahead/debugfs 189 MB (2% inode=60%): [11:51:32] RECOVERY - Disk space on srv220 is OK: DISK OK [11:57:42] RECOVERY - Disk space on srv223 is OK: DISK OK [12:00:40] i just got a mail because of changed user talk page. content only a single marker: <enotif_body> [12:01:22] from apache by srv267.pmtpa.wmnet [12:08:15] nice [12:11:52] RECOVERY - Host srv199 is UP: PING OK - Packet loss = 0%, RTA = 1.01 ms [12:11:52] ACKNOWLEDGEMENT - Host sq46 is DOWN: PING CRITICAL - Packet loss = 100% daniel_zahn hardware problem - RT 2301 [12:30:42] PROBLEM - Apache HTTP on srv199 is CRITICAL: Connection refused [12:30:42] PROBLEM - RAID on srv199 is CRITICAL: Connection refused by host [12:34:42] PROBLEM - Disk space on srv199 is CRITICAL: Connection refused by host [12:41:12] RECOVERY - Misc_Db_Lag on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 0s [12:42:42] PROBLEM - DPKG on srv199 is CRITICAL: Connection refused by host [12:47:52] PROBLEM - Memcached on srv199 is CRITICAL: Connection refused [12:57:22] RECOVERY - MySQL replication status on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 0s [13:00:42] RECOVERY - Apache HTTP on srv199 is OK: HTTP OK HTTP/1.1 200 OK - 453 bytes in 0.005 seconds [13:10:52] RECOVERY - RAID on srv199 is OK: OK: no RAID installed [13:12:52] RECOVERY - DPKG on srv199 is OK: All packages OK [13:14:32] RECOVERY - Disk space on srv199 is OK: DISK OK [13:35:14] RECOVERY - Memcached on srv199 is OK: TCP OK - 0.003 second response time on port 11000 [14:22:35] PROBLEM - SSH on srv272 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:23:24] PROBLEM - Disk space on srv272 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:24:04] PROBLEM - DPKG on srv272 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:27:26] !log reedy synchronized php/cache/interwiki.cdb 'Updating interwiki cache' [14:27:27] Logged the message, Master [14:29:58] RECOVERY - SSH on srv272 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [14:29:59] RECOVERY - DPKG on srv272 is OK: All packages OK [14:33:08] RECOVERY - Disk space on srv272 is OK: DISK OK [14:41:54] !log reedy synchronized php/cache/interwiki.cdb 'Updating interwiki cache' [14:41:55] Logged the message, Master [14:53:58] PROBLEM - Disk space on srv224 is CRITICAL: DISK CRITICAL - free space: / 0 MB (0% inode=60%): /var/lib/ureadahead/debugfs 0 MB (0% inode=60%): [15:04:02] <^demon> !log fixed post-commit hook on formey email notifs to point to correct smtp server [15:04:03] Logged the message, Master [15:13:58] RECOVERY - Disk space on srv224 is OK: DISK OK [17:22:12] we turned logos to //upload.wikimedia.org/wikipedia/commons/0/0a/Wikipedia-logo-v2-*.png during logo v2 release [17:22:20] when will they be switched back? [17:22:24] to local Wiki.png [17:26:52] has anyone seen milos around? [17:37:57] liangent: Why would we? Local uploads should be discouraged or is already disabled anyway [17:54:05] PROBLEM - Backend Squid HTTP on knsq9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:54:25] PROBLEM - Frontend Squid HTTP on knsq9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:58:45] PROBLEM - SSH on knsq9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:15:05] PROBLEM - RAID on searchidx2 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [18:18:25] PROBLEM - Host knsq9 is DOWN: PING CRITICAL - Packet loss = 100% [18:25:25] RECOVERY - RAID on searchidx2 is OK: OK: State is Optimal, checked 4 logical device(s) [18:33:30] RECOVERY - SSH on knsq9 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [18:33:40] RECOVERY - Host knsq9 is UP: PING OK - Packet loss = 0%, RTA = 109.26 ms [18:35:10] RECOVERY - Backend Squid HTTP on knsq9 is OK: HTTP OK HTTP/1.0 200 OK - 629 bytes in 0.440 seconds [18:37:10] RECOVERY - Frontend Squid HTTP on knsq9 is OK: HTTP OK HTTP/1.0 200 OK - 650 bytes in 0.220 seconds [19:11:52] http://*.wikipedia.org/stats is pointing to knams.wikimedia.org which doesn't exist for ages [19:32:26] Greetings all. Can anyone explain to me how the parser invalidates cache for a redirecting page when the underlying page gets changed? [19:35:03] Or if it does so at all? Realted to http://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)#Parser_cache_not_invalidated_for_redirect_pages [19:36:15] Franamax: I guess that's more a #mediawiki question, cause it's not WMF specific [19:37:54] zzz =_= [19:39:02] hoo:Hmm, I was hoping it was specific to en:wiki setup. My understanding was that invalidation gets tossed to the job queue to regenerate redirects, and maybe it's not working so well at high volume. [19:39:49] well, the job queue is there to make it scale [19:40:38] It seems to happen on super-busy pages, in this case the WP:RD/L WP:RD/C etc. redirects [19:43:49] and it's known to work on other pages? [19:44:14] Heh, who knows? No-one complains about that. :) [19:44:55] before job runners were fixed last time, Roan found a WP:ANI redirect which was cached and several weeks old [19:45:32] queue is ok, so problems can't be that bad [19:46:18] yes, it's quite low [19:46:26] I was wondering if ANI woould have the same problem, the RD pages change on the same scale [19:47:06] The WP:VPT example is weeks old - so could the job just have been dropped somewhere? [19:48:12] I mean the example I give at VPT was for a weeks-old invalid cached page, so it wasn't just an overloaded job queue. [19:50:35] New patchset: Bhartshorne; "adding sharding to proxy config, sharding two containers in the eqiad cluster" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1986 [19:50:50] New review: gerrit2; "Change did not pass lint check. You will need to send an amended patchset for this (see: https://lab..." [operations/puppet] (production); V: -1 - https://gerrit.wikimedia.org/r/1986 [19:54:25] New patchset: Bhartshorne; "adding sharding to proxy config, sharding two containers in the eqiad cluster" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1986 [19:54:41] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/1986 [19:54:57] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/1986 [19:54:57] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1986 [20:00:13] Also, why do (seemingly) only anon users see the problem? Based on checks by logged-in users at the RefDeks, the WP:RD/x redirects are never stale for them. [20:04:50] New patchset: Asher; "attempt to fix db22 puppet breakage" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1987 [20:05:06] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/1987 [20:05:16] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1987 [20:05:16] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1987 [20:17:29] I guess there's no more interest in the question/problem I raised. hoo, NemoBis, thanks for your comments! :) [20:18:03] You're welcome ;) [20:25:46] �� [20:42:17] PROBLEM - SSH on knsq9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:46:03] New patchset: Asher; "ensure nrpe.d directory exists" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1988 [20:46:19] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/1988 [20:46:41] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1988 [20:46:42] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1988 [20:58:17] PROBLEM - Backend Squid HTTP on knsq9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:02:07] RECOVERY - SSH on knsq9 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [21:34:54] New patchset: Lcarr; "adding in bonding information to correct hashing" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1989 [21:35:08] New review: gerrit2; "Change did not pass lint check. You will need to send an amended patchset for this (see: https://lab..." [operations/puppet] (production); V: -1 - https://gerrit.wikimedia.org/r/1989 [21:38:15] New patchset: Lcarr; "adding in bonding information to correct hashing" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1989 [21:38:27] New review: gerrit2; "Change did not pass lint check. You will need to send an amended patchset for this (see: https://lab..." [operations/puppet] (production); V: -1 - https://gerrit.wikimedia.org/r/1989 [21:47:32] New patchset: Lcarr; "adding in bonding information to correct hashing" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1989 [21:49:15] New patchset: Lcarr; "adding in bonding information to correct hashing" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1989 [21:49:30] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/1989 [21:56:29] New patchset: Lcarr; "Changing default bonding xmit_hash_policy to layer2+3" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1989 [21:58:40] !log reedy synchronized wmf-config/InitialiseSettings.php 'Bug 33789 - Enable botadmin usergroup on ml.wikipedia' [21:58:42] Logged the message, Master [22:00:08] !log reedy synchronized wmf-config/InitialiseSettings.php 'Bug 33789 - Enable botadmin usergroup on ml.wikipedia' [22:00:10] Logged the message, Master [22:02:50] New patchset: Lcarr; "Changing default bonding xmit_hash_policy to layer2+3" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1989 [22:03:06] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/1989 [22:03:47] !log reedy synchronized wmf-config/InitialiseSettings.php 'Bug 33789 - Enable botadmin usergroup on ml.wikipedia' [22:03:48] Logged the message, Master [22:08:18] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/1989 [22:11:41] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1989 [22:11:41] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1989 [22:34:50] New patchset: Asher; "running pt-heartbeat daemon on core cluster dbs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1990 [22:35:06] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/1990 [22:35:58] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1990 [22:35:59] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1990 [22:40:11] New patchset: Asher; "fix path" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1991 [22:40:29] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/1991 [22:43:19] New patchset: Lcarr; "adding in the bonding xmit-hash-policy to interface commands" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1992 [22:43:34] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/1992 [22:45:56] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/1992 [22:47:47] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1992 [22:47:47] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1992 [23:03:51] PROBLEM - Host ms6 is DOWN: PING CRITICAL - Packet loss = 100% [23:05:01] PROBLEM - Host sq70 is DOWN: PING CRITICAL - Packet loss = 100% [23:05:11] PROBLEM - Host niobium is DOWN: PING CRITICAL - Packet loss = 100% [23:09:27] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1991 [23:09:27] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1991 [23:10:26] !log reedy synchronized wmf-config/InitialiseSettings.php 'Wrap some stupidly long lines' [23:10:27] Logged the message, Master [23:13:41] RECOVERY - Host sq70 is UP: PING OK - Packet loss = 0%, RTA = 0.31 ms [23:17:52] PROBLEM - Host ms2 is DOWN: PING CRITICAL - Packet loss = 100% [23:18:11] RECOVERY - Host niobium is UP: PING OK - Packet loss = 0%, RTA = 28.74 ms [23:18:51] PROBLEM - Host sq69 is DOWN: PING CRITICAL - Packet loss = 100% [23:18:51] PROBLEM - Host ms1 is DOWN: PING CRITICAL - Packet loss = 100% [23:20:11] PROBLEM - Host sq67 is DOWN: PING CRITICAL - Packet loss = 100% [23:21:11] PROBLEM - Host cp3002 is DOWN: PING CRITICAL - Packet loss = 100% [23:21:21] PROBLEM - Host sq68 is DOWN: PING CRITICAL - Packet loss = 100% [23:21:21] RECOVERY - Host ms2 is UP: PING OK - Packet loss = 0%, RTA = 2.11 ms [23:21:56]

Error 503 Service Unavailable

[23:21:56]

Service Unavailable

[23:21:56]

Guru Meditation:

[23:21:56]

XID: 1254067489

[23:22:10] on http://bits.wikimedia.org/fi.wikibooks.org/load.php?debug=false&lang=fi&modules=jquery.autoEllipsis%2CcheckboxShiftClick%2CcollapsibleTabs%2Ccookie%2CdelayedBind%2ChighlightText%2CmakeCollapsible%2CmessageBox%2CmwPrototypes%2Cplaceholder%2Csuggestions%2CtabIndex|mediawiki.action.watch.ajax|mediawiki.language%2Cuser%2Cutil|mediawiki.legacy.ajax%2Cmwsuggest%2Cwikibits|mediawiki.page.ready&skin=vector&version=20120120T085346Z&* [23:22:21] RECOVERY - Host sq69 is UP: PING OK - Packet loss = 0%, RTA = 0.74 ms [23:23:49] Is it only me or has bits serious problems? [23:24:12] not just you [23:24:16] folks are looking into it now [23:25:39] is something wrong? [23:25:50] problem at bits? [23:25:58] sDrewth: yes [23:26:04] k [23:27:26] "Failed to load resource: the server responded with a status of 502 (Bad Gateway)" [23:27:44] according to CT, load issues possibly caused by downtime in our European DC [23:28:47] bits.wikimedia.org is overload now [23:28:51] RECOVERY - Host sq67 is UP: PING OK - Packet loss = 0%, RTA = 1.16 ms [23:29:16] because some work is happening to the bits servers in ESAM [23:29:25] thanks Erik [23:29:58] it may be that i'm using the SSL, but it's not just a skin issue. I'm not even seeing edit links on pages. [23:30:26] ditto [23:30:28] pakaran: CSS might be missing [23:30:36] my question was answered though as I came in :P [23:31:41] RECOVERY - Host sq68 is UP: PING OK - Packet loss = 0%, RTA = 0.36 ms [23:31:59] DragonFire_aw, do you use chrome too? [23:32:06] i'm thinking about trying to switch to ff if not [23:32:10] fireFox [23:32:15] :( [23:32:27] i have chrome [23:32:31] i just rarely use it [23:32:34] Eloquence: I hope that comment to subject is accurate [23:33:01] RECOVERY - Host ms1 is UP: PING OK - Packet loss = 0%, RTA = 0.23 ms [23:33:21] PROBLEM - Host ms3 is DOWN: PING CRITICAL - Packet loss = 100% [23:33:24] sDrewth, yep, thanks [23:34:09] fixed? [23:35:09] should be fine now [23:35:33] switched over from running bits from EQIAD to PMTPA [23:35:38] aok here now [23:35:51] working now [23:37:11] and peace returns [23:39:13] yayy [23:47:51] RECOVERY - Host ms3 is UP: PING OK - Packet loss = 0%, RTA = 1.92 ms [23:49:24] New patchset: Bhartshorne; "deploying new rewrite.py with sharded container support" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1993 [23:49:39] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/1993 [23:52:31] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/1993 [23:52:31] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1993 [23:54:12] New review: Diederik; "Three suggestions:" [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/1794