[00:04:55] sync done. [00:05:58] !log awjrichards synchronized wmf-config/CommonSettings.php 'Bumpging version number for MobileFrontend resources' [00:06:01] Logged the message, Master [00:07:20] !log awjrichards synchronized wmf-config/InitialiseSettings.php 'Enabling zero rated mobile access everywhere' [00:07:23] Logged the message, Master [00:15:42] !log awjrichards synchronized php/extensions/MobileFrontend/MobileFrontend.body.php 'r114221' [00:15:45] Logged the message, Master [00:16:56] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:23:05] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 335 bytes in 1.843 seconds [00:23:23] PROBLEM - Packetloss_Average on locke is CRITICAL: CRITICAL: packet_loss_average is 8.24693041322 (gt 8.0) [00:25:08] !log awjrichards synchronizing Wikimedia installation... : Reverting MobileFrontend to r113973 [00:25:12] Logged the message, Master [00:37:31] sync done. [00:58:47] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:05:05] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 335 bytes in 0.035 seconds [01:38:59] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:45:08] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 335 bytes in 0.028 seconds [01:47:50] RECOVERY - Packetloss_Average on locke is OK: OK: packet_loss_average is 3.8184787931 [02:13:06] RECOVERY - Packetloss_Average on emery is OK: OK: packet_loss_average is 3.31827025 [02:17:56] !log LocalisationUpdate completed (1.19) at Tue Mar 20 02:17:55 UTC 2012 [02:18:01] Logged the message, Master [02:18:48] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:24:57] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 335 bytes in 5.518 seconds [03:10:29] Joan: ping [03:10:48] jeremyb: Hi. [03:11:15] Joan: what makes you think your request is being consistently served by a lagged DB? [03:11:26] @replag s1 [03:11:27] jeremyb: [s1] db36: 21382s, db12: 0s, db32: 0s, db38: 0s, db52: 0s, db53: 0s [03:11:41] ok, that looks good [03:12:26] and of course, do provide more info about what exactly you're doing and what the response is [03:12:29] jeremyb: The fact that they're all hanging. [03:12:38] I don't really think this is very complicated. [03:12:47] One server is lagged at a crazy, crazy high value. [03:12:52] And that breaks shit. [03:12:54] sure [03:12:56] yes [03:13:23] > I'm not sure of the exact configuration, but it seems like nearly every API [03:13:27] request is being handled by the lagged server (db36)? Or perhaps my scripts [03:13:30] just have terrible luck. [03:13:32] what does that mean? [03:13:47] It means that it's possibile that all my scripts are hitting db36 for some reason. [03:14:03] But I think the way the slaves are assigned, db36 is handling all(?) API requests. [03:14:06] I'm not sure. [03:14:13] 20 03:11:15 < jeremyb> Joan: what makes you think your request is being consistently served by a lagged DB? [03:14:34] i'm not a huge fan of running in circles. your reporting is deficient [03:14:35] Because all the scripts are hanging. [03:14:38] ;/ [03:14:48] 20 03:12:26 < jeremyb> and of course, do provide more info about what exactly you're doing and what the response is [03:14:53] If they were hitting a different server, they'd presumably run correctly. [03:15:27] what specifically are you doing to make yourself backoff when there's a high lag? [03:19:00] http://p.defau.lt/?BubGLpHHmmJPivVldAZunw seems to be the branch it's hitting. [03:21:20] {u'servedby': u'srv234', u'error': {u'info': u'Waiting for 10.0.6.46: 21948 seconds lagged', u'code': u'maxlag'}} [03:21:42] So... the master's lagged? [03:21:53] no. a master can never be lagged [03:22:11] not possible. not even theoretically [03:23:24] hrmmm, now i see you already discussed in #-operations [03:23:28] * jeremyb catches up [03:28:22] Joan: haha, you're sleeping for 6 hrs at a time? ;) [03:29:14] What? [03:29:19] oh, nvm. if lagtime > self.wiki.maxwaittime: [03:29:24] Right. [03:29:41] It just keeps sleeping for 120 seconds, with the crazy idea that the API lag will decrease one day. [03:29:44] 21600 secs is 6 hrs [03:30:01] What I don't understand is why one lagged server is affected the response code for the others. [03:30:22] you should consider that it may be by design [03:30:27] makes perfect sense to me [03:30:59] anyway, getting sleepy here [03:32:24] If the design is to ruin API scripts, yes, it makes perfect sense. [03:37:41] Joan: err... I think we can all agree that long lasting table alters are uncommon? so, it's unlikely the maxlag feature was designed specifically around that scenario. (unless maybe it's had some recentish updates) [03:38:19] I thought the procedure her was to depool the server. [03:38:23] here [03:38:44] Joan: the solution is 1) wait for asher to wake up and 2) get him to fix the maxlag response. maybe that means pull from db.php, maybe he has some other fix [03:38:49] Joan: maybe... [03:39:00] I'm not sure how six hours of lag is acceptable, particularly when it affects the other servers. [03:39:25] > it affects the other servers [03:39:27] {{fact}} [03:39:55] {u'servedby': u'srv234', u'error': {u'info': u'Waiting for 10.0.6.46: 21948 seconds lagged', u'code': u'maxlag'}} [03:40:07] what other servers? [03:40:22] All of them? [03:40:24] http://lists.wikimedia.org/pipermail/wikitech-l/2012-March/059057.html [03:40:32] They're all returning the error. [03:40:37] According to my script, at least. [03:41:18] again, I have to say it's by design. having only some return the "error" would be a bug [03:41:31] but I'm only 98.5% certain about by design [03:44:54] PROBLEM - Puppet freshness on owa3 is CRITICAL: Puppet has not run in the last 10 hours [03:46:52] PROBLEM - Puppet freshness on amslvs2 is CRITICAL: Puppet has not run in the last 10 hours [03:54:49] PROBLEM - Puppet freshness on owa1 is CRITICAL: Puppet has not run in the last 10 hours [03:54:49] PROBLEM - Puppet freshness on owa2 is CRITICAL: Puppet has not run in the last 10 hours [05:21:33] PROBLEM - Puppet freshness on aluminium is CRITICAL: Puppet has not run in the last 10 hours [05:35:08] !log asher synchronized wmf-config/db.php 'pulling db36 durring db migration' [05:35:13] Logged the message, Master [06:11:24] dungodung|away, yes, a mixed approach is the solution probably :) [06:21:51] !log tstarling synchronized php-1.19/includes/User.php [06:21:54] Logged the message, Master [06:56:17] PROBLEM - Packetloss_Average on emery is CRITICAL: CRITICAL: packet_loss_average is 13.9469902676 (gt 8.0) [06:58:23] RECOVERY - Packetloss_Average on emery is OK: OK: packet_loss_average is 0.84471624 [07:01:05] PROBLEM - Puppet freshness on db59 is CRITICAL: Puppet has not run in the last 10 hours [07:10:41] RECOVERY - check_job_queue on spence is OK: JOBQUEUE OK - all job queues below 10,000 [07:27:29] jhow [07:35:26] RECOVERY - Puppet freshness on brewster is OK: puppet ran at Tue Mar 20 07:35:20 UTC 2012 [07:41:09] PROBLEM - Puppet freshness on amslvs4 is CRITICAL: Puppet has not run in the last 10 hours [08:53:21] !log tfinc synchronized wmf-deployment/extensions/ZeroRatedMobileAccess/ZeroRatedMobileAccess.body.php 'Fixes file pages showing data charge warnings' [08:53:24] Logged the message, Master [10:04:41] hello ppls [10:05:28] Any known reason why the book creator is failing to provide ODT outputs? [10:23:43] mkay, filed bug. [10:24:13] https://bugzilla.wikimedia.org/show_bug.cgi?id=35353 [12:33:47] PROBLEM - Puppet freshness on magnesium is CRITICAL: Puppet has not run in the last 10 hours [12:46:32] PROBLEM - swift-container-auditor on ms-be2 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [12:56:44] PROBLEM - Disk space on srv219 is CRITICAL: DISK CRITICAL - free space: / 179 MB (2% inode=61%): /var/lib/ureadahead/debugfs 179 MB (2% inode=61%): [13:05:17] RECOVERY - Disk space on srv219 is OK: DISK OK [13:11:17] RECOVERY - swift-container-auditor on ms-be2 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [13:46:32] PROBLEM - Puppet freshness on owa3 is CRITICAL: Puppet has not run in the last 10 hours [13:48:38] PROBLEM - Puppet freshness on amslvs2 is CRITICAL: Puppet has not run in the last 10 hours [13:56:26] PROBLEM - Puppet freshness on owa1 is CRITICAL: Puppet has not run in the last 10 hours [13:56:26] PROBLEM - Puppet freshness on owa2 is CRITICAL: Puppet has not run in the last 10 hours [14:08:16] !log reedy synchronized php-1.19/extensions/ZeroRatedMobileAccess/ZeroRatedMobileAccess.i18n.php 'r114268' [14:08:19] Logged the message, Master [14:09:47] PROBLEM - swift-container-auditor on ms-be3 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [14:14:52] !log reedy synchronizing Wikimedia installation... : sscapping for r114268 [14:14:55] Logged the message, Master [14:20:15] RECOVERY - swift-container-auditor on ms-be3 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [14:29:17] sync done. [14:33:03] !log reedy synchronized wmf-config/InitialiseSettings.php 'Bug 35355 - Reset of permissions for Hindi Wikipedia (hiwiki)' [14:33:06] Logged the message, Master [14:41:06] PROBLEM - Packetloss_Average on emery is CRITICAL: CRITICAL: packet_loss_average is 18.9332980342 (gt 8.0) [14:55:39] PROBLEM - Packetloss_Average on locke is CRITICAL: CRITICAL: packet_loss_average is 8.10398981982 (gt 8.0) [14:57:04] mark: can you email the log output. thx [14:57:46] !log reedy synchronized wmf-config/InitialiseSettings.php 'Remove more group dupes' [14:57:49] Logged the message, Master [15:12:18] RECOVERY - Packetloss_Average on locke is OK: OK: packet_loss_average is 1.44617946429 [15:22:57] RECOVERY - Disk space on search1015 is OK: DISK OK [15:23:06] PROBLEM - Puppet freshness on aluminium is CRITICAL: Puppet has not run in the last 10 hours [15:25:50] !log reedy synchronized wmf-config/InitialiseSettings.php 'Bug 31209 - Enable the WikiLove extension for incubator' [15:25:54] Logged the message, Master [15:30:17] !log reedy synchronized wmf-config/InitialiseSettings.php 'Bug 35161 - Incubator configuration updates' [15:30:21] Logged the message, Master [15:49:03] !log reedy synchronized wmf-config/InitialiseSettings.php 'Bug 35296 - Namespace names changing on Komi Wikipedia' [15:49:06] Logged the message, Master [15:50:17] !log reedy synchronized wmf-config/InitialiseSettings.php 'Bug 35296 - Namespace names changing on Komi Wikipedia' [15:50:21] Logged the message, Master [15:52:28] !log reedy synchronized wmf-config/InitialiseSettings.php 'Bug 35296 - Namespace names changing on Komi Wikipedia' [15:52:31] Logged the message, Master [16:04:37] A user reported some edits from 2001 to 2003 seems to be missing on ptwiki. Any change they were they lost somehow? Maybe a software update? [16:05:58] helderwiki, yes [16:06:57] can someone explain what "RT #2665 created" means on a bugzilla wiki creation request page? :) [16:07:35] Wizardist, that the job has been assigned internally to WMF staff (or will be soon) [16:07:47] helderwiki, see e.g. https://en.wikipedia.org/wiki/User:Graham87/Page_history_observations https://en.wikipedia.org/wiki/User:Graham87/Import [16:07:56] Nemo_bis: (from #mediawiki:) Why old edits such as [16:07:57] seems like a jira ticket id :) [16:07:58] http://pt.wikipedia.org/wiki/Planeta?diff=1517&oldid=1030&uselang=en [16:08:00] are not in the list of contributions of the user? [16:08:02] http://pt.wikipedia.org/wiki/Special:Contribs/Jorge?dir=prev&limit=10&uselang=en [16:09:33] helderwiki, has the user been renamed? [16:10:10] Nemo_bis: I don't know [16:10:24] it's usually either the username not nomalized, or 0/NULL as user id for the revision [16:10:46] but it shows the name on diff [16:11:05] desn't matter [16:11:16] hmm [16:12:01] There are other examples mentioned (in Portuguese) at [16:12:03] https://pt.wikipedia.org/wiki/Wikip%C3%A9dia:Esplanada/geral/Onde_est%C3%A1_o_nosso_passado%3F_(17mar2012)?uselang=en [16:12:18] Wizardist: RT = ticket for usage in the interal ticket system of the Operations departement [16:16:37] helderwiki, analyse the history with API [16:17:03] I'll provide them the second link you suggested [16:17:16] and also a link to your subpage about bug 323 =) [16:17:38] since they may be interested [16:17:57] what kind of analysis you suggest using API [16:18:00] ? [16:22:23] PROBLEM - Packetloss_Average on locke is CRITICAL: CRITICAL: packet_loss_average is 10.0887963964 (gt 8.0) [16:26:35] RECOVERY - Packetloss_Average on locke is OK: OK: packet_loss_average is 0.0953002702703 [16:28:58] Reedy: are you around by any chance? [16:29:38] Ya [16:32:00] Reedy, more to do on https://bugzilla.wikimedia.org/show_bug.cgi?id=35355 [16:32:18] Bleh [16:32:20] stewards are on it right now and would need the changes to complete the work [16:34:54] Why can't people work out exactly what they want before asking for things? [16:34:59] The "and one more thing" annoys me [16:35:52] Reedy, not my fault, bug filed too soon [16:36:08] also, the group renamed wrongly in the past is a shell's fault [16:37:27] I might aswell just clear the bad groups from the database [16:37:51] Reedy, that would be sweet [16:37:53] Also, that isn't exactly shell at fault [16:38:05] well, a mix I suppose [16:38:05] MediaWiki had a 16 character limit [16:39:35] that's the former groups removed [16:40:32] !log reedy synchronized wmf-config/InitialiseSettings.php 'Remove hiwiki botadmin from whGRoupsRemoveFromSelf' [16:40:36] Logged the message, Master [16:42:18] !log reedy synchronized wmf-config/abusefilter.php 'Bug 35355 - Reset of permissions for Hindi Wikipedia' [16:42:22] Logged the message, Master [16:43:34] Nemo_bis: that's it all done again [16:44:01] Reedy, thank you very much! [16:46:54] Reedy, look at https://hi.wikipedia.org/w/index.php?title=special:ListUsers&group=autopatrolled [16:46:59] first row cached? [16:47:35] PROBLEM - Packetloss_Average on locke is CRITICAL: CRITICAL: packet_loss_average is 9.32344088496 (gt 8.0) [16:47:45] possibly [16:47:50] let me try invalidating it [16:49:16] Yup, cache invalidated for the user and it's gone [16:50:02] good [16:52:46] Reedy, more errors :( don't act on the bug yet, they'e being investigated to give you a full list... [16:53:02] * Nemo_bis blames billinghurst [16:53:06] bleh [16:53:16] * Reedy goes and deletes hiwiki [16:53:24] that's a good option too [16:56:20] Hi. I get 504 Gateway Time-out when using Special:Nuke at commons. [16:57:52] RECOVERY - Packetloss_Average on locke is OK: OK: packet_loss_average is 0.718140810811 [16:59:02] mafk, on any user? [16:59:13] I now get [16:59:23] Wikimedia Foundation Error Our servers are currently experiencing a technical problem. This is probably temporary and should be fixed soon. etc [16:59:40] They were just ~40 copyvio uploads [16:59:51] That'd not be so much to stress the database I think :) [17:00:18] Nuke's query has been changed recently IIRC [17:00:22] dunno if deplyed [17:01:04] mafk, but is it on request or on actual deletion? [17:01:15] !log reedy synchronized wmf-config/InitialiseSettings.php 'Test something for sewikimedia' [17:01:18] Logged the message, Master [17:01:41] Nemo_bis: uhm? [17:02:05] mafk, afte you put the username or after you click "delete selection"? [17:02:23] !log reedy synchronized wmf-config/InitialiseSettings.php 'Revert that then' [17:02:27] Logged the message, Master [17:02:35] Yes, I click 'delete selected' and then the favicon starts rolling etc [17:02:40] PROBLEM - Puppet freshness on db59 is CRITICAL: Puppet has not run in the last 10 hours [17:02:51] but the images are slooowly deleted (one per minute or so) [17:03:08] even blue links on the deletion log.. I think the archiving is slow. [17:03:26] ah, so I guess it's the same problem as with file restores? [17:05:19] I've not restored a file for months. [17:05:26] So, idk :) [17:23:54] hello [17:24:56] does anyone of you know whether the "alter table" /database migration of enwiki is complete regarding rev_sha1? [17:30:58] !log aaron synchronized php-1.19/includes/filerepo/file/LocalFile.php 'deployed r114285' [17:31:01] Logged the message, Master [17:42:43] PROBLEM - Puppet freshness on amslvs4 is CRITICAL: Puppet has not run in the last 10 hours [17:42:57] Does anyone know the error in donation interface with credit card? [17:44:01] http://wikimediafoundation.org/w/index.php?title=WMFJA085/en&utm_source=donate&utm_medium=sidebar&utm_campaign=20101204SB002&language=en&uselang=en&country=JP&referrer=http%3A%2F%2Fmeta.wikimedia.org%2Fwiki%2FDonation [17:45:05] select amount and click donate by credit card, then Internal error is returned [17:45:35] PASTEBIN [17:45:53] channel notices are really not the best way to do this, use pastebin [17:47:54] Why did those messages send as notices, though. [17:48:10] Anyways, aokomoriuta - somebody else emailed about the same issue. [17:48:38] sorry for not using pasetbin>all [17:49:19] RD: emailed to where? [17:49:54] aokomoriuta: to OTRS [17:50:14] I've been looking for the right person to nag about it ;) [17:50:26] aokomoriuta: credit card forms are currently disabled [17:50:36] aokomoriuta: how did you get to that form? [17:51:04] pgehres: I have an email same thing from today. [17:51:21] I also got an OTRS email, and he/she said [17:51:35] (I am a member of info-ja queue [17:52:03] you should send anything about payments.wikimedia.org to the donations queue [17:52:16] THe one is there now [17:52:20] https://ticket.wikimedia.org/otrs/index.pl?Action=AgentTicketZoom&TicketID=6490755 [17:52:30] !log reedy synchronized wmf-config/InitialiseSettings.php 'Bug 35193 - Enable sub page feature in Telugu Wikisource' [17:52:34] Logged the message, Master [17:52:41] * pgehres needs to learn how to use OTRS better [17:52:47] thanks RD [17:52:54] Sure [17:53:42] The page via link from Wikipedia seems same, doesn't it? [17:55:13] the credit card forms are disabled due to an issue with our payment processor, there shouldn't be any links to the forms on the donation pages, but clearly there are [17:56:45] How can we remove/solve this link? [17:57:56] pgehres: I'll email this guy back and explain, and ask where he found the link if you want [17:58:10] well, I apparently forgot to remove the link on the forms on foundationwiki. I have turned them off there, but we will have to wait for the cache to clear [17:58:34] ok [17:58:48] OK, thank you! [17:59:02] RD: if you don't mind that would be great. PayPal is still on if he wants to make a donation using that, or we can let him know when they are back on [17:59:10] ok [17:59:47] RD, aokomoriuta: If you ever have any fundraising issues, feel free to find me. [18:00:01] Sure. Is Megan still on IRC? [18:00:30] pgehres: ok. [18:00:49] yes, but she isn't on all the time. I try to stay logged in most of the time, even if I am not actually at my computer [18:01:03] Alright cool [18:02:12] anyway, RD: this IRC client with default setting sends as notice when I try to send many text and I forgot turned off... [18:02:31] PROBLEM - swift-account-server on ms-be4 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-account-server [18:02:34] !log flipped Template:CC-status on wmfwiki since credit cards are still disabled on payments.wikimedia.org [18:02:37] Logged the message, Master [18:04:19] PROBLEM - swift-container-server on ms-be3 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-server [18:05:22] PROBLEM - swift-account-server on ms-be3 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-account-server [18:07:10] PROBLEM - Packetloss_Average on locke is CRITICAL: CRITICAL: packet_loss_average is 10.4438136036 (gt 8.0) [18:07:17] pgehres: I confirmed the link has been removed now. Thank you again! [18:11:48] thank you all and sorry for bothering you with wrong notice messages! [18:12:03] RECOVERY - Packetloss_Average on locke is OK: OK: packet_loss_average is 0.917548660714 [18:15:48] RECOVERY - swift-account-server on ms-be4 is OK: PROCS OK: 25 processes with regex args ^/usr/bin/python /usr/bin/swift-account-server [18:16:15] RECOVERY - swift-account-server on ms-be3 is OK: PROCS OK: 25 processes with regex args ^/usr/bin/python /usr/bin/swift-account-server [18:17:09] RECOVERY - swift-container-server on ms-be3 is OK: PROCS OK: 25 processes with regex args ^/usr/bin/python /usr/bin/swift-container-server [18:20:36] PROBLEM - swift-container-server on ms-be2 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-server [18:20:54] PROBLEM - swift-account-server on ms-be2 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-account-server [18:26:54] RECOVERY - swift-container-server on ms-be2 is OK: PROCS OK: 25 processes with regex args ^/usr/bin/python /usr/bin/swift-container-server [18:27:12] RECOVERY - swift-account-server on ms-be2 is OK: PROCS OK: 25 processes with regex args ^/usr/bin/python /usr/bin/swift-account-server [18:34:27] !log preilly synchronized php-1.19/extensions/ZeroRatedMobileAccess/ZeroRatedMobileAccess.body.php 'changes for zero needed for carrier testing header of landing page' [18:34:32] Logged the message, Master [18:56:00] PROBLEM - Packetloss_Average on locke is CRITICAL: CRITICAL: packet_loss_average is 11.0074254955 (gt 8.0) [18:57:57] RECOVERY - Packetloss_Average on locke is OK: OK: packet_loss_average is 0.967665929204 [19:00:57] PROBLEM - Disk space on srv223 is CRITICAL: DISK CRITICAL - free space: / 221 MB (3% inode=61%): /var/lib/ureadahead/debugfs 221 MB (3% inode=61%): [19:07:15] RECOVERY - Disk space on srv223 is OK: DISK OK [19:41:17] !log preilly synchronized php-1.19/extensions/ZeroRatedMobileAccess/ZeroRatedMobileAccess.body.php 'changes for zero needed for carrier testing header of landing page' [19:41:20] Logged the message, Master [19:52:48] !log pushing change for zero.wikipedia.org to redirect to the english message [19:52:52] Logged the message, Master [20:00:22] RoanKattouw: how do I purge things again? [20:00:33] From where, Squid? [20:00:34] and why does mwscript not have a freaking —help option? [20:00:36] yes [20:00:47] mwscript purgeList.php enwiki [20:01:00] PROBLEM - Disk space on srv220 is CRITICAL: DISK CRITICAL - free space: / 112 MB (1% inode=61%): /var/lib/ureadahead/debugfs 112 MB (1% inode=61%): [20:01:00] Enter a list of URLs separated by newlines, end with Ctrl+D on an empty line [20:01:41] thanks [20:15:42] RECOVERY - Disk space on srv220 is OK: DISK OK [20:59:52] !log preilly synchronized php-1.19/extensions/ZeroRatedMobileAccess/ZeroRatedMobileAccess.body.php 'changes for zero needed for carrier testing header of landing page only for mswiki' [20:59:55] Logged the message, Master [21:07:41] PROBLEM - Host magnesium is DOWN: PING CRITICAL - Packet loss = 100% [21:11:44] RECOVERY - Packetloss_Average on emery is OK: OK: packet_loss_average is 0.0 [21:22:06] bonne nuit [21:22:49] good night ;) [21:39:38] RECOVERY - RAID on virt1 is OK: OK: State is Optimal, checked 2 logical device(s) [21:41:17] PROBLEM - Packetloss_Average on emery is CRITICAL: CRITICAL: packet_loss_average is 9.52236754386 (gt 8.0) [21:53:53] PROBLEM - swift-container-auditor on ms-be3 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [21:54:13] !log awjrichards synchronizing Wikimedia installation... : Pushing MobileFrontend changes per http://www.mediawiki.org/wiki/Extension:MobileFrontend/Deployments#20_March.2C_2012 [21:54:16] Logged the message, Master [21:55:59] RECOVERY - Packetloss_Average on emery is OK: OK: packet_loss_average is 1.02894858407 [22:04:07] [18:01] (Jeff|mobile) s1 is lagging almost 25 hours? [22:07:17] sync done. [22:07:36] Jeff|mobile, see wikitech-l - we're in the middle of schema updates [22:09:42] !log awjrichards synchronized wmf-config/CommonSettings.php 'Bumping resrouce version # for MobileFrontend' [22:09:46] Logged the message, Master [22:10:36] Thanks [22:10:41] RECOVERY - swift-container-auditor on ms-be3 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [22:10:42] https://jira.toolserver.org/browse/MNT-1225 [22:36:47] !log awjrichards synchronized php/extensions/MobileFrontend/api/ApiQueryExtracts.php 'r114319' [22:36:50] Logged the message, Master [22:46:07] !log reedy synchronized wikipedia.dblist 'test' [22:46:10] Logged the message, Master [22:53:21] bleh. anything related to MobileFrontend in error log? [23:10:36] parser ooms [23:12:21] duh [23:12:27] how many images are on Wikimedia's file servers? (ballpark) [23:13:21] Sean_Colombo: Ask ape rgos at a more reasonable hour (he's 9 hours ahead of San Francisco) [23:13:24] -3.4 [23:14:02] RoanKattouw: thanks, will do [23:14:37] we've been messing around with a bunch of image compression stuff, and it might be helpful to you guys too (if img space/bandwidth are a big enough concern) [23:14:56] we found about 13% savings on average across all of our images [23:15:14] data: http://lyrics.wikia.com/User:ImageBot [23:15:16] OK [23:15:29] I can tell you roughly how much data we have, but not easily how many files we have [23:40:08] PROBLEM - MySQL Replication Heartbeat on db1033 is CRITICAL: CRIT replication delay 199 seconds [23:40:08] PROBLEM - MySQL Slave Delay on db1033 is CRITICAL: CRIT replication delay 199 seconds [23:48:50] PROBLEM - Puppet freshness on owa3 is CRITICAL: Puppet has not run in the last 10 hours [23:50:38] PROBLEM - Puppet freshness on amslvs2 is CRITICAL: Puppet has not run in the last 10 hours [23:57:50] PROBLEM - Puppet freshness on owa1 is CRITICAL: Puppet has not run in the last 10 hours [23:57:50] PROBLEM - Puppet freshness on owa2 is CRITICAL: Puppet has not run in the last 10 hours