[00:01:14] Reedy: this in mediawiki-config like the docroot for noc, right [00:01:43] or just local [00:01:59] yeah, mediawiki-config [00:12:13] New patchset: Dzahn; "add GPG key for hexmode to sign mw packages" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32165 [00:12:37] Change merged: Dzahn; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32165 [00:13:03] hexmode: if you want to update it in the future, you can just send gerrit patches [00:13:33] :) [00:15:10] Mmmmm DEADBEEF [00:20:02] so... sync-common now just calls "scap-1" and thats all? [00:20:14] did that also change recently? [00:21:04] Not very recently.. [00:21:12] ok [00:21:50] New patchset: Anomie; "(bug 41133) Allow per-realm dblists and wikiversions.dat" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32167 [00:21:50] New patchset: Anomie; "Add ability for switching for eqiad-specific configuration" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/30792 [00:24:10] New patchset: Anomie; "(bug 41133) Allow per-realm dblists and wikiversions.dat" [operations/mediawiki-multiversion] (master) - https://gerrit.wikimedia.org/r/32168 [00:27:58] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [00:28:02] New patchset: Anomie; "(bug 41133) Allow per-realm dblists and wikiversions.dat" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32167 [00:33:59] New patchset: CSteipp; "Update WV config based on their current LocalSettings config" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32096 [00:35:10] New review: CSteipp; "typo" [operations/mediawiki-config] (master); V: 0 C: -1; - https://gerrit.wikimedia.org/r/32096 [00:37:26] New patchset: CSteipp; "Update WV config based on their current LocalSettings config" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32096 [00:40:25] PROBLEM - MySQL Slave Running on db1003 is CRITICAL: CRIT replication Slave_IO_Running: Yes Slave_SQL_Running: No Last_Error: Error Cant create database frwikivoyage: database exists on quer [00:40:38] PROBLEM - MySQL Slave Running on db1035 is CRITICAL: CRIT replication Slave_IO_Running: Yes Slave_SQL_Running: No Last_Error: Error Cant create database frwikivoyage: database exists on quer [00:41:01] PROBLEM - MySQL Replication Heartbeat on db1003 is CRITICAL: CRIT replication delay 250 seconds [00:41:01] * AaronSchulz looks at binasher [00:41:10] PROBLEM - MySQL Replication Heartbeat on db1035 is CRITICAL: CRIT replication delay 258 seconds [00:41:10] PROBLEM - MySQL Slave Running on db1010 is CRITICAL: CRIT replication Slave_IO_Running: Yes Slave_SQL_Running: No Last_Error: Error Cant create database frwikivoyage: database exists on quer [00:41:17] maaaaan [00:41:31] mutante: "if not exists" is your friend :) [00:42:07] * AaronSchulz loves to use that in his sql patches, heh [00:42:08] but i was supposed to create them manually this time, we discused this here [00:42:13] PROBLEM - MySQL Replication Heartbeat on db1010 is CRITICAL: CRIT replication delay 321 seconds [00:43:06] i'm fixing repl [00:43:28] mutante: forgot set sql_log_bin = 0 when running the create db on a master? [00:43:44] RECOVERY - MySQL Slave Running on db1003 is OK: OK replication Slave_IO_Running: Yes Slave_SQL_Running: Yes Last_Error: [00:44:20] RECOVERY - MySQL Replication Heartbeat on db1003 is OK: OK replication delay 0 seconds [00:44:28] RECOVERY - MySQL Slave Running on db1010 is OK: OK replication Slave_IO_Running: Yes Slave_SQL_Running: Yes Last_Error: [00:45:09] binasher: no, using the same files from earlier [00:45:15] SET SQL_LOG_BIN = 0; [00:45:31] RECOVERY - MySQL Slave Running on db1035 is OK: OK replication Slave_IO_Running: Yes Slave_SQL_Running: Yes Last_Error: [00:45:31] RECOVERY - MySQL Replication Heartbeat on db1010 is OK: OK replication delay 0 seconds [00:45:53] mutante: the files didn't include the "create database" though, did they? [00:46:08] RECOVERY - MySQL Replication Heartbeat on db1035 is OK: OK replication delay 0 seconds [00:47:37] binasher: no they didnt, gotcha, my notes said to do it before dropping anything and last time they were created by addwiki [00:56:31] !log demon synchronized php-1.21wmf3/extensions/ExtensionDistributor/ExtensionDistributor_body.php 'Debugging' [00:56:37] Logged the message, Master [01:05:44] !log demon synchronized php-1.21wmf3/extensions/ExtensionDistributor/ExtensionDistributor_body.php 'Debugging' [01:05:50] Logged the message, Master [01:13:20] !log demon synchronized php-1.21wmf3/extensions/ExtensionDistributor/ExtensionDistributor_body.php 'Revert debugging' [01:13:33] Logged the message, Master [01:19:57] binasher: done, incl. db34 [01:28:52] PROBLEM - Puppet freshness on ms-fe3 is CRITICAL: Puppet has not run in the last 10 hours [01:34:20] New patchset: Dzahn; "re-enabling frwikivoyage Apache config" [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/32176 [01:36:05] Change merged: Dzahn; [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/32176 [01:37:35] omg... [01:37:50] now i cant pull the apache config on fenari? [01:37:51] error: insufficient permission for adding an object to repository database .git/objects [01:37:54] fatal: failed to write object [01:39:11] 12K drwxrwxr-x 154 hashar wikidev 12K 2012-11-06 23:49 objects [01:39:20] owned by hashar?! [01:39:40] less than an hour ago..man [01:39:52] lol [01:39:58] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 288 seconds [01:39:58] PROBLEM - MySQL Slave Delay on db78 is CRITICAL: CRIT replication delay 242 seconds [01:40:05] root ftw [01:40:13] mysql slave ........ [01:40:38] @replag [01:40:50] none of the servers we touched :p [01:40:54] good ;) [01:41:11] sooo, fix apache git repo... [01:43:16] RECOVERY - MySQL Slave Delay on db78 is OK: OK replication delay 0 seconds [01:45:42] dzahn is doing a graceful restart of all apaches [01:46:02] !log dzahn gracefulled all apaches [01:46:10] Logged the message, Master [01:46:10] sooo.done.. [01:46:25] woop [01:48:14] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 13 seconds [01:48:54] !log reedy synchronized all.dblist 'frwikivoyage' [01:48:59] Logged the message, Master [01:58:55] PROBLEM - Puppet freshness on analytics1001 is CRITICAL: Puppet has not run in the last 10 hours [01:58:55] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [01:58:55] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [02:00:54] PROBLEM - MySQL Slave Delay on db78 is CRITICAL: CRIT replication delay 305 seconds [02:00:54] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 306 seconds [02:03:57] New patchset: Hoo man; "(bug 41840) Disable 'editsection' pref for wikibase" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32179 [02:08:27] New review: Danny B.; "The correct way is to fix the wrong code of ns-0 which causes that and not to disable the preference..." [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/32179 [02:12:15] New review: Hoo man; "I don't think this needs any further discussion, this is the least bad way to workaround this until ..." [operations/mediawiki-config] (master) C: 0; - https://gerrit.wikimedia.org/r/32179 [02:17:01] New review: Hoo man; "The pref. is currently set by 2 users on wikidatawiki:" [operations/mediawiki-config] (master) C: 0; - https://gerrit.wikimedia.org/r/32179 [02:17:34] New review: Danny B.; "You want to change the behavior with much bigger impact than to properly fix the single-namespace is..." [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/32179 [02:19:18] New review: Danny B.; "Argumenting with the current number of users with such preference is invalid, because obviously when..." [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/32179 [02:20:04] PROBLEM - Varnish traffic logger on cp1035 is CRITICAL: PROCS CRITICAL: 2 processes with command name varnishncsa [02:23:20] New patchset: Reedy; "Readonly for wikivoyage wikis" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32180 [02:28:55] !log LocalisationUpdate completed (1.21wmf3) at Wed Nov 7 02:28:51 UTC 2012 [02:28:58] Logged the message, Master [02:31:46] RECOVERY - Varnish traffic logger on cp1035 is OK: PROCS OK: 3 processes with command name varnishncsa [02:34:57] Change abandoned: Hoo man; "Worked around in the common.css" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32179 [02:41:58] PROBLEM - Puppet freshness on arsenic is CRITICAL: Puppet has not run in the last 10 hours [02:49:00] !log LocalisationUpdate completed (1.21wmf2) at Wed Nov 7 02:49:00 UTC 2012 [02:49:10] Logged the message, Master [02:49:41] New patchset: CSteipp; "Disable wikivoyage sites in apache" [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/32181 [02:56:43] PROBLEM - Varnish traffic logger on cp1035 is CRITICAL: PROCS CRITICAL: 2 processes with command name varnishncsa [03:05:04] RECOVERY - Varnish traffic logger on cp1035 is OK: PROCS OK: 3 processes with command name varnishncsa [03:10:47] RECOVERY - Puppet freshness on erzurumi is OK: puppet ran at Wed Nov 7 03:10:30 UTC 2012 [03:12:02] PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours [03:34:25] PROBLEM - Varnish traffic logger on cp1035 is CRITICAL: PROCS CRITICAL: 2 processes with command name varnishncsa [03:39:22] RECOVERY - Varnish traffic logger on cp1035 is OK: PROCS OK: 3 processes with command name varnishncsa [03:48:23] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 11 seconds [03:48:40] RECOVERY - MySQL Slave Delay on db78 is OK: OK replication delay 0 seconds [04:28:40] PROBLEM - Varnish traffic logger on cp1035 is CRITICAL: PROCS CRITICAL: 2 processes with command name varnishncsa [04:42:42] PROBLEM - Varnish traffic logger on cp1025 is CRITICAL: PROCS CRITICAL: 2 processes with command name varnishncsa [04:45:15] RECOVERY - Varnish traffic logger on cp1035 is OK: PROCS OK: 3 processes with command name varnishncsa [04:49:23] RECOVERY - Varnish traffic logger on cp1025 is OK: PROCS OK: 3 processes with command name varnishncsa [04:54:17] PROBLEM - Varnish traffic logger on cp1025 is CRITICAL: PROCS CRITICAL: 2 processes with command name varnishncsa [05:21:17] RECOVERY - Varnish traffic logger on cp1025 is OK: PROCS OK: 3 processes with command name varnishncsa [06:18:03] New patchset: Stefan.petrea; "Openssl Package needed to be installed for jenkins At wmf-analytics we are trying to build libanon for CI (the same applies to libcidr and udp-filters, but will treat those in separate reviews)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32190 [06:35:19] New patchset: Stefan.petrea; "Openssl Package needed to be installed for jenkins At wmf-analytics we are trying to build libanon for CI (the same applies to libcidr and udp-filters, but will treat those in separate reviews)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32192 [06:36:23] Change abandoned: Stefan.petrea; "branch isn't as good as I wanted it to be(some files got deleted and stuff, I had just 2 files modif..." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32190 [06:49:46] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [07:07:26] New review: Dzahn; "per chris" [operations/apache-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/32181 [07:07:26] Change merged: Dzahn; [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/32181 [07:13:49] dzahn is doing a graceful restart of all apaches [07:14:26] !log dzahn gracefulled all apaches [07:14:35] Logged the message, Master [07:31:02] PROBLEM - Host srv278 is DOWN: PING CRITICAL - Packet loss = 100% [07:32:10] RECOVERY - Host srv278 is UP: PING OK - Packet loss = 0%, RTA = 0.22 ms [07:53:25] New patchset: Matthias Mullie; "Update WV config based on their current LocalSettings config" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32096 [08:18:37] PROBLEM - Puppet freshness on db42 is CRITICAL: Puppet has not run in the last 10 hours [08:18:37] PROBLEM - Puppet freshness on ms-be7 is CRITICAL: Puppet has not run in the last 10 hours [08:18:37] PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours [09:45:41] PROBLEM - SSH on kaulen is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:45:42] RECOVERY - SSH on kaulen is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [09:56:45] PROBLEM - SSH on kaulen is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:56:46] RECOVERY - SSH on kaulen is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [10:05:11] PROBLEM - SSH on kaulen is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:05:11] PROBLEM - HTTP on kaulen is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:05:12] RECOVERY - SSH on kaulen is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [10:05:20] RECOVERY - HTTP on kaulen is OK: HTTP OK HTTP/1.1 200 OK - 461 bytes in 0.010 seconds [10:25:37] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32060 [10:28:48] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [10:31:17] New patchset: Mark Bergsma; "Make every frontend only talk to its own backend, for now" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32197 [10:31:37] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32197 [10:33:04] New patchset: Mark Bergsma; "Revert "mysql 5.5 compat"" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32198 [10:33:24] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32198 [10:59:05] New patchset: Mark Bergsma; "Install upload Varnish on cp3004 as well" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32202 [10:59:25] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32202 [11:17:03] !log Pooled cp3004 with weight 1 [11:17:03] Logged the message, Master [11:30:04] PROBLEM - Puppet freshness on ms-fe3 is CRITICAL: Puppet has not run in the last 10 hours [11:31:36] New patchset: J; "Bug 41826 add wgmEnableTimedText" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32205 [11:33:27] New patchset: J; "Bug 41826 add wgmEnableTimedText" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32205 [11:48:28] mark: care to explain? [11:49:10] you're now hashing not just on the URL, but also on a header... which is not available when purging [11:49:31] so instead you should not adjust the hashing, but add a new header on which you can Vary: [11:49:39] that has drawbacks as well, but it doesn't break purging [11:50:07] thanks, that's way less dramatic and way more helpful than "you just broke purging" [11:50:38] but less brain exercise [11:53:34] mark: by the way -- this is unrelated by always confused me -- how come static assets from bits have x-cache-hit/miss headers from squids? is the setup varnish -> squid -> mw? [11:53:44] or am i misreading it? [11:54:30] they do? [11:55:27] let me check and produce an example before i make a complete idiot of myself [11:55:34] ok [11:57:03] ah nevermind, strontium is varnish, right? [11:57:12] yes [11:57:57] !log Increased weight of cp3004 to 9 [11:58:02] Logged the message, Master [11:58:15] nm then [12:00:24] PROBLEM - Puppet freshness on analytics1001 is CRITICAL: Puppet has not run in the last 10 hours [12:00:24] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [12:00:24] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [12:35:16] it's funny how I replied about vary: cookie/accept-language in that RT ticket [12:35:26] and how it's a bad idea even if you normalize it [12:35:55] 5 days ago [12:42:40] PROBLEM - Puppet freshness on arsenic is CRITICAL: Puppet has not run in the last 10 hours [12:52:39] !log Starting load testing of cp3003 and cp3004 [12:52:42] Logged the message, Master [13:02:43] did gerrit die? [13:06:03] probably [13:06:10] it'll come back again in a couple of minutes [13:06:58] it's back [13:11:12] !log reedy synchronized php-1.21wmf3/extensions/TimedMediaHandler/ [13:11:18] Logged the message, Master [13:12:40] PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours [13:15:56] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/31009 [13:15:59] Reedy: looks like I overlooked a variable typo in the code that only gets triggered in production: can i get one more sync of TMH [13:16:34] lol [13:16:53] I should write a script to do this.. [13:17:18] :) [13:17:55] It's a brainless activity to do it, and with the extension name you can make a generic Update FOO to master message [13:18:51] i might do that this afternoon [13:19:05] can even make it auto approve itself... [13:26:39] !log reedy synchronized php-1.21wmf3/extensions/TimedMediaHandler/ [13:26:47] Logged the message, Master [13:33:38] heh, 9 lines of bash [13:33:47] just need to finish the auto approve bit [14:21:27] Reedy: nice, looks like I would need your new script right away, can you sync TMH once more? [14:44:57] j^: we [14:44:57] whee [14:45:00] first part works first time [14:45:03] New patchset: Silke Meyer; "Added puppet files for Wikidata on labs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30593 [14:46:11] !log reedy synchronized php-1.21wmf3/extensions/TimedMediaHandler/ [14:46:13] Logged the message, Master [14:57:52] are tmh1/tmh2 running the latest code? somehow tmh2 looks a bit idle from http://ganglia.wikimedia.org/latest/?c=Video%20scalers%20pmtpa&h=tmh2.pmtpa.wmnet&m=load_one&r=hour&s=by%20name&hc=4&mc=2 [15:06:57] New patchset: Mark Bergsma; "Set thread_pools to 2, and reduce max thread count to a saner level" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32222 [15:07:29] New patchset: Hashar; "rake disable colors on non TTY" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/30062 [15:07:38] New review: Hashar; "rebased" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/30062 [15:08:06] pooor gerrit interface :/ [15:08:06] Service Temporarily Unavailable [15:08:16] ^demon: Gerrit having issue again apparently [15:08:21] why does gerrit's stability suck so much [15:08:22] though it only last for a few seconds [15:08:38] I am wondering if it is related to the Precise upgrade [15:08:40] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32222 [15:08:52] you're relating everything to precise upgrades [15:09:01] <^demon> It's unrelated to precise. [15:09:10] <^demon> It started a few days before we upgraded. [15:09:15] <^demon> And I've not tracked down why yet. [15:23:25] !log reactivating tele2 ipv6 peering and re-exporting routes to it [15:23:31] Logged the message, Master [15:56:36] New patchset: Hydriz; "Removed duplicate Wikivoyage section for wgSitename." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32223 [16:01:34] New review: Hydriz; "-1 for now to get some small issues fixed first." [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/31949 [16:17:41] New patchset: J; "Bug 41826 add wgmEnableLocalTimedText" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32205 [16:39:27] PROBLEM - Host cp3019 is DOWN: PING CRITICAL - Packet loss = 100% [16:40:13] RECOVERY - Host cp3019 is UP: PING OK - Packet loss = 0%, RTA = 117.29 ms [16:41:30] New patchset: Alex Monk; "(bug 41841) Add wikidata and wikivoyage to wgNoFollowDomainExceptions" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32226 [16:49:02] New patchset: Demon; "Enable TMH on commonswiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32231 [16:49:30] oh wow [16:49:30] already? [16:49:30] New review: Demon; "For merging in ~10m when the deployment window opens." [operations/mediawiki-config] (master); V: 0 C: 0; - https://gerrit.wikimedia.org/r/32231 [16:50:05] <^demon> paravoid: In 10 minutes, yeah. [16:50:51] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [16:51:47] Scary stuffs! [16:51:55] indeed [16:52:03] ^demon: before we do deployment to commons, we need TMH master synced to 1.21wmf3, https://gerrit.wikimedia.org/r/#/c/32205/ merged to mediawiki-config [16:52:19] <^demon> Turns out I'm enabling it right at lunch time, which is *fantastic* [16:52:22] <^demon> :) [16:52:22] A lot of moving parts [16:53:50] New patchset: Reedy; "Move everything else over to 1.21wmf3" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32232 [16:54:10] New review: J; "would also want to enable transcoding on commons. " [operations/mediawiki-config] (master) C: 0; - https://gerrit.wikimedia.org/r/32231 [16:54:16] Change abandoned: Reedy; "(no reason)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32180 [16:55:26] <^demon> Reedy: Gonna have to rebase 32232, since it depends on abandoned 32180. [16:55:39] Yup, just fixing it [16:56:06] New patchset: Reedy; "Move everything else over to 1.21wmf3" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32232 [16:56:14] j^: have we done any estimates regarding the scalability of tmh1/2? [16:56:20] that's better [16:56:24] how many transcoding hits can we sustain? [16:56:34] <^demon> Rebase button on change with abandoned dependency should rebase against destination branch. [16:56:38] <^demon> Would make most sense imho, rather than giving a stupid error [16:56:52] mmm, very much so [16:59:20] paravoid: its hard to predict, but initially it will have a backlog, and given the changes in support we might see more video uploads [16:59:55] +there is room for using them a bit better [17:00:26] but might need to add more transcoding boxes if a lot of new videos are uploaded [17:02:43] New patchset: Demon; "Enable TMH on commonswiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32231 [17:02:53] <^demon> j^: PS2 has your transcode fix. [17:03:13] New patchset: Mark Bergsma; "Reduce thread limits" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32234 [17:03:40] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32234 [17:04:24] Change merged: Demon; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32205 [17:05:55] New patchset: Demon; "Enable TMH on commonswiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32231 [17:06:16] Change merged: Demon; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32231 [17:07:49] thanks ^demon! [17:08:03] <^demon> No problem. Syncing config now. [17:08:20] !log demon synchronized wmf-config/CommonSettings.php 'Enabling TMH for commonswiki' [17:08:25] Logged the message, Master [17:08:43] Can i get 'transcode-status' permissions on commons? [17:08:43] !log demon synchronized wmf-config/InitialiseSettings.php 'Enabling TMH for commonswiki' [17:08:46] Logged the message, Master [17:09:30] j^: mdale: what's the reason for restricting 'transcode-status' on any given wiki? [17:09:37] <^demon> j^: We can't assign individual permissions. Only group on commons with that permission right now is admins. [17:10:10] morebots: the special page is costly to generate [17:10:31] ^demon: are staff admins? [17:10:46] <^demon> If you're part of the global staff group, which not everybody is. [17:10:49] <^demon> (I'm not, for one) [17:11:16] hmm if its something your granfathered into... I should be good ;) [17:11:40] I can check on en .. one sec. [17:12:09] There's likely commons admins around [17:12:36] nope coming up blank for me. [17:12:36] http://en.wikipedia.org/wiki/Special:TimedMediaHandler [17:12:49] <^demon> Generally speaking: hiding a special page because it's expensive sounds like the wrong route to go. I'd much rather cache it and show it to other users (and maybe restrict re-generating it to a permission, if it's really that bad) [17:13:02] ^demon: agreed. [17:13:04] completely hiding seems strange [17:13:16] or do you mean you don't have permission to view X hiding? [17:13:49] <^demon> Man, this page is taking forever to load. There *really* needs to be some level of caching here. [17:13:49] ^demon: that is what we do for the per-page transcode status.. show to all .. restrict reseting transcode jobs to users with that status [17:14:29] j^ wrote it.. maybe not using all the MediaWiki caching tricks ? [17:15:18] or maybe we need more indexes on the transcode state table? [17:15:45] <^demon> Possibly. I don't have TMH installed locally yet so I haven't run explain on it. [17:15:47] * robla enters new bug for caching this page [17:15:59] the table can't be that large already, can it? :/ [17:16:18] dont think its that slow [17:16:23] just did not spend time testing it [17:16:47] if you want to dig up the queries, we can quickly EXPLAIN them and verify [17:17:00] <^demon> It was really slow for me on enwiki just now. [17:17:06] <^demon> I'm seeing some queries inside of a foreach(), which can't be fast :\ [17:17:09] and its not using cache since i wanted to see the current state [17:17:28] RECOVERY - SSH on arsenic is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [17:17:48] <^demon> We're doing a count(*) on the image table inside of a foreach. [17:17:55] <^demon> I'll be $5 that's it. [17:17:55] o_0 [17:18:15] <^demon> Line 156-161. [17:18:44] https://bugzilla.wikimedia.org/show_bug.cgi?id=41854 [17:18:57] "Bug 41854 - Cache expensive elements in Special:TimedMediaHandler " [17:21:24] img_media_type isn't indexed [17:21:53] and then there's an unknown extra conditional [17:22:08] I'll just dump it on the bug robla opened [17:23:09] http://commons.wikimedia.org/wiki/Commons:Media_help is a bit outdated now [17:23:36] array( 'transcode_key' => $key ), [17:23:36] isn't indexed [17:24:55] we should add an index for it in that case [17:25:15] Yeah, just left a comment to that effect [17:25:25] It'll be cheap to do with few rows in the table [17:25:39] RECOVERY - Puppet freshness on arsenic is OK: puppet ran at Wed Nov 7 17:25:32 UTC 2012 [17:26:06] <^demon> We should look at doing this slightly differently. I'd rather query more things at once and then sort at the application level, rather than iterating and doing queries. [17:26:35] There's a few more obfuscated queries based on where the source comes from [17:27:34] Change merged: CSteipp; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32096 [17:29:00] <^demon> I did explains on the image queries: http://p.defau.lt/?iyE2Kf2tqwL2AKcFATLAUw [17:29:44] i uploaded a WebM video to commons http://commons.wikimedia.org/wiki/File:2012-07-18_Market_Street_-_San_Francisco.webm [17:31:21] adding an indexed file_extension field to image table would make that query easier. [17:31:39] PROBLEM - Host arsenic is DOWN: PING CRITICAL - Packet loss = 100% [17:32:52] https://gerrit.wikimedia.org/r/32236 adds an index on trancode_key [17:34:16] i get logged out of gerrit all the time today [17:36:36] !log Creating transcode_key_idx on all transcode tables on all wikis [17:36:43] Logged the message, Master [17:37:11] <^demon> Reedy: Done on enwiki yet? [17:37:24] yup, it's onto ga via foreachwiki [17:37:38] j^ Market Street file not playing in IE7, is that expected? IE7 shows the first frame only. [17:37:39] <^demon> Oooh, I think I found a way to eliminate 4x queries in one of the iterations :) [17:37:46] <^demon> The new index works for this. [17:38:21] chrismcmahon: yes will not play until Ogg transcode is ready [17:38:36] display looks nice, though [17:38:39] Hmm, I guess the TMH table needs adding to addWiki, /me makes a TODO [17:39:01] Reedy: what does that mean? [17:39:13] so it creates the transcode table on new wikis [17:41:32] can we add me to a group that can see: http://en.wikipedia.org/wiki/Special:TimedMediaHandler ... user "mdale" [17:41:59] and for http://commons.wikimedia.org/wiki/Special:TimedMediaHandler ;) [17:42:02] Why is it empty for anons? ;) [17:42:26] <^demon> j^: How does http://p.defau.lt/?0sDneLyhr_iOCCa6BdSFqg look? [17:42:33] New review: Andrew Bogott; "This is just the design that I was aiming for -- good work!" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/30593 [17:43:19] !log racking ms-be6 (720xd) [17:43:24] Logged the message, Master [17:43:26] 179 videos [17:43:26] 179 Ogg videos [17:43:33] Wow, what an awesome pag ;) [17:44:26] mdale: Only admins have that on enwiki... [17:44:45] <^demon> We already discussed this :p [17:44:45] ;) [17:45:56] Should we create a group then? [17:46:00] Potentially annoying enwiki? [17:46:00] New patchset: Dzahn; "add wildcard SSL cert for wikivoyage.org" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32240 [17:46:08] ^demon: with those changes you think its safe for anonymous users? [17:46:26] <^demon> All we've done is slap an index on it. [17:46:37] <^demon> We still need to clean up some of these queries-in-foreaches. [17:46:55] New review: Dzahn; "RT-3696 (includes .m.)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/32240 [17:46:55] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32240 [17:47:20] And maybe add some wfProfile calls [17:47:34] <^demon> mdale: If you can test out http://p.defau.lt/?0sDneLyhr_iOCCa6BdSFqg on your local install, that would reduce the number of queries by count($wgEnabledTranscodeSet) times. [17:48:01] <^demon> (So on enwiki, 4x less queries in getStats() :)) [17:49:52] ^demon: Hunk #1 FAILED at 132. [17:50:01] thats against master? [17:50:33] <^demon> It wasn't but easily rebases. [17:50:33] <^demon> I'll put a new patch [17:50:54] <^demon> Against master: http://p.defau.lt/?TFvE4xxm0Sg1Wx5_TBfBEw [17:56:56] <^demon> j^: I've got to prep for a meeting. If that works on your install, we'll get it merged & deployed soon. [17:58:31] ^demon: ah that pastebin adds html to that page impossible to just patch can you just git review it to gerrit? [17:59:49] <^demon> j^: https://gerrit.wikimedia.org/r/#/c/32243/ [18:01:00] New patchset: Dzahn; "add SSL proxy config for wikivoyage" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32244 [18:03:40] New patchset: Dzahn; "add SSL proxy config for wikivoyage" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32244 [18:05:46] RECOVERY - Host arsenic is UP: PING OK - Packet loss = 0%, RTA = 26.52 ms [18:05:53] ^demon: thanks for your help today! [18:06:04] RECOVERY - Varnish HTTP bits on arsenic is OK: HTTP OK HTTP/1.1 200 OK - 633 bytes in 0.056 seconds [18:06:06] <^demon> No problem :) [18:06:15] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32244 [18:07:09] robla: i ended up starting to write a script to do extension updates to master in wmf branches... [18:08:26] just need to make it submit them too... [18:14:57] ^demon: your patch lost keys will rework it a bit [18:15:26] <^demon> Yeah, I wasn't 100% sure if I had it. Got the general idea of what I was getting at though? [18:17:57] the first part does not work like that, the second one with group by is k [18:18:39] * robla disappears into a meeting for a bit. [18:19:52] PROBLEM - Puppet freshness on db42 is CRITICAL: Puppet has not run in the last 10 hours [18:19:52] PROBLEM - Puppet freshness on ms-be7 is CRITICAL: Puppet has not run in the last 10 hours [18:19:52] PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours [18:22:16] PROBLEM - HTTPS on ssl1 is CRITICAL: Connection refused [18:22:21] dang, there is an issue with the nginx config on ssl1, looking [18:30:36] RECOVERY - HTTPS on ssl1 is OK: OK - Certificate will expire on 08/22/2015 22:23. [18:42:20] !log racking new ms-be6 to row B sdtpa [18:42:28] Logged the message, Master [18:43:11] New review: Dzahn; "role class ++, installing openssl - don't see a problem. Hashar, if this makes sense to you on integ..." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/32192 [18:45:12] New review: Ottomata; "Naw, I talked with Stefan this morning." [operations/puppet] (production); V: 0 C: -1; - https://gerrit.wikimedia.org/r/32192 [18:46:42] New review: Dzahn; "ah,ok, i also talked to him a bit last night and actually recommended to do a role class. It sounded..." [operations/puppet] (production); V: 1 C: 1; - https://gerrit.wikimedia.org/r/32192 [18:49:15] New review: Dzahn; "[[RT:3879]]" [operations/puppet] (production); V: 1 C: 1; - https://gerrit.wikimedia.org/r/32192 [18:50:47] heisenbugs. test/test2 recently produced js errors on nearly every page for IE7. now that I'm looking for them, they're not there. found one, but it looks exceptional. http://test.wikipedia.org/wiki/Foobar [18:54:07] New review: Dzahn; "we already merged the redirect, but looks like it is missing a ServerAlias" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/31302 [18:57:20] New patchset: Dzahn; "add missing ServerAlias for wlm.wikimedia.org redirect to toolserver" [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/32250 [18:58:45] New review: Dzahn; "needed for change 31303 to work. and that is what 31302 depends on" [operations/apache-config] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/32250 [18:58:45] Change merged: Dzahn; [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/32250 [19:00:06] dzahn is doing a graceful restart of all apaches [19:00:26] !log dzahn gracefulled all apaches [19:00:34] Logged the message, Master [19:02:18] New review: Dzahn; "needed a ServerAlias: change 32250" [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/31303 [19:03:49] New review: Dzahn; "should work now soon" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/31302 [19:06:45] mutante: how much more are you doing? it's time for us to finish off the 1.20wmf3 deployment [19:08:09] Reedy: assuming we get the all clear here, are you ready to switch eveyrthing else to wmf3? [19:14:13] robla: go ahead, stopping for now [19:19:32] 07 18:51:30 < rfaulkner> Amit is asking if anybody here might know a wikipedia zero feature enabled today? [19:20:36] Yes. Wondering who merged the WP zero changes earlier? Who I need to contact when we're ready to shut off the test? [19:20:52] did you check gerrit and the SAL? ;) [19:21:08] do you know yet what time they should be turned off? [19:23:22] akapoor: i see nothing relevant in the SAL [19:23:35] how did you get it deployed to begin with? [19:23:44] !sal | akapoor [19:23:44] akapoor: https://labsconsole.wikimedia.org/wiki/Server_Admin_Log see it and you will know all you need [19:25:48] !log re-adding arsenic to bits cache pool [19:25:55] Logged the message, notpeter [19:37:08] anyone seen Reedy? [19:38:08] I'm here [19:38:24] I'd done 50% of the work already ;) [19:38:51] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32232 [19:39:06] Woah [19:39:40] j^: I've got 2 TMH related php warnings and 1 fatal (though 243 occurances in the last 1000 log lines) [19:39:48] Make that 2 fatals [19:41:48] akapoor: ? [19:42:06] paravoid: what's with copper/zinc swift? [19:42:09] seems to be down [19:42:43] hm? [19:43:37] I've never touched that cluster, does it even work? [19:44:18] not anymore, I think there was some hardware problem or something [19:45:56] zinc seems down [19:46:11] do you actively use that? does it make sense to put effort into reviving that cluster? [19:46:48] !log point wlm.wikimedia.org to the cluster rather than stand-alone host yttrium [19:46:51] Logged the message, Master [19:48:25] Reedy: can you provide more info? [19:48:42] j^: https://bugzilla.wikimedia.org/show_bug.cgi?id=41860 [19:48:53] The 2 fatals have stack traces [19:49:06] The 2 warnings should be simple enough, as they're missing constructor parameters [19:50:02] paravoid: I run tests against it [19:50:20] okay [19:50:22] I don't know how much work it is to keep that running [19:50:29] I'll try fixing it [19:50:44] and then I should probably upgrade it at some point [19:50:45] it's also good for latency tests ;) [19:50:51] to match what we have in production [19:51:05] New patchset: Asher; "disable zero for orange congo and botswana" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32252 [19:51:06] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Everything else to 1.21wmf3 [19:51:12] Logged the message, Master [19:51:30] New patchset: Reedy; "Fix abwiki typoo" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32253 [19:52:19] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32252 [19:52:33] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32253 [19:53:18]