[00:01:16] New review: Dzahn; "this is not so much about giving Chad access, i added you reviewers for the puppet bikeshedding part..." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/42791 [00:04:30] LeslieCarr: looks ok now ?! [00:06:55] New review: Aaron Schulz; "I recall this being really noisy, though my memory may exaggerate." [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/42970 [00:11:30] root@fenari:/# puppet agent --configprint splay [00:11:30] true [00:11:35] which I suppose is the correct test [00:14:22] RECOVERY - NTP on db1036 is OK: NTP OK: Offset -1.430511475e-06 secs [00:18:06] PROBLEM - Puppet freshness on db1047 is CRITICAL: Puppet has not run in the last 10 hours [00:18:07] PROBLEM - Puppet freshness on ms-fe1004 is CRITICAL: Puppet has not run in the last 10 hours [00:18:07] PROBLEM - Puppet freshness on ms-fe1003 is CRITICAL: Puppet has not run in the last 10 hours [00:18:07] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [00:18:07] PROBLEM - Puppet freshness on sq48 is CRITICAL: Puppet has not run in the last 10 hours [00:26:22] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:29:20] New patchset: Tim Starling; "Fix mw-deployment-vars.erb" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43109 [00:29:29] New patchset: Pyoungmeister; "coredb: testing: ability to mark hosts as mariadb per node" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43110 [00:29:43] mutante: right now it is (post snmptt restart) [00:29:56] mutante: i also enabled debugging to see if it will core dump or something [00:30:25] Change merged: Tim Starling; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43109 [00:42:24] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.025 seconds [00:48:47] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43110 [00:50:22] LeslieCarr: i see.. so is that "snmptt on neon appears to freeze when logrotate runs" ? [00:50:43] yes/no ? [00:50:50] it looks like it did last night when logrotate ran [00:50:52] LeslieCarr: unrelated.. looks like we also need to look at that "check_ganglios_generic_value" [00:51:02] but it also forze at 17:46:53 and nothing excited happened [00:51:09] hrmm..ok [00:51:15] tstarling@tin:/srv/deployment/mediawiki/common$ mwscript eval.php --wiki=enwiki [00:51:15] PHP Warning: dba_open(/srv/deployment/mediawiki/common/wikiversions.cdb): failed to open stream: No such file or directory in /srv/deployment/mediawiki/common/multiversion/MWMultiVersion.php on line 260 [00:51:20] that's progress, I guess [00:52:01] Ryan_Lane/Reedy: is that file moving somewhere? [00:52:21] I don't think so [00:52:23] does it need to? [00:52:29] ditto [00:52:38] Just built a wikiversions.cdb [00:53:05] mutante: i think that's because those classes are commented out on neon -- i think both they should be uncommented and probably moved into the icinga class [00:53:05] we're also missing a LocalSettings.php [00:53:21] I was going to get git-deploy going in labs. is that something I should prioritize, or are you guys working on tin, now? [00:53:30] (in deployment-prep) [00:54:50] well, anomie was complaining about not being able to run maintenance scripts on tin [00:54:55] the work is probably much the same either way [00:55:06] which would you guys prefer? [00:55:43] I think things are mostly in a working state on tin right now for the rest of the work to continue [00:55:48] LeslieCarr: hrmm.yea. not a single line NOT containing "Normal" in snmptt.log ..will take a look again tomorrow.. [00:55:52] I would like both to be working with the same configuration [00:55:57] * Ryan_Lane nods [00:55:59] ok [00:56:09] I'll get deployment-prep going, then [00:56:55] mutante: if you see it stop processing check out snmptt.debug as well (just turned that on though) [00:57:20] btw, I changed the sync scripts slightly. when you run git deploy sync, it'll pause after the fetch state and provide a prompt where you can check the fetch status before continuing [00:57:23] yep, i saw it, not much in it yet though [00:57:52] so you can wait till all systems are finished with the fetch before doing a checkout [00:58:00] Ryan_Lane: ok [00:59:55] New patchset: Pyoungmeister; "coredb: need to pass mariadb to common" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43114 [01:01:01] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43114 [01:04:42] New review: Dereckson; "Already done. Ib1bb0040556c13e5d05ea610155e3cd7b7cce02c." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/40561 [01:05:33] Is it work pushing these commits to a branch on gerrit? [01:13:07] Reedy: shouldn't be [01:13:15] make a remote branch and push it in [01:13:24] you can push to a branch using git review [01:13:25] That's what I meant [01:13:27] ignore the gerrit ;) [01:13:55] notice that I moved the private stuff into a private repo [01:14:21] which does suck a little [01:14:21] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:14:34] we could treat each repo as dependent on each other [01:14:47] where when you start a deploy in one it also locks and deploys the other [01:15:31] I'm going to have to put in some logic to avoid infinite recursion there, though [01:15:40] require(/srv/deployment/mediawiki/common/1.21wmf7/../wmf-config/wgConf.php): failed to open stream: No such file or directory in /srv/deployment/mediawiki/common/wmf-config/CommonSettings.php on line 148 [01:15:55] needs to currently be /srv/deployment/mediawiki/common/1.21wmf7/../common/wmf-config/wgConf.php [01:15:57] :| [01:16:38] ah. yeah [01:17:00] hmm, there's gonna be a load of those [01:17:01] why not have a config variable for the common repo location? [01:17:37] is there a var for that? [01:17:55] if there isn't, that's what I'm suggesting :) [01:18:11] I've got a feeling I've seen one somewhere [01:18:46] we have 9 usages of $IP/.. in CommonSettings [01:19:08] maybe I was thinking of $wmfConfigDir [01:27:43] New patchset: Pyoungmeister; "testing: set db1036 back to noon-module for retest" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43119 [01:28:54] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43119 [01:29:27] [01:05:33] Is it work pushing these commits to a branch on gerrit?s [01:29:31] s/work/worth/ [01:30:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.030 seconds [01:32:03] TimStarling: any suggestions for a sensible way to determine where common is? [01:32:22] PROBLEM - Host db1036 is DOWN: PING CRITICAL - Packet loss = 100% [01:38:14] RECOVERY - Host db1036 is UP: PING OK - Packet loss = 0%, RTA = 26.49 ms [01:42:24] PROBLEM - MySQL Slave Delay on db1036 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:42:25] PROBLEM - MySQL Idle Transactions on db1036 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:42:25] PROBLEM - MySQL Slave Running on db1036 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:42:52] PROBLEM - MySQL Recent Restart on db1036 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:43:01] PROBLEM - MySQL disk space on db1036 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:43:11] PROBLEM - MySQL Replication Heartbeat on db1036 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:43:26] Reedy: how about __DIR__.'/..' ? [01:43:28] PROBLEM - Full LVS Snapshot on db1036 is CRITICAL: Connection refused by host [01:43:52] I guess my question was more "do we have to still use $IP?" [01:43:53] actually there is already MULTIVER_COMMON [01:44:11] Ah [01:44:12] I thought I'd seen something like that [01:44:25] from multiversion/defines.php [01:44:31] so yeah, you can just use that [01:44:51] * AaronSchulz snickers in the background [01:49:01] reedy@tin:/srv/deployment/mediawiki/common$ mwscript eval.php --wiki=enwiki [01:49:01] > echo $wgVersion; [01:49:01] 1.21wmf7 [01:49:21] one step at a time [01:49:46] I think I might've broken getExtendedVersionNumber though [01:49:46] RECOVERY - MySQL Recent Restart on db1036 is OK: OK seconds since restart [01:49:55] RECOVERY - MySQL disk space on db1036 is OK: DISK OK [01:49:55] RECOVERY - MySQL Replication Heartbeat on db1036 is OK: OK replication delay seconds [01:50:39] RECOVERY - Full LVS Snapshot on db1036 is OK: OK no full LVM snapshot volumes [01:50:58] RECOVERY - MySQL Slave Delay on db1036 is OK: OK replication delay seconds [01:50:59] RECOVERY - MySQL Idle Transactions on db1036 is OK: OK longest blocking idle transaction sleeps for seconds [01:50:59] RECOVERY - MySQL Slave Running on db1036 is OK: OK replication [01:54:56] re puppet splay fail: puppet only sleeps a random amount of time once, immediately after it starts up [01:55:17] then it sleeps for $runinterval *after* each run [01:55:32] so the interval is actually the amount of time it takes to run, plus $runinterval [01:57:46] so if the server stalls for 10 minutes and then allows a whole lot of clients to finish their runs simultaneously, they will stay together for the next run [02:01:25] Think we're going to have to chase a few files around to get them into version control [02:01:41] like LocalSettings.php as part of making the branch, rather than from checkoutMediaWiki [02:01:46] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:02:30] as much as I'd like to avoid it, it's possible to do tasks at the end of a checkout run [02:02:35] we do it in parsoid, for instance [02:02:51] in parsoid, we link to the configuration file in the config repo [02:02:56] Can't see any reason why it can't be in the wmf branches [02:02:59] ok [02:03:11] that's easier and less prone to failure [02:03:11] < 5 lines of code ;D [02:03:25] even copy paste most of it [02:03:51] this is going to cause me issues on other sites where I use the wmf branches, but I'll just deal with that :D [02:04:27] PROBLEM - NTP on db1036 is CRITICAL: NTP CRITICAL: Offset unknown [02:05:49] TimStarling: That's most of it done... php- prefixes stripped, dblists moved [02:07:54] RECOVERY - NTP on db1036 is OK: NTP OK: Offset -0.006457686424 secs [02:08:38] who the fuck delete my mariadb packages from the wikimedia repo and didn't email or otherwise tell me? [02:09:23] larry ellison? [02:09:40] someone deleted them? [02:09:54] who deletes packages at all? [02:09:57] reprepro changes: [02:09:57] remove precise-wikimedia deb universe i386 mysql-common 5.5.28-mariadb-wmf201212041~precise -- pool/universe/m/mariadb-5.5/mysql-common_5.5.28-mariadb-wmf201212041~precise_all.deb [02:10:02] someone who should be fired [02:10:04] wtf [02:10:30] someone who, if they're remote, that won't protect them [02:16:00] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 7.365 seconds [02:27:16] !log LocalisationUpdate completed (1.21wmf7) at Thu Jan 10 02:27:16 UTC 2013 [02:27:28] Logged the message, Master [02:32:58] PROBLEM - MySQL Slave Delay on db1020 is CRITICAL: CRIT replication delay 218 seconds [02:33:42] PROBLEM - MySQL Replication Heartbeat on db33 is CRITICAL: CRIT replication delay 237 seconds [02:33:43] PROBLEM - MySQL Replication Heartbeat on db1020 is CRITICAL: CRIT replication delay 247 seconds [02:33:51] PROBLEM - MySQL Slave Delay on db33 is CRITICAL: CRIT replication delay 242 seconds [02:34:50] New patchset: Asher; "changing log level to verbose in order to log key fingerprints for all logins" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43125 [02:35:11] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43125 [02:38:24] RECOVERY - Puppet freshness on sq86 is OK: puppet ran at Thu Jan 10 02:38:17 UTC 2013 [02:41:03] RECOVERY - MySQL Slave Delay on db33 is OK: OK replication delay 0 seconds [02:41:21] RECOVERY - Puppet freshness on sq72 is OK: puppet ran at Thu Jan 10 02:41:05 UTC 2013 [02:41:22] RECOVERY - Puppet freshness on search25 is OK: puppet ran at Thu Jan 10 02:41:06 UTC 2013 [02:41:58] RECOVERY - MySQL Slave Delay on db1020 is OK: OK replication delay 0 seconds [02:42:25] RECOVERY - MySQL Replication Heartbeat on db33 is OK: OK replication delay 0 seconds [02:42:25] RECOVERY - MySQL Replication Heartbeat on db1020 is OK: OK replication delay 0 seconds [02:43:55] RECOVERY - Puppet freshness on search32 is OK: puppet ran at Thu Jan 10 02:43:48 UTC 2013 [02:45:15] RECOVERY - Puppet freshness on mc16 is OK: puppet ran at Thu Jan 10 02:44:51 UTC 2013 [02:46:09] RECOVERY - Puppet freshness on mc5 is OK: puppet ran at Thu Jan 10 02:45:53 UTC 2013 [02:46:19] RECOVERY - Puppet freshness on search36 is OK: puppet ran at Thu Jan 10 02:46:03 UTC 2013 [02:48:24] RECOVERY - Puppet freshness on solr1001 is OK: puppet ran at Thu Jan 10 02:47:58 UTC 2013 [02:48:25] RECOVERY - Puppet freshness on kaulen is OK: puppet ran at Thu Jan 10 02:47:59 UTC 2013 [02:48:25] RECOVERY - Puppet freshness on mw1121 is OK: puppet ran at Thu Jan 10 02:48:00 UTC 2013 [02:49:55] RECOVERY - Puppet freshness on mw1129 is OK: puppet ran at Thu Jan 10 02:49:34 UTC 2013 [02:51:36] !log LocalisationUpdate completed (1.21wmf6) at Thu Jan 10 02:51:34 UTC 2013 [02:51:46] Logged the message, Master [02:55:18] RECOVERY - Puppet freshness on virt1000 is OK: puppet ran at Thu Jan 10 02:54:55 UTC 2013 [02:56:49] RECOVERY - Puppet freshness on sq65 is OK: puppet ran at Thu Jan 10 02:56:37 UTC 2013 [02:57:25] RECOVERY - Puppet freshness on kuo is OK: puppet ran at Thu Jan 10 02:57:08 UTC 2013 [02:57:25] RECOVERY - Puppet freshness on hooper is OK: puppet ran at Thu Jan 10 02:57:12 UTC 2013 [02:57:52] RECOVERY - Puppet freshness on solr2 is OK: puppet ran at Thu Jan 10 02:57:34 UTC 2013 [03:00:16] RECOVERY - Puppet freshness on wtp1 is OK: puppet ran at Thu Jan 10 02:59:50 UTC 2013 [03:01:19] RECOVERY - Puppet freshness on search35 is OK: puppet ran at Thu Jan 10 03:01:15 UTC 2013 [03:04:19] RECOVERY - Puppet freshness on search29 is OK: puppet ran at Thu Jan 10 03:03:54 UTC 2013 [03:05:22] RECOVERY - Puppet freshness on mc13 is OK: puppet ran at Thu Jan 10 03:04:52 UTC 2013 [03:05:22] RECOVERY - Puppet freshness on search1024 is OK: puppet ran at Thu Jan 10 03:05:04 UTC 2013 [03:06:52] RECOVERY - Puppet freshness on sq79 is OK: puppet ran at Thu Jan 10 03:06:42 UTC 2013 [03:14:21] RECOVERY - Puppet freshness on knsq17 is OK: puppet ran at Thu Jan 10 03:13:57 UTC 2013 [03:17:13] PROBLEM - Puppet freshness on ms-be1003 is CRITICAL: Puppet has not run in the last 10 hours [03:19:10] PROBLEM - Puppet freshness on vanadium is CRITICAL: Puppet has not run in the last 10 hours [03:21:07] New patchset: Reedy; "Add .deploy to gitignore" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43127 [03:21:35] New patchset: Reedy; "Remove php- and ExtensionMessages from .gitignore" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43128 [03:22:16] New patchset: Reedy; "Updated all docroot symlinks" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43129 [03:22:46] New patchset: Reedy; "Move dblists to their own folder (dblists)" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43130 [03:23:13] New patchset: Reedy; "Update noc related symlinks (non http config)" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43131 [03:23:37] New patchset: Reedy; "Drop the php- prefix" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43132 [03:24:22] New patchset: Reedy; "Various updates and changes to make it all work..." [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43133 [03:24:34] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43127 [03:24:42] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43128 [03:25:19] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43129 [03:25:55] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43130 [03:26:20] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43131 [03:26:26] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43132 [03:26:38] New patchset: Ryan Lane; "(bug 43339) deployment roles for beta" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/42549 [03:26:39] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43133 [03:28:11] New patchset: Ryan Lane; "(bug 43339) deployment roles for beta" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/42549 [03:29:24] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/42549 [03:32:05] New patchset: Ryan Lane; "Remove trailing slash on hostname for common repo" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43134 [03:32:21] PROBLEM - Host srv266 is DOWN: PING CRITICAL - Packet loss = 100% [03:32:47] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43134 [03:35:40] RECOVERY - Puppet freshness on lvs1004 is OK: puppet ran at Thu Jan 10 03:35:26 UTC 2013 [03:36:14] New patchset: Reedy; "Fixup live-1.5 symlinks" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43135 [03:37:03] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43135 [03:39:46] New patchset: Reedy; "Attempt at fixing phpunit tests for wmf-config.." [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43136 [03:42:33] New patchset: Reedy; "Attempt at fixing phpunit tests for wmf-config.." [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43136 [03:47:07] New patchset: Reedy; "Attempt at fixing phpunit tests for wmf-config.." [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43136 [03:47:58] New patchset: Ryan Lane; "Include packages conditionally" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43137 [03:50:13] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43137 [03:51:22] New patchset: Reedy; "Attempt at fixing phpunit tests for wmf-config.." [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43136 [03:55:26] New patchset: Reedy; "Attempt at fixing phpunit tests for wmf-config.." [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43136 [03:56:09] New patchset: Reedy; "Attempt at fixing phpunit tests for wmf-config.." [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43136 [03:57:03] New patchset: Reedy; "Attempt at fixing phpunit tests for wmf-config.." [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43136 [03:57:18] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43136 [04:05:13] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [04:12:45] New patchset: Reedy; "Ignore "SHA-1 metadata" in fatalmonitor" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/40766 [04:12:50] New patchset: Reedy; "Log sync-docroot calls" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/40311 [04:13:09] New review: Reedy; "Needs rebasing" [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/37968 [04:13:15] New review: Reedy; "Needs rebasing" [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/42133 [04:22:22] New patchset: Ryan Lane; "Disable management of the pillars top file" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43138 [04:22:30] err: Failed to apply catalog: Could not find dependency File[/usr/local/bin/mw-update-l10n] for File[/usr/local/bin/wmf-beta-autoupdate] at /etc/puppet/manifests/misc/beta.pp:30 [04:22:31] * Ryan_Lane sighs [04:23:03] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43138 [04:31:52] New patchset: Tim Starling; "Restart puppet every 4 hours instead of every 24" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43139 [04:33:54] New patchset: Tim Starling; "Restart puppet every 4 hours instead of every 24" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43139 [04:34:53] any objections to me merging that? [04:35:06] nope [04:35:39] Change merged: Tim Starling; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43139 [05:12:09] Change abandoned: Tim Starling; "Per Ryan's comment" [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/13293 [05:12:52] New patchset: Tim Starling; "(bug 20409) Use NE flag for rewrites that probably need to deal with special chars." [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/34113 [05:13:16] Change merged: Tim Starling; [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/34113 [05:20:51] Sweet. :-) [05:29:30] ah. forgot to puppetize redis [05:39:02] New patchset: Ryan Lane; "Add redis to deployment hosts" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43142 [05:44:08] New patchset: Ryan Lane; "Add sudo policy to deployment hosts" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43144 [05:44:44] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43142 [05:45:12] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43144 [05:48:04] New patchset: Ryan Lane; "Remove duplicate def" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43145 [05:48:37] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43145 [05:49:30] New patchset: Tim Starling; "Don't proxy secure.wikimedia.org to the main apache cluster" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43146 [05:50:04] TimStarling: \o/ [05:50:23] die secure die. heh [05:59:42] well, obviously repo dependencies are working [06:00:12] I purged redis's data when moving the database location, and force sync'd slot0 [06:00:35] Change merged: Tim Starling; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43146 [06:00:35] data showed up in deploy-info for both slot1 and l10n-slot0 [06:00:53] err [06:00:58] slot0 and l10n-slot0 [06:01:13] good [06:06:07] that last puppet change of mine didn't appear to do anything [06:06:18] lemme see [06:06:39] the secure.wikimedia.org file on singer never matched the one in puppet [06:06:50] o.O [06:06:59] maybe it's always been broken in puppet? [06:07:11] yes, probably [06:07:23] let me look at what apache_site does [06:07:51] there's a vim backup file on singer that suggests faidon just edited the file directly when he made his changes [06:08:14] ugh [06:08:29] apache_site only adds the link [06:08:48] the file isn't being included anywhere [06:08:51] let me fix that [06:08:59] just looking at planet.pp for comparison [06:09:05] it has a file {} for actually creating the file [06:09:12] yep [06:11:57] PROBLEM - Host bits-lb.esams.wikimedia.org_ipv6_https is DOWN: /bin/ping6 -n -U -w 15 -c 5 2620:0:862:ed1a::a [06:13:04] ipv6 down for just one single -lb? [06:14:25] nagios-wm: you're full of shit [06:15:12] hrm [06:15:15] i'm looking [06:15:47] New patchset: Ryan Lane; "Make secure actually pull its config from puppet" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43147 [06:16:15] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43147 [06:17:49] RECOVERY - Host bits-lb.esams.wikimedia.org_ipv6_https is UP: PING OK - Packet loss = 0%, RTA = 112.27 ms [06:19:12] New patchset: Tim Starling; "Changes for git-deploy" [operations/apache-config] (newdeploy) - https://gerrit.wikimedia.org/r/43148 [06:19:23] i am getting some ping loss when pinging directly to cr2-eqiad though not thoruhg [06:19:33] at this utilization level i shouldn't be getting any ping loss though [06:20:04] TimStarling: ok, it's pulling from puppet now [06:20:07] thanks [06:20:16] yw [06:26:37] New patchset: Tim Starling; "Fix secure.wikimedia.org VirtualHost" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43149 [06:27:02] Change merged: Tim Starling; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43149 [06:30:03] so it looks like switching over to git-deploy will be as simple as "mv common common-old && ln -s /srv/deployment/mediawiki/common common" [06:30:17] oh? [06:30:27] localization stuff still needs to happen... [06:31:07] I mean the particular part of the deployment process where you make apache use the new directory [06:31:17] ah [06:31:17] cool [06:31:37] actually it is: mv common-local common-old && ln -s /srv/deployment/mediawiki/common common-local [06:31:52] since there were one or two references to common-local [06:32:54] what still needs to be done on the localisation stuff? [06:33:13] did anomie finish the script that writes into the repos? [06:33:38] the repos are there, and repo dependencies work, but there's no data in them [06:33:50] also, are the references to the files good? [06:34:17] are we going to make a link from common to the repo? [06:34:42] I don't see any relevant commits from him [06:35:02] well, I'll work this out with him tomorrow, I guess [06:35:17] I'm finishing up adding deployment to beta [06:55:27] New patchset: Ryan Lane; "Use localhost for deploy-info util" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43152 [06:56:47] hm. I guess I should pull the pillar data for that hostname. [06:58:06] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43152 [07:26:04] PROBLEM - Puppet freshness on ms1004 is CRITICAL: Puppet has not run in the last 10 hours [07:30:22] New patchset: Ryan Lane; "Add private repo and fix reference to common repo" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43155 [07:33:15] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43155 [07:33:38] TimStarling: have the references to the private files also been updated? [08:05:43] New patchset: Adamw; "Move default config into a file" [operations/dumps] (master) - https://gerrit.wikimedia.org/r/43156 [08:05:55] New patchset: Hashar; "beta: fix autoupdater class" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43157 [08:07:25] New patchset: Hashar; "(bug 43811) beta: fix autoupdater class" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43157 [08:10:54] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43157 [08:20:08] PROBLEM - MySQL Slave Delay on db1035 is CRITICAL: CRIT replication delay 221 seconds [08:20:26] PROBLEM - MySQL Replication Heartbeat on db1035 is CRITICAL: CRIT replication delay 231 seconds [08:25:51] PROBLEM - Memcached on virt0 is CRITICAL: Connection refused [08:29:18] RECOVERY - Memcached on virt0 is OK: TCP OK - 0.020 second response time on port 11000 [08:32:36] New patchset: Hashar; "attempt to get mediawiki master under git deploy" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43158 [08:38:42] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43158 [08:53:03] New patchset: Hashar; "mw master branch to slotmaster" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43160 [08:53:32] Change merged: Hashar; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43160 [08:56:42] New patchset: Ryan Lane; "wikidev group doesn't exist in labs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43162 [08:57:29] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [08:57:30] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [08:57:36] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43162 [08:59:20] ok [08:59:22] perms changed [09:17:41] New patchset: MaxSem; "Try disabling noindex on some enwiki pages per discussion with Google" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/43163 [09:25:41] New patchset: Hashar; "git-deploy: rename slotmaster to slotbeta" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43164 [09:27:17] New patchset: Hashar; "git-deploy: rename slotmaster to beta0" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43164 [09:27:56] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43164 [09:52:38] RECOVERY - MySQL Slave Delay on db1035 is OK: OK replication delay 0 seconds [09:53:53] RECOVERY - MySQL Replication Heartbeat on db1035 is OK: OK replication delay 0 seconds [10:00:08] Change merged: MaxSem; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/43163 [10:11:23] !log maxsem synchronized wmf-config/mobile.php 'https://gerrit.wikimedia.org/r/#/c/43163/ - Try disabling noindex on some enwiki pages per discussion with Google' [10:11:33] Logged the message, Master [10:18:56] PROBLEM - Puppet freshness on ms-fe1004 is CRITICAL: Puppet has not run in the last 10 hours [10:18:57] PROBLEM - Puppet freshness on ms-fe1003 is CRITICAL: Puppet has not run in the last 10 hours [10:18:57] PROBLEM - Puppet freshness on sq48 is CRITICAL: Puppet has not run in the last 10 hours [10:18:57] PROBLEM - Puppet freshness on db1047 is CRITICAL: Puppet has not run in the last 10 hours [10:18:57] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [10:29:35] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:34:50] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.035 seconds [10:56:13] New patchset: Hashar; "contint: "luajit", "libluajit-5.1-dev" packages" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43182 [10:56:31] New review: Hashar; "Already in production, so feel free to merge whenever you want." [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/43182 [11:07:29] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:21:43] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 6.107 seconds [11:47:39] New patchset: Hashar; "contint: luajit, libluajit-5.1-dev and g++ packages" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43182 [11:48:08] !log gallium: manually installed luajit, libluajit-5.1-dev and g++ packages . Pending gerrit change is {{gerrit|43182}} [11:48:17] Logged the message, Master [11:48:46] New review: Hashar; "PS2 adds g++" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/43182 [11:50:27] New patchset: Hashar; "contint: packages needed to build WMF PHP extensions" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43182 [11:50:54] New review: Hashar; "PS3 adds libthai-dev needed by wikidiff2" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/43182 [11:54:53] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:55:37] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43182 [11:56:14] PROBLEM - MySQL Replication Heartbeat on db1007 is CRITICAL: CRIT replication delay 183 seconds [11:57:53] PROBLEM - MySQL Slave Delay on db1007 is CRITICAL: CRIT replication delay 227 seconds [11:57:55] New review: Hashar; "it would list the list of all languages for each branch. Maybe I could hack something to have it un..." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/42970 [12:02:59] RECOVERY - MySQL Slave Delay on db1007 is OK: OK replication delay 0 seconds [12:03:08] RECOVERY - MySQL Replication Heartbeat on db1007 is OK: OK replication delay 0 seconds [12:07:19] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 4.034 seconds [12:47:40] New patchset: MaxSem; "Rm GeoData exceptions" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/43202 [12:48:22] guillom: pardon j'ai quitté la mauvaise fenetre [12:48:25] guillom: le plus important pour décembre c'est le déploiement de Zuul et du nouveau workflow https://www.mediawiki.org/wiki/Continuous_integration/Workflow [12:48:36] pas de souci [12:49:01] et effectivement les tests ne sont plus lancés pour tout le monde. [12:49:25] en revanche on vérifie toujours quelques tests simple (genre que le PHP est valide, mais sans l'exécuter) [12:49:30] mais "trusted users" c'est toujours mieux que "juste les employés" =) [12:49:34] j'ai modifié [12:49:38] nice [12:49:47] omg French cabal:P [12:50:11] :P [12:50:26] New patchset: Hashar; "Contint requires a 512M tmpfs file fs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35159 [12:50:46] MaxSem: feel free to join in :-] [12:51:07] guillom: pas grand chose à dire sur le projet "beta" pour le mois de décembre [12:51:08] по французски не бум-бум [12:51:24] hashar: pas de souci :) [12:51:53] guillom: though we had a small meeting with MaxSem + Arthur + Michelle to kick of Mobile support on the beta cluster [12:52:29] clope time :-D [12:52:45] MaxSem: Google Translate says "French for no boom-boom" [12:52:53] yes [12:53:24] "I don't speak French" to be precise [12:53:40] Ah. I would definitely not have guessed :) [12:54:37] (the meaning) [13:02:12] MaxSem: your beer drinking skill is good enough, that is the only thing that matter [13:02:44] I thought you French prefer wine? [13:03:29] احلاهاشر [13:03:31] :D [13:03:36] Oo, [13:03:45] MaxSem: I do both. Indeed prefer red wine [13:03:52] hello hashar [13:04:04] hiii [13:05:47] i am curious when we do the data center migration, if we changing anything like version of php we use? [13:05:58] we are changing [13:06:00] no [13:06:04] MaxSem: btw, when are we supposed to start working on bringing up Mobile support on beta? I got an account on mingle but that is about it for now [13:06:04] phew :) [13:06:08] both are running precise [13:06:36] though you should really write code agnostic to PHP version [13:06:39] aude: I think we do it January 21st till 23rd [13:06:49] * aude running my development environment on same version as wikimedia uses [13:06:52] MaxSem: indeed [13:06:53] aude: don't quote me on this though, just hearsays. I guess there will be a proper announcement. [13:07:14] * aude just wants to do extra testing on same version wikimedia uses [13:07:40] hashar, next week [13:07:47] MaxSem: good [13:07:53] (mobile on beta) [13:16:56] New review: Faidon; "Shouldn't it be 'requires => [ User['jenkins'], Group['jenkins'] ]." [operations/puppet] (production); V: 0 C: -1; - https://gerrit.wikimedia.org/r/35159 [13:18:53] PROBLEM - Puppet freshness on ms-be1003 is CRITICAL: Puppet has not run in the last 10 hours [13:20:50] PROBLEM - Puppet freshness on vanadium is CRITICAL: Puppet has not run in the last 10 hours [13:22:56] Change merged: MaxSem; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/43202 [13:24:52] New patchset: Hashar; "Contint requires a 512M tmpfs file fs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35159 [13:25:23] New review: Hashar; "PS5:" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/35159 [13:25:43] !log maxsem synchronized wmf-config/InitialiseSettings.php 'Rm GeoData exceptions' [13:25:52] Logged the message, Master [13:26:33] PS5? I thought that the latest PlayStation is 3 [13:27:40] Patchset :-) [13:27:48] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35159 [13:29:57] and since I am growing fed up by the manifests/misc/contint.pp I think I am going to make it a module ;-D [13:30:17] what will be the difference? [13:30:41] I will have multiple manifests instead of a huuuuge one [13:31:09] hm, I'm wondering if we should create a module per each of those or a submodule [13:31:23] like wikimedia/contint/foo.pp [13:31:37] mount point /var/lib/jenkins/tmpfs does not exist gr [13:31:38] I thought puppet was creating it for us [13:35:04] paravoid: wikimedia sounds good [13:35:06] New patchset: Hashar; "contint: mount dir must exist /var/lib/jenkins/tmpfs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43208 [13:35:33] I just thought of that, I don't have a strong opinion or have thought it much [13:35:34] paravoid: https://gerrit.wikimedia.org/r/43208 creates the directory required for the mount command to work :-D [13:35:52] but then our modules are mostly wikimedia stuff only aren't they ? [13:36:57] I think coredb could be moved under wikimedia, while redis could not [13:37:07] but again, not sure [13:37:38] but my contint stuff could ? ;) [13:40:08] back [13:51:28] paravoid: could you get my mount point creation merged in please ? ;https://gerrit.wikimedia.org/r/#/c/43208/ [13:52:03] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43208 [13:52:08] danke [13:52:38] if that works, I am no more interrupting ya for the rest of the day ;-] [13:57:27] !log gallium: manually reinstalled Zuul and restarted it. [13:57:37] Logged the message, Master [14:04:38] notice: /Stage[main]/Misc::Contint::Test::Jenkins/Mount[/var/lib/jenkins/tmpfs]/ensure: ensure changed 'unmounted' to 'mounted' [14:04:40] success [14:06:53] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [14:07:20] PROBLEM - MySQL Replication Heartbeat on db33 is CRITICAL: CRIT replication delay 181 seconds [14:07:20] PROBLEM - MySQL Slave Delay on db1020 is CRITICAL: CRIT replication delay 183 seconds [14:08:32] PROBLEM - MySQL Replication Heartbeat on db1020 is CRITICAL: CRIT replication delay 219 seconds [14:08:59] PROBLEM - MySQL Slave Delay on db33 is CRITICAL: CRIT replication delay 229 seconds [14:12:16] RECOVERY - MySQL Replication Heartbeat on db1020 is OK: OK replication delay 0 seconds [14:12:25] RECOVERY - MySQL Slave Delay on db33 is OK: OK replication delay 0 seconds [14:12:35] RECOVERY - MySQL Replication Heartbeat on db33 is OK: OK replication delay 1 seconds [14:12:35] RECOVERY - MySQL Slave Delay on db1020 is OK: OK replication delay 0 seconds [14:15:14] !log cleaned up all content in /mnt/jenkins-tmp been replaced by /var/lib/jenkins/tmpfs [14:15:24] Logged the message, Master [14:18:02] New review: Anomie; "Check /etc/wikimedia-realm. Or use multiversion/MWRealm.sh to get $WMF_REALM, to avoid duplicating t..." [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/42970 [14:22:14] anomie: indeed [14:22:26] anomie: now I start to understand the usefulness of MWRealm.sh :-] [14:38:51] !log Updating Solr index on all wikis [14:39:01] Logged the message, Master [14:57:53] Anyone around know about git-deploy? Or do I have to wait for Ryan? [15:04:39] anomie: Ryan :-] though I received a training this morning [15:04:46] so I might be able to help [15:05:40] hashar- So it looks like things are working as far as running maintenance scripts on beta labs and on tin? [15:06:13] anomie: so Ryan deployed git-deploy on beta yesterday [15:06:25] he guided me to add a beta slot in git-deploy so we can get master deployed [15:06:26] But it looks like mwscript and such in beta will still point to the non-git-deploy install [15:06:34] hmm [15:06:53] the mediawiki-config should be using the new branch "newdeploy" [15:07:03] maybe it does not yet points to /srv/deployment/mediawiki/* [15:07:27] I wanted to poke Sam about it but he is probably still asleep / out [15:08:04] hmm [15:08:13] mediawiki-config runs out of master :/ [15:08:40] And it looks like /var/lib/git-deploy/dependencies/l10n is the script that is going to get run to update the l10n when one of the git-deploy slots is deployed. [15:08:53] yeah l10n does not work yet [15:09:10] so Ryan told me he needs the list of commands that need to be run [15:09:29] then I think /usr/local/bin/mw-update-l10n should handle that nicely [15:09:36] That's what I'm going to work on, if I can figure out enough to be able to do so [15:10:55] looking at it [15:10:55] /usr/local/bin/mw-update-l10n isn't even deployed right now for git-deploy, see gerrit change 42871. It's on beta because beta also has the scripts for scap. [15:11:11] !g 42871 [15:11:11] https://gerrit.wikimedia.org/r/#q,42871,n,z [15:13:11] seems we can point mw-update-l10n to /srv/deployment/mediawiki/common [15:13:14] (on beta) [15:16:02] anomie: could you review https://gerrit.wikimedia.org/r/#/c/43165/ adds /.deploy to gitignore for mediawiki/core [15:16:15] that is used by git-deploy and we don't need to track it down [15:17:18] Note that --outdir in the rebuildLocalisationCache.php call would need to be changed for mw-update-l10n, to point to the appropriate l10n-slot#. Also probably the --output for the mergeMessageFileList.php call, so we can avoid dumping random untracked files into the config directory (or does that not matter?) [15:17:59] Reviewed. [15:21:03] backporting to wmf branches [15:30:39] 2013-01-10 15:28:40,539 ERROR zuul.IndependentPipelineManager: Reporting change received: error: (change 43213) not permitted to submit change [15:30:42] pfff [15:31:53] New patchset: MaxSem; "Cronjobs for GeoData" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43218 [15:56:08] PROBLEM - Host ms-be1007 is DOWN: PING CRITICAL - Packet loss = 100% [17:27:54] PROBLEM - Puppet freshness on ms1004 is CRITICAL: Puppet has not run in the last 10 hours [17:28:02] New review: Aaron Schulz; "Maybe this and misc::maintenance::parsercachepurging can still write to stderr as long as that is se..." [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/42600 [17:45:12] New patchset: MaxSem; "Postgres module for OSM" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/36155 [17:45:12] New patchset: MaxSem; "WIP: OSM module" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/36222 [17:45:37] New review: MaxSem; "Rebased." [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/36222 [17:57:18] New patchset: Reedy; "Remove old ExtensionDistributor files from wmf-config" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/43231 [17:58:30] Change merged: Demon; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/43231 [18:00:27] !log demon synchronized php-1.21wmf7/extensions/ExtensionDistributor 'New version of ExtensionDistributor' [18:00:38] Logged the message, Master [18:17:06] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 269 seconds [18:19:03] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 0 seconds [18:19:43] !log reedy synchronized wmf-config/ [18:19:52] Logged the message, Master [18:25:09] New patchset: Reedy; "Remove more old extensiondistributor config" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/43236 [18:27:43] Change merged: Demon; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/43236 [18:27:54] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 250 seconds [18:28:21] PROBLEM - MySQL Slave Delay on db1013 is CRITICAL: CRIT replication delay 277 seconds [18:29:12] !log reedy synchronized wmf-config/CommonSettings.php [18:29:22] Logged the message, Master [18:32:08] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 1 seconds [18:32:34] RECOVERY - MySQL Slave Delay on db1013 is OK: OK replication delay 28 seconds [18:35:48] New review: Reedy; "The extdist user needs deleting from fenari. By the looks of it it isn't in puppet, so just needs do..." [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/41976 [18:37:35] New review: Demon; "That's what systemuser is for, and why I'm doing ensure => absent." [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/41976 [18:39:11] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 304 seconds [18:39:28] PROBLEM - MySQL Slave Delay on db1013 is CRITICAL: CRIT replication delay 323 seconds [18:41:16] RECOVERY - MySQL Slave Delay on db1013 is OK: OK replication delay 4 seconds [18:41:52] Hi all, what's the right way to get a new php pear package added? I'm happy to put up a proposed puppet manifest change if cutting work makes it more likely to get done quickly, but I don't know if it should exec a command or roll a new .deb package with the files in it. I want Mail_Mime, and Mail which I assumed was there but I can't see in the manifests [18:42:13] New patchset: Demon; "Delete remote branches when they're deleted in Gerrit" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43239 [18:42:38] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 27 seconds [18:48:21] ^demon, what's the right way to get a new php pear package added? I'm happy to put up a proposed puppet manifest change if cutting work makes it more likely to get done quickly, but I don't know if it should exec a command or roll a new .deb package with the files in it. I want Mail_Mime, and Mail which I assumed was there but I can't see in the manifests [18:48:56] <^demon> That's an ops question. I know we generally frown on things like gems, pear, etc. [18:49:26] <^demon> (Not the code necessarily, but using them for package management) [18:51:22] Ryan_Lane - I see we have mwscript working on tin now, and /srv/deployment filled on beta labs. What do I need to do to get git-deploy doing the l10n? [18:51:46] thanks. That's the sort of sanity check I was looking for. The hardest things for a newbie here are the changes that are 1% code and 99% knowing who to ask and which of the possible options to ask for [18:52:15] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43239 [18:52:16] <^demon> Yeah--puppet stuff is all ops (which means you're in the right place, just asking the wrong guy :)) [18:52:30] anomie: it was a late night last night ;) [18:52:32] anomie: you just need to have it write into the l10n repos [18:52:38] anomie: I can handle the rest [18:54:04] Ryan_Lane- I see we have symlinks now from common to the core slots. Any advice on finding the corresponding l10n slot? More symlinks? [18:54:18] likely a good idea [18:54:32] lololol [18:54:40] soiherdulieksymlinks [18:54:57] ;) [18:55:09] Reedy: suggest better ways to set up l10n [18:55:16] I honestly don't know that system very well [18:55:20] touche [18:55:20] :D [18:55:37] At least they are all consistent now etc [18:55:49] it may be that we don't even need multiple slots for it? [18:56:01] Should I work on tin or beta? Everyone's been talking about beta, but the scripts on beta are still using the old paths [18:56:12] anomie: Add a constant to multiversion/defines.php [18:56:15] Swap $wgLocalisationCacheConf['storeDirectory'] = "$IP/cache/l10n"; [18:56:34] beta needs more work, I found out last night [18:56:44] because master doesn't have any submodules [18:57:19] to $wgLocalisationCacheConf['storeDirectory'] = MULTIVER_L10N . '/l10n'; [18:57:23] can someone review https://gerrit.wikimedia.org/r/#/c/43218/ please? [18:57:34] Reedy: wouldn't that put it in /src/deployment/mediawiki/common/slot0/cache/l10n? [18:57:43] ah [18:58:05] well, not slot0., but 1.20wmf6 [18:58:22] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [18:58:23] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [18:58:50] Bleh [18:59:03] Yeah, I was using it like it was a "common" target to use [18:59:29] in the old system, both use the same set of l10n, right? [18:59:35] nope [18:59:43] ah [18:59:43] they have their own cache folder [18:59:52] yeah. multiple slots it is [18:59:55] In the old system, it's dumped into the cache/l10n directory in each branch's checkout [19:00:00] (not tracked in git) [19:00:03] * Ryan_Lane nods [19:00:20] we could symlink cache/l10n to ../l10n-slotN [19:00:38] but then that seems overly dirty [19:01:44] cache/l10n-1.20wmf6 -> ../../l10n-slotN [19:02:07] or [19:02:24] cache/l10n/1.20wmf6 ../../../l10n-slotN [19:02:52] this is in the common repo, right? [19:03:05] or is it in core's wmf branch [19:03:06] ? [19:03:25] they're currently at php-1.21wmf6/cache/l10n [19:03:50] ah [19:03:51] ok [19:03:59] why is a symlink dirty, then? [19:04:05] MULTIVER_COMMON . '/' . $wgVersion ? [19:04:40] $wgLocalisationCacheConf['storeDirectory'] = MULTIVER_COMMON . "/l10n-$wgVersion" ? [19:04:40] I guess it makes merges a pain in the ass [19:04:41] DONE [19:04:47] yeah [19:04:50] that works [19:05:02] then a symlink there [19:05:12] which anomie has already created on tin I can see [19:05:15] heh [19:05:16] cool [19:07:36] interwiki.cdb and trusted-xff.cdb need moving [19:08:09] into common (or common/wmf-config ?) and symlinking back in [19:10:12] symlinks from cache into common? [19:10:45] hmm [19:10:48] Maybe not even bother [19:10:48] $wgLocalisationCacheConf['storeDirectory'] = [19:10:51] fail copy [19:10:53] feel free to add these things into the repos on tin :) [19:10:57] if ( function_exists( 'dba_open' ) && file_exists( "$IP/cache/interwiki.cdb" ) ) { [19:10:57] $wgInterwikiCache = "$IP/cache/interwiki.cdb"; [19:10:57] } [19:11:02] * Ryan_Lane nods [19:11:04] just move it completely [19:11:07] indeed [19:11:11] same for the other [19:16:08] New patchset: Reedy; "Move interwiki and trusted-xff cdbs into common repo" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43243 [19:16:47] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43243 [19:18:18] New patchset: Reedy; "Update localisation cache dir to MULTIVER_COMMON . "l10n-$wmfVersionNumber"" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43244 [19:19:35] New patchset: Reedy; "Update localisation cache dir to MULTIVER_COMMON . "l10n-$wmfVersionNumber"" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43244 [19:19:56] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43244 [19:21:01] Hmm. I wonder if this branch needs rebasing onto master again.. [19:21:53] Might be a nice idea to expose those 2 via noc for other people to grab. Not exactly end user readable, but might be helpful [19:24:30] we can have noc as a deployment target [19:24:31] Ryan_Lane, Reedy - will it cause any trouble if I test out the l10n updating on tin? I'd run git deploy start, make changes to l10n-slotX, commit, then git deploy sync, is that about right? [19:25:06] anomie: that's fine [19:25:20] there's one difference in the way you run sync, though [19:25:35] you need to use git deploy--force sync [19:26:24] automated pushes of dependency repos needs some work. it's likely to fail [19:26:40] that's something I'm going to fix today [19:26:44] New patchset: Reedy; "Expose trusted-xff.cdb and interwiki.cdb on noc" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43245 [19:27:25] New patchset: Reedy; "Expose trusted-xff.cdb and interwiki.cdb on noc" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43245 [19:27:34] We seem to be getting pretty close now :) [19:27:37] yep [19:27:48] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43245 [19:27:53] there's only a couple things I need to do in the system itself [19:28:25] Tim has updated the apache configs in gerrit... [19:28:36] we need to push /srv out to the apaches, and make a faux link farm [19:29:00] I might do the latter now in common2 or something under /usr/local/apache [19:29:09] I think we're going to use a single deployment host initially [19:29:31] tin, in eqiad [19:29:44] unless I get multi deployment host support working by tomorrow [19:30:21] Not like it's much of an issue if we do [19:30:28] so, to deploy to the apaches in pmtpa I just need to modify the config to add them to the regex [19:30:35] and point pmtpa to tin [19:31:21] Ahh.. No one has moved "uncommon" so far... [19:31:26] I'd say it will slow down deployment in pmtpa, but I'd probably be joking myself :) [19:31:31] uncommon? [19:31:45] texvc binary [19:31:49] ah [19:31:51] right [19:31:57] should that be another repo? [19:32:09] Not sure [19:32:16] It's currently built locally on each apache [19:32:24] * Ryan_Lane nods [19:32:37] See /usr/bin/scap-recompile [19:33:29] New patchset: Reedy; "Remove duplicate definition of $wgTexvc" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/43247 [19:33:40] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/43247 [19:34:09] !log reedy synchronized wmf-config/CommonSettings.php [19:34:18] Logged the message, Master [19:34:27] For most of the hosts, using the same file should actually be ok (same ubuntu, same arch etc) [19:34:34] file/binary [19:34:50] yeah [19:35:08] so, another repo, then? [19:35:59] then we can have a script on the deployment host that compiles and pushes it [19:36:09] binasher, is it deliberate that redirector changes don't trigger a squid restart in squid.pp? [19:36:09] and we can make it a dependency to the slots [19:38:17] reedy@fenari:/home/wikipedia/common$ ls -al /usr/local/apache/uncommon/1.21wmf7/bin/texvc [19:38:17] -rwxr-xr-x 1 mwdeploy mwdeploy 725237 Jan 2 20:06 /usr/local/apache/uncommon/1.21wmf7/bin/texvc [19:38:21] MaxSem: there's a squid.pp? [19:38:27] who knew [19:39:53] Ryan_Lane: makes sense I guess [19:40:05] ok. I'll add that [19:40:17] Reedy: have you guys modified references to the private files? [19:40:19] It's not like we're planning on adding solaris hosts or anything [19:40:27] Reedy: (notice I've made a "private" repo) [19:40:37] I've done some of them that were explicitly needed [19:40:42] PrivateSettings.php and alike [19:40:44] ok [19:41:02] ah [19:41:08] the lib directory in private can be nuked [19:41:19] or its contents can at least [19:41:19] (be bold ;) ) [19:41:48] <^demon> The people who designed "git submodule foreach" really really did not intend it to be used on 300+ submodules. [19:41:55] done then [19:42:06] <^demon> It's only marginally faster doing it by hand. [19:42:13] ^demon: yeah. it's just a for in bash [19:42:27] I want it to be able to run a certain number of submodules in parallel [19:42:47] ^demon: really foreach is just a really shitty wrapper [19:42:55] we should rewrite it. in python [19:43:02] and make it parallel [19:43:09] with a fanout [19:43:47] New review: Adamw; "apergos: I see that ariel:xmldumps-backup/WikiDump.py still has defaults specified as a flat array. ..." [operations/dumps] (master) C: -1; - https://gerrit.wikimedia.org/r/43156 [19:44:29] <^demon> Ryan_Lane: I would love that. I've been watching this thing iterate all day. [19:44:30] New patchset: Reedy; "Simplify line to checkers.php" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43248 [19:44:32] <^demon> It's a huge waste of time. [19:44:44] ^demon: it also makes the "fetch" stage in git-deploy much slower [19:44:47] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43248 [19:45:34] Reedy- Hmm. When I'm trying to test this, it keeps telling me "No localisation cache found for English. Please run maintenance/rebuildLocalisationCache.php". Even when I'm trying to do just that. What's the magic to fix it? [19:46:04] New patchset: Reedy; "Remove php- prefix if it exists" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43249 [19:46:16] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43249 [19:46:21] anomie: I think you need to run update first [19:58:17] New patchset: Reedy; "Increase abusefilter emergency shutdown threshold for feedback to 20%" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43251 [19:58:17] New patchset: Reedy; "(bug 43630) set autoconfirm count to 10 @ fawiki, per consensus" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43252 [19:58:17] New patchset: Reedy; "AFTv5: skip rollbacker group on wikis without that" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43253 [19:58:18] New patchset: Reedy; "(bug 43565) Update Wikivoyage logo and favicon" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43254 [19:58:18] New patchset: Reedy; "Set WMF default of $wgUnwatchedPageThreshold" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43255 [19:58:18] New patchset: Reedy; "AFT test group permissions have been removed already; these lines no longer make sense" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43256 [19:58:18] New patchset: Reedy; "(bug 43687) Namespace configuration for meta." [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43257 [19:58:18] New patchset: Reedy; "Reset to UTC the es.wikivoyage.org timezone." [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43258 [19:58:19] New patchset: Reedy; "(bug 43617) FlaggedRevs configuration for pl.wikipedia" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43259 [19:58:19] New patchset: Reedy; "Harmonize bugzilla links for $wgAutoConfirmCount" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43260 [19:58:19] New patchset: Reedy; "Enable wgMFForceSecureLogin on testwiki" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43261 [19:58:20] New patchset: Reedy; "$wgCategoryCollation = 'identity' on iswiktionary for bug 30722" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43262 [19:58:20] New patchset: Reedy; "Disable wgMFForceSecureLogin on testwiki" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43263 [19:58:21] New patchset: Reedy; "Disable TorBlock on private wikis" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43264 [19:58:21] New patchset: Reedy; "beta: makes two wikis to use WMF branches" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43265 [19:58:22] New patchset: Reedy; "Workaround for exception preventing translation" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43266 [19:58:22] New patchset: Reedy; "Configure ExtensionDistributor in preparation for new version" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43267 [19:58:23] New patchset: Reedy; "sewikipedia -> sewiki" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43268 [19:58:23] New patchset: Reedy; "add WikipediaMobileFirefoxOS to bits docroot as Submodule" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43269 [19:58:24] New patchset: Reedy; "fix wmfDatacenter / wmfRealm scope in IS.php" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43270 [19:58:24] New patchset: Reedy; "Restore $wmfConfigDir global in wmfLoadInitialiseSettings" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43271 [19:58:25] New patchset: Reedy; "Try disabling noindex on some enwiki pages per discussion with Google" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43272 [19:58:25] New patchset: Reedy; "Rm GeoData exceptions" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43273 [19:58:26] New patchset: Reedy; "Remove old ExtensionDistributor files from wmf-config" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43274 [19:58:26] New patchset: Reedy; "Remove more old extensiondistributor config" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43275 [19:58:27] New patchset: Reedy; "Remove duplicate definition of $wgTexvc" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43276 [19:58:32] ... [19:58:43] heh [19:58:48] ty gerrit-wm [20:00:00] Does someone want to silence it while I batch merge those? [20:01:24] * Ryan_Lane shrugs [20:01:28] it doesn't bother me [20:02:00] alright [20:02:08] Since no one seems to be around to silence it... [20:02:12] I forgot this isn't #mediawiki ;) [20:02:26] Jeff_Green: moved ethernet cable to eth0 from eth4 on server 'pappas' [20:02:43] thanks sbernardin! [20:02:49] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43251 [20:03:16] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43252 [20:03:19] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43253 [20:03:22] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43254 [20:03:26] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43255 [20:03:29] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43256 [20:03:32] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43257 [20:03:36] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43258 [20:03:39] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43259 [20:03:42] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43260 [20:03:45] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43261 [20:03:49] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43262 [20:03:52] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43263 [20:03:56] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43264 [20:03:59] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43265 [20:04:02] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43266 [20:04:06] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43267 [20:04:09] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43268 [20:04:12] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43269 [20:04:16] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43270 [20:04:19] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43271 [20:04:23] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43272 [20:04:26] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43273 [20:04:29] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43274 [20:04:32] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43275 [20:04:36] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43276 [20:04:43] doneded [20:05:33] I'll be quiet for a few minutes ;) [20:07:17] ^demon: seen the quick-update vvv wrote? [20:07:35] "This allows to update all extensions in about a minute (unlike 7 to 15 minutes without it)." [20:07:59] AFK for a few, then uncommon.. [20:09:15] New patchset: MaxSem; "WIP: OSM module" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/36222 [20:09:19] Ryan_Lane: Do we need a remote "backup" for private? [20:09:22] <^demon> Reedy: I'm not updating :) [20:09:33] Reedy: probably. do we right now? [20:10:14] URL: file:///home/wikipedia/conf-svn/wmf-config/trunk [20:12:00] Put it somewhere like the private puppet one? [20:13:07] we back that up? [20:13:09] heh [20:13:17] kidding. it likely goes into tridge [20:13:35] I think I've figured it out. According to http://wikitech.wikimedia.org/view/Heterogeneous_deployment_v2 (which should probably be merged to How_to_deploy_code, BTW), when we first check out a new version of MediaWiki, we first update wikiversions.dat to put the new version for test2wiki only and then scap which rebuilds the l10n. Since test2wiki has $wgLanguageCode = 'en', which doesn't have this problem, the initial l10n build works. I wo [20:13:35] nder if we should have the l10n maintenance scripts force $wgLanguageCode = 'en' so it works no matter which wiki is used for the initial l10n build. Or am I just lost? [20:14:30] anomie: We currently do more crazy things... [20:14:48] anomie: I created that as my essentially my copy paste list for deploying new versions [20:14:49] Reedy: how do you end up sending 20+ changes against operations/mediawiki-config ? :D [20:15:26] hashar: branch chaining, rebase and git review? [20:15:55] MaxSem: could have been a merge commit [20:16:03] but he I am just ranting [20:16:04] pfft [20:16:09] more is always better [20:16:21] verbosity [20:16:53] so you cherry picked form master and applied to new deploy ? [20:17:09] * hashar rebase to find out [20:17:21] git pull origin master [20:17:52] you could have: git checkout newdeploy && git merge origin/master [20:17:59] that would have created a nice merge commit :-] [20:18:01] can someone review https://gerrit.wikimedia.org/r/#/c/43218/ please? [20:18:10] not a big deal anyway [20:18:19] hashar: we get nice thins like https://gerrit.wikimedia.org/r/#/q/I00f8511366caf66bdfea420f60241406a34eae50,n,z [20:18:22] :D [20:18:35] It's only short term [20:18:46] ahh yeah [20:18:52] that looks nice in gerrit indeed [20:18:57] anyway [20:19:08] Ryan_Lane has create git deploy on beta [20:19:15] and gave me a tutorial this morning [20:19:19] looks fine :-] [20:19:27] still need to generate the l10ncache properly [20:20:00] we're getting there! [20:20:01] and the beta slot in git-deploy needs to get mediawiki/core.git + mediawiki/extensions.git as a submodule and all its submodules [20:20:26] PROBLEM - Puppet freshness on db1047 is CRITICAL: Puppet has not run in the last 10 hours [20:20:26] PROBLEM - Puppet freshness on ms-fe1003 is CRITICAL: Puppet has not run in the last 10 hours [20:20:26] PROBLEM - Puppet freshness on sq48 is CRITICAL: Puppet has not run in the last 10 hours [20:20:26] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [20:20:26] PROBLEM - Puppet freshness on ms-fe1004 is CRITICAL: Puppet has not run in the last 10 hours [20:20:47] pooor puppet [20:39:21] !log nas1-a thumbs vol ran out of inodes, increased total by 10x [20:39:32] Logged the message, Master [20:51:26] heya LeslieCarr, [20:52:06] is there a way to know if there was a network connectivity problem between the two analytics racks during a particular time period? [20:52:45] New patchset: Reedy; "Remove $IP/lib from include path" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/43278 [20:53:11] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/43278 [20:54:08] Cane someone please run this on fenari? rm -rf /home/wikipedia/common/lib [20:56:43] !g 43280 |Reedy [20:56:44] Reedy: https://gerrit.wikimedia.org/r/#q,43280,n,z [20:57:19] Reedy: you positive you want me to delete this? [20:57:33] /home/w/common/lib right? [20:57:53] Yup, pleased [20:57:59] s/d// [20:58:06] done [20:58:40] !log reedy synchronized wmf-config/CommonSettings.php [20:58:51] Logged the message, Master [21:00:26] * Damianz watches the world implode [21:01:00] It's not been used since around the time we switched to git... [21:01:28] Ryan_Lane: Should this uncommon (better name needed??) repo just be a local repo in /srv/deployment/mediawiki for now? [21:01:54] yes [21:04:45] New patchset: Reedy; "Add MULTIVER_UNCOMMON" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43281 [21:05:03] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43281 [21:05:59] !log olivneh synchronized php-1.21wmf7/extensions/EventLogging [21:06:10] Logged the message, Master [21:08:47] Hmm. Why has tin seemingly got the old versions of deployment scripts? [21:09:07] https://gerrit.wikimedia.org/r/gitweb?p=operations/puppet.git;a=commitdiff;h=0532856984eddca06ca9298e77e41eb6e59a4971 is merged... [21:09:56] New patchset: Ryan Lane; "Improve reporting for the deployment system" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43317 [21:10:02] Ughhhh [21:10:19] mwversionsinuse is in both /usr/bin and /usr/local/bin [21:10:23] hahaha [21:10:43] ^^ that reporting change I just pushed in is going to rock [21:15:07] One comes from wikimedia-task-appserver.deb [21:20:30] reedy@tin:/srv/deployment/mediawiki/uncommon$ ~/scap-recompile [21:20:30] /usr/bin/env: php: No such file or directory [21:20:30] Unable to read wikiversions.dat or it is empty [21:20:38] Yay, we have scripts that call scripts.. [21:20:39] :D [21:20:49] is php not installed on tin? [21:21:06] oh. no. it is [21:21:10] for security reasons! [21:21:23] it's necessary to have php installed on tin [21:21:26] we'd have loads of broken stuff with no php [21:21:43] people would be able to deploy bad code if php were installed! [21:21:49] heh [21:23:48] New patchset: Brion VIBBER; "Add MIME type for .webapp extension: needed for FirefoxOS app" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43344 [21:25:29] checkout_checkin_timestamp [21:25:34] what a stupid variable name [21:25:54] Swapping /usr/bin/env for /usr/bin/php fixes it... [21:26:01] I didn't think about it when I named "checking in to the deployment server" checkin [21:26:02] but running /usr/bin/env php works fine standalone [21:29:00] delete the line, readd it, it works [21:29:01] ffs [21:29:18] WFM [21:31:13] :) [21:32:20] New patchset: Brion VIBBER; "Add MIME type for webapp manifest for Firefox OS" [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/43346 [21:32:51] Change abandoned: Brion VIBBER; "bits does not run on this configuration, apparently" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43344 [21:34:17] New patchset: Reedy; "Add .gitreview" [operations/debs/wikimedia-task-appserver] (master) - https://gerrit.wikimedia.org/r/43347 [21:34:38] !log olivneh synchronized php-1.21wmf7/extensions/EventLogging [21:34:47] Logged the message, Master [21:38:39] New patchset: Reedy; "Update scap-recompile (guess it needs renaming!) for gitdeploy" [operations/debs/wikimedia-task-appserver] (master) - https://gerrit.wikimedia.org/r/43348 [21:39:39] Ryan_Lane: Presumably all the scap related skips are just going to be nuked from the deb? And I should add the scap-recompile (maybe with rename) to puppet? [21:39:47] yes [21:39:50] scap scripts must die [21:39:52] all must die [21:39:55] * Reedy amends that commit [21:39:57] :) [21:40:11] kill all teh things [21:41:09] !log olivneh synchronized php-1.21wmf7/extensions/E3Experiments [21:41:19] Logged the message, Master [21:43:28] New patchset: Reedy; "Removed scap related scripts" [operations/debs/wikimedia-task-appserver] (master) - https://gerrit.wikimedia.org/r/43349 [21:46:35] Reedy: we'll need to make sure we don't update that deb until we totally finish the switchover [21:46:49] I guess someone doing it accidentally is unlikely [21:46:58] I surely hope so :) [21:47:08] I seem to recall Aaron and I both making changes to it and then realise like 6 months later "hey, we fixed that already..." [21:47:28] heh [21:48:36] I just realised moving the interwiki.cdb half makes another commit I've got in gerrit redundant [21:48:43] will just update it again [21:51:43] Ok, that seems to function. /home/anomie/l10nupdate-quick on tin is more or less the equivalent of mw-update-l10n for git-deploy. [21:52:04] My stupid question of the day: If the search on mediawiki.org does not work reliably and does not allow provide search results, is there any way on the client side to provide some useful debug info for a developer to track it further down? [21:52:24] I don't think it's stupid [21:52:29] But I do think the answer is no [21:53:06] notpeter: ^^ [21:53:38] heh. I asked because valeriej (my WOP student) can reproduce https://bugzilla.wikimedia.org/show_bug.cgi?id=42423 - mediawiki.org search not always providing results. [21:53:46] anomie: this is writing into the repo on tin? [21:53:47] New patchset: Ryan Lane; "Update deploy-info to report on fetch tags" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43353 [21:53:52] mmm I think there were more reports about search [21:53:55] Ryan_Lane- yes [21:54:01] ottomata: hrm, on a conference right now - i'd check out the network report on ganglia and we can get you an observium login [21:54:17] MaxSem: we have one report about Commons search sometimes failing, and one about mediawiki.org search. [21:54:24] anomie: awesome [21:54:30] (and one about the Ukrainian chapter website where search doesn't work at all) [21:54:32] let me move this into puppet [21:54:56] oo observium? [21:55:00] network report? [21:55:01] hm [21:55:19] https://github.com/openstack-ci/git-review dead link is dead [21:55:40] anomie: sweet, it isn't doing a git deploy —force sync, either [21:55:42] that's good [21:56:09] Ryan_Lane- BTW, if you run it with no args it will update everything returned by mwversionsinuse; if you pass it a version then it will try to only do that version [21:56:21] MaxSem, I'm about to scap as part of the E3 deploy and you have two untracked files in 1.21wmf7/extensions/WikimediaMaintenance , wikisWithSetting.php wikisWithSetting.php.save [21:56:23] * anomie was not aware there was a "--force" option to git deploy [21:56:34] New patchset: Reedy; "Add a replacement to "scap-recompile"" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43354 [21:56:41] maxsem I am going to delete them from fenari [21:56:41] yay, manual hook insallation [21:56:54] spagewmf, feel free to [21:56:56] anomie: for instance, this update script won't work in a cron [21:57:05] but it works properly when updating slot0, for instance [21:57:26] Oh? Why not? [21:57:36] New patchset: Reedy; "Add a replacement to "scap-recompile"" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43354 [21:57:58] let's say we're doing a deployment for slot0 [21:58:04] git deploy start [21:58:07] [21:58:09] git deploy sync [21:58:18] ^^ that'll run the sync script [21:58:32] the sync script will: [21:58:50] 1. see if it has repo dependencies [21:59:03] 2. if so, will tell the dependencies to update themselves [21:59:38] 3. will tell all the deployment destinations to do a fetch [22:00:03] the deployment destinations will fetch all repo dependencies, then it'll fetch itself [22:00:10] then the sync script will: [22:00:25] 4. tell all deployment destinations to checkout [22:00:43] the deployment destinations will checkout all repo dependencies, then checkout itself [22:01:05] doing it that way makes it faster for the checkout stage of all repos [22:01:18] it makes it more atomic [22:01:59] started scap, can it do it in less than 58 minutes? [22:02:07] so, repos that we treat as dependencies won't tell minions to fetch/checkout themselves, it lets the dependent repo handle that [22:02:25] if you want the repo to actually do a fetch/checkout, you need to use --force [22:02:40] New patchset: Reedy; "Add a replacement to "scap-recompile"" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43354 [22:03:00] Ryan_Lane- Ok. Makes sense. [22:03:38] I can add in the changes to make that work for this [22:03:53] Change abandoned: Reedy; "(no reason)" [operations/debs/wikimedia-task-appserver] (master) - https://gerrit.wikimedia.org/r/43348 [22:05:30] Ryan_Lane- Speaking of which, next up is the script to run the LocalisationUpdate extension update (the equivalent to l10nupdate on fenari), which will probably need to be using that. Do we have the location for that extension's /var data set up on tin? Remember that bit only needs to be on the master. [22:05:32] New patchset: Reedy; "Add script to update the interwiki cache for all active MediaWiki versions" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/42133 [22:06:11] anomie: nope. I guess we should use mwdeploy for that [22:06:18] like it is in the old system [22:07:08] we can probably just use the same path [22:07:13] I don't see a reason to change that [22:08:48] Ryan_Lane will someone modify mediawiki's 1.21 Roadmap and wikitech's How_to_deploy_code with huge banners saying "Coming January 16th... git deploy!" ? [22:09:08] New patchset: Reedy; "Add script to update the interwiki cache" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/42133 [22:09:25] New review: Reedy; "Marking -1 per lack of git-deploy ness" [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/42133 [22:09:49] spagewmf: I dunno, it won't be me ;) [22:10:30] Ryan_Lane I assume How_to_deploy will completely change [22:10:42] "Don't" [22:10:48] spagewmf: not much, really [22:11:28] ; scap; -> git deploy start; ; git deploy sync [22:11:30] !log spage Started syncing Wikimedia installation... : E3 deployment Extension:GettingStarted and its i18n msgs [22:11:37] same with any other of the "sync" parts [22:11:41] Logged the message, Master [22:11:45] otherwise everything is exactly the same [22:12:10] we're making good progress [22:12:17] at this pace I think friday is going to be a go [22:12:25] I'm excited. and we will be doing this for 1.21wmf8 (Jan 16) [22:12:30] ? [22:12:33] deploy then run... to the pub [22:12:48] if we think we're ready by tomorrow, then yes, 1.21wmf8 will be git-deploy [22:13:38] by the way... Ryan_Lane, will it be possible to touch files with git-deploy? [22:13:43] no [22:13:48] Ryan_Lane- /home/anomie/l10nupdate should be more or less right for the LocalisationUpdate extension updater [22:13:50] but with salt, yes [22:13:50] duh [22:14:12] anomie: cool [22:14:35] anomie: how much longer will you be around today? [22:14:38] cause touching resource files is a standard way to wrestle with RL [22:14:50] MaxSem: yes, that's been brought up [22:14:57] we'll likely have a script to touch files [22:15:01] that will call salt [22:15:05] Ryan_Lane- I have nothing much to do tonight, so I can stick around for a while [22:15:07] cool [22:15:34] anomie: ok. cool. cause I want to finish up this reporting stuff really quick, then we'll work on integrating these scripts with deployment [22:21:18] Ryan_Lane: for https://rt.wikimedia.org/Ticket/Display.html?id=4307 I want to add a couple of packages to Brewster. I've done this with custom packages before but not for existing, official ubuntu packages. Is the process any different? [22:26:09] New patchset: Ryan Lane; "Improve reporting for the deployment system" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43317 [22:26:56] andrewbogott: ummm [22:27:18] andrewbogott: official ubuntu packages? is this from a newer ubuntu version? [22:27:32] There's 12.04 packages (those are 10.04) [22:28:24] you'll need to rebuild the package for lucid [22:28:33] then push it in like normal [22:28:55] ah [22:28:59] these are already built? [22:29:15] Let me start over with a more general question :) [22:29:37] Is it sometimes the case that there are packages in the upstream ubuntu repo that are not available via brewster? [22:29:55] Or is everything duplicated and/or passed through by brewster? [22:30:19] if it's available in ubuntu repos, it's available everywhere [22:30:27] reedy@fenari:/home/wikipedia/common$ apt-cache search php-mail [22:30:27] php-mail - PHP PEAR module for sending email [22:30:27] php-mail-mime - PHP PEAR module for creating MIME messages [22:30:29] ok! Then my question doesn't make sense. [22:30:40] New review: Anomie; "You forgot to actually delete sync-common along with the other five." [operations/debs/wikimedia-task-appserver] (master) C: -1; - https://gerrit.wikimedia.org/r/43349 [22:30:49] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43317 [22:31:03] New patchset: Ryan Lane; "Update deploy-info to report on fetch tags" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43353 [22:31:28] New patchset: Reedy; "Removed scap related scripts" [operations/debs/wikimedia-task-appserver] (master) - https://gerrit.wikimedia.org/r/43356 [22:31:33] Change abandoned: Reedy; "(no reason)" [operations/debs/wikimedia-task-appserver] (master) - https://gerrit.wikimedia.org/r/43349 [22:31:46] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43353 [22:31:46] Damn lack of change id [22:31:51] <3 rebase button [22:35:36] scap had strange texvc error on spence: "Copying to spence..ok \n No entry for terminal type "unknown"; \n using dumb terminal settings. \n MediaWiki 1.21wmf7: Compiling texvc...failed \n Done ". [22:35:54] New patchset: Anomie; "Adjust configuration for l10n" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43358 [22:36:16] scap-2 does make texvc, so not too surprising [22:38:05] ok. let's see if I broke the deployment system [22:38:13] I'm glad I have this in beta [22:38:24] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43358 [22:39:12] -_- [22:39:39] well, I at least broke the deploy-info tool [22:39:53] It's not like anyone would notice if you broke one of the top 10 sites in the world :P [22:42:37] ugh. I'm dumb [22:43:09] New patchset: Ryan Lane; "Add arguments to the proper function" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43360 [22:43:10] to think I checked this like 3 times before pushing it in [22:43:30] I really need to clean up this util script [22:43:36] way too much duplicated code [22:44:27] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43360 [22:45:38] I wish I could tell puppet that I only want to run a subsection of a cataloge [22:45:43] catalog [22:47:16] But that would be sensible [22:57:09] !log spage Finished syncing Wikimedia installation... : E3 deployment Extension:GettingStarted and its i18n msgs [22:57:19] Logged the message, Master [22:58:40] New patchset: Ryan Lane; "Fix a number of errors in deploy-info" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43362 [22:59:17] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43362 [23:01:27] ugh. this minion report is getting long and ugly [23:01:32] it has too much information [23:01:52] hm. if I could make the page refresh in place I could make a lof of it headers [23:02:14] *lot [23:03:51] hm. looks like I broke the returner too [23:07:38] New patchset: awjrichards; "Enable forcing https for mobile login/account creation on testwiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/43364 [23:07:55] Change merged: awjrichards; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/43364 [23:08:29] awjr- I think it's your deployment window right now? Please ping me when you're done, there's a regression in wmf7 that I'd like to backport the fix for. [23:08:50] hi anomie indeed it is, and i will. there is a good chance we'll eat up the whole 2 hours [23:08:57] anomie is it a quick fix? [23:09:32] awjr- Yeah, one line to one file. Changeset 43361. But I can wait. [23:10:23] !log awjrichards synchronized wmf-config/InitialiseSettings.php 'enable forced https for mobile login/account creation on testwiki' [23:10:33] Logged the message, Master [23:10:40] anomie we are testing on testwiki right now if you want to just get that change out [23:10:51] awjr- ok, I'll sneak it in [23:11:00] anomie :) lmk when you're done [23:11:05] just pleaes don't scap! [23:12:14] !log anomie synchronized php-1.21wmf7/includes/api/ApiQueryRevisions.php [23:12:19] awjr- done [23:12:24] anomie awesome :) [23:12:25] Logged the message, Master [23:12:56] New patchset: Ryan Lane; "More followups to Ia6e1f2d2" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43365 [23:13:30] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43365 [23:18:53] New patchset: MaxSem; "WIP: OSM module" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/36222 [23:19:51] Change merged: Tim Starling; [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/43346 [23:19:54] PROBLEM - Puppet freshness on ms-be1003 is CRITICAL: Puppet has not run in the last 10 hours [23:21:51] PROBLEM - Puppet freshness on vanadium is CRITICAL: Puppet has not run in the last 10 hours [23:22:58] New patchset: Ryan Lane; "Check the proper tag on fetch" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43368 [23:25:27] RECOVERY - Puppet freshness on ms-be1003 is OK: puppet ran at Thu Jan 10 23:25:11 UTC 2013 [23:27:04] New patchset: Ryan Lane; "Display information specific to fetch/checkout" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43369 [23:27:14] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43368 [23:28:24] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43369 [23:29:42] opsen, the mobile team is doing a semi-emergency deployment right now - just wanted to give a heads up that we'll likely need a mobile varnish cache flush in ~30-60mins [23:29:51] ok [23:32:10] New patchset: Ryan Lane; "Typo in variable in full report" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43370 [23:34:29] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43370 [23:35:29] !log awjrichards synchronized php-1.21wmf7/extensions/ZeroRatedMobileAccess/ 'Updating ZeroRatedMobileAccess to head of master' [23:35:40] Logged the message, Master [23:44:26] !log awjrichards synchronized php-1.21wmf6/extensions/ZeroRatedMobileAccess/ 'Updating ZeroRatedMobileAccess to master on php-1.21wmf6' [23:44:36] Logged the message, Master [23:59:34] New patchset: Ryan Lane; "Report on repo dependencies when reporting on repo" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43375