[00:00:09] mutante, yeah I just replied [00:00:24] thanks! [00:04:05] New patchset: Pyoungmeister; "migrating all pmtpa slaves to coredb-based roles" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43970 [00:07:40] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43970 [00:08:10] merging someone's scap script stuff [00:16:34] New patchset: Pyoungmeister; "migrating pmtpa otrs dbs to coredb-based role classes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43971 [00:18:00] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43971 [00:19:18] New patchset: Asher; "deploy sharded SqlBagOStuff across pc1-3 for parsercache" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/43973 [00:19:54] AaronSchulz: can you review ^^ [00:30:55] New patchset: Reedy; "Update wikiversions-labs.dat to use same versions as production" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43974 [00:31:53] Change merged: Reedy; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/43974 [00:32:41] PROBLEM - Puppet freshness on db1047 is CRITICAL: Puppet has not run in the last 10 hours [00:32:42] PROBLEM - Puppet freshness on ms-fe1003 is CRITICAL: Puppet has not run in the last 10 hours [00:32:42] PROBLEM - Puppet freshness on ms-fe1004 is CRITICAL: Puppet has not run in the last 10 hours [00:32:42] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [00:32:42] New patchset: Pyoungmeister; "some cleanup of db boxes in site.pp" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43975 [00:33:38] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43975 [00:35:24] the old intern proxy stuff is in labs now right? [00:35:32] i have an old ticket about getting it hardware i wanna resolve. [00:35:39] Ryan_Lane: perhaps you know? ^ [00:37:05] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43961 [00:38:45] New patchset: Pyoungmeister; "db67 (researchdb) to coredb-based role class" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43976 [00:39:44] PROBLEM - Puppet freshness on gallium is CRITICAL: Puppet has not run in the last 10 hours [00:39:50] New patchset: Ryan Lane; "More threads for l10n cache rebuild" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43977 [00:40:21] RobH: no [00:40:25] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43976 [00:40:26] RobH: we're killing internproxy [00:40:28] it's on stat1 [00:40:35] oh..... [00:40:47] well, i still can kill ticket then [00:40:51] so yay? ;P [00:40:53] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43977 [00:40:54] heh [00:40:54] thx for info =] [00:40:57] yw [00:41:32] yay puppet finally completed on sanger [00:41:36] take that ! [00:42:25] Change merged: Asher; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/43973 [00:43:45] !log asher synchronized wmf-config/CommonSettings.php 'deploying db parsercache, sharded and replicated' [00:43:55] Logged the message, Master [00:47:36] !log the production persistent parsercache is now on pc[1-3], replicated to pc100[1-3] [00:47:49] Logged the message, Master [00:48:43] New patchset: Pyoungmeister; "moving pmtpa s5 master to coredb-based role class" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43978 [00:49:30] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43978 [00:49:55] merging some l10n stuff [00:54:37] New patchset: Pyoungmeister; "moving all other pmtpa masters to coredb-based role classes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43980 [00:56:51] New patchset: Asher; "helper script for mha to ensure write location consistency during an online master switch." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43981 [00:57:18] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43980 [00:57:41] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43981 [01:05:03] New patchset: Pyoungmeister; "migrating pmtpa es2 shard to coredb-based role class" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43983 [01:08:00] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43983 [01:09:10] !log deleting atop.log.1* in /var/log on neon to free disk space [01:09:28] Logged the message, Master [01:09:53] RECOVERY - MySQL disk space on neon is OK: DISK OK [01:10:01] LeslieCarr: <-- neon ran out of disk .. atop logs use quite a bit of it [01:10:52] !log neon - changing logrotate config for atop to rotate 7 instead of 14 [01:11:20] cool [01:11:21] wow [01:11:40] oh well shit, the main partition is only 9G ? [01:11:49] why don't we resize that [01:12:36] Logged the message, Master [01:13:57] New patchset: Pyoungmeister; "migrating pmtpa es3 shard to coredb-based role class" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43984 [01:14:41] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 326 seconds [01:16:00] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43984 [01:16:20] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 14 seconds [01:17:17] Ryan_Lane: (or whoever knows) are we using the Perl-based git-deploy or sartoris on Wednesday? [01:17:28] perl [01:17:56] Ryan_Lane: is it going to deploy to all eqiad api/app/job servers on weds? [01:18:00] yes [01:18:06] woo! [01:18:09] and tampa [01:18:09] right? [01:18:09] mutante: hrm, can you help me out ? i haven't done lvm extending - so i am on neon but lvdisplay seems to only show the swap partition as a volume ? [01:18:15] or just eqiad? [01:18:26] right [01:18:28] it's already deployed in eqiad [01:18:28] and i know that can't be right [01:19:06] it's probably testable in eqiad right now, if anyone knows how to do so :) [01:19:13] Ryan_Lane: will there be a way to deploy livehacks in case of uber-emergency? if no, can we make one? [01:19:17] yes [01:19:22] just check in locally [01:19:44] directly on the deploy host? (tin?) [01:19:47] yes [01:20:17] we should probably have a warning for people logging in as root [01:20:22] or anyone else who is knowledgeable about extending lvm [01:20:24] like "What the fuck are you doing?" [01:20:31] LeslieCarr: huh? [01:20:48] LeslieCarr: which drive are you trying to extend? [01:21:06] it's possible that it's not really using lvm [01:21:12] neon's main partition [01:21:16] most systems don't have / set as lvm [01:21:20] man, that would be crazy if only swap was using lvm [01:21:23] it really isn't using lvm [01:21:26] New patchset: Aaron Schulz; "Set sqlbagostuff log." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/43986 [01:21:26] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:21:36] noticed that / was only 10G [01:21:53] see what space is left on the vg [01:21:55] vgdisplay [01:21:59] make another volume [01:22:06] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/43986 [01:22:09] mount whatever is eating a lot of space under the new one [01:22:10] and mount it somewhere [01:22:32] well, yeah, as binasher says, mount it somewhere, then move the data [01:22:39] then mount it where you want it to go [01:23:48] !log aaron synchronized wmf-config/InitialiseSettings.php 'Set sqlbagostuff log' [01:23:58] Logged the message, Master [01:24:35] so i'd do "lvcreate -L 16G -n logs neon " to create the volume (it's basically /var/logs/ taking up the space in the tiny tiny partition) [01:24:53] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.371 seconds [01:25:19] yep [01:25:25] wait [01:25:30] -n logs neon? [01:25:43] is neon the vg? [01:25:51] yeah, neon is the vg [01:25:57] ah ok. yes, then [01:27:45] back later [01:39:15] !log restarting neon [01:39:24] Logged the message, Mistress of the network gear. [01:42:27] binasher - are you still around ? [01:42:35] and would be able to help :) [01:43:04] so i remounted /var/log on a partition [01:43:10] however, it'snot having logs written to it [01:45:52] you need to restart rsyslog [01:46:00] thanks [01:46:01] and whatever else writes to /var/log, like apache [01:46:50] hrm, that doesn't seem to be doing it [01:47:12] is it possible processes could still be writing to the old /var/log path (which is now sort of "hidden") ? [01:47:34] lsof | grep var/log [01:47:42] not if they've been restarted [01:48:44] LeslieCarr: apache is writing to /var/log/apache2 on neon.. looks ok [01:49:53] hrm, though rsyslog has been restarted [01:50:11] ah [01:50:13] nm [01:50:22] icinga isn't running which should be spamming out the logs [01:51:30] yay [01:51:31] :) [01:51:39] LeslieCarr: 2013-01-15 01:50:57 1Tuvg1-00014M-Uv Cannot open main log file "/var/log/exim4/mainlog": Permission denied: euid=105 egid=109 [01:52:01] some owner/permissions may not have been preserved [01:52:13] or directories might have not been created [01:52:43] cool [01:53:10] in this case /var/log/exim4/mainlog is there but incorrect user, group, and perms [01:53:34] i'll fix that up [01:54:41] rsyslog can't write to most of the log files either [01:57:08] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:58:52] i'm going through and slowly fixing it [02:02:21] !log created a new lvm for /var/log on neon, copied files over, and restarted processes. some file permission issues and incorrect file owners may remain [02:02:32] Logged the message, Mistress of the network gear. [02:09:35] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.032 seconds [02:16:05] !log rebooting neon [02:16:15] Logged the message, Mistress of the network gear. [02:26:38] !log LocalisationUpdate completed (1.21wmf7) at Tue Jan 15 02:26:37 UTC 2013 [02:26:41] PROBLEM - MySQL disk space on db78 is CRITICAL: DISK CRITICAL - free space: /a 117078 MB (3% inode=99%): [02:26:47] Logged the message, Master [02:42:53] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:53:33] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 7.907 seconds [03:27:23] RECOVERY - MySQL disk space on db78 is OK: DISK OK [03:29:38] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:34:26] PROBLEM - Puppet freshness on vanadium is CRITICAL: Puppet has not run in the last 10 hours [03:43:54] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.063 seconds [04:17:29] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:20:29] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [04:28:08] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 7.711 seconds [04:50:49] !log set pc[1-3] to replicate from pc100[1-3], full master/master all the way across the sky [04:50:59] Logged the message, Master [05:05:24] PROBLEM - MySQL Replication Heartbeat on db1035 is CRITICAL: CRIT replication delay 196 seconds [05:06:00] PROBLEM - MySQL Slave Delay on db1035 is CRITICAL: CRIT replication delay 193 seconds [05:07:12] RECOVERY - MySQL Replication Heartbeat on db1035 is OK: OK replication delay 0 seconds [05:07:39] RECOVERY - MySQL Slave Delay on db1035 is OK: OK replication delay 0 seconds [05:30:11] New patchset: Asher; "adding acct pkg to base::standard-packages" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43991 [05:30:48] any opposition to adding acct to base::standard-packages? https://gerrit.wikimedia.org/r/#/c/43991/ [05:40:21] PROBLEM - MySQL Replication Heartbeat on db1007 is CRITICAL: CRIT replication delay 184 seconds [05:42:09] RECOVERY - MySQL Replication Heartbeat on db1007 is OK: OK replication delay 0 seconds [06:46:30] hey the Wikidata interwiki prefix is broken [07:19:37] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [07:41:39] PROBLEM - Puppet freshness on ms1004 is CRITICAL: Puppet has not run in the last 10 hours [08:33:39] !log gallium : installing liblua5.1-0-dev package. [08:33:50] Logged the message, Master [08:38:09] New patchset: Hashar; "(bug 43819) liblua5.1-0-dev on gallium" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43999 [08:40:12] New review: Hashar; "I have installed the package manually in production. That did fix the related bug." [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/43999 [08:59:52] New review: Hashar; "Per faidon :" [operations/puppet] (production); V: 0 C: -1; - https://gerrit.wikimedia.org/r/43420 [09:10:45] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [09:10:45] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [09:16:11] New patchset: Nikerabbit; "ULS config update" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/44004 [09:20:48] Change merged: Nikerabbit; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/44004 [09:22:35] New patchset: Nikerabbit; "Oops, add missing comma" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/44007 [09:22:47] Change merged: Nikerabbit; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/44007 [09:27:18] New patchset: J; "install libjpeg-turbo-progs for rotate api" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44008 [09:29:39] PROBLEM - MySQL Slave Delay on db1035 is CRITICAL: CRIT replication delay 207 seconds [09:29:48] PROBLEM - MySQL Replication Heartbeat on db1035 is CRITICAL: CRIT replication delay 213 seconds [09:31:30] New patchset: Nikerabbit; "Enable ULS and disable WebFonts/Narayam on Translate wikis" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/44009 [09:34:06] New review: Siebrand; "Disable Narayam for meta, too." [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/44009 [09:34:18] RECOVERY - Puppet freshness on gallium is OK: puppet ran at Tue Jan 15 09:34:01 UTC 2013 [09:35:32] New patchset: Nikerabbit; "Enable ULS and disable WebFonts/Narayam on Translate wikis" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/44009 [09:38:21] RECOVERY - MySQL Slave Delay on db1035 is OK: OK replication delay 0 seconds [09:38:39] RECOVERY - MySQL Replication Heartbeat on db1035 is OK: OK replication delay 0 seconds [09:42:34] !log nikerabbit synchronized php-1.21wmf7/extensions/UniversalLanguageSelector/ 'ULS to master' [09:42:44] Logged the message, Master [10:00:50] Change merged: Nikerabbit; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/44009 [10:03:02] !log nikerabbit synchronized wmf-config/CommonSettings.php 'Updated ULS configuration' [10:03:11] Logged the message, Master [10:04:06] !log nikerabbit synchronized wmf-config/InitialiseSettings.php 'Enable ULS and disable WebFonts/Narayam on Translate wikis' [10:04:21] Logged the message, Master [10:04:45] PROBLEM - Puppet freshness on db1048 is CRITICAL: Puppet has not run in the last 10 hours [10:05:39] PROBLEM - Puppet freshness on db1007 is CRITICAL: Puppet has not run in the last 10 hours [10:05:39] PROBLEM - Puppet freshness on db1028 is CRITICAL: Puppet has not run in the last 10 hours [10:05:40] PROBLEM - Puppet freshness on db1041 is CRITICAL: Puppet has not run in the last 10 hours [10:05:40] PROBLEM - Puppet freshness on db1043 is CRITICAL: Puppet has not run in the last 10 hours [10:06:42] PROBLEM - Puppet freshness on db1024 is CRITICAL: Puppet has not run in the last 10 hours [10:06:58] Change restored: Hashar; "(no reason)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/36552 [10:07:00] New patchset: Hashar; "validating new Jenkns job (do not submit)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/36552 [10:07:45] PROBLEM - Puppet freshness on db1006 is CRITICAL: Puppet has not run in the last 10 hours [10:07:45] PROBLEM - Puppet freshness on db1038 is CRITICAL: Puppet has not run in the last 10 hours [10:07:45] PROBLEM - Puppet freshness on db1049 is CRITICAL: Puppet has not run in the last 10 hours [10:08:02] poor puppet [10:11:39] PROBLEM - Puppet freshness on db1001 is CRITICAL: Puppet has not run in the last 10 hours [10:11:40] PROBLEM - Puppet freshness on db1034 is CRITICAL: Puppet has not run in the last 10 hours [10:11:40] PROBLEM - Puppet freshness on db1005 is CRITICAL: Puppet has not run in the last 10 hours [10:15:42] PROBLEM - Puppet freshness on db1042 is CRITICAL: Puppet has not run in the last 10 hours [10:15:42] PROBLEM - Puppet freshness on db1018 is CRITICAL: Puppet has not run in the last 10 hours [10:15:43] PROBLEM - Puppet freshness on db1036 is CRITICAL: Puppet has not run in the last 10 hours [10:17:16] !log nikerabbit synchronized php-1.21wmf7/includes/AutoLoader.php [10:17:25] Logged the message, Master [10:17:39] PROBLEM - Puppet freshness on db1033 is CRITICAL: Puppet has not run in the last 10 hours [10:18:02] hmm [10:18:02] !log nikerabbit synchronized php-1.21wmf7/includes/Preferences.php [10:18:12] Logged the message, Master [10:18:28] !log nikerabbit synchronized php-1.21wmf7/includes/HTMLForm.php [10:18:37] Logged the message, Master [10:19:26] git is doing sooo many I/O [10:19:45] PROBLEM - Puppet freshness on db1017 is CRITICAL: Puppet has not run in the last 10 hours [10:21:42] PROBLEM - Puppet freshness on db1020 is CRITICAL: Puppet has not run in the last 10 hours [10:21:42] PROBLEM - Puppet freshness on db1021 is CRITICAL: Puppet has not run in the last 10 hours [10:21:42] PROBLEM - Puppet freshness on db1003 is CRITICAL: Puppet has not run in the last 10 hours [10:24:42] PROBLEM - Puppet freshness on db1027 is CRITICAL: Puppet has not run in the last 10 hours [10:24:42] PROBLEM - Puppet freshness on db1010 is CRITICAL: Puppet has not run in the last 10 hours [10:25:43] PROBLEM - Puppet freshness on db1019 is CRITICAL: Puppet has not run in the last 10 hours [10:25:43] PROBLEM - Puppet freshness on db1046 is CRITICAL: Puppet has not run in the last 10 hours [10:26:06] !log nikerabbit synchronized php-1.21wmf7/extensions/Translate 'Translate to master' [10:26:16] Logged the message, Master [10:27:13] PROBLEM - Puppet freshness on db1050 is CRITICAL: Puppet has not run in the last 10 hours [10:27:14] PROBLEM - Puppet freshness on db1022 is CRITICAL: Puppet has not run in the last 10 hours [10:28:25] PROBLEM - Puppet freshness on db1011 is CRITICAL: Puppet has not run in the last 10 hours [10:28:26] PROBLEM - Puppet freshness on db1035 is CRITICAL: Puppet has not run in the last 10 hours [10:29:28] PROBLEM - Puppet freshness on db1002 is CRITICAL: Puppet has not run in the last 10 hours [10:29:28] PROBLEM - Puppet freshness on db1009 is CRITICAL: Puppet has not run in the last 10 hours [10:31:25] PROBLEM - Puppet freshness on db1039 is CRITICAL: Puppet has not run in the last 10 hours [10:31:25] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [10:33:31] PROBLEM - Puppet freshness on db1047 is CRITICAL: Puppet has not run in the last 10 hours [10:33:31] PROBLEM - Puppet freshness on db1026 is CRITICAL: Puppet has not run in the last 10 hours [10:33:31] PROBLEM - Puppet freshness on ms-fe1003 is CRITICAL: Puppet has not run in the last 10 hours [10:33:31] PROBLEM - Puppet freshness on ms-fe1004 is CRITICAL: Puppet has not run in the last 10 hours [10:33:31] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [10:34:25] PROBLEM - Puppet freshness on db1004 is CRITICAL: Puppet has not run in the last 10 hours [10:40:43] PROBLEM - Apache HTTP on srv278 is CRITICAL: Connection refused [10:52:54] Swift is creating quite a lot of noise in the logs [10:53:07] hm? [10:53:10] what kind of noise? [10:54:59] Reedy: ^ [10:56:30] 401s [10:56:34] Invalid responses [10:56:42] can you copy one? [10:57:04] where is this? fluorine? [10:57:24] reedy@fenari:~$ tail -n 1000 /home/wikipedia/syslog/apache.log | grep -c -i swift [10:57:24] 294 [10:57:33] thanks. [10:57:40] http://p.defau.lt/?bzAKe7gIatXMyJtc_HMMkA [10:57:44] that's quite a lot indeed [10:57:46] I know there's usually some nearly all the time [10:57:52] Just seems there's more than usual [10:58:16] I think all but 3 of those lines are swift [11:06:49] RECOVERY - Apache HTTP on srv278 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.089 second response time [11:09:49] PROBLEM - Apache HTTP on mw42 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:11:28] RECOVERY - Apache HTTP on mw42 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.053 second response time [11:19:06] it got better now [11:19:16] PROBLEM - MySQL Slave Delay on db1007 is CRITICAL: CRIT replication delay 182 seconds [11:19:20] and I wasn't able to capture it via tcpdump [11:19:53] PROBLEM - MySQL Replication Heartbeat on db1007 is CRITICAL: CRIT replication delay 186 seconds [11:20:45] Aye [11:24:31] RECOVERY - MySQL Slave Delay on db1007 is OK: OK replication delay 0 seconds [11:25:08] RECOVERY - MySQL Replication Heartbeat on db1007 is OK: OK replication delay 0 seconds [11:29:44] paravoid: have you wrote the README file for the puppet wikimedia module ? :-D [11:56:06] New patchset: Nikerabbit; "Fix available Translate tasks" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/44047 [12:03:23] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/44047 [12:09:37] !log nikerabbit synchronized wmf-config/CommonSettings.php 'Translate config fix' [12:09:47] Logged the message, Master [12:38:17] RECOVERY - MySQL Slave Delay on db1047 is OK: OK replication delay 0 seconds [12:55:54] !log nikerabbit synchronized php-1.21wmf7/extensions/Translate/ 'Translate to master again' [12:56:03] Logged the message, Master [13:06:11] PROBLEM - MySQL Replication Heartbeat on db1007 is CRITICAL: CRIT replication delay 181 seconds [13:07:59] RECOVERY - MySQL Replication Heartbeat on db1007 is OK: OK replication delay 0 seconds [13:13:23] PROBLEM - Apache HTTP on mw43 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:15:02] RECOVERY - Apache HTTP on mw43 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 2.413 second response time [13:35:53] PROBLEM - Puppet freshness on vanadium is CRITICAL: Puppet has not run in the last 10 hours [14:22:07] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [16:01:32] !log Made a mistake in Zuul configuration. A slight outage while correcting it :/ [16:01:45] Logged the message, Master [16:02:47] New patchset: Silke Meyer; "Puppet files to install Wikidata repo / client on different labs instances" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/42786 [16:04:39] New review: Silke Meyer; "Good point. My files/templates now have puppet disclaimers, too." [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/42786 [16:42:13] New patchset: Cmjohnson; "Changing virt1007 to labsdb1003" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44066 [16:42:45] Hi. Is this the place where I can ask to have my contributions transfered? [16:42:55] Change merged: Cmjohnson; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44066 [16:48:59] PROBLEM - Packetloss_Average on oxygen is CRITICAL: CRITICAL: packet_loss_average is 9.53584724409 (gt 8.0) [16:49:14] XTSTech: From where? To where? [16:58:07] New patchset: Mark Bergsma; "Handle RADOSGW (Swift API) url rewriting for the basic case" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44067 [16:58:08] New patchset: Mark Bergsma; "Implement thumb 404 image scaler handling" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44068 [16:59:28] Reedy: From Wikipedia.. To.. Well, Wikipedia. [16:59:38] Either way, no [17:00:00] XTSTech: See http://en.wikipedia.org/wiki/Wikipedia:Changing_username [17:14:29] PROBLEM - MySQL Replication Heartbeat on db1007 is CRITICAL: CRIT replication delay 194 seconds [17:15:14] PROBLEM - MySQL Slave Delay on db1007 is CRITICAL: CRIT replication delay 216 seconds [17:20:23] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [17:27:26] RECOVERY - MySQL Slave Delay on db1007 is OK: OK replication delay 0 seconds [17:28:38] RECOVERY - MySQL Replication Heartbeat on db1007 is OK: OK replication delay 0 seconds [17:28:42] New review: Silke Meyer; ":/ Oops. Aha, the xml dump must not start with a puppet comment. Sorry." [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/42786 [17:31:31] New patchset: Mark Bergsma; "Add timeline, math, score rewrites" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44072 [17:43:20] PROBLEM - Puppet freshness on ms1004 is CRITICAL: Puppet has not run in the last 10 hours [17:56:24] New patchset: Mark Bergsma; "Remove double slashes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44076 [18:07:20] PROBLEM - Puppet freshness on mw40 is CRITICAL: Puppet has not run in the last 10 hours [18:07:52] New patchset: Mark Bergsma; "Support project/language prefixes for math" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44077 [18:07:53] New patchset: Mark Bergsma; "Set CORS header in vcl_deliver" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44078 [18:26:18] New patchset: Jgreen; "remove fundraising-analytics apache virtualhost from aluminium" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44080 [18:26:46] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44080 [18:27:01] anomie: I'm having issues with l10n on beta [18:27:14] Fatal error: require(): Failed opening required '/srv/deployment/mediawiki/common/l10n-1.21wmf7/ExtensionMessages.php' [18:27:17] Ryan_Lane- In a meeting now, should be done in a few minutes [18:27:20] ok [18:28:56] PROBLEM - MySQL Replication Heartbeat on db1007 is CRITICAL: CRIT replication delay 190 seconds [18:29:05] PROBLEM - MySQL Slave Delay on db1007 is CRITICAL: CRIT replication delay 195 seconds [18:36:12] !log stopping Apache on sanger [18:36:23] Logged the message, Master [18:37:47] RECOVERY - MySQL Replication Heartbeat on db1007 is OK: OK replication delay 0 seconds [18:38:14] RECOVERY - MySQL Slave Delay on db1007 is OK: OK replication delay 0 seconds [18:44:08] Ryan_Lane- Ok. It looks like in the old scap world we just manually copied that file from wmfX to wmfX+1. But it looks like we can easily enough add an if ( file_exists( ... ) ) around the inclusion of that file so it can be generated from scratch; I think the point of it is just to have the extension messages available on all wikis even if the extension itself isn't actually enabled. But let's ask Reedy in case I'm missing something. [18:44:41] the thing I don't understand is that this works fine on tin [18:45:10] When I was messing with tin, I put the file in place and forgot to revisit the issue. [18:45:15] oh [18:45:18] I see [18:45:31] I wish hashar was here [18:45:36] because there's another issue [18:45:40] with mwversionsinuse --withdb [18:46:46] what's the issue with mwversionsinuse? [18:46:59] laner@deployment-bastion:~$ mwversionsinuse --withdb [18:46:59] 1.21wmf7=aawikibooks 1.21wmf6=abwiki [18:47:15] check that vs tin [18:47:35] there's wikiversions-labs.dat and wikiversions.dat [18:47:38] no clue how that works [18:47:45] but it doesn't seem to update the cdb properly [18:48:30] also, the apache configuration isn't pointing to the correct place [18:48:42] I don't really know how hashar is doing that [18:48:51] The command to update the CDB should be checking /etc/wikimedia-realm, and using wikiversions-labs.dat if it contains "labs" and wikiversions.dat otherwise. [18:49:01] it doesn't work properly [18:49:15] I don't know anything about the apache config [18:50:36] when you push in the fix to l10n, please add it to: puppet [18:50:40] err s/:// [18:50:51] puppet/modules/deployment/files/git-deploy/dependencies/l10n [18:52:13] there's already a: if [ ! -f "$MW_COMMON/l10n-$mwVerNum/ExtensionMessages.php" ] [18:52:17] touch $MW_COMMON/l10n-$mwVerNum/ExtensionMessages.php [18:52:22] well, it was -d [18:54:30] Oh. Myabe I did revisit it then. Except that it should be -e instead of -d, stupid copy-paste. Is l10nupdate-quick bombing out before getting to that line? [18:56:23] ryan. need to re-establish ldap replication between sfo-aaa1 and sanger [18:56:37] how do you know it's broken? [18:56:42] 'cause I did it. [18:56:50] how did you do it? [18:56:51] Ryan_Lane: it is broken [18:57:06] delete all replication relationships in sfo-aaa1. [18:57:10] they are setting up new google accounts and they dont get mail [18:57:13] -_- [18:57:19] why would you do such a thing? [18:57:37] * Ryan_Lane sighs [18:57:38] accidently, while trying to add a new replication relationship with sfo-intranet1 in order to provide redundancy for ldap internally. [18:57:52] you don't need redundancy internally :( [18:58:00] that's the point of sanger [18:58:27] this is a *really* bad day for this... [18:58:27] well, ok. but it is necessary to re-create the relationship now. [18:58:31] :( [18:58:33] yes [18:58:59] sanger has a problem with its admin connector as well [18:58:59] good morning ops! [18:59:07] this isn't going to be simple to fix [18:59:13] good morning (evening?) hashar [18:59:21] you may need to wait until after we do the eqiad switchover [18:59:21] LeslieCarr: evening indeed. [18:59:42] LeslieCarr: someone contacted us earlier to get a RIPE record updated. I asked him to mail network at rt.wikimedia.org [18:59:48] no idea if that mail works though [18:59:55] oh [18:59:57] i have no idea [19:00:07] noc@wikimedia.org is a good address [19:00:13] Yossie: if http://www.blacksteel.com/ is your site congrats :-] [19:00:16] what was the relevant info ? [19:00:25] hashar, thanks, it is. [19:00:26] greping my browser history right now [19:00:39] we tried to temporarily add aliases to mchenry to forward USER: USER@corp.wikimedia.org - shouldn't that have worked? [19:00:41] Yossie: I love the 1990's look'n feel. Goood olddays [19:01:45] i had totally forgotten about all the codes [19:01:52] :) [19:01:58] Yossie: no [19:02:18] Yossie: the point of the replication is so that our email system knows about the ldap entries [19:02:39] either way, you're kind of out of luck until we finish the switchover [19:02:40] LeslieCarr: so the request was to change the maintainer for RT744-RIPE (that is River Tarnell, who used to be a contractor for WMDE iirc) [19:02:52] oh god yes need to update that [19:03:01] LeslieCarr: it is still maintained by Wikimedia. He asked for another maintainer and of course I can't find it :-] [19:03:07] ok, i need to get my ripe updating to not be sucky [19:03:09] ryan: I undestand that. so that it can forward emails to folks in @corp.wikimedia.org to the mx for that domain which is google apps. I would think adding aliases would do that, albeit manually.. [19:03:26] I'm not going to break our email system the day before we switch datacenters [19:03:39] Ryan_Lane: that's next week [19:03:43] is it? [19:03:48] hashar, Leslie: network@rt per mail works [19:03:54] well tomorrow is the deployment system switch [19:04:01] the dc switch is next week [19:04:06] i know because i also have jury duty [19:04:08] should email about that [19:04:09] I was wondering why we were doing both at the same time :D [19:04:15] mutante: nice. Thanks for the confirmation :-] [19:04:25] LeslieCarr: anyway River's nickname was felicity [19:04:34] either way, I'm working on something that's happening tomorrow [19:04:41] Ryan_Lane- Found the problem with mwversionsinuse: Ic64aa75a broke it. I'll commit a fix in a minute. [19:04:53] and this is likely to eat most of my day [19:05:06] after the deployment system switchover I'll help [19:05:46] anomie: thanks :) [19:06:09] ryan: thanks. appreciated.. [19:08:38] Yossie: it's possible if you have all the correct info that you could set it back up yourself [19:08:47] absoluately. [19:08:48] New patchset: Anomie; "Fix mwversionsinuse on labs" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/44082 [19:08:56] to be honest, I don't know it [19:09:01] the password, that is [19:09:16] everything except the hostname and password should be default, though [19:09:23] so, if you know the two of those it should work [19:10:00] !g 44082 |Ryan_Lane [19:10:00] Ryan_Lane: https://gerrit.wikimedia.org/r/#q,44082,n,z [19:10:16] need admin DN / password and, I think cn=admin as well - to set up replication. [19:10:44] they are all the same [19:10:48] Yossie: i think i have that here if you want to come on over ? [19:10:54] assuming fenari is accurate [19:11:07] LeslieCarr: it probably isn't [19:11:18] then nobody knows [19:11:39] which file? [19:12:00] ugh. I surely hope it isn't that, though it could be [19:12:01] :D [19:12:22] yeah [19:12:26] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [19:12:26] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [19:12:45] LeslieCarr: felicity is online :-] [19:12:56] LeslieCarr: he is going to /msg you about his ripe record. [19:13:04] ok [19:13:11] cool [19:13:21] bonjour felicity ! [19:13:26] \o/ [19:13:30] hashar told me to ask here :-) can someone please update RT744-RIPE to be mnt-bt TORCHBOX-MNT? (my new employer?) [19:13:35] hi leslie ;-) [19:13:44] it's great to finally meet you [19:13:47] felicity: this is LeslieCarr one of the WMF network engineer [19:14:15] LeslieCarr: this is felicity, who used to want to rewrite mediawiki Haskell (or was it java) ? :-] [19:14:25] anyway, a long time volunteer from the old school era [19:14:27] python! [19:14:37] although i actually hate python, it seemed like the most sensible option [19:14:42] :) [19:14:44] we have more and more python stuff on the cluster nowadays [19:15:23] !log authdns-update [19:15:24] how many servers now? i think when i started we had 7 ;) [19:15:35] Logged the message, RobH [19:16:08] wow, i don't have an exact count but i think something closer to 600 ? [19:17:32] over [19:17:38] 750+. [19:17:50] i think is what we had in last fundraiser banner. [19:18:29] sbernardin: Are you on site? [19:18:35] colby.mgmt isnt functioning. [19:18:40] and i need it online for an install [19:19:15] !g 44082 |hashar, review this please? Ryan seems busy. [19:19:16] hashar, review this please? Ryan seems busy.: https://gerrit.wikimedia.org/r/#q,44082,n,z [19:19:39] Change merged: Ryan Lane; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/44082 [19:19:42] hehe [19:19:44] :) [19:20:14] anomie: good catch! [19:20:19] !g Ic64aa75a [19:20:19] https://gerrit.wikimedia.org/r/#q,Ic64aa75a,n,z [19:20:48] New review: Andrew Bogott; "> the xml dump must not start with a puppet comment" [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/42786 [19:21:49] Ryan_Lane- Looks like the l10n git-deploy script should work without any additional fix. And after running that, ExtensionMessages.php will exist. [19:21:50] did someone already pull that change on beta? [19:21:56] I just did, yeah [19:22:00] without git-deploy? :) [19:22:04] oops. [19:22:07] tsk tsk ;) [19:22:40] so. here's how to fix that [19:22:46] * anomie was just about to ask how to fix that [19:22:47] I'm going to do it and I'll walk through the steps [19:22:51] git tag [19:22:58] find the last deployed tag [19:23:24] in this case I had already done git deploy start and found out that it was already pulled [19:23:31] how do you know which tags are deployed? [19:23:33] so, it's not the very last one, but the one before that [19:23:45] in reality anything before it is fine [19:24:02] we just need to have git deploy think we're in another state [19:24:09] git checkout common-20130115-010158 [19:24:29] hm [19:24:38] I say that, but this obviously isn't working [19:27:01] Ryan_Lane: no luck on the passwords [19:27:08] tried the one on fenari and all the ones in the private repo [19:30:41] New patchset: Anomie; "Fix test in l10nupdate-quick" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44084 [19:30:57] hmm, actually this is also probably a good time to ask if someone could change river@wikimedia.org to forward to /dev/null [19:31:02] as it pretty much only gets spam nowadays [19:31:51] LeslieCarr: ok. I think I have it [19:31:55] unless they changed it [19:32:02] ok [19:32:11] yes, we can do that witht he alias [19:32:19] though if you want to come back and do more work…. ;) [19:32:26] we'd be happy to have you [19:32:32] i've heard good things [19:33:05] i actually wouldn't mind that, but i've been a bit starved for time recently [19:33:44] :) [19:34:13] felicity: let me check that for you..see query [19:36:56] RoanKattouw: So now eqiad has 3 parsoid nodes (temp, as the high performacne ones wont arrive until 22nd) [19:37:06] tampa has 5 but they are horribly underutlizied [19:37:09] RobH: Oh cool [19:37:10] so figured 3 in eqiad is good [19:37:17] Yeah the utilization isn't exactly crushing right now [19:37:18] but i have no caching servers allocated, we need for eqiad as well yes? [19:37:23] Yup [19:37:28] two seems reasonable [19:37:44] Supposedly VisualEditor is gonna be the default editor for Wikipedias in June [19:37:53] so once that happens the Parsoid utilization should go up a bit :) [19:38:05] all of the worker nodes will be replaced with higher performance dual cpu nodes by then [19:38:11] Good [19:38:15] just tyring to ensure we are good for next week swapver [19:38:25] Oh, right, of course [19:38:28] i'll spin up two servers to act as caching proxies today [19:38:39] will ping you if i need help on anything, figured it out well enough yesterday [19:38:58] As long as eqiad has the same setup, it should be fine. I'll have to test it, and configure MW to use it when running out of eqiad [19:39:21] yep, will let you know when OS is up so you can do your thing (well os and initial puppet run) [19:39:31] OK good [19:39:34] PROBLEM - Memcached on virt0 is CRITICAL: Connection refused [19:40:10] anomie: Hey we have a mechanism for setting a config variable to a different value based on whether we're in pmtpa or eqiad, right? How does that work? [19:41:16] RobH: Also, someone said something about paging and I would have to give some information to Leslie? [19:41:19] RoanKattouw- $wmfDatacenter is set to "pmtpa" in pmtpa and "eqiad" in eqiad. It works just like $wmfRealm for doing production/labs differences. [19:41:31] ah yeah [19:41:42] you guys are going to be paged about breakages ? [19:42:00] I should be paged for Parsoid LVS, yeah [19:42:13] open a ticket ( you can put the information in an email or non ticket) - i'll need email, phone, working hours if you're doing a more shifty type of thing [19:42:15] RoanKattouw: LeslieCarr #4318: add Roan to Nagios monitoring for Parsoid boxes [19:42:21] oh [19:42:22] hehe [19:42:38] oh and cell phone service provider [19:42:38] Oh, there's a ticket? [19:42:40] (for the gateway) [19:42:54] I'll put it in [19:42:59] yes, i created it after we talked the other day Roan [19:43:05] OK [19:43:08] Thanks [19:43:10] np [19:44:34] anomie: that did indeed fix it for beta [19:44:36] anomie: thanks [19:44:50] RoanKattouw: is there a simple web based thing we can check on visual editor ? [19:44:55] for watchmouse (external monitoring) [19:44:57] anomie: so, easiest way to fix the git pull issue is to reset to an older change [19:45:04] New patchset: Anomie; "Fix comment on getRealmSpecificFilename" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/44085 [19:45:06] anomie: then to do git deploy start [19:45:07] RoanKattouw: yea, we can do Nagios and Watchmouse seperately [19:45:08] LeslieCarr: For Parsoid? Yes, the Nagios checks already do that [19:45:13] then switch back to the newest change [19:45:21] <^demon> How do you disable puppet on a host? [19:45:22] <^demon> (in labs, fwiw) [19:45:27] The Nagios check is literally "send an HTTP request to / on the Parsoid box" [19:45:44] For Watchmouse, I suppose you'd want http://celsus.wikimedia.org/en/Main_Page [19:45:45] ^demon: puppet agent --disable [19:45:47] are those publically ip'ed though ? [19:45:47] ok [19:45:50] Ryan_Lane- Does it matter which previous change, or will HEAD^1 work? [19:45:55] so the not currently available [19:45:55] hehe [19:45:56] (where celsus is the name of the Varnish proxy in the appropriate location) [19:45:58] <^demon> mutante: ty. [19:46:00] anomie: that'll work [19:46:05] And sadly yes, the Parsoid Varnishes are publicly IPed [19:46:21] I would have slightly preferred them not to be, but it's fine [19:46:22] anomie: though that isn't even actually necessary [19:46:36] anomie: you could also do a git deploy --force sync [19:46:37] And once the firehose is unleashed on them, it's moot I suppose [19:47:08] Ryan_Lane- So the easiest fix is "git deploy start; git deploy --force sync"? [19:47:12] yes [19:47:12] ok, we'll make a separate ticket [19:47:33] LeslieCarr: I just put my phone info in the t icket [19:48:51] RoanKattouw: I dunno about the paging stuff sorry [19:48:58] about adding you that is, seems they do though [19:49:36] RobH: Already handled, sorry [19:54:16] RECOVERY - MySQL Slave Delay on db1043 is OK: OK replication delay 0 seconds [19:54:44] !log authdns-update [19:54:54] Logged the message, RobH [19:59:58] what's a "badboy key" :) [20:00:18] related to monitoring somehow, watchmouse would let me add them to user accounts [20:00:56] http://www.badboysoftware.biz/docs/keyinput.htm hmm..ok [20:05:31] RECOVERY - Memcached on virt0 is OK: TCP OK - 0.021 second response time on port 11000 [20:05:40] PROBLEM - Puppet freshness on db1048 is CRITICAL: Puppet has not run in the last 10 hours [20:06:43] PROBLEM - Puppet freshness on db1007 is CRITICAL: Puppet has not run in the last 10 hours [20:06:43] PROBLEM - Puppet freshness on db1041 is CRITICAL: Puppet has not run in the last 10 hours [20:06:43] PROBLEM - Puppet freshness on db1028 is CRITICAL: Puppet has not run in the last 10 hours [20:06:43] PROBLEM - Puppet freshness on db1043 is CRITICAL: Puppet has not run in the last 10 hours [20:07:40] New patchset: Jgreen; "remove faulkner database from fundraisingdb dumps" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44086 [20:07:46] PROBLEM - Puppet freshness on db1024 is CRITICAL: Puppet has not run in the last 10 hours [20:08:23] RECOVERY - Packetloss_Average on oxygen is OK: OK: packet_loss_average is 3.71913785714 [20:08:24] db1043? there's certainly more of those than i remember [20:08:40] PROBLEM - Puppet freshness on db1038 is CRITICAL: Puppet has not run in the last 10 hours [20:08:40] PROBLEM - Puppet freshness on db1049 is CRITICAL: Puppet has not run in the last 10 hours [20:08:41] PROBLEM - Puppet freshness on db1006 is CRITICAL: Puppet has not run in the last 10 hours [20:12:43] PROBLEM - Puppet freshness on db1001 is CRITICAL: Puppet has not run in the last 10 hours [20:12:43] PROBLEM - Puppet freshness on db1034 is CRITICAL: Puppet has not run in the last 10 hours [20:12:44] PROBLEM - Puppet freshness on db1005 is CRITICAL: Puppet has not run in the last 10 hours [20:13:14] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44086 [20:15:58] uuuuhhhh [20:15:59] huh [20:16:46] PROBLEM - Puppet freshness on db1036 is CRITICAL: Puppet has not run in the last 10 hours [20:16:46] PROBLEM - Puppet freshness on db1018 is CRITICAL: Puppet has not run in the last 10 hours [20:16:47] PROBLEM - Puppet freshness on db1042 is CRITICAL: Puppet has not run in the last 10 hours [20:18:43] PROBLEM - Puppet freshness on db1033 is CRITICAL: Puppet has not run in the last 10 hours [20:20:40] PROBLEM - Puppet freshness on db1017 is CRITICAL: Puppet has not run in the last 10 hours [20:20:50] !log stopping squid3 on sq48 [20:20:55] (yes, squid3) [20:21:00] Logged the message, Master [20:22:19] PROBLEM - Backend Squid HTTP on sq48 is CRITICAL: Connection refused [20:22:46] PROBLEM - Puppet freshness on db1003 is CRITICAL: Puppet has not run in the last 10 hours [20:22:46] PROBLEM - Puppet freshness on db1020 is CRITICAL: Puppet has not run in the last 10 hours [20:22:47] PROBLEM - Puppet freshness on db1021 is CRITICAL: Puppet has not run in the last 10 hours [20:24:52] PROBLEM - MySQL Slave Delay on db78 is CRITICAL: CRIT replication delay 3586000 seconds [20:25:46] PROBLEM - Puppet freshness on db1010 is CRITICAL: Puppet has not run in the last 10 hours [20:25:47] PROBLEM - Puppet freshness on db1027 is CRITICAL: Puppet has not run in the last 10 hours [20:26:40] PROBLEM - Puppet freshness on db1019 is CRITICAL: Puppet has not run in the last 10 hours [20:26:40] PROBLEM - Puppet freshness on db1046 is CRITICAL: Puppet has not run in the last 10 hours [20:27:38] New patchset: Raimond Spekking; "Change SUL icon for Wikivoyage to the current logo" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/44091 [20:28:46] PROBLEM - Puppet freshness on db1022 is CRITICAL: Puppet has not run in the last 10 hours [20:28:47] PROBLEM - Puppet freshness on db1050 is CRITICAL: Puppet has not run in the last 10 hours [20:29:40] PROBLEM - Puppet freshness on db1035 is CRITICAL: Puppet has not run in the last 10 hours [20:29:40] PROBLEM - Puppet freshness on db1011 is CRITICAL: Puppet has not run in the last 10 hours [20:30:43] PROBLEM - Puppet freshness on db1002 is CRITICAL: Puppet has not run in the last 10 hours [20:30:43] PROBLEM - Puppet freshness on db1009 is CRITICAL: Puppet has not run in the last 10 hours [20:32:44] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [20:32:45] PROBLEM - Puppet freshness on db1039 is CRITICAL: Puppet has not run in the last 10 hours [20:34:59] PROBLEM - Puppet freshness on db1047 is CRITICAL: Puppet has not run in the last 10 hours [20:34:59] PROBLEM - Puppet freshness on db1026 is CRITICAL: Puppet has not run in the last 10 hours [20:35:00] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [20:35:00] PROBLEM - Puppet freshness on ms-fe1003 is CRITICAL: Puppet has not run in the last 10 hours [20:35:00] PROBLEM - Puppet freshness on ms-fe1004 is CRITICAL: Puppet has not run in the last 10 hours [20:36:02] PROBLEM - Puppet freshness on db1004 is CRITICAL: Puppet has not run in the last 10 hours [20:38:08] PROBLEM - MySQL Replication Heartbeat on db1035 is CRITICAL: CRIT replication delay 193 seconds [20:38:35] PROBLEM - MySQL Slave Delay on db1035 is CRITICAL: CRIT replication delay 211 seconds [20:48:38] RECOVERY - MySQL Replication Heartbeat on db1035 is OK: OK replication delay 0 seconds [20:49:23] RECOVERY - MySQL Slave Delay on db1035 is OK: OK replication delay 0 seconds [20:51:29] PROBLEM - MySQL Slave Delay on db1020 is CRITICAL: CRIT replication delay 190 seconds [20:51:47] PROBLEM - MySQL Replication Heartbeat on db33 is CRITICAL: CRIT replication delay 195 seconds [20:52:14] PROBLEM - MySQL Slave Delay on db33 is CRITICAL: CRIT replication delay 212 seconds [20:52:23] PROBLEM - MySQL Replication Heartbeat on db1020 is CRITICAL: CRIT replication delay 222 seconds [20:53:02] hm. why does the l10nupdate script work when I run it manually and not when called from the sync script? [20:56:04] Ryan_Lane- Is it being passed something odd for $1? [20:56:29] it should be passing the slot [20:56:51] I modified the script to take slots and turn them into version numbers [20:57:49] ah [20:57:51] slot0 is working [20:57:54] slot1 is failing [20:58:23] A copy of your installation's LocalSettings.php [20:58:23] must exist and be readable in the source directory. [20:58:41] New patchset: Jgreen; "critical=>true for fundraisingdb nagios replication test" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44096 [20:58:58] good reason [20:59:17] RECOVERY - MySQL Replication Heartbeat on db1020 is OK: OK replication delay 0 seconds [20:59:22] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44096 [20:59:32] Anybody knows if the "Eqiad Migration Countdown Meeting" will be on Google Hangout, or SIP only? [20:59:53] andre__- I'd guess hangout unless it's down for SF again. Waiting on someone to post the hangout link, or tell us otherwise. [21:00:00] unless hangouts are down [21:00:09] I heard that might be the ase [21:00:11] RECOVERY - MySQL Slave Delay on db1020 is OK: OK replication delay 16 seconds [21:00:11] *case [21:00:29] RECOVERY - MySQL Replication Heartbeat on db33 is OK: OK replication delay 13 seconds [21:00:56] RECOVERY - MySQL Slave Delay on db33 is OK: OK replication delay 5 seconds [21:00:59] Ryan_Lane- Last I heard (about an hour ago) hangouts were back up [21:01:03] cool [21:01:05] hashar: around? [21:01:16] Ryan_Lane: I am in a conf call sorry [21:01:18] hashar: can you change the apache links in beta to use the deployment locations? [21:01:20] oh that is ryan [21:01:24] :D [21:01:54] screw you hang out [21:02:09] It's taking too long to connect you to this hangout. Try again in a few minutes. [21:02:27] yeah [21:02:32] falling back to audio [21:02:33] l10n is working in beta now [21:02:48] I'm doing a deploy of slot1 with l10n rebuilding [21:03:11] I need to add some compute nodes to labs [21:03:49] it's hard to believe we're already close to capacity on 4 nodes [21:13:03] New patchset: Pyoungmeister; "fix regex for dbs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44098 [21:15:56] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44098 [21:17:39] !log cerium & titanium being installed as parsoid caching servers, ignore any alerts for now. [21:17:50] Logged the message, RobH [21:17:54] hashar: so, yeah, I think we're ready to start testing new deployment on beta [21:18:11] RECOVERY - Puppet freshness on db1009 is OK: puppet ran at Tue Jan 15 21:17:45 UTC 2013 [21:18:15] l10n is pushed out and working on deploy [21:19:05] RECOVERY - Puppet freshness on db1001 is OK: puppet ran at Tue Jan 15 21:18:52 UTC 2013 [21:19:24] New patchset: Jgreen; "debug fundraisingdb's vs site.pp" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44100 [21:19:41] RECOVERY - Puppet freshness on db1003 is OK: puppet ran at Tue Jan 15 21:19:21 UTC 2013 [21:19:41] RECOVERY - Puppet freshness on db1049 is OK: puppet ran at Tue Jan 15 21:19:24 UTC 2013 [21:19:47] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44100 [21:19:57] anomie: hm. it seems that l10nupdate-quick is rebuilding all message caches [21:20:05] anomie: this is going to be problematic [21:20:36] RECOVERY - Puppet freshness on db1017 is OK: puppet ran at Tue Jan 15 21:20:10 UTC 2013 [21:20:37] cause every small push that updates messages can cause 750MB of deployment [21:21:11] RECOVERY - Puppet freshness on db1042 is OK: puppet ran at Tue Jan 15 21:20:47 UTC 2013 [21:21:18] it may be fine when we switch to bittorrent, though [21:21:20] RECOVERY - Puppet freshness on db1005 is OK: puppet ran at Tue Jan 15 21:21:07 UTC 2013 [21:21:28] Ryan_Lane- You can take out the '! -z "$1" -a' bit to force it to only work with whatever matches the slot passed in $1 [21:21:39] well, that's the thing [21:21:43] it does match that [21:21:55] Oh, I misunderstood [21:22:01] but any extension update could cause every language to be rebuilt [21:22:10] I think it's only supposed to do english [21:22:27] Ryan_Lane: sorry been busy with some Jenkins job [21:22:37] * Ryan_Lane nods [21:22:41] RECOVERY - Puppet freshness on db1021 is OK: puppet ran at Tue Jan 15 21:22:32 UTC 2013 [21:23:31] It will rebuild anything whose source file changed since the last run. [21:23:49] is this also how it works with scap currently? [21:23:56] yes [21:23:58] ah [21:23:59] ok [21:24:17] ok. I'll put more effort into getting bittorrent working [21:24:54] BTW, I keep mentioning !g 42777 which will make the cron job (if that's even set up yet) regenerate fewer languages. [21:25:21] maybe we're talking about different things [21:25:28] I'm not talking about the cron [21:25:37] I know [21:25:40] ah. ok [21:25:41] RECOVERY - Puppet freshness on db1028 is OK: puppet ran at Tue Jan 15 21:25:06 UTC 2013 [21:25:41] RECOVERY - Puppet freshness on db1038 is OK: puppet ran at Tue Jan 15 21:25:34 UTC 2013 [21:26:08] RECOVERY - Puppet freshness on db1020 is OK: puppet ran at Tue Jan 15 21:25:42 UTC 2013 [21:26:08] RECOVERY - Puppet freshness on db1048 is OK: puppet ran at Tue Jan 15 21:25:54 UTC 2013 [21:27:03] New patchset: Jgreen; "Revert "debug fundraisingdb's vs site.pp" to see if notpeter's fix fixed the issue I'm debugging" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44103 [21:27:18] Ryan_Lane: so /data/project/apache/common-local has symbolic links for php-1.21wmf6 to slot1 and wmf7 to slot0 [21:27:26] Ryan_Lane- If you run the script a second time (no changes to the source message files), it *should* make no changes. If it does make changes, I'd have to fix that. [21:27:33] anomie: it doesn't [21:27:34] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44103 [21:27:35] did we ever figure out what to do about mariadb ? the packages are preventing other needed packages from installing (trying to push out neon) [21:27:39] Ryan_Lane: so I guess the wikis that use the wmf branches are already running out of the git deployed code. [21:27:39] libmysqlclient-dev : Depends: libmysqlclient18 (= 5.5.28-0ubuntu0.12.04.3) but 5.5.28-mariadb-wmf201212041~precise is to be installed [21:27:40] E: Unable to correct problems, you have held broken packages. [21:28:02] LeslieCarr: i'll delete the rest [21:28:35] ok [21:28:47] RECOVERY - Puppet freshness on db1018 is OK: puppet ran at Tue Jan 15 21:28:19 UTC 2013 [21:28:49] hashar: they aren't I don't think [21:30:08] RECOVERY - Puppet freshness on db1024 is OK: puppet ran at Tue Jan 15 21:29:46 UTC 2013 [21:30:09] RECOVERY - Puppet freshness on db1034 is OK: puppet ran at Tue Jan 15 21:29:57 UTC 2013 [21:32:20] Ryan_Lane: at least I got blank page on http://en.wiktionary.beta.wmflabs.org :-] [21:32:41] RECOVERY - Puppet freshness on db1043 is OK: puppet ran at Tue Jan 15 21:32:38 UTC 2013 [21:32:46] supposed to run php-1.21wmf6 [21:33:20] ..8........<13>Jan 15 21:32:57 i-0000031a apache2: PHP Warning: require(MULTIVER_COMMON/wmf-config/wgConf.php) [function.require]: failed to open stream: No such file or directory in /srv/deployment/mediawiki/common/wmf-config/CommonSettings.php on line 148 [21:33:36] MULTIVER_COMMON is not set apparently [21:33:44] RECOVERY - Puppet freshness on db1035 is OK: puppet ran at Tue Jan 15 21:33:18 UTC 2013 [21:33:48] New patchset: RobH; "cerium & titanium deploying as parsoid caching" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44132 [21:35:41] RECOVERY - Puppet freshness on db1007 is OK: puppet ran at Tue Jan 15 21:35:27 UTC 2013 [21:35:41] RECOVERY - Puppet freshness on db1002 is OK: puppet ran at Tue Jan 15 21:35:33 UTC 2013 [21:37:20] New review: RobH; "this isn't self review, I'm letting another personality chime in." [operations/puppet] (production); V: 2 C: 2; - https://gerrit.wikimedia.org/r/44132 [21:37:21] Change merged: RobH; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44132 [21:37:38] RECOVERY - Puppet freshness on db1019 is OK: puppet ran at Tue Jan 15 21:37:30 UTC 2013 [21:40:11] RECOVERY - Puppet freshness on db1006 is OK: puppet ran at Tue Jan 15 21:39:38 UTC 2013 [21:41:42] RECOVERY - Puppet freshness on db1033 is OK: puppet ran at Tue Jan 15 21:41:19 UTC 2013 [21:41:50] RECOVERY - Puppet freshness on db1027 is OK: puppet ran at Tue Jan 15 21:41:38 UTC 2013 [21:41:50] RECOVERY - Puppet freshness on db1050 is OK: puppet ran at Tue Jan 15 21:41:38 UTC 2013 [21:41:51] Ryan_Lane: Disk space on the Apaches looks good to me [21:42:04] RoanKattouw_away: same [21:42:44] RECOVERY - Puppet freshness on db1046 is OK: puppet ran at Tue Jan 15 21:42:09 UTC 2013 [21:42:44] RECOVERY - Puppet freshness on db1026 is OK: puppet ran at Tue Jan 15 21:42:33 UTC 2013 [21:43:56] RECOVERY - Puppet freshness on db1039 is OK: puppet ran at Tue Jan 15 21:43:40 UTC 2013 [21:44:05] RECOVERY - Puppet freshness on db1022 is OK: puppet ran at Tue Jan 15 21:43:55 UTC 2013 [21:44:14] RECOVERY - Puppet freshness on db1040 is OK: puppet ran at Tue Jan 15 21:44:07 UTC 2013 [21:44:45] hashar: http://en.wikipedia.beta.wmflabs.org/wiki/Special:Version doesn't look like it's running against the new deployment location [21:44:58] but I've switched all wikis to use the new system [21:45:02] well to use the slots [21:45:26] that one isn't [21:45:27] $ grep wmf wikiversions-labs.dat [21:45:28] enwiktionary php-1.21wmf6 [21:45:28] enwikibooks php-1.21wmf7 [21:45:35] those do :-] [21:45:38] and blank page :( [21:45:44] wair [21:45:44] RECOVERY - Puppet freshness on db1036 is OK: puppet ran at Tue Jan 15 21:45:37 UTC 2013 [21:45:45] wait [21:45:50] hashar: where are you reading that file? [21:45:54] the reason I only switched two is to keep enwiki to master so feature team could still use it [21:46:01] form /data/project/apache/common-local [21:46:04] we're switching them all [21:46:14] why is it pulling from there? [21:46:27] cause that is the DocumentRoot for the apaches [21:46:29] hashar: for now we're going to switch everything [21:46:38] RECOVERY - Puppet freshness on db1010 is OK: puppet ran at Tue Jan 15 21:46:26 UTC 2013 [21:46:41] but we can just rename /data/project/apache/common-local to something else [21:46:48] yeah [21:46:54] as tim mentioned [21:46:56] and have /data/project/apache/common-local to be a symbolic link to /srv/deployment/whatever [21:47:06] mv it and link it to /srv... [21:47:14] RECOVERY - Puppet freshness on db1004 is OK: puppet ran at Tue Jan 15 21:46:53 UTC 2013 [21:47:23] RECOVERY - Puppet freshness on db1041 is OK: puppet ran at Tue Jan 15 21:47:11 UTC 2013 [21:47:32] we're likely getting a blank page because the old one doesn't know about the other versions [21:47:41] RECOVERY - Puppet freshness on db1011 is OK: puppet ran at Tue Jan 15 21:47:30 UTC 2013 [21:47:54] done [21:47:57] it's only the new system that's pointing at that properly [21:48:02] heh [21:48:11] *now* we're getting an interesting error [21:48:14] Invalid host name (docroot=/usr/local/apache/common/docroot/wikipedia.org), can't determine language. :-] [21:48:40] they are symlinks too [21:52:17] hashar: any ideas on that? [21:52:17] Ryan_Lane: multi version has a regex on the path [21:52:21] ah [21:52:22] } elseif ( preg_match( '/^(?:\/usr\/local\/apache\/|\/home\/wikipedia\/)(?:htdocs|common\/docroot)\/([a-z]+)\.org/', $docRoot, $matches ) ) { [21:52:36] that's wrong, then :) [21:52:36] multiversion/MWMultiVersion.php of operations/mediawiki-config [21:52:56] fucking stupid multiversion [21:53:00] hm [21:53:03] I should rewrite that in a few lines of python [21:53:03] which line? [21:53:09] I don't see this in newdeploy [21:53:24] } elseif ( preg_match( "/^\/srv\/deployment\/mediawiki\/(?:htdocs|common\/docroot)\/([a-z0-9\-_]*)$/", $docRoot, $matches ) ) { [21:53:41] hmm [21:53:48] ahhh [21:53:48] I know [21:54:02] $docRoot = $_SERVER['DOCUMENT_ROOT'] apparently [21:54:06] there's a link for this in production [21:54:08] on tin [21:54:41] now we're getting a blank [21:54:43] where are the logs? [21:54:46] oh [21:54:47] wait [21:54:57] that link needs to exist on the apaches too, right? [21:55:02] there are no log in beta [21:55:09] no logs? really? :( [21:55:13] cause someone rejected my hack to have syslog installed on deployment-dbdump [21:55:16] :/ [21:55:18] heh [21:55:22] * hashar blame Ryan [21:55:24] :-]]]]]] [21:55:40] anyway [21:55:48] logs are sent to deployment-dbdump still [21:55:51] so just tcpdump it [21:55:52] sudo tcpdump -A -n -s0 udp port 514 [21:55:53] 2846 replace syslog permission handling on labs with root cause fix [21:55:57] hm [21:55:59] been doing that for months :-] [21:55:59] how about that, btw?:) [21:56:07] actually, why's it using /usr/local/apache/common/docroot/wiktionary.org? [21:56:12] hashar: oh [21:56:27] hashar: did you switch everything on all of the apaches? [21:56:36] no [21:56:43] I have updated the link in /data/project [21:56:48] so it happened magically :-] [21:56:49] ah [21:56:51] right [21:56:54] the docroot is on /data/project hehe [21:57:02] easy deployment :] [21:57:21] (which does not scale and has a huge spot, we now the story already) [21:57:32] what's looking for that docroot location? [21:57:58] do the systems themselves have an error log? [21:58:05] the apaches [21:58:08] mutante: on labs it is a different issue. ryslog is installed by default and our main syslog use syslog-ng which conflict with rsyslog. So we can't get syslog-ng on beta and hence have no log :-D [21:58:22] right [21:58:24] Ryan_Lane: the apaches are configured like in production, they sent everything to syslog [21:58:39] and rsyslog locally relay on a central host ( deployment-dbdump for beta ) [21:58:41] ugh [21:58:47] binasher: i saw the removal of mariadb via reprepro, yet despite an apt-get update the packages are still trying to be installed… - any ideas off the top of your head ? [21:59:02] I wonder where that error is being thrown [21:59:03] LeslieCarr: apt-get update [21:59:04] -dbdump does receive all the logs on UDP 514 thus [21:59:23] which is then happily discard by the rsyslog listening there. [21:59:31] "yet despite an apt-get update" [21:59:32] so you have to use tcpdump [21:59:33] :p [21:59:35] hashar: that one refers to me once adding class base::syslogs which makes /var/log/syslogs and /var/log/messages readable for non-roots.. and then i was asked to keep it open to replace that with the root cause fix which would be to have syslog write it with more relaxed permissions in the first place, instead of having puppet change them [21:59:51] LeslieCarr: what box? [22:00:03] ah, they still appear to be in brewster [22:00:04] mutante: ah I see :-] [22:00:05] on neon [22:00:50] looks like brewster still has some packages - http://pastebin.com/MGP6ULPs [22:01:14] Ryan_Lane: something loaded on http://en.wikipedia.beta.wmflabs.org/wiki/Special:Version :-] [22:01:53] not if you force refresh [22:02:08] that was being pulled from cache [22:02:21] :/ [22:04:52] hashar: i-0000031b apache2: PHP Fatal error: Invalid host name (docroot=/usr/local/apache/common/docroot/wikipedia.org), can't determine language.#012 in /srv/deployment/mediawiki/common/multiversion/MWMultiVersion.php on line 353 [22:04:59] tcpdump -A -s 1514 -i eth0 port 514 ;) [22:05:45] yeah :-] [22:06:05] I have no idea what that would be [22:06:12] the line is the trigger_error() call [22:06:16] which is not really useful [22:06:44] I have a feeling it's line 162 [22:07:02] } elseif ( preg_match( "/^\/srv\/deployment\/mediawiki\/(?:htdocs|co mmon\/docroot)\/([a-z0-9\-_]*)$/", $docRoot, $matches ) ) { [22:07:09] would that not match for some reason? [22:07:31] run it agains the provided docroot /usr/local/apache/common/docroot/wikipedia.org ? [22:07:46] damnit, better but now it doesn't have any mysql-5.1 available - i guess i'll grab that package from ubunut [22:07:54] http://pastebin.com/qT18wHi9 [22:08:04] Ryan_Lane: ^demon: Do you know who set up doc.wikimedia.org at gallium? [22:08:10] nope [22:08:12] I can't find any trace of it in puppet [22:08:16] But it exists [22:08:18] ironic [22:08:28] Ryan_Lane: oh yeah the regex mention /srv/deployment/ [22:08:36] <^demon> Wasn't I. [22:08:41] Especially ironic since the contents of it are auto-generated documetation about.. puppet. [22:08:45] Ryan_Lane: but the Apaches still point to the /usr/local/apache/common dir. [22:08:52] Ryan_Lane: so I guess we want either of it [22:09:18] ah.... [22:09:18] right [22:09:20] New patchset: Pyoungmeister; "a hat trick of bad regex..." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44154 [22:09:21] shit [22:09:21] Ryan_Lane: is the common dir using the "new deploy" branch ? [22:09:24] yes [22:09:41] Ryan_Lane: I think Tim wrote another patch for the Apache config to switch their docroot to /srv/deployment/ [22:09:47] he did [22:09:51] binasher: would it be a bad thing to switch the mysql-client-5.1 packages to being mysqlfb-client-5.1 ? [22:09:58] We were going to simlink the shit out of everything [22:09:59] mark: mutante: Maybe it was one of you who set up http://doc.wikimedia.org (auto-generated puppet docs). This week I might add something to that (doxygen for mw-core), but can't find where the current setup is puppetized [22:10:00] Or something [22:10:03] yeah [22:10:05] Ryan_Lane: so to play it safe, I guess that regex should match both path. Cleanup later [22:10:06] e.g. whether it nukes the directory or somethjing [22:10:08] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44154 [22:10:09] yeah [22:10:12] LeslieCarr: what for? [22:10:12] should match both [22:10:23] cmjohnson1: you still @ eqiad? [22:10:36] so i don't have to backport the lucid mysql-client-5.1 from lucid [22:10:41] titanium (misc server) isnt console redirecting for me (though its drac is responsible) [22:10:51] It was done on December 5/6/7 by someone with root access [22:10:54] responding even [22:10:56] =P [22:10:57] LeslieCarr: what needs a 5.1 client on precise? nothing should [22:11:15] Ryan_Lane: I guess if you deploy operations/mediawiki-config @master , that might work again :-] [22:11:27] robh: no i can head back though [22:11:31] 5.1 client is the default installed by generic::mysql::packages::client [22:11:33] no [22:11:35] it'll break [22:11:37] badly ;) [22:11:44] hm [22:11:49] which is used by both precise and lucid [22:11:55] i could switch it to a case thing [22:12:00] LeslieCarr: then that class isn't compatible with precise [22:12:03] Krinkle: If it was done by someone without root access.... [22:12:05] if lucid, client-5.1, if precise - client-5.5 [22:12:20] Reedy: It was done as root (or via puppet) [22:12:26] that would do the trick [22:12:33] cmjohnson1: no need to head back today [22:12:43] cmjohnson1: its non-emergency, ill just snag a different one and drop ticket for this one [22:12:57] * hashar waves at Leslie [22:13:02] okay..i will get it in the a.m. [22:13:04] hi hashar [22:13:20] RECOVERY - Host msfe1002 is UP: PING OK - Packet loss = 0%, RTA = 26.51 ms [22:13:32] LeslieCarr: watching you typing on your laptop right now :-] You occupy the whole right part of the meeting cam [22:13:41] ;-] [22:13:46] now i hide a bit more [22:13:47] :) [22:14:33] !log reedy synchronized wmf-config/InitialiseSettings.php 'wgUseRCPatrol on for wikivoyages' [22:14:39] Logged the message, Master [22:14:40] hashar: fixed :) [22:14:49] kind of [22:15:03] change ? :-] [22:15:07] http://en.wiktionary.beta.wmflabs.org/wiki/Special:Version [22:15:12] I made a local hack [22:15:13] !g 81d65d00 [22:15:13] https://gerrit.wikimedia.org/r/#q,81d65d00,n,z [22:15:14] !g 1a6ece73 [22:15:15] https://gerrit.wikimedia.org/r/#q,1a6ece73,n,z [22:15:16] oh man you are doing live hacksè [22:15:19] I'm using both locations [22:15:27] hashar: I'm going to take them and push them in when I'm done [22:15:29] 1.21wmf6 (8ef1a23) !! [22:15:31] congratulations [22:15:51] looks like it's working to me [22:16:09] maybe not [22:16:13] I'm getting errors [22:16:32] at least it pointed to the correct path and show something [22:16:35] hm [22:16:38] maybe I'm not [22:16:44] it looks like it's working [22:16:45] I guess [22:17:07] !g 0cfe7666 [22:17:07] https://gerrit.wikimedia.org/r/#q,0cfe7666,n,z [22:17:14] also note there is a dumb squid cache in front of the apaches which is most probably never purged [22:17:22] [22:17:22] :( [22:17:37] I'm hitting pages that don't cache [22:18:09] hashar: https://gerrit.wikimedia.org/r/#q,0cfe7666,n,z https://gerrit.wikimedia.org/r/#q,81d65d00,n,z https://gerrit.wikimedia.org/r/#q,1a6ece73,n,z [22:18:12] I say that [22:18:17] PROBLEM - SSH on msfe1002 is CRITICAL: Connection refused [22:18:20] but that's obviously not true with this squid config [22:18:21] hashar: That's implementation of doc.wikimedia.org and puppet dox gen [22:18:40] New patchset: Lcarr; "switching mysql.pp to check ubuntu distribution" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44155 [22:18:41] Krinkle: ping andrewbogott about it :-] [22:18:42] enwp for beta still shows an alpha version [22:18:53] hashar: nothing to ping, I just wanted to know how it was done and where, and I found it. [22:19:03] New patchset: Reedy; "Enable wgUseRCPatrol on wikivoyages" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/44156 [22:19:07] Ryan_Lane: http://en.wikipedia.beta.wmflabs.org/wiki/Special:NewPagesFeed is a good test page. Loads javascript / css + images from the docroot. [22:19:21] why does enwp for beta show an alpha version? [22:19:25] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/44156 [22:19:29] msfe1002 just stopped responding .. anyone working on it? [22:19:32] !log authdns-update [22:19:35] or im gonna powercycle [22:19:42] does apache need to be restarted on those hosts? [22:19:42] Logged the message, RobH [22:20:02] Krinkle: oh sorry I thought you were giving me patches to review ;-) [22:20:15] yep [22:20:21] apc needed to be cleared I guess [22:20:24] now it's working [22:20:34] robla: http://en.wikipedia.beta.wmflabs.org/wiki/Special:Version [22:20:35] :] [22:20:37] achievement [22:20:41] congrats Ryan! [22:20:43] !log powercycling msfe1002 [22:20:46] ty [22:20:52] seems that's the only change we need for now [22:20:54] Logged the message, Master [22:20:57] TimStarling: http://en.wikipedia.beta.wmflabs.org/wiki/Special:Version [22:21:35] binasher - can you check out https://gerrit.wikimedia.org/r/44155 ? [22:21:37] Ryan_Lane: so you had to restart apache on the box ? [22:21:42] I did [22:21:43] yes [22:21:49] Ryan_Lane: what was the issue with the APC cache? [22:21:53] RECOVERY - SSH on msfe1002 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [22:22:06] apc likely had the old common stuff in memory [22:22:18] binasher - can you check out https://gerrit.wikimedia.org/r/44155 ? (dunno if you saw that due to disconnecting and reconnecting) [22:22:19] ahh [22:22:28] hashar: btw, is there a way to not duplicate everything twice? See https://gerrit.wikimedia.org/r/gitweb?p=operations/puppet.git;a=blob;f=files/apache/sites/integration.mediawiki.org;h=afb69811fb315bcde3ace0ad984b5023cad6d152;hb=HEAD [22:22:35] Ryan_Lane: smart move :-] [22:22:39] Ryan_Lane: it is fast again! nice! [22:22:55] it's faster because it's not hitting php from gluster [22:22:56] Krinkle: yup there is :-] [22:23:05] Krinkle: Apache support including files IIRC. [22:23:10] it's surprising how slow gluster is [22:23:19] <^demon> Krinkle, hashar: Do like we do on gerrit, force redirect to https :p [22:23:22] Ryan_Lane: chrismcmahon is going to be very happy about that. [22:23:36] well, he'll be happy when we're actually using it for master ;) [22:23:51] Ryan_Lane: It's not /that/ surpristing [22:23:59] Damianz: well, true [22:24:01] so I will call it an end [22:24:04] I guess it's added latency for the filesystem [22:24:19] I need to turn off atime on glustr [22:24:21] gluster [22:24:25] LeslieCarr: that kinda kills use of $version parameter in the class, so might want to get rid of it [22:24:36] New patchset: Krinkle; "integration.mediawiki.org: Remove old testswarm routing." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44157 [22:24:37] for the underlying filesystem [22:24:40] still have to do some laundry there and my wife just came back from job [22:24:53] Ryan_Lane: well done [22:24:56] there's some other performance tweaks I need to make for it too. it'll never be fast, though [22:24:59] Ryan_Lane: and /data/project on deployment-prep is corrupted too I think [22:25:06] good point [22:25:08] hashar: yes, somewhat [22:25:31] Ryan_Lane: going to disconnect. Congrats again ryan! [22:25:31] so, let me push this change into newdeploy [22:25:38] then I'll deploy to tampa [22:25:43] and we can try out test [22:25:50] !! [22:26:22] then we'll know if we're ready for tomorrow :) [22:27:10] New patchset: Lcarr; "switching mysql.pp to check ubuntu distribution" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44155 [22:27:14] binasher: ^^ [22:27:59] Ryan_Lane: will figure out from the SAL :-] [22:28:04] Ryan_Lane: I should be around for most of the next X hours.. If you need a hand with hashar leaving [22:28:11] hashar: figure out what from SAL? [22:28:19] Ryan_Lane: I am heading bed, will not be there tomorrow morning, I need a long nap. [22:28:20] Reedy: awesome. thanks. [22:28:28] ah [22:28:28] ok [22:28:31] hashar: night! [22:28:40] barely slept for the last 3 days or so :/ [22:29:00] I got a very efficient alarm clock that kick in at t7am every day :-] [22:29:02] LeslieCarr: now you need to update invocations of both of those classes that are passing in a version [22:29:14] there's one of each, also in mysql.pp [22:29:49] Reedy: and thanks for proposing to review stuff :-] [22:30:13] New patchset: Ryan Lane; "Use old and new path match" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/44158 [22:30:33] ^^ [22:30:50] RobH: When you can, could you give me the IP address of the Parsoid Varnish in eqiad? Even if it's not up yet. I need the IP to put in the config (which is also the first test of the datacenter-dependent config mechanism) [22:31:42] we really need test coverage [22:32:23] cmjohnson1: did you send out the card on asw-c-eqiad ? [22:32:29] wondering when we'll get the fixed part [22:33:25] lesliecarr: came in late this afternoon (according to portal). [22:33:38] New review: Hashar; "That would work though that might does two preg_match() calls. The master branch has a single preg_..." [operations/mediawiki-config] (newdeploy); V: 0 C: 0; - https://gerrit.wikimedia.org/r/44158 [22:33:39] so tomorrow [22:33:48] Ryan_Lane: ^^^^ [22:33:48] now I am sleeping :) [22:33:52] wave! [22:34:07] see ya :) [22:34:40] RoanKattouw: its going to be cerium and praseodymium which is 208.80.154.147 and 148 respectively [22:36:10] KO [22:36:15] I'm gonna use cerium for now [22:36:21] I don't have LVS groups for Parsoid Varnish set up yet [22:36:31] So I'll use 208.80.154.147 [22:36:31] Thanks man [22:37:56] New patchset: Catrope; "Vary the Parsoid IP by datacenter" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/44160 [22:39:04] New patchset: Faidon; "autoinstall: switch all squid boxes to lucid" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44161 [22:39:05] New patchset: Ryan Lane; "Use old and new path match" [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/44158 [22:39:34] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44161 [22:39:35] Change merged: Ryan Lane; [operations/mediawiki-config] (newdeploy) - https://gerrit.wikimedia.org/r/44158 [22:41:12] ok. now to configure all tampa hosts [22:41:21] anyone have an idea of a good regex for that [22:41:22] ? [22:41:53] haha [22:42:01] * robla pokes around beta [22:42:01] errm [22:42:02] eqiad is: ^(mw).*eqiad.* [22:42:04] isn't there a site grain? [22:42:20] TimStarling: that still requires an idea of which systems we needto use [22:42:22] *to [22:42:27] We've mw* and srv* [22:42:40] because we'd need to include a class for them in puppet [22:42:48] and we don't have anything like that in puppet right now either [22:42:51] + snapshot and tmh, couple of search indexes, fenari, hume and spence [22:44:13] what does spence use it for? [22:44:18] memcache check? [22:44:41] PROBLEM - Host msfe1002 is DOWN: PING CRITICAL - Packet loss = 100% [22:44:43] yes [22:44:44] And job possibly job queue metircs [22:44:49] * Ryan_Lane nods [22:45:01] puppet regexes shit me [22:45:05] node /^cp10(2[1-9]|3[0-6])\.eqiad\.wmnet$/ { [22:45:06] etc. [22:45:41] it should have a range feature, like node [cp1021-1036].eqiad.wmnet [22:45:57] That'd be very useful for this nature of things [22:46:04] yep [22:46:35] /^cp(1021|1022|1023|1024|1025|1026|1027|1028|1029|1030|1031|1032|1033|1034|1035|1036).eqiad.wmnet$/ [22:46:38] simple [22:46:53] hi felicity! [22:47:00] adding grains via puppet isn't amazingly straightforward, either, thanks to the lack of iteration [22:47:02] hi tim [22:47:16] how's life? [22:47:31] shitty as always, you? [22:47:39] /^cp(102[1-9|1022|1023|1024|1025|1026|1027|1028|1029|1030|1031|1032|1033|1034|1035|1036).eqiad.wmnet$/ [22:47:45] ragrgh [22:47:54] same old [22:47:54] thankfully we don't need a regex for the cp systems ;) [22:47:57] missing ] [22:48:02] Hence the noise [22:48:15] /^cp(102[1-9]|103[0-6]).eqiad.wmnet$/ [22:49:09] mw1-mw74 skipping mw50 and mw23 [22:49:09] New patchset: Krinkle; "integration.mediawiki.org: Configure localhost:9412 for QUnit." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44162 [22:49:23] New patchset: RobH; "replacing titanium role with praseodymium" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44163 [22:49:28] Reedy: your works appears to be based on mine, and i did not give you permission to distribute it [22:49:43] Did you tell me that I couldn't? [22:49:59] i'm pretty sure that's not how copyright works [22:50:22] not for long [22:50:33] New review: RobH; "pay no attention to the man doing self review behind the curtain" [operations/puppet] (production); V: 2 C: 2; - https://gerrit.wikimedia.org/r/44163 [22:50:35] Change merged: RobH; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44163 [22:50:38] however i'll remember that if i ever murder someone [22:50:41] "you didn't tell me i couldn't" [22:50:42] (^(srv|mw|snapshot).(eqiad|pmtpa).wmnet$)|(^(hume|spence|fenari).wikimedia.org$)) [22:50:47] at least if you're in the UK, there is that orphan works law right? [22:50:48] ^^ that look correct? [22:50:54] New patchset: Krinkle; "integration.mediawiki.org: Configure localhost:9412 for QUnit." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44164 [22:51:08] Ryan_Lane: No numbers? [22:51:12] heh [22:51:14] whoops [22:51:33] \d{4} should be encompassing enough.. [22:51:38] .+\.wmnet # done [22:51:59] (^(srv|mw|snapshot).*.(eqiad|pmtpa).wmnet$)|(^(hume|spence|fenari).wikimedia.org$)) [22:52:15] I guess it's not going to bring any false positives [22:52:22] doubtful [22:52:24] we have no srvlol for example ;) [22:52:35] New patchset: Krinkle; "integration.mediawiki.org: Remove old testswarm routing." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44157 [22:52:49] * Ryan_Lane tests with: salt -E '(^(srv|mw|snapshot).*.(eqiad|pmtpa).wmnet$)|(^(hume|spence|fenari).wikimedia.org$))' test.ping [22:53:00] Ryan_Lane: Nope [22:53:03] searchidx and tmh too [22:53:18] Though, it's only some searchidx hosts.. [22:53:53] searchidx2 and searchidx1001 [22:54:19] * Ryan_Lane groans [22:54:24] (^(srv|mw|snapshot|tmh|searchidx(2|1001)).*.(eqiad|pmtpa).wmnet$)|(^(hume|spence|fenari).wikimedia.org$)) [22:54:54] that'll match 2nnnn.. [22:55:34] Where's 2? esams? [22:55:42] (2XXX) [22:56:55] we better not be deploying to esams :D [22:57:27] It was more thinking we're not going to be having search indexers there anytime soon (if it at all ;)) [22:58:22] root@sockpuppet:~# salt -E '^(srv|mw|snapshot|tmh)|(searchidx2|searchidx1001).*.(eqiad|pmtpa).wmnet$|^(hume|spence|fenari).wikimedia.org$' test.ping | grep search [22:58:29] searchidx2.pmtpa.wmnet: True [22:58:29] searchidx1001.eqiad.wmnet: True [23:02:32] RECOVERY - Host msfe1002 is UP: PING OK - Packet loss = 0%, RTA = 26.63 ms [23:05:28] New patchset: Dzahn; "remove empy comment lines, for some weird reason these end up in generated HTML output and mess up the index page, while the other lines starting with # that actually have text do not" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44168 [23:06:08] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44168 [23:11:09] New patchset: RobH; "praseodymium mac address update" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44169 [23:12:16] New review: RobH; "im out of witty self review comments for today" [operations/puppet] (production); V: 2 C: 2; - https://gerrit.wikimedia.org/r/44169 [23:12:17] Change merged: RobH; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44169 [23:13:40] a soft, silvery, malleable and ductile metal in the lanthanide group [23:14:07] root@sockpuppet:~# salt -E '^(srv|mw|snapshot|tmh)|(searchidx2|searchidx1001).*.(eqiad|pmtpa).wmnet$|^(hume|spence|fenari).wikimedia.org$' test.ping | wc [23:14:07] 338 676 8344 [23:14:27] I think I'll fan out their initialization :D [23:16:30] New patchset: Ryan Lane; "Switch all mw hosts to use tin" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44172 [23:18:44] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44172 [23:20:37] RoanKattouw_away: caesium is online (one of your two parsoid vanish hosts) [23:21:23] Awesome [23:21:46] had issues with the other host, so had to move service to a different server, it'll be online shortly [23:21:54] !log reedy synchronized wmf-config/InitialiseSettings.php 'Bug 44015 - Add en.wikivoyage autopatrolled group' [23:22:04] Logged the message, Master [23:22:36] RobH: Wait, caesium? [23:22:40] I thought it was cerium? [23:22:43] doh [23:22:44] sorry [23:22:46] cerium [23:22:52] im doing a differnt, unrelated work on caesium [23:22:54] OK [23:22:58] Is https://gerrit.wikimedia.org/r/#/c/44160/ correct? [23:22:59] and losing my mind oO [23:23:20] (the IPs and hostnames in that commit that is) [23:23:22] New patchset: Reedy; "Bug 44015 - Add en.wikivoyage autopatrolled group" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/44173 [23:23:34] RoanKattouw: yep, looks good to me [23:23:40] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/44173 [23:23:50] Cool thanks [23:23:56] RoanKattouw: want someone to review it or you got it? [23:24:17] PROBLEM - SSH on msfe1002 is CRITICAL: Connection refused [23:24:25] RobH: I need 44160 to be reviewed but not by ops [23:24:34] It's the first commit that uses the multi-datacenter-config thingy [23:24:36] no worries then [23:24:51] praseodymium will be ready soon enough [23:25:39] Ryan_Lane: So its deploying ALL the files to ALL the servers? ;) [23:25:45] yep [23:26:00] all repos [23:26:00] wheee [23:26:01] it isn't just yet [23:26:01] but will be soon [23:26:11] Are "we" planning on replacing /usr/local/apache stuff with symlinks today? [23:26:18] ie to /srv/.. [23:26:59] New patchset: RobH; "msfe1001-1002 to decom, cleaning up decom" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44174 [23:27:08] ACKNOWLEDGEMENT - SSH on msfe1002 is CRITICAL: Connection refused daniel_zahn will be renamed - this is not ms-fe1002 [23:27:31] Change merged: RobH; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44174 [23:28:02] PROBLEM - NTP on msfe1002 is CRITICAL: NTP CRITICAL: No response from NTP server [23:28:24] Reedy: no [23:28:27] AaronSchulz: around? [23:28:32] !log ignore msfe1001/msfe1002 errors (the names without the dash) as they are turned off and renamed [23:28:38] Reedy: when we do that we've actually switched over [23:28:42] Logged the message, RobH [23:28:48] we could actually deploy wmf8 today [23:28:54] * AaronSchulz is just debugging some code [23:28:55] Aha [23:28:57] So just staging it [23:28:59] k [23:29:02] and then switch it tomorrow [23:29:15] paravoid: what it is? [23:29:20] for sure we should try test first :) [23:29:30] the cronjob [23:29:35] I'm re-reading it [23:29:40] oh, that [23:29:44] I'm deploying the deployment system right now [23:29:45] $tempRepo = $repo->getTempRepo(); [23:29:45] $dir = $tempRepo->getZonePath( 'thumb' ); [23:29:45] $iterator = $tempRepo->getBackend()->getFileList( array( 'dir' => $dir ) ); [23:29:48] $this->output( "Deleting old thumbnails...\n" ); [23:29:49] then I'll initialize all of tampa [23:29:52] that's temp I guess [23:29:57] so I guess I was wrong [23:29:58] then we can try test [23:30:04] yeah a thumb subdir of the temp container [23:30:10] right [23:30:38] New patchset: Dzahn; "remove global index.html.tmpl, they are created for every language" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44175 [23:32:00] ok, reviewed all the misc servers in eqiad (i missed msfe since i forgot they werent used now) [23:32:09] and now have 10 spare misc servers. [23:32:11] \o/ [23:32:20] (3 of which I have already marked for use) [23:32:21] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44175 [23:35:06] !log testing deployment destination initialization on fenari [23:35:17] Logged the message, Master [23:36:05] well, that failed [23:37:49] What did you try? [23:38:21] New patchset: RobH; "msfe1001/2 never in puppet, so no need to list here, opps" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44177 [23:38:26] it's a problem with a change I made to the deployment module [23:38:29] I'm fixing it now [23:39:03] New patchset: Ryan Lane; "Don't reference repo before checking if we use it" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44178 [23:39:07] Change merged: RobH; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44177 [23:40:07] New patchset: Lcarr; "switching mysql.pp to check ubuntu distribution" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44155 [23:41:16] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44178 [23:41:26] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44155 [23:41:30] ah. no that won't fix it either [23:41:36] New patchset: Ryan Lane; "Use proper reference to Parsoid" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44179 [23:41:38] but that will ^^ [23:42:00] Ryan_Lane: is your change safe to merge ? [23:42:04] yep [23:42:12] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44179 [23:42:44] RoanKattouw: ok, praseodymium is installed as parsoid varnish and puppet has run [23:42:46] I just merged everything [23:42:47] so its all yours [23:42:50] yay [23:49:22] RoanKattouw: you broke nagios [23:49:27] ? [23:49:31] please define @monitor_group { "parsoidcache_eqiad [23:49:33] I know the Parsoid Nagios checks are broken [23:49:35] ... [23:49:36] Oh [23:49:42] Ahm... [23:50:03] well, i broke nagios about 15 minutes ago [23:50:05] actually, I can do it [23:50:05] so its RoanKattouw's turn. [23:50:19] well, shit [23:50:22] Oh I see [23:50:23] fanout is going to be difficult [23:50:30] since I'm using a returner [23:50:35] Yeah that one is about 30% Rob's fault and 70% my fault [23:51:03] New patchset: Pyoungmeister; "eqiad parsoid caches need nagios group" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44180 [23:51:32] Oh looks like you beat me to it [23:51:32] no worries, I more mention so that you're aware and can not forget next time :) [23:51:47] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44180 [23:51:54] Yeah thanks, I'd forgotten to define the eqiad group ahead of time [23:51:58] Should've just done it right away [23:52:05] no probs [23:52:57] I mean, trust me, the rest of us want nagios to die more than you do... but this isn't the right way to do it [23:52:58] ;) [23:53:21] hahaha [23:54:46] : !log copying ganglios to precise-wikimedia repository [23:55:10] hrm, is morebots borked ? [23:55:17] always [23:55:24] damnit [23:55:52] LeslieCarr: http://www.vidarholen.net/contents/wordcount/ [23:56:04] hehehe [23:56:04] nice [23:59:07] let's try this again [23:59:09] !log restarted morebots [23:59:19] Logged the message, Mistress of the network gear. [23:59:40] !log copied ganglios from lucid-wikimedia to precise-wikimedia [23:59:50] Logged the message, Mistress of the network gear.