[00:00:10] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:00:29] New patchset: Ryan Lane; "Ensure default site is gone." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2872 [00:01:02] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2872 [00:01:03] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2872 [00:02:07] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:02:25] RECOVERY - Puppet freshness on db1040 is OK: puppet ran at Wed Feb 29 00:02:15 UTC 2012 [00:04:04] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:06:11] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:08:07] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:08:16] RECOVERY - RAID on db40 is OK: OK: 1 logical device(s) checked [00:09:38] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:11:25] New patchset: Ryan Lane; "Add manganese to ldap firewall rules on virt0" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2874 [00:11:35] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:13:32] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:13:48] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2874 [00:13:48] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2874 [00:15:29] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:17:26] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:19:23] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:21:20] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:23:17] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:25:14] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:27:11] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:29:08] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:31:05] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:32:53] RECOVERY - Puppet freshness on db1040 is OK: puppet ran at Wed Feb 29 00:32:42 UTC 2012 [00:33:02] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:34:59] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:36:29] PROBLEM - Packetloss_Average on emery is CRITICAL: CRITICAL: packet_loss_average is 8.46059248 (gt 8.0) [00:36:56] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:37:14] PROBLEM - Packetloss_Average on locke is CRITICAL: CRITICAL: packet_loss_average is 8.60164352 (gt 8.0) [00:38:53] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:39:11] PROBLEM - Puppet freshness on mw1010 is CRITICAL: Puppet has not run in the last 10 hours [00:40:50] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:42:47] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:44:44] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:46:14] PROBLEM - Puppet freshness on mw1020 is CRITICAL: Puppet has not run in the last 10 hours [00:46:14] PROBLEM - Puppet freshness on mw1110 is CRITICAL: Puppet has not run in the last 10 hours [00:46:41] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:48:56] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:50:53] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:50:56] so what is up with db1040 ? [00:51:24] it's doing puppet checks [00:52:04] i blame nagios [00:52:05] RECOVERY - Puppet freshness on db1040 is OK: puppet ran at Wed Feb 29 00:51:36 UTC 2012 [00:52:23] RECOVERY - MySQL Slave Running on db24 is OK: OK replication [00:52:50] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:53:12] yeah [00:53:13] heh [00:54:47] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:56:44] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [00:57:56] PROBLEM - MySQL Slave Delay on db24 is CRITICAL: CRIT replication delay 11499 seconds [00:58:41] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [01:00:25] RECOVERY - Puppet freshness on db1040 is OK: puppet ran at Wed Feb 29 01:00:14 UTC 2012 [01:00:34] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [01:02:22] RECOVERY - Puppet freshness on db1040 is OK: puppet ran at Wed Feb 29 01:01:49 UTC 2012 [01:02:49] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [01:03:25] RECOVERY - Puppet freshness on db1040 is OK: puppet ran at Wed Feb 29 01:03:11 UTC 2012 [01:04:55] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [01:06:52] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [01:07:25] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2514 [01:07:26] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2514 [01:07:55] PROBLEM - Packetloss_Average on locke is CRITICAL: CRITICAL: packet_loss_average is 8.23574895161 (gt 8.0) [01:08:58] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [01:09:45] New patchset: Ryan Lane; "Adding sumanah back onto manganese" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2876 [01:11:04] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [01:12:09] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2876 [01:12:10] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2876 [01:12:16] RECOVERY - Puppet freshness on db1040 is OK: puppet ran at Wed Feb 29 01:11:59 UTC 2012 [01:13:01] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [01:14:58] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [01:16:55] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [01:19:10] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [01:21:07] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [01:21:52] RECOVERY - Puppet freshness on db1040 is OK: puppet ran at Wed Feb 29 01:21:32 UTC 2012 [01:23:04] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [01:25:01] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [01:26:58] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [01:28:55] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [01:30:37] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [01:30:37] RECOVERY - MySQL Slave Delay on db24 is OK: OK replication delay 27 seconds [01:31:22] RECOVERY - MySQL Replication Heartbeat on db24 is OK: OK replication delay 0 seconds [01:32:34] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [01:34:04] RECOVERY - Puppet freshness on db1040 is OK: puppet ran at Wed Feb 29 01:33:38 UTC 2012 [01:34:31] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [01:36:28] PROBLEM - Puppet freshness on db1040 is CRITICAL: Puppet has not run in the last 10 hours [01:51:22] RECOVERY - Packetloss_Average on locke is OK: OK: packet_loss_average is 3.14903610169 [02:02:31] !log manually set large rmem_max and rmem_default on locke and restarted udp2log to stem packet loss, opened an rt ticket to fix the (lost) fix [02:02:36] Logged the message, Master [02:17:46] PROBLEM - Puppet freshness on db1004 is CRITICAL: Puppet has not run in the last 10 hours [02:18:04] RECOVERY - Packetloss_Average on emery is OK: OK: packet_loss_average is 3.20444134454 [02:27:49] PROBLEM - Puppet freshness on owa3 is CRITICAL: Puppet has not run in the last 10 hours [02:36:49] PROBLEM - Puppet freshness on owa2 is CRITICAL: Puppet has not run in the last 10 hours [02:36:49] PROBLEM - Puppet freshness on owa1 is CRITICAL: Puppet has not run in the last 10 hours [02:53:24] New patchset: Bhartshorne; "added first iteration of the swift cleaner" [operations/software] (master) - https://gerrit.wikimedia.org/r/2877 [02:53:25] New review: gerrit2; "Lint check passed." [operations/software] (master); V: 1 - https://gerrit.wikimedia.org/r/2877 [02:54:47] New review: Bhartshorne; "(no comment)" [operations/software] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2877 [02:54:47] Change merged: Bhartshorne; [operations/software] (master) - https://gerrit.wikimedia.org/r/2877 [03:21:48] TimStarling: do you remember the magic word to deploy the squid configs to only upload squids? [03:21:56] just 'upload'? [03:23:28] PROBLEM - Disk space on db1025 is CRITICAL: DISK CRITICAL - free space: / 284 MB (3% inode=80%): /var/lib/ureadahead/debugfs 284 MB (3% inode=80%): [03:24:26] I think you would have to specify each cluster name as it appears in generated/clusters [03:24:31] PROBLEM - MySQL disk space on db1025 is CRITICAL: DISK CRITICAL - free space: / 284 MB (3% inode=80%): /var/lib/ureadahead/debugfs 284 MB (3% inode=80%): [03:24:45] ah, that looks right. [03:24:52] I was trying to read the perl and my eyes burned. [03:25:13] thank you [03:25:27] !log took swift out of rotation - thumbnails now served by ms5 [03:25:30] :( [03:25:31] Logged the message, Master [03:25:54] oh, and TimStarling, you were right. My heuristic for assessing bad thumbnails sucked ass. [03:31:25] RECOVERY - Disk space on db1025 is OK: DISK OK [03:32:28] RECOVERY - MySQL disk space on db1025 is OK: DISK OK [06:21:55] PROBLEM - Puppet freshness on db1022 is CRITICAL: Puppet has not run in the last 10 hours [07:58:55] New patchset: ArielGlenn; "move rsync to external mirrors off to download mirror host" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2879 [07:59:25] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2879 [08:03:10] New review: ArielGlenn; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2879 [08:03:10] Change merged: ArielGlenn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2879 [08:50:41] New patchset: ArielGlenn; "download host kernel settings for eth buffer allocs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2880 [08:51:08] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2880 [08:52:47] New review: ArielGlenn; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2880 [08:52:48] Change merged: ArielGlenn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2880 [09:06:46] PROBLEM - Puppet freshness on hooper is CRITICAL: Puppet has not run in the last 10 hours [09:07:49] PROBLEM - Puppet freshness on cadmium is CRITICAL: Puppet has not run in the last 10 hours [09:08:43] PROBLEM - Puppet freshness on formey is CRITICAL: Puppet has not run in the last 10 hours [10:17:21] PROBLEM - Puppet freshness on virt0 is CRITICAL: Puppet has not run in the last 10 hours [10:30:01] PROBLEM - Puppet freshness on mw70 is CRITICAL: Puppet has not run in the last 10 hours [10:30:01] PROBLEM - Puppet freshness on mw1098 is CRITICAL: Puppet has not run in the last 10 hours [10:40:58] PROBLEM - Puppet freshness on mw1010 is CRITICAL: Puppet has not run in the last 10 hours [10:48:01] PROBLEM - Puppet freshness on mw1020 is CRITICAL: Puppet has not run in the last 10 hours [10:48:01] PROBLEM - Puppet freshness on mw1110 is CRITICAL: Puppet has not run in the last 10 hours [10:55:04] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [11:00:01] PROBLEM - Puppet freshness on srv278 is CRITICAL: Puppet has not run in the last 10 hours [11:13:04] PROBLEM - Puppet freshness on db25 is CRITICAL: Puppet has not run in the last 10 hours [11:13:58] PROBLEM - Puppet freshness on amssq42 is CRITICAL: Puppet has not run in the last 10 hours [11:13:58] PROBLEM - Puppet freshness on sq76 is CRITICAL: Puppet has not run in the last 10 hours [11:13:58] PROBLEM - Puppet freshness on mw71 is CRITICAL: Puppet has not run in the last 10 hours [11:15:01] PROBLEM - Puppet freshness on amslvs2 is CRITICAL: Puppet has not run in the last 10 hours [11:15:01] PROBLEM - Puppet freshness on cp1002 is CRITICAL: Puppet has not run in the last 10 hours [11:15:01] PROBLEM - Puppet freshness on db1019 is CRITICAL: Puppet has not run in the last 10 hours [11:15:01] PROBLEM - Puppet freshness on db10 is CRITICAL: Puppet has not run in the last 10 hours [11:15:01] PROBLEM - Puppet freshness on dataset1001 is CRITICAL: Puppet has not run in the last 10 hours [11:15:01] PROBLEM - Puppet freshness on es1 is CRITICAL: Puppet has not run in the last 10 hours [11:15:01] PROBLEM - Puppet freshness on db1030 is CRITICAL: Puppet has not run in the last 10 hours [12:12:10] New patchset: Mark Bergsma; "Cleanup with hierarchy" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2881 [12:12:43] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2881 [12:12:44] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2881 [12:19:24] New patchset: Mark Bergsma; "Move misc::install-server into a separate misc/ file" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2882 [12:19:52] New patchset: Mark Bergsma; "Fix mode of /srv/autoinstall" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2883 [12:20:21] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2882 [12:20:21] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2882 [12:20:21] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2882 [12:20:22] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2883 [12:20:22] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2883 [12:20:31] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2883 [12:20:32] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2883 [12:41:12] New patchset: Mark Bergsma; "Add new, simple partman recipe for LVM on hw raid, root/swap LVs only" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2884 [12:41:39] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2884 [12:41:50] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2884 [12:41:51] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2884 [12:50:59] !log upgraded mwlib to 0.13.5 on pdf cluster [12:51:02] Logged the message, Master [13:05:17] I wanna update twinkle at hi-wp. I'm just worried that if I mess it up and someone messes an MW update up, I wont be able to tell the difference if it was me who messed up. So when will it be a good time to update twinkle sometime in the next week? [13:14:53] anyone? [13:23:26] New patchset: Mark Bergsma; "Make lvm.cfg recipe fully automatic" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2885 [13:23:54] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2885 [13:24:09] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2885 [13:24:10] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2885 [13:27:07] New patchset: Hashar; "rt: force HTTPS protocol" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2446 [13:42:38] New patchset: Mark Bergsma; "Make partman recipes fully automatic" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2886 [13:43:45] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2886 [13:43:46] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2886 [13:46:02] mark: do you happen to know how I we can get gerrit to reverify a change? https://gerrit.wikimedia.org/r/#change,2682 [13:46:34] should I just resubmit a dummy patchset ? [13:46:49] that doesn't work I believe [13:46:52] I don't think it's possible [13:53:28] New patchset: Hashar; "git-setup script no more use "git config --global"" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2682 [13:53:58] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2682 [13:54:11] that would mean we only lint one time [13:55:07] !log Reinstalled strontium and palladium with hw raid1 and fully automatic lvm based partman recipe [13:55:12] Logged the message, Master [14:02:36] New patchset: Demon; "Push script for extensions" [operations/software] (master) - https://gerrit.wikimedia.org/r/2887 [14:02:37] New review: gerrit2; "Lint check passed." [operations/software] (master); V: 1 - https://gerrit.wikimedia.org/r/2887 [14:03:08] New review: Demon; "(no comment)" [operations/software] (master); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2887 [14:03:09] Change merged: Demon; [operations/software] (master) - https://gerrit.wikimedia.org/r/2887 [14:38:03] domas: any suggestions how to do a find and replace on an unindexed column on a table with over 116 million rows? [14:38:20] Wanting to replace " " with "_" as someone made an issue in MW... [14:42:54] depends on how scattered rows are [14:43:38] No idea unfortunately [14:43:48] Relatively it's not going to be not that many rows [14:43:56] 1%? 0.1%? [14:44:24] Any page title (on any wiki) that has a space in it, a picture from commons, and the page has been moved [14:44:29] you can write a script that reads all rows, then issues replace statements =) [14:44:46] I was just thinking use one of the eqiad slaves [14:44:49] they're sat there doing nothing [14:44:56] you sure can do a batch query before [14:44:59] so I can run a long slow query, build a list of PKs and replace on the master from there? [14:45:03] right [14:46:16] I'll do that then, thanks :) [14:51:54] New patchset: Hashar; "Bug 28469 - Make SVN Documentation be indexed" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2888 [15:19:25] New patchset: Demon; "Add rule for RL2 gadgets branch" [operations/software] (master) - https://gerrit.wikimedia.org/r/2889 [15:19:26] New review: gerrit2; "Lint check passed." [operations/software] (master); V: 1 - https://gerrit.wikimedia.org/r/2889 [15:38:45] domas: just over 4 million rows [15:38:56] 3.5% or s [15:38:58] so [15:42:01] heh [15:42:09] if new rows are not being written, just prepare a batch and run a script :) [15:44:32] Yup, that's what I was going to do now [15:48:00] just do the regular waitforslaves every X mutations [15:48:09] you can use REPLACE() and LIMIT 100 or something [15:50:39] New review: Demon; "(no comment)" [operations/software] (master); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2889 [15:50:39] Change merged: Demon; [operations/software] (master) - https://gerrit.wikimedia.org/r/2889 [16:35:44] robh: cisco arrived please update rt 2499 [16:35:59] ok, will in a moment, on dell rma call at the moment =] [16:36:08] that will make ryan lane really happy [16:45:18] cmjohnson1: ok, finding a place for it. [16:47:34] i hate where cr2pmtpa is [16:47:42] its low neough in the rack that now the rack has a lot of power and no space [16:47:51] hrmm [16:48:30] ok, found a place. [16:49:50] cmjohnson1: https://rt.wikimedia.org/Ticket/Display.html?id=2499 updated [16:50:36] cool [16:55:50] So, it appears puppet "magically" restarted itself in the middle of the night and undid all my work on reportcard1. [16:55:58] This is frowny-face. [16:56:00] :( [17:08:23] dschoon: it does do that. root's crontab. [17:08:43] maplebed: ah! yes, that makes a great deal of sense. [17:08:48] ty. will fix. [17:09:00] I think it's because puppet was crashing randomly so this "fixed" it. [17:09:15] *nod* just don't have time to puppetize changes atm. [17:09:26] next week. need box in a certain state now tho. [17:12:28] oh Lucene, why are you evil? Why oh why. [17:12:34] morning AaronSchulz - any chance you could give me a code review on the swift cleaning stuff? [17:12:53] Jeff_Green: it secretly wants to be a dairy product. [17:13:02] hehehe. [17:13:45] when searches start coming up "Have you seen me?" I will really worry [17:14:42] maplebed: sure [17:14:51] thanks! [17:22:16] New patchset: Bhartshorne; "explaining why I don't use the normal URL to HEAD objects in swift" [operations/software] (master) - https://gerrit.wikimedia.org/r/2891 [17:22:33] New review: Bhartshorne; "(no comment)" [operations/software] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2891 [17:22:34] Change merged: Bhartshorne; [operations/software] (master) - https://gerrit.wikimedia.org/r/2891 [17:51:39] New review: Danakim; "Looks fine to me. Will this be merged in?" [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/2682 [18:22:26] New patchset: Bhartshorne; "grumble." [operations/software] (master) - https://gerrit.wikimedia.org/r/2892 [18:22:46] New review: Bhartshorne; "(no comment)" [operations/software] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2892 [18:22:46] Change merged: Bhartshorne; [operations/software] (master) - https://gerrit.wikimedia.org/r/2892 [18:39:19] Is there a way to make [[Special:Version]] update right after a deploy? https://bugzilla.wikimedia.org/34794 [18:39:33] oops, wrong bug [18:40:21] special:version caching: https://bugzilla.wikimedia.org/34796 [18:44:51] <^demon> Special:Version isn't cached? [18:47:38] hexmode: cant they just put ?action=purge [18:47:40] ;p [18:48:12] <^demon> On a special page? :p [18:48:16] i dunno [18:48:20] RobH: I've no clue :{ [18:48:22] im asking [18:48:38] i assumed purging would work on any cached page [18:48:39] special or no [18:48:44] but its an assumption [18:48:49] ^demon: do they not? [18:48:56] <^demon> The special pages that are cached are recached manually. [18:49:03] <^demon> Most special pages aren't cached at all. [18:49:18] so is this guy who reported the bug crazy? [18:49:23] <^demon> Possibly. [18:49:36] its always possible, i dunno them [18:49:38] heh [18:50:06] hexmode: may wanna make sure he knows about action purge so if he sees it again, he can try that. [18:50:12] but sounds like it shouldnt be cached form what chad says [18:50:33] oh wait [18:50:38] he states he was logged in [18:50:43] so he gets no cache, so this cannot be right [18:50:46] hexmode: ^ [18:50:57] <^demon> And the svn data's not cached in memc or anything. [18:51:33] browser cache? [18:52:10] <^demon> Possibly. I think it's confusion most likely. [18:53:27] ^demon: maplebed: RobH: Ask saper in #mediawiki [18:53:40] hexmode: doesnt matter, its invalid [18:53:43] he says he was logged in [18:53:51] logged in users skip caching. [18:54:34] * ^demon goes back to his work [19:06:08] robh: any mgmt info for virt5? [19:06:23] lemme make some right now [19:08:11] cmjohnson1: +virt5 1H IN A 10.1.8.78 [19:08:23] thx [19:08:28] !log dns update for virt5 mgmt [19:08:30] Logged the message, RobH [19:18:33] oooooooo [19:18:37] it's in and racked!? [19:21:04] New patchset: Bhartshorne; "changing the user for swift's rewrite stuff so I can use a new password" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2893 [19:21:34] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2893 [19:21:35] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2893 [19:23:47] ryan_lane: yes it is....do you want more than one nic connected? [19:24:02] yes, please [19:24:05] I need two connected [19:24:19] \o/ [19:24:24] okay [19:26:29] Ryan_Lane: i said that today would be your favorite day this week [19:26:32] because of that server [19:28:16] today is :) [19:28:26] I'm very happy about this [19:28:39] I'm also moving over to the new gerrit server today! :) [19:45:59] maplebed: the cleaner scripts seems to look OK so far otherwise [19:46:15] excellent. [19:46:20] I already found one bug... ;) [19:46:39] the try/except catching keyboard interrupts wouldn't have stopped execution when multiple files are passed in. [19:46:45] I moved it from delegate() to main() [19:47:08] I appreciate you reading over it for me. Thanks! [19:48:27] robh: do we want to keep nic redundancy active-active or none? [19:48:43] where is this setting? [19:48:44] or active-standby [19:48:53] in the cimc configuration [19:49:09] hrmm, i have to recall wtf i did in eqiad, lemme pull on eup [19:49:47] if none the eth ports operate independently [19:50:03] active-active used simultaneously [19:50:26] active standby [19:50:31] okay [19:50:46] these have two mgmt connections is why if i recall correctly [19:50:57] but we dont care about that, we only wire one on servers. [19:51:04] only core routers get dual mgmt connections [19:51:33] cmjohnson1: when you go to test, on these its not root, but admin [19:51:41] and it wont let you change it, so better to just use admin [19:51:52] just figured that out [19:51:55] we could add a root, but that leaves it open to errors, since its an added account [19:51:59] wondering how to do that [19:52:01] best to just adapt to that [19:52:08] nah, just leave admin, dont add root [19:52:23] if we decide to do that in the future, we will do it in a mass update on all the cisco servers [19:52:38] for now i prefer they all operate on the defaults like that [19:52:45] ok..sounds good [19:52:53] all set up...sending network ticket now [19:53:02] Ryan_Lane: ^ [19:53:07] since you took juniper classes [19:53:09] heh [19:53:13] you may wanna snag that ticket if you dont wanna wait ;] [19:53:18] * Ryan_Lane nods [19:53:24] my classes were canceled, i now have them next month =P [19:53:29] awww [19:53:31] took online version this time [19:56:50] you got to take one? [19:56:58] oh. the ones next month will be online. got it [19:57:26] yea, i dont wanna have another cancellation due to class size [19:57:32] online courses are a lot less prone to that issue [19:57:48] ok [19:58:01] I am going to have to update my parallels windows copy [19:58:07] i have not run it in months [19:58:17] and even then its for one specific windows task, i never run updates =P [19:58:29] the only thing i didn't care for in the online course was having to use windows and putty for everything. [19:58:35] seemed like that caused more issues [19:58:50] well, luckily i have my air running os x [19:58:57] i have parallels copies of linux and windows [19:59:07] and i plan to load my old laptop up with bootcamp before the course [19:59:15] so will have a nice mix of options to do the course. [19:59:34] that should help [20:06:40] New patchset: Demon; "Changing permissions" [operations/software] (master) - https://gerrit.wikimedia.org/r/2894 [20:07:27] New review: Demon; "(no comment)" [operations/software] (master); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2894 [20:07:27] Change merged: Demon; [operations/software] (master) - https://gerrit.wikimedia.org/r/2894 [20:32:58] New review: Reedy; "This change should also be made to the api appserver config too..." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2578 [20:34:18] Do we have a way of adding a Vary: Accept-Encoding header to svg files? [20:40:18] a user tells me new mailing lists can't be created? really? [20:58:17] !log restarting nova-compute on virt4 [20:58:21] Logged the message, Master [20:58:40] iptables restore was failing [21:01:12] maplebed: are we using swauth for swift? [21:01:23] yes [21:07:41] maplebed: I looked at the swiftcleaner code earlier, it looked like it should work as advertised. I did not run the numbers [21:07:55] apergos: awesome. thanks for the review. [21:08:00] sure ething [21:15:14] ryan_lane: I did not find out anything wrong with the test of labstore 1 but had Dell send me a new hdd anyway. rt 2441 [21:18:04] cmjohnson1: hrmmm [21:18:17] tested ok in the backplane slot it was in? [21:18:22] or tested ok removed from that slot? [21:18:28] (the former is better than the latter in this case) [21:18:46] i just wanna be sure it doesnt sound like backplane issue [21:19:47] between the dell rep and myself we don't think it is the backplane at this time. the easiest thing for us to do is swap the drive. if that doesn't work than it may be something more serious like the backplane. [21:21:11] * Ryan_Lane nods [21:21:25] cmjohnson1: have you checked out the raidset since the drive switch? [21:21:55] I have not switched the drive yet? it will arrive tomorrow [21:22:24] oh ok [21:22:56] yeah, are you still having issue with disk 2? [21:23:18] yes [21:25:32] ok...i will replace it and we'll go from there. [21:35:53] Ryan_Lane: not sure he can really do that [21:36:02] do what? [21:36:03] all he can tell is the mgmt feedback, which is lacking compared to raid utils [21:36:12] checking out raidsets [21:36:17] the raid controller showed the drive as being bad [21:36:24] in fact, it would go between missing and bad [21:36:28] right but how is he going to read that? [21:36:29] well, missing a rebuilding [21:36:33] *and [21:36:38] he can see if the disk is good or bad in the drac lom most of the time [21:36:40] boot into the raid controller? [21:36:46] but he cannot determine raid health outside of looking at nagios [21:36:50] sure if its down [21:36:55] sorry, didnt realize it was fully down [21:36:57] it's not doing anything right now [21:37:00] it isn't [21:37:09] I can shut it down or reboot it [21:37:15] this is a dell right? [21:37:19] yeah [21:37:23] just recalling what the raid bios shows [21:37:36] * Ryan_Lane nods [21:37:39] it shows if its healthy or not [21:37:50] but i dont really recall if it shows rebuiliding, cuz i always use command line tools, heh [21:38:07] so cmjohnson1 if it doesnt show that feel free to ping a root [21:38:22] but would be nice to know if it does. [21:38:48] if the server is up and online, then chris cannot confirm raidset, so was just making sure everuyone was on same page [21:38:49] sorry ;] [21:39:21] maplebed: meh, we should probably just make mw:mediawiki an admin [21:40:32] * AaronSchulz looks at bug 34814 [21:41:35] ^demon: in 30 minutes I'm going to switch gerrit [21:49:40] <^demon> Ok, I'm done with the stuff I'm doing. Fire away. [21:50:19] can somebody review my proposed apache conf changes for RT #2488? on fenari diff /root/redirects.conf /home/w/conf/httpd/redirects.conf [21:54:50] meanwhile . . . back in a few minutes, gonna make a quick snow shovelling pass before it gets too heavy [22:01:21] ^demon: ok [22:02:35]