[00:00:04] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [00:00:04] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [00:07:43] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Fri Jan 25 00:07:35 UTC 2013 [00:07:53] RECOVERY - Puppet freshness on ms1 is OK: puppet ran at Fri Jan 25 00:07:44 UTC 2013 [00:08:04] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [00:08:04] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [00:08:13] RECOVERY - Puppet freshness on ms1 is OK: puppet ran at Fri Jan 25 00:08:02 UTC 2013 [00:08:23] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Fri Jan 25 00:08:14 UTC 2013 [00:09:03] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [00:09:04] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [00:09:13] RECOVERY - Puppet freshness on ms1 is OK: puppet ran at Fri Jan 25 00:09:03 UTC 2013 [00:09:13] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Fri Jan 25 00:09:10 UTC 2013 [00:10:09] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [00:10:09] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [00:11:01] !log DNS update - push reverse IPv6 for zirconium [00:11:11] Logged the message, Master [00:12:53] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Fri Jan 25 00:12:52 UTC 2013 [00:13:03] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [00:17:33] PROBLEM - MySQL Slave Delay on db78 is CRITICAL: CRIT replication delay 207 seconds [00:18:13] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 239 seconds [00:18:20] New patchset: Lcarr; "ms1-4 are decommissioned" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/45692 [00:18:57] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/45692 [00:19:20] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 308 seconds [00:19:29] PROBLEM - MySQL Slave Delay on db78 is CRITICAL: CRIT replication delay 316 seconds [00:20:13] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 2 seconds [00:20:33] RECOVERY - MySQL Slave Delay on db78 is OK: OK replication delay 0 seconds [00:20:40] the above icinga freshness checks flapping are because the machines are both in decomissioning and have classes on them so they are unhappy - removed the classes from ms1 and ms2 [00:21:08] RECOVERY - MySQL Slave Delay on db78 is OK: OK replication delay 0 seconds [00:21:17] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 4 seconds [00:25:09] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/45683 [00:26:45] New patchset: Dzahn; "add interface_add_ip6_mapped for zirconium" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/45695 [00:27:29] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/45664 [00:27:33] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/45695 [00:27:41] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/45666 [00:35:05] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [00:38:23] !log DNS update - set non-autoconfig IPv6 and reverse for zirconium [00:38:34] Logged the message, Master [00:46:26] New review: preilly; "nice!" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/45688 [01:00:02] New patchset: Pyoungmeister; "mariadb pacakges. but only for those who really want them" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/45698 [01:02:43] rfaulkner: can you fix sartoris.py:135:80: E501 line too long (95 characters) [01:03:07] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/45698 [01:03:58] notpeter: you do know that we can hear you right? [01:04:21] what am I saying? [01:04:24] also, i don't care [01:04:27] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [01:06:30] preilly: http://riemann.io/ [01:07:05] !log planet is now an IPv6-enabled service [01:07:15] Logged the message, Master [01:08:00] binasher: we have mariadb packages again [01:08:21] mutante: \o/ [01:08:22] and only for boxes that actually want them. [01:12:17] mutante: nice work [01:13:27] thanks. now i just need the wildcard SSL cert [01:13:44] preilly: https://gerrit.wikimedia.org/r/45702 [01:14:57] notpeter: i saw you working on db1047.. and found an oooold ticket.. i guess it can be closed..it was "Cannot reassign variable read_only at /etc/puppet/manifests/mysql.pp:149 on node db1047.eqiad.wmnet" ...in 2011 though [01:15:27] !log moved s[2-7] analytics slave cnames to pmtpa dbs [01:15:39] Logged the message, Master [01:15:57] notpeter: linked it to the upgrade ticket [01:15:59] mutante: link? [01:16:15] https://rt.wikimedia.org/Ticket/Display.html?id=1661 [01:16:43] mutante: I'm asusming that that was a "puppet is broken" ticket [01:16:47] it is no longer broken [01:16:52] thus, can be closed [01:17:23] thanks, done:) [01:17:56] rfaulkner: New patchset: preilly; "Conform to PEP 8" [sartoris] (master) - https://gerrit.wikimedia.org/r/45703 [01:19:39] Ryan_Lane: you need to rebase https://gerrit.wikimedia.org/r/#/c/45701/ [01:19:48] preilly: cool, pulled the latest [01:20:20] rfaulkner: thanks [01:20:44] binasher: https://gerrit.wikimedia.org/r/#/q/owner:+%2522preilly%2522,n,z [01:20:55] preilly: rebased [01:24:20] Ryan_Lane: https://gerrit.wikimedia.org/r/#/c/45705/1/sartoris/sartoris.py [01:27:32] preilly: small fix, https://gerrit.wikimedia.org/r/45706 [01:27:52] New patchset: Asher; "fix logging and clarified reporting per-shard" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/45707 [01:36:16] dschoon: https://gerrit.wikimedia.org/r/gitweb?p=sartoris.git;a=tree; [01:36:36] dschoon: or https://github.com/wikimedia/sartoris [01:42:33] New patchset: Asher; "cleanup for greater pep8 compliance" [operations/software] (master) - https://gerrit.wikimedia.org/r/45710 [01:44:43] hello guys [01:44:47] I'm from WMF analytics [01:44:50] I need access to analytics1001.wikimedia.org [01:45:20] Change merged: Asher; [operations/software] (master) - https://gerrit.wikimedia.org/r/45710 [01:45:27] average_drifter: you need to talk to andrew :) [01:45:35] he's out for the day, though [01:45:47] if there's something you need, i can probably help [01:47:52] dschoon: just access to an01 [01:58:48] Ryan_Lane: I was talking about pathogen [01:59:09] ah. I'll have to try that [02:00:07] Ryan_Lane: set foldmethod=indent [02:00:07] set foldlevel=99 [02:01:15] Ryan_Lane: take a look at: http://sontek.net/blog/detail/turning-vim-into-a-modern-python-ide [02:07:01] PROBLEM - Puppet freshness on stat1001 is CRITICAL: Puppet has not run in the last 10 hours [02:11:16] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [02:11:45] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [02:12:09] dschoon: I just made a simple file: https://github.com/wikimedia/sartoris/blob/master/setup.py [02:12:14] dschoon: you can totally fix it [02:12:24] yeep. [02:13:45] PROBLEM - Puppet freshness on stat1001 is CRITICAL: Puppet has not run in the last 10 hours [02:14:34] dschoon: that's just a hack for Python 3 [02:14:44] isn't it always false? [02:14:48] if False and ...: [02:15:00] dschoon: try it and you'll see [02:15:12] mk. [02:21:10] AaronSchulz, dschoon: http://etherpad.wmflabs.org/pad/p/weird-python [02:27:21] !log LocalisationUpdate completed (1.21wmf8) at Fri Jan 25 02:27:20 UTC 2013 [02:27:33] Logged the message, Master [02:28:15] Ryan_Lane: huh, yeah. you were right. i had it backward. https://gist.github.com/4631211 [02:28:21] AaronSchulz ^^ [02:28:54] weird construct, right? :) [02:29:08] the process killing is the only use case I've found so far [02:30:25] Hah, you can do while...else [02:30:43] ? [02:33:20] RoanKattouw: yeah. it's interesting, right? [02:47:23] preilly: https://gerrit.wikimedia.org/r/45717 [02:47:32] dschoon: http://pastebin.mozilla.org/2084375 [02:47:45] yeah, i didn't run it through pep8. [02:48:51] dschoon: can you fix it? [02:49:09] yeah; i'll make a note to get to it [02:49:22] dschoon: I'll just fix it [02:49:34] sorry, spent enough time on it already :) [02:53:01] !log LocalisationUpdate completed (1.21wmf7) at Fri Jan 25 02:53:00 UTC 2013 [02:53:11] Logged the message, Master [03:20:02] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [03:20:32] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [04:07:42] RECOVERY - Puppet freshness on ms1 is OK: puppet ran at Fri Jan 25 04:07:37 UTC 2013 [04:07:52] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Fri Jan 25 04:07:49 UTC 2013 [04:07:53] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [04:08:03] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [04:08:12] RECOVERY - Puppet freshness on ms1 is OK: puppet ran at Fri Jan 25 04:08:06 UTC 2013 [04:08:13] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Fri Jan 25 04:08:07 UTC 2013 [04:08:52] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [04:09:02] RECOVERY - Puppet freshness on ms1 is OK: puppet ran at Fri Jan 25 04:08:52 UTC 2013 [04:09:03] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [04:09:13] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Fri Jan 25 04:09:08 UTC 2013 [04:09:52] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [04:10:02] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [04:17:32] RECOVERY - Puppet freshness on ms1 is OK: puppet ran at Fri Jan 25 04:17:22 UTC 2013 [04:17:53] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [04:21:43] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Fri Jan 25 04:21:40 UTC 2013 [04:22:02] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [04:55:48] PROBLEM - Puppet freshness on analytics1007 is CRITICAL: Puppet has not run in the last 10 hours [05:10:51] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [05:22:30] PROBLEM - Puppet freshness on db1047 is CRITICAL: Puppet has not run in the last 10 hours [05:22:31] PROBLEM - Puppet freshness on msfe1002 is CRITICAL: Puppet has not run in the last 10 hours [05:22:31] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [05:22:31] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [05:22:31] PROBLEM - Puppet freshness on ms-be1008 is CRITICAL: Puppet has not run in the last 10 hours [05:22:31] PROBLEM - Puppet freshness on ms1004 is CRITICAL: Puppet has not run in the last 10 hours [05:22:31] PROBLEM - Puppet freshness on vanadium is CRITICAL: Puppet has not run in the last 10 hours [05:22:32] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [06:19:23] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [06:26:23] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Fri Jan 25 06:26:22 UTC 2013 [06:26:44] RECOVERY - Puppet freshness on ms1 is OK: puppet ran at Fri Jan 25 06:26:32 UTC 2013 [06:26:54] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [06:27:03] RECOVERY - Puppet freshness on ms1 is OK: puppet ran at Fri Jan 25 06:26:56 UTC 2013 [06:27:23] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [06:27:54] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [06:35:43] RECOVERY - Puppet freshness on ms1 is OK: puppet ran at Fri Jan 25 06:35:34 UTC 2013 [06:35:53] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [06:36:38] PROBLEM - Puppet freshness on virt1000 is CRITICAL: Puppet has not run in the last 10 hours [06:45:43] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Fri Jan 25 06:45:35 UTC 2013 [06:46:24] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [08:02:22] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [08:02:41] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [08:07:42] RECOVERY - Puppet freshness on ms1 is OK: puppet ran at Fri Jan 25 08:07:37 UTC 2013 [08:07:52] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Fri Jan 25 08:07:42 UTC 2013 [08:08:21] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [08:08:41] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [08:10:13] PROBLEM - Puppet freshness on db1031 is CRITICAL: Puppet has not run in the last 10 hours [08:12:19] PROBLEM - Puppet freshness on db1037 is CRITICAL: Puppet has not run in the last 10 hours [08:15:10] PROBLEM - Puppet freshness on db1012 is CRITICAL: Puppet has not run in the last 10 hours [08:17:16] PROBLEM - Puppet freshness on db1015 is CRITICAL: Puppet has not run in the last 10 hours [08:17:17] PROBLEM - Puppet freshness on db1014 is CRITICAL: Puppet has not run in the last 10 hours [08:18:10] PROBLEM - Puppet freshness on db1023 is CRITICAL: Puppet has not run in the last 10 hours [08:19:13] PROBLEM - Puppet freshness on db1030 is CRITICAL: Puppet has not run in the last 10 hours [08:19:22] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Fri Jan 25 08:19:16 UTC 2013 [08:19:41] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [08:29:48] PROBLEM - Puppet freshness on db1029 is CRITICAL: Puppet has not run in the last 10 hours [08:32:48] PROBLEM - Puppet freshness on db1044 is CRITICAL: Puppet has not run in the last 10 hours [08:32:48] PROBLEM - Puppet freshness on db1045 is CRITICAL: Puppet has not run in the last 10 hours [08:35:48] PROBLEM - Puppet freshness on db1016 is CRITICAL: Puppet has not run in the last 10 hours [10:02:44] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [10:02:44] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [10:08:13] PROBLEM - Puppet freshness on virt1000 is CRITICAL: Puppet has not run in the last 10 hours [10:37:33] PROBLEM - Memcached on virt0 is CRITICAL: Connection refused [10:39:07] PROBLEM - Memcached on virt0 is CRITICAL: Connection refused [10:58:28] RECOVERY - Memcached on virt0 is OK: TCP OK - 0.032 second response time on port 11000 [10:58:33] RECOVERY - Memcached on virt0 is OK: TCP OK - 0.027 second response time on port 11000 [11:45:37] Does being "on duty" mean that LeslieCarr is online? Anyone else from ops? [11:48:00] New patchset: Hashar; "move PHP linter under `wikimedia` module" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/29937 [11:48:00] New patchset: Hashar; "refactor continuous integration manifests" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43429 [11:48:01] New patchset: Hashar; "wikimedia module placeholder" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43420 [11:48:20] (I don't get an "away" indicator on this client) [11:48:41] In any case, what should I tell the people at https://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28technical%29#Users_reporting_site_time_issues_and_delay_in_visible_update_of_edits [12:06:07] New review: Hashar; "rebased, fixed a few linting issues" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/29937 [12:07:50] RECOVERY - Puppet freshness on ms1 is OK: puppet ran at Fri Jan 25 12:07:43 UTC 2013 [12:07:50] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Fri Jan 25 12:07:46 UTC 2013 [12:08:08] PROBLEM - Puppet freshness on stat1001 is CRITICAL: Puppet has not run in the last 10 hours [12:08:39] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [12:08:40] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [12:08:50] RECOVERY - Puppet freshness on ms1 is OK: puppet ran at Fri Jan 25 12:08:47 UTC 2013 [12:08:57] New patchset: Hashar; "move PHP linter under `wikimedia` module" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/29937 [12:08:58] New patchset: Hashar; "refactor continuous integration manifests" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43429 [12:08:58] New patchset: Hashar; "wikimedia module placeholder" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43420 [12:08:59] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Fri Jan 25 12:08:49 UTC 2013 [12:09:39] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [12:09:40] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [12:09:43] New review: Hashar; "PS4 re removes the init.pp placeholder" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/43420 [12:10:32] New review: Hashar; "rebased on latest production branch" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/43429 [12:14:30] PROBLEM - Puppet freshness on stat1001 is CRITICAL: Puppet has not run in the last 10 hours [13:05:22] New review: Faidon; "The check is crap, but the change itself is sane." [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/43843 [13:05:23] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43843 [14:56:37] PROBLEM - Puppet freshness on analytics1007 is CRITICAL: Puppet has not run in the last 10 hours [15:12:46] PROBLEM - Host ms-be1008 is DOWN: PING CRITICAL - Packet loss = 100% [15:15:11] PROBLEM - Host ms-be1008 is DOWN: PING CRITICAL - Packet loss = 100% [15:20:28] paravoid: ms-be1008 raid setup....going to installer now [15:21:02] RECOVERY - Host ms-be1008 is UP: PING WARNING - Packet loss = 44%, RTA = 26.69 ms [15:23:16] RECOVERY - Host ms-be1008 is UP: PING OK - Packet loss = 0%, RTA = 0.46 ms [15:23:26] PROBLEM - Puppet freshness on db1047 is CRITICAL: Puppet has not run in the last 10 hours [15:23:27] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [15:23:27] PROBLEM - Puppet freshness on msfe1002 is CRITICAL: Puppet has not run in the last 10 hours [15:23:27] PROBLEM - Puppet freshness on ms-be1008 is CRITICAL: Puppet has not run in the last 10 hours [15:23:27] PROBLEM - Puppet freshness on vanadium is CRITICAL: Puppet has not run in the last 10 hours [15:23:27] PROBLEM - Puppet freshness on ms1004 is CRITICAL: Puppet has not run in the last 10 hours [15:23:27] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [15:23:28] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [15:30:56] PROBLEM - Packetloss_Average on oxygen is CRITICAL: CRITICAL: packet_loss_average is 8.22690244094 (gt 8.0) [15:50:29] cmjohnson1: thanks [15:50:49] yw...do you want to check it b4 puppet run? [15:53:13] nah [15:54:42] k [15:58:03] New patchset: Ottomata; "filters.oxygen.erb - disabling x-cs filter until we get the ok to disable the other IP based filters." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/45769 [15:58:17] Change merged: Ottomata; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/45769 [16:04:32] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [16:04:41] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [16:07:43] RECOVERY - Puppet freshness on ms1 is OK: puppet ran at Fri Jan 25 16:07:39 UTC 2013 [16:07:52] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Fri Jan 25 16:07:42 UTC 2013 [16:08:31] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [16:08:32] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Fri Jan 25 16:08:29 UTC 2013 [16:08:41] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [16:08:44] RECOVERY - Packetloss_Average on oxygen is OK: OK: packet_loss_average is 0.0 [16:08:51] RECOVERY - Puppet freshness on ms1 is OK: puppet ran at Fri Jan 25 16:08:50 UTC 2013 [16:09:31] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [16:09:42] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [16:12:21] paravoid [16:12:24] nable to install GRUB in /dev/sda │ │ [16:12:24] │ │ Executing 'grub-install /dev/sda' failed [16:12:31] RECOVERY - Puppet freshness on ms1 is OK: puppet ran at Fri Jan 25 16:12:25 UTC 2013 [16:12:42] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [16:14:42] is the raid configured ok? [16:15:14] it is but I think i may know what it is...the new board may have the raid controller disabled [16:16:22] ok [16:21:31] PROBLEM - Host ms-be1008 is DOWN: PING CRITICAL - Packet loss = 100% [16:27:38] PROBLEM - Host ms-be1008 is DOWN: PING CRITICAL - Packet loss = 100% [16:31:02] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Fri Jan 25 16:30:56 UTC 2013 [16:31:32] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [16:32:02] RECOVERY - Host ms-be1008 is UP: PING OK - Packet loss = 0%, RTA = 0.38 ms [16:33:30] RECOVERY - Host ms-be1008 is UP: PING OK - Packet loss = 0%, RTA = 26.51 ms [16:37:23] PROBLEM - Puppet freshness on virt1000 is CRITICAL: Puppet has not run in the last 10 hours [16:49:10] cmjohnson1: works now? [16:51:09] <^demon> !log formey: mediawiki svn repository is now read-only. long live git. [16:51:21] Logged the message, Master [16:52:29] paravoid: still going [16:53:44] ^demon: nice job :) [16:54:54] <^demon> https://gerrit.wikimedia.org/r/#/c/45564/ can now be merged. [17:04:03] hrm paravoid there is something with either naggen or puppet that's making ms1 and ms2 still have the checks generated in the icinga files and those annoying puppet freshness checks -- have any ideas of where i should start looking ? [17:05:21] uhm [17:05:31] they both just contacted the puppetmaster [17:05:42] so their exports were restored [17:05:49] decom will run at some point again and remove them [17:05:56] and then they'll run again and add themselves again [17:06:03] until the end of time :-) [17:06:12] kill puppet on the boxes or the boxes themselves [17:08:26] LeslieCarr: ^ [17:08:42] hehe ok [17:11:02] thanks [17:17:04] New review: Demon; "It's tomorrow :)" [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/45564 [17:18:24] paravoid: same problem...not installing /sda [17:19:37] I'll have a look [17:19:57] Is anyone physically adjacent to Ryan? [17:20:14] Coren: no. I'm still at home [17:20:29] we're talking in 40 minutes, right? [17:20:40] i am out of the console (paravoid) [17:21:10] Ryan_Lane: Err, that's okay with me but it was actually 20 minutes ago. Long live timezone confusion! :-) [17:21:33] Coren: my calendar says 10 PST ;) [17:21:54] Gah! Stupid irc client. ^W is delete word, not close window! [17:22:13] Coren: my calendar says 10 PST ;) [17:22:40] Ryan_Lane: Ah, Ion told me 17:00 UTC. Either works for me anyways. :-) [17:22:59] Ryan_Lane: Talk to you laters, then. ~wave~ [17:23:10] cool. yep [17:30:27] New patchset: Reedy; "Disble QueryPage updates on frwiki like enwiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/45783 [17:31:18] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/45783 [17:31:50] !log reedy synchronized wmf-config/InitialiseSettings.php [17:32:01] Logged the message, Master [17:48:20] PROBLEM - Puppet freshness on ms-be1011 is CRITICAL: Puppet has not run in the last 10 hours [18:07:57] New patchset: Reedy; "Add small, medium and large dblists to wikitags" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/45785 [18:08:34] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/45785 [18:08:56] PROBLEM - Puppet freshness on ms-be1011 is CRITICAL: Puppet has not run in the last 10 hours [18:09:50] !log reedy synchronized wmf-config/CommonSettings.php [18:10:01] Logged the message, Master [18:11:17] PROBLEM - Puppet freshness on db1031 is CRITICAL: Puppet has not run in the last 10 hours [18:13:14] PROBLEM - Puppet freshness on db1037 is CRITICAL: Puppet has not run in the last 10 hours [18:15:37] Anybody aware of en.wiki (elsewhere?) main page rendering issues? https://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28technical%29#Users_reporting_site_time_issues_and_delay_in_visible_update_of_edits [18:15:45] quite a few emails coming to otrs too [18:16:14] PROBLEM - Puppet freshness on db1012 is CRITICAL: Puppet has not run in the last 10 hours [18:16:28] It would sound like it's stuff not being purged from squid in esams [18:18:21] PROBLEM - Puppet freshness on db1015 is CRITICAL: Puppet has not run in the last 10 hours [18:18:21] PROBLEM - Puppet freshness on db1014 is CRITICAL: Puppet has not run in the last 10 hours [18:19:14] PROBLEM - Puppet freshness on db1023 is CRITICAL: Puppet has not run in the last 10 hours [18:20:17] PROBLEM - Puppet freshness on db1030 is CRITICAL: Puppet has not run in the last 10 hours [18:21:26] New review: MZMcBride; "This is related to some bug, isn't it?" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/45785 [18:21:56] RECOVERY - swift-account-reaper on ms-be3 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-reaper [18:22:20] New review: Reedy; "https://bugzilla.wikimedia.org/show_bug.cgi?id=43741" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/45785 [18:22:57] New review: MZMcBride; "Right-o. Thanks for working on this!" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/45785 [18:24:56] PROBLEM - swift-account-reaper on ms-be3 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-account-reaper [18:30:34] New patchset: Reedy; "Bug 44349 - updateSpecialPages is run twice on small wikis" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/45786 [18:31:14] PROBLEM - Puppet freshness on db1029 is CRITICAL: Puppet has not run in the last 10 hours [18:32:22] New patchset: Reedy; "(bug 29692) Per-wiki namespace aliases shouldn't override (remove) global ones" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/25737 [18:34:14] PROBLEM - Puppet freshness on db1045 is CRITICAL: Puppet has not run in the last 10 hours [18:34:14] PROBLEM - Puppet freshness on db1044 is CRITICAL: Puppet has not run in the last 10 hours [18:35:56] Importing 2013 Barack Obama Inauguration Ceremony (full).ogv... No comment file with extension txt found for /tmp/odder/2013 Barack Obama Inauguration Ceremony (full).ogv, using default comment. failed. (* Could not write file "mwstore://local-swift/local-public/a/a6/2013_Barack_Obama_Inauguration_Ceremony_(full).ogv" because it is larger than {{PLURAL:4294967296|one byte|4294967296 bytes}}. [18:35:59] Saywhat? [18:36:01] AaronSchulz: ^ [18:37:14] PROBLEM - Puppet freshness on db1016 is CRITICAL: Puppet has not run in the last 10 hours [18:38:16] reedy@fenari:/home/wikipedia/common$ du --si /tmp/odder/2013\ Barack\ Obama\ Inauguration\ Ceremony\ \(full\).ogv [18:38:16] 5.1G /tmp/odder/2013 Barack Obama Inauguration Ceremony (full).ogv [18:38:37] O_o [18:39:15] New review: Demon; "One minor nitpick, functionally ok." [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/45786 [18:40:32] New patchset: Reedy; "Bug 44349 - updateSpecialPages is run twice on small wikis" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/45786 [18:43:54] New review: Reedy; "Is this still to go out?" [operations/mediawiki-config] (master); V: 0 C: -1; - https://gerrit.wikimedia.org/r/43029 [18:44:38] New patchset: Reedy; "Prevent MediaWiki maintenance scripts from running as privileged users" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/44200 [18:45:08] New patchset: Reedy; "(bug 44054) Enable Translate and TranslateNotifications on br.wikimedia" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/44516 [18:45:35] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/44516 [18:46:30] New patchset: RobH; "barium & colby to become new dns servers" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/45788 [18:47:40] !log Created translate tables on brwikimedia [18:47:50] Logged the message, Master [18:49:34] !log reedy synchronized wmf-config/InitialiseSettings.php [18:49:45] Logged the message, Master [18:49:55] robh: look for a ticket shortly for a cable order for eqiad..need 5' green and blue [18:50:14] np [18:50:17] Change abandoned: Reedy; "(no reason)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/45150 [18:50:20] Change abandoned: Reedy; "(no reason)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/45149 [18:51:16] Nemo_bis: Want to try https://gerrit.wikimedia.org/r/#/c/25737/ again? [18:51:35] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/44856 [18:52:11] Reedy: yes! [18:52:18] I'll check if pt.wiki and friends break [18:52:36] tell me when you're going to sync, gotta prepare dinner soon [18:52:37] New review: RobH; "-stealth self review-" [operations/puppet] (production); V: 2 C: 2; - https://gerrit.wikimedia.org/r/45788 [18:52:38] Change merged: RobH; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/45788 [18:53:09] I can do it now if you want? [18:53:15] ok [18:53:28] New patchset: Reedy; "(bug 29692) Per-wiki namespace aliases shouldn't override (remove) global ones" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/25737 [18:53:33] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/25737 [18:54:09] !log reedy synchronized wmf-config/InitialiseSettings.php [18:54:19] Logged the message, Master [18:54:36] as of now, it still works [18:54:39] And the surbey says [18:54:45] Good [18:54:55] how long it takes to sync? [18:54:59] It's done [18:55:07] [18:54:09] !log reedy synchronized wmf-config/InitialiseSettings.php [18:55:12] ^ That means it has finished [18:55:18] ah right :) [18:55:31] See if anyone comes in later and complains ;) [19:14:44] * Coren complains. [19:19:33] coren - about what? [19:19:41] [13:55:31] See if anyone comes in later and complains ;) [19:19:44] :-) [19:20:27] :-) [19:31:47] so otrs is getting flooded with emails asking why the main page is out of date…is this something that we can just say was due to the datacenter migration (was it?) and will hopefully be fixed soon? [19:32:42] Ops are aware and are/were looking at it [19:32:59] It would seem that the esams squids aren't recieving the purge requests from eqiad [19:33:10] But yes, it's a leftover from the datacentre migration [19:33:21] alright, ill just say it'll get fixed soon ;) [19:33:50] thanks [19:33:50] if its for sure related, could link the blog post about that duh [19:33:53] shrug [19:34:00] yeah was planning on it [19:34:25] i'm presuming it's a "network issue" with how the purge packets are sent out [19:34:31] and from eqiad they don't make it to pmtpa [19:34:35] err, esams [19:34:47] one of them metal box thingies had an issue [19:34:54] that's what you reply with, duh [19:34:57] :P [19:36:03] PROBLEM - Host mw1001 is DOWN: PING CRITICAL - Packet loss = 100% [19:37:38] PROBLEM - Host mw1001 is DOWN: PING CRITICAL - Packet loss = 100% [19:39:56] PROBLEM - Puppet freshness on ms-be1012 is CRITICAL: Puppet has not run in the last 10 hours [19:40:58] we had such a complains also on it.quote [19:41:10] they said 16 January even [19:41:18] RECOVERY - Host mw1001 is UP: PING OK - Packet loss = 0%, RTA = 0.29 ms [19:41:26] RECOVERY - Host mw1001 is UP: PING OK - Packet loss = 0%, RTA = 26.70 ms [19:43:19] ah, pretty, en.wiki has a sitenotice even about broken cache [19:43:49] the person who created that forgot to close the tag and messed up a bunch of pages :P [19:44:09] https://en.wikipedia.org/w/index.php?title=MediaWiki%3ASitenotice&diff=534860227&oldid=534859722 [19:44:19] well forgot to close the