[00:04:39] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [00:04:59] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [00:06:59] RECOVERY - Puppet freshness on virt1001 is OK: puppet ran at Thu Jan 2 00:06:52 UTC 2014 [00:07:19] RECOVERY - Puppet freshness on virt1003 is OK: puppet ran at Thu Jan 2 00:07:16 UTC 2014 [00:07:40] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 12:06:52 AM UTC [00:07:59] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 12:07:16 AM UTC [00:28:39] PROBLEM - Auth DNS on labs-ns0.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [00:29:29] RECOVERY - Auth DNS on labs-ns0.wikimedia.org is OK: DNS OK: 0.149 seconds response time. nagiostest.beta.wmflabs.org returns 208.80.153.219 [01:04:56] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [01:04:57] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [01:06:46] RECOVERY - Puppet freshness on virt1001 is OK: puppet ran at Thu Jan 2 01:06:40 UTC 2014 [01:06:56] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:06:40 AM UTC [01:07:06] RECOVERY - Puppet freshness on virt1003 is OK: puppet ran at Thu Jan 2 01:07:00 UTC 2014 [01:07:56] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:07:00 AM UTC [02:03:47] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [02:03:56] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [02:06:56] RECOVERY - Puppet freshness on virt1001 is OK: puppet ran at Thu Jan 2 02:06:53 UTC 2014 [02:07:17] RECOVERY - Puppet freshness on virt1003 is OK: puppet ran at Thu Jan 2 02:07:09 UTC 2014 [02:07:47] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:07:09 AM UTC [02:07:56] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:06:53 AM UTC [02:26:42] !log LocalisationUpdate completed (1.23wmf8) at Thu Jan 2 02:26:41 UTC 2014 [02:29:36] PROBLEM - Auth DNS on labs-ns0.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [02:30:26] RECOVERY - Auth DNS on labs-ns0.wikimedia.org is OK: DNS OK: 0.149 seconds response time. nagiostest.beta.wmflabs.org returns 208.80.153.219 [02:33:36] PROBLEM - Auth DNS on labs-ns0.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [02:34:34] RECOVERY - Auth DNS on labs-ns0.wikimedia.org is OK: DNS OK: 0.149 seconds response time. nagiostest.beta.wmflabs.org returns 208.80.153.219 [02:37:34] PROBLEM - Auth DNS on labs-ns0.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [02:38:24] RECOVERY - Auth DNS on labs-ns0.wikimedia.org is OK: DNS OK: 0.149 seconds response time. nagiostest.beta.wmflabs.org returns 208.80.153.219 [02:38:41] !log LocalisationUpdate completed (1.23wmf7) at Thu Jan 2 02:38:41 UTC 2014 [03:02:15] !log LocalisationUpdate ResourceLoader cache refresh completed at Thu Jan 2 03:02:15 UTC 2014 [03:21:34] PROBLEM - Auth DNS on labs-ns0.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [03:23:24] RECOVERY - Auth DNS on labs-ns0.wikimedia.org is OK: DNS OK: 0.149 seconds response time. nagiostest.beta.wmflabs.org returns 208.80.153.219 [03:46:34] PROBLEM - Auth DNS on labs-ns0.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [03:47:24] RECOVERY - Auth DNS on labs-ns0.wikimedia.org is OK: DNS OK: 0.152 seconds response time. nagiostest.beta.wmflabs.org returns 208.80.153.219 [03:48:48] (03PS1) 10Rschen7754: add templateeditor right for testwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/104912 [03:50:34] PROBLEM - Auth DNS on labs-ns0.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [03:51:24] RECOVERY - Auth DNS on labs-ns0.wikimedia.org is OK: DNS OK: 0.148 seconds response time. nagiostest.beta.wmflabs.org returns 208.80.153.219 [03:51:56] (03PS2) 10Rschen7754: add templateeditor right for testwiki: bug: 59084 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/104912 [04:03:21] (03PS3) 10Rschen7754: add templateeditor right for testwiki: bug: 59084 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/104912 [04:04:00] (03PS4) 10Legoktm: Add templateeditor right, group, and restriction [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/104912 (owner: 10Rschen7754) [04:04:47] (03PS5) 10John F. Lewis: Add templateeditor right, group, and restriction [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/104912 (owner: 10Rschen7754) [04:05:12] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [04:05:12] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [04:05:16] (03CR) 10John F. Lewis: [C: 031] "Configuration seems fine. Commit message is fine too now." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/104912 (owner: 10Rschen7754) [04:06:42] RECOVERY - Puppet freshness on virt1001 is OK: puppet ran at Thu Jan 2 04:06:36 UTC 2014 [04:07:02] RECOVERY - Puppet freshness on virt1003 is OK: puppet ran at Thu Jan 2 04:06:56 UTC 2014 [04:07:12] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 04:06:56 AM UTC [04:07:12] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 04:06:36 AM UTC [04:53:32] PROBLEM - Auth DNS on labs-ns0.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [04:55:22] RECOVERY - Auth DNS on labs-ns0.wikimedia.org is OK: DNS OK: 0.150 seconds response time. nagiostest.beta.wmflabs.org returns 208.80.153.219 [05:04:36] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [05:04:46] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [05:07:07] RECOVERY - Puppet freshness on virt1003 is OK: puppet ran at Thu Jan 2 05:07:01 UTC 2014 [05:07:16] RECOVERY - Puppet freshness on virt1001 is OK: puppet ran at Thu Jan 2 05:07:11 UTC 2014 [05:07:36] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 05:07:01 AM UTC [05:07:46] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 05:07:11 AM UTC [05:21:33] (03CR) 10MZMcBride: "I don't think there's any need for this. The bug is still awaiting a rationale. The commit message attempts to provide one, but it's prett" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/104912 (owner: 10Rschen7754) [05:48:29] PROBLEM - Auth DNS on labs-ns0.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [05:49:19] RECOVERY - Auth DNS on labs-ns0.wikimedia.org is OK: DNS OK: 0.149 seconds response time. nagiostest.beta.wmflabs.org returns 208.80.153.219 [06:05:38] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [06:05:48] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [06:07:28] RECOVERY - Puppet freshness on virt1003 is OK: puppet ran at Thu Jan 2 06:07:23 UTC 2014 [06:07:28] RECOVERY - Puppet freshness on virt1001 is OK: puppet ran at Thu Jan 2 06:07:23 UTC 2014 [06:07:38] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 06:07:23 AM UTC [06:07:48] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 06:07:23 AM UTC [07:04:35] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [07:04:45] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [07:06:55] RECOVERY - Puppet freshness on virt1001 is OK: puppet ran at Thu Jan 2 07:06:45 UTC 2014 [07:06:56] RECOVERY - Puppet freshness on virt1003 is OK: puppet ran at Thu Jan 2 07:06:45 UTC 2014 [07:07:35] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 07:06:45 AM UTC [07:07:45] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 07:06:45 AM UTC [08:04:52] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [08:04:52] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [08:06:52] RECOVERY - Puppet freshness on virt1001 is OK: puppet ran at Thu Jan 2 08:06:44 UTC 2014 [08:07:03] RECOVERY - Puppet freshness on virt1003 is OK: puppet ran at Thu Jan 2 08:06:59 UTC 2014 [08:07:52] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 08:06:44 AM UTC [08:07:52] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 08:06:59 AM UTC [08:36:24] !log gallium / jenkins upgrading Jenkins from 1.509.4 to 1.532.1 [08:36:55] ... [08:46:45] PROBLEM - Auth DNS on labs-ns0.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [08:47:35] RECOVERY - Auth DNS on labs-ns0.wikimedia.org is OK: DNS OK: 5.435 seconds response time. nagiostest.beta.wmflabs.org returns [09:05:04] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [09:05:14] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [09:07:24] RECOVERY - Puppet freshness on virt1003 is OK: puppet ran at Thu Jan 2 09:07:16 UTC 2014 [09:07:34] RECOVERY - Puppet freshness on virt1001 is OK: puppet ran at Thu Jan 2 09:07:31 UTC 2014 [09:07:57] !log jenkins restarted [09:08:04] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 09:07:31 AM UTC [09:08:14] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 09:07:16 AM UTC [09:08:33] (03PS1) 10Hashar: stages.pp puppet lint fixes [operations/puppet] - 10https://gerrit.wikimedia.org/r/104919 [09:16:44] PROBLEM - Auth DNS on labs-ns0.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [09:17:34] RECOVERY - Auth DNS on labs-ns0.wikimedia.org is OK: DNS OK: 0.149 seconds response time. nagiostest.beta.wmflabs.org returns 208.80.153.219 [09:22:06] (03PS1) 10Hashar: applicationserver: pass puppetlint / retab [operations/puppet] - 10https://gerrit.wikimedia.org/r/104920 [09:31:36] (03PS1) 10Hashar: ganglia_new: fix puppet-lint issues [operations/puppet] - 10https://gerrit.wikimedia.org/r/104921 [09:53:04] (03CR) 10Alexandros Kosiaris: [C: 032 V: 032] Redirect kr.wikimedia [operations/apache-config] - 10https://gerrit.wikimedia.org/r/101220 (owner: 10John F. Lewis) [10:03:08] (03CR) 10Alexandros Kosiaris: [C: 032] 2 new education redirects [operations/apache-config] - 10https://gerrit.wikimedia.org/r/102753 (owner: 10Jeremyb) [10:04:42] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [10:04:42] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [10:07:02] RECOVERY - Puppet freshness on virt1001 is OK: puppet ran at Thu Jan 2 10:06:54 UTC 2014 [10:07:14] RECOVERY - Puppet freshness on virt1003 is OK: puppet ran at Thu Jan 2 10:07:09 UTC 2014 [10:07:42] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 10:06:54 AM UTC [10:07:42] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 10:07:09 AM UTC [10:08:22] (03PS1) 10Hashar: beta: allow ssh from gallium on parsoid instance [operations/puppet] - 10https://gerrit.wikimedia.org/r/104924 [10:08:46] akosiaris: good morning :) whenever you are done with apache config could you merge in https://gerrit.wikimedia.org/r/104924 please ? :-D that is for beta / ci [10:15:10] (03CR) 10Alexandros Kosiaris: [C: 032] beta: allow ssh from gallium on parsoid instance [operations/puppet] - 10https://gerrit.wikimedia.org/r/104924 (owner: 10Hashar) [10:16:06] hashar: Good morning to you too. And happy new year!! :) [10:16:54] akosiaris1: thanks :-D and yeah happy 0x7DE [10:18:37] :-) [10:19:39] akosiaris1: do you have any clue how I could get a debian package uploaded on debian repo ? [10:19:59] I have updated the python-statsd package by bumping the changelog entry but no clue whom to ask :D [10:21:14] the official one ? you are supposed to wait for an uploader.... and I 've been advised not to push things on that front. [10:21:45] aka "patience is a virtue" [10:23:29] akosiaris1: well I haven't even uploaded the file :D [10:23:47] maybe I should poke the debian python folks [10:26:14] your best best I 'd say [10:46:53] (03PS1) 10Alexandros Kosiaris: Install python{,3}-dev packages on stat1 [operations/puppet] - 10https://gerrit.wikimedia.org/r/104929 [10:47:57] (03CR) 10jenkins-bot: [V: 04-1] Install python{,3}-dev packages on stat1 [operations/puppet] - 10https://gerrit.wikimedia.org/r/104929 (owner: 10Alexandros Kosiaris) [10:51:52] (03PS2) 10Alexandros Kosiaris: Install python{,3}-dev packages on stat1 [operations/puppet] - 10https://gerrit.wikimedia.org/r/104929 [10:54:42] (03CR) 10Alexandros Kosiaris: [C: 032] Install python{,3}-dev packages on stat1 [operations/puppet] - 10https://gerrit.wikimedia.org/r/104929 (owner: 10Alexandros Kosiaris) [11:04:32] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [11:04:32] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [11:07:12] RECOVERY - Puppet freshness on virt1003 is OK: puppet ran at Thu Jan 2 11:07:08 UTC 2014 [11:07:32] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 11:07:08 AM UTC [11:07:32] RECOVERY - Puppet freshness on virt1001 is OK: puppet ran at Thu Jan 2 11:07:29 UTC 2014 [11:08:32] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 11:07:29 AM UTC [11:08:45] (03PS1) 10Hashar: beta: parsoid switch to jenkins deployed parsoid [operations/puppet] - 10https://gerrit.wikimedia.org/r/104932 [11:09:25] akosiaris1: and another lame beta/parsoid change https://gerrit.wikimedia.org/r/#/c/104932/ update the parsoid upstart configuration to have it load the NPM module from the proper place [11:09:34] akosiaris1: no impact on production since prod does not use upstart yet :-] [11:09:50] (03PS2) 10Hashar: beta: parsoid switch to jenkins deployed parsoid [operations/puppet] - 10https://gerrit.wikimedia.org/r/104932 [11:10:00] amended summary to reflect it has no impact on prod [11:10:49] don't forget we hope to rip off as much of that for pro as possible though ;-) [11:10:53] *prod [11:11:52] yeah got to update that later on with gwicke [11:11:59] they want to use upstart as well [11:12:37] that will be very nice indeed [11:14:37] mind merging that meanwhile ? :-] [11:14:39] (03CR) 10Alexandros Kosiaris: [C: 04-1] "There is a typo. And a nitpick. Otherwise LGTM" (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/104932 (owner: 10Hashar) [11:14:43] ah [11:15:06] upstart is nice ? [11:16:19] upstart is ok ;) [11:16:26] (03CR) 10Hashar: beta: parsoid switch to jenkins deployed parsoid (033 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/104932 (owner: 10Hashar) [11:16:32] I like upstart [11:16:41] (03PS3) 10Hashar: beta: parsoid switch to jenkins deployed parsoid [operations/puppet] - 10https://gerrit.wikimedia.org/r/104932 [11:16:56] Ubuntu upstart cookbook is worth a read http://upstart.ubuntu.com/cookbook/ [11:17:00] I dislike upstart :P. even systemd seems better [11:17:06] no systemd is not better [11:17:24] oh please, lets not bring the Debian "systemd vs upstart" drama here :-D [11:17:32] why not? [11:17:33] done brought it [11:17:39] that is not a debian drama [11:18:11] I have never used systemd myself nor do I have any clue what are the difference between the two [11:18:39] well systemd is too intrusive for an init system [11:18:50] it wants DBUS for example ... [11:19:10] it changes early boot logging as well [11:20:26] addressed issue on https://gerrit.wikimedia.org/r/104932 meanwhile [11:20:58] (03CR) 10Alexandros Kosiaris: [C: 032] beta: parsoid switch to jenkins deployed parsoid [operations/puppet] - 10https://gerrit.wikimedia.org/r/104932 (owner: 10Hashar) [11:21:32] Ubuntu is probably going to stick with upstart anyway so we are stuck with it as well [11:21:43] though if we switch to Debian ... [11:21:51] ahahaha [11:23:54] hashar: merged [11:27:18] thanks! [11:29:42] PROBLEM - Auth DNS on labs-ns0.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [11:30:33] RECOVERY - Auth DNS on labs-ns0.wikimedia.org is OK: DNS OK: 0.578 seconds response time. nagiostest.beta.wmflabs.org returns 208.80.153.219 [11:35:11] (03PS1) 10Mark Bergsma: Update A/AAAA records of SLD project domain records [operations/dns] - 10https://gerrit.wikimedia.org/r/104937 [11:35:12] (03PS1) 10Hashar: beta: parsoid localsettings.js [operations/puppet] - 10https://gerrit.wikimedia.org/r/104938 [11:37:58] (03CR) 10Mark Bergsma: [C: 032] Update A/AAAA records of SLD project domain records [operations/dns] - 10https://gerrit.wikimedia.org/r/104937 (owner: 10Mark Bergsma) [11:38:10] akosiaris1: and a last one to publish the parsoid configuration on beta https://gerrit.wikimedia.org/r/#/c/104938/ :d [11:40:34] I am out to lunch be back in a few [11:40:35] :-) [11:46:02] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [11:55:13] (03PS1) 10Mark Bergsma: Swap new and old bits-lb LVS service IPs for esams [operations/puppet] - 10https://gerrit.wikimedia.org/r/104940 [11:56:50] (03CR) 10Mark Bergsma: [C: 032] Swap new and old bits-lb LVS service IPs for esams [operations/puppet] - 10https://gerrit.wikimedia.org/r/104940 (owner: 10Mark Bergsma) [12:05:08] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [12:05:08] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [12:06:47] RECOVERY - Puppet freshness on virt1001 is OK: puppet ran at Thu Jan 2 12:06:43 UTC 2014 [12:07:07] RECOVERY - Puppet freshness on virt1003 is OK: puppet ran at Thu Jan 2 12:07:01 UTC 2014 [12:07:08] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 12:06:43 PM UTC [12:07:21] (03PS1) 10Mark Bergsma: Update bits-lb.esams IP addresses to the new Zero scheme [operations/dns] - 10https://gerrit.wikimedia.org/r/104941 [12:08:07] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 12:07:01 PM UTC [12:13:31] (03CR) 10Mark Bergsma: [C: 032] Update bits-lb.esams IP addresses to the new Zero scheme [operations/dns] - 10https://gerrit.wikimedia.org/r/104941 (owner: 10Mark Bergsma) [12:25:46] (03PS1) 10Mark Bergsma: Swap old and new bits-lb.eqiad IPv6 LVS service IPs [operations/puppet] - 10https://gerrit.wikimedia.org/r/104945 [12:27:10] (03CR) 10Mark Bergsma: [C: 032] Swap old and new bits-lb.eqiad IPv6 LVS service IPs [operations/puppet] - 10https://gerrit.wikimedia.org/r/104945 (owner: 10Mark Bergsma) [12:52:41] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [12:52:41] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [12:55:29] (03PS1) 10Mark Bergsma: Update AAAA record of bits-lb.eqiad to the new Zero scheme [operations/dns] - 10https://gerrit.wikimedia.org/r/104947 [12:58:13] (03CR) 10Mark Bergsma: [C: 032] Update AAAA record of bits-lb.eqiad to the new Zero scheme [operations/dns] - 10https://gerrit.wikimedia.org/r/104947 (owner: 10Mark Bergsma) [13:07:14] RECOVERY - Puppet freshness on virt1001 is OK: puppet ran at Thu Jan 2 13:06:36 UTC 2014 [13:07:22] RECOVERY - Puppet freshness on virt1003 is OK: puppet ran at Thu Jan 2 13:07:12 UTC 2014 [13:07:41] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:06:36 PM UTC [13:07:41] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:07:12 PM UTC [13:11:18] (03PS1) 10Mark Bergsma: Add new mobile-lb LVS service IPs (Zero scheme) [operations/puppet] - 10https://gerrit.wikimedia.org/r/104948 [13:12:11] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [13:12:50] (03CR) 10Mark Bergsma: [C: 032] Add new mobile-lb LVS service IPs (Zero scheme) [operations/puppet] - 10https://gerrit.wikimedia.org/r/104948 (owner: 10Mark Bergsma) [13:20:57] (03PS1) 10Mark Bergsma: Remove site pmtpa from the protoproxy configuration [operations/puppet] - 10https://gerrit.wikimedia.org/r/104949 [13:23:06] (03CR) 10Mark Bergsma: [C: 032] Remove site pmtpa from the protoproxy configuration [operations/puppet] - 10https://gerrit.wikimedia.org/r/104949 (owner: 10Mark Bergsma) [13:24:18] akosiaris1: i have some more redirects if you're still in the merging mood [13:27:02] akosiaris1: also, did you deploy what you already merged? [13:30:49] (03PS1) 10Mark Bergsma: Add new mobile LVS service IPs to protoproxies (Zero scheme) [operations/puppet] - 10https://gerrit.wikimedia.org/r/104950 [13:31:02] huh, 2 issues: 1 the redirect is broken and 2) it's cached in varnish as the older state [13:31:14] (krwikimedia) [13:32:22] (03CR) 10Mark Bergsma: [C: 032] Add new mobile LVS service IPs to protoproxies (Zero scheme) [operations/puppet] - 10https://gerrit.wikimedia.org/r/104950 (owner: 10Mark Bergsma) [13:32:50] jeremyb: sure and yes [13:48:33] (03PS15) 10Hashar: Enable Wikidata build on beta labs [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/95996 (owner: 10Aude) [13:48:44] RECOVERY - Varnish HTTP text-backend on cp1065 is OK: HTTP OK: HTTP/1.1 200 OK - 189 bytes in 0.001 second response time [13:48:44] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:19:11 PM UTC [13:48:44] RECOVERY - SSH on cp1065 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [13:48:44] RECOVERY - puppet disabled on cp1065 is OK: OK [13:48:44] RECOVERY - Disk space on cp1065 is OK: DISK OK [13:48:45] RECOVERY - Varnish HTCP daemon on cp1065 is OK: PROCS OK: 1 process with UID = 111 (vhtcpd), args vhtcpd [13:48:45] RECOVERY - DPKG on cp1065 is OK: All packages OK [13:48:46] RECOVERY - RAID on cp1065 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0 [13:48:58] icinga-wm: no flooding, kthx! [13:49:24] RECOVERY - Puppet freshness on cp1065 is OK: puppet ran at Thu Jan 2 13:49:15 UTC 2014 [13:49:44] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:49:15 PM UTC [13:49:51] akosiaris1: so, we have some (probably widespread) redirect problems with the new system uncovered by the latest kr.wikimedia.org change. https://bugzilla.wikimedia.org/54883#c5 (but definitely effecting more than just that new one) [13:50:01] (03CR) 10Hashar: [C: 032] "with aude" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/95996 (owner: 10Aude) [13:50:47] speaking which, Tim-away are you here? i guess nick indicates no [13:50:50] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:49:15 PM UTC [13:50:54] (03Merged) 10jenkins-bot: Enable Wikidata build on beta labs [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/95996 (owner: 10Aude) [13:51:12] hashar: re jenkins jobs, i should be trusted, no? :) [13:51:49] jeremyb: probably, add your email in integration/zuul-config.git , there is a few examples in the history :) [13:51:50] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:49:15 PM UTC [13:51:57] jeremyb: yes [13:52:50] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:49:15 PM UTC [13:53:50] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:49:15 PM UTC [13:54:01] PROBLEM - LDAP on virt1000 is CRITICAL: Connection refused [13:54:40] PROBLEM - LDAPS on virt1000 is CRITICAL: Connection refused [13:54:50] RECOVERY - Puppet freshness on cp1065 is OK: puppet ran at Thu Jan 2 13:54:42 UTC 2014 [13:54:50] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:49:15 PM UTC [13:55:50] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:54:42 PM UTC [13:56:50] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:54:42 PM UTC [13:57:50] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:54:42 PM UTC [13:58:50] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:54:42 PM UTC [13:59:30] PROBLEM - Auth DNS on labs-ns0.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [13:59:50] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:54:42 PM UTC [14:00:20] RECOVERY - Auth DNS on labs-ns0.wikimedia.org is OK: DNS OK: 0.149 seconds response time. nagiostest.beta.wmflabs.org returns 208.80.153.219 [14:00:50] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:54:42 PM UTC [14:01:50] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:54:42 PM UTC [14:02:50] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:54:42 PM UTC [14:03:50] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:54:42 PM UTC [14:04:35] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [14:04:46] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [14:04:46] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:54:42 PM UTC [14:04:58] 1970? [14:05:00] (03PS1) 10Hashar: missing extension-list-labs [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/104953 [14:05:45] (03CR) 10Hashar: "extension-list-lab got removed by that change which caused messages to no more be updating :( Fixed by https://gerrit.wikimedia.org/r/#/" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/104741 (owner: 10Dan-nl) [14:05:45] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:54:42 PM UTC [14:05:52] (03CR) 10Hashar: [C: 032] missing extension-list-labs [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/104953 (owner: 10Hashar) [14:06:00] (03Merged) 10jenkins-bot: missing extension-list-labs [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/104953 (owner: 10Hashar) [14:06:45] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:54:42 PM UTC [14:07:15] RECOVERY - Puppet freshness on virt1003 is OK: puppet ran at Thu Jan 2 14:07:14 UTC 2014 [14:07:25] RECOVERY - Puppet freshness on virt1001 is OK: puppet ran at Thu Jan 2 14:07:19 UTC 2014 [14:07:35] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:07:14 PM UTC [14:07:39] aude: that's the beginning of time [14:07:45] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:07:19 PM UTC [14:07:45] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:54:42 PM UTC [14:07:48] yep [14:08:26] bahh [14:09:06] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:54:42 PM UTC [14:09:32] * jeremyb fights with labs [14:09:35] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:54:42 PM UTC [14:09:56] i almost feel like it would be worth it to take an apache out of rotation and test there [14:10:42] RECOVERY - Puppet freshness on cp1065 is OK: puppet ran at Thu Jan 2 14:10:30 UTC 2014 [14:10:42] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 01:54:42 PM UTC [14:10:47] (03PS1) 10Hashar: Wikibase: fix extension-list paths [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/104954 [14:11:19] (03PS2) 10Hashar: Wikibase: fix extension-list paths [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/104954 [14:11:29] (03CR) 10Hashar: [C: 032] Wikibase: fix extension-list paths [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/104954 (owner: 10Hashar) [14:11:35] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:10:30 PM UTC [14:12:35] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:10:30 PM UTC [14:12:44] (03Merged) 10jenkins-bot: Wikibase: fix extension-list paths [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/104954 (owner: 10Hashar) [14:13:35] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:10:30 PM UTC [14:13:58] 3 minutes ago and it still is critical ??? [14:13:59] grrrrr [14:14:37] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:10:30 PM UTC [14:15:02] (03PS1) 10Mark Bergsma: Swap old and new mobile-lb.eqiad LVS service IPs (IPv6) [operations/puppet] - 10https://gerrit.wikimedia.org/r/104955 [14:15:35] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:10:30 PM UTC [14:15:45] RECOVERY - Puppet freshness on cp1065 is OK: OK [14:16:07] akosiaris1: that's a box that had chronic problems a week or two ago. idk what happened since then [14:16:35] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:15:35 PM UTC [14:16:56] jeremyb: yeah I am aware. But this is not a problem of the box... crappy icinga has problems now [14:17:16] (03CR) 10Mark Bergsma: [C: 032] Swap old and new mobile-lb.eqiad LVS service IPs (IPv6) [operations/puppet] - 10https://gerrit.wikimedia.org/r/104955 (owner: 10Mark Bergsma) [14:17:45] akosiaris1: or snmpt (sp?) [14:17:53] or the thing it calls [14:17:55] or something [14:20:05] (03PS1) 10Hashar: beta: make mw-update-l10n verbose [operations/puppet] - 10https://gerrit.wikimedia.org/r/104956 [14:20:40] (03PS1) 10Faidon Liambotis: icinga: capitalize Faidon's name & remove Asher [operations/puppet] - 10https://gerrit.wikimedia.org/r/104957 [14:21:00] (03CR) 10Faidon Liambotis: [C: 032 V: 032] icinga: capitalize Faidon's name & remove Asher [operations/puppet] - 10https://gerrit.wikimedia.org/r/104957 (owner: 10Faidon Liambotis) [14:25:33] what's up with virt100* and icinga? [14:26:35] PROBLEM - Host mw27 is DOWN: PING CRITICAL - Packet loss = 100% [14:27:15] RECOVERY - Host mw27 is UP: PING OK - Packet loss = 0%, RTA = 35.33 ms [14:28:34] mark: virt1000 had its opendj crash due to "too many open files" [14:28:42] puppet restarted it [14:29:36] that one also affected DNS for labs... the crappy freshness checks ? I really don't know [14:29:50] no i mean, the virt* boxes seem to be breaking the icinga config dependencies [14:29:58] PROBLEM - Apache HTTP on mw27 is CRITICAL: Connection refused [14:30:36] akosiaris1: when was the opendj problem? [14:30:49] sudo's been very slow since at least last night [14:30:49] some 39 minutes ago [14:30:56] RECOVERY - Apache HTTP on mw27 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.426 second response time [14:31:00] still having problems now [14:31:35] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:19:14 PM UTC [14:32:35] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:19:14 PM UTC [14:33:37] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:19:14 PM UTC [14:34:01] mark: seems to be ok now... did you run puppet manually ? [14:34:19] yes [14:34:26] it was missing the virt1002 host definition... [14:34:26] seems like subsequent puppet runs fix it [14:34:30] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:19:14 PM UTC [14:34:33] race condition ? [14:35:51] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:19:14 PM UTC [14:35:51] RECOVERY - Puppet freshness on cp1065 is OK: GRRRRRR [14:36:28] akosiaris1: hey, where do we update puppet volatile now? palladium, or all workers? [14:36:30] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:35:34 PM UTC [14:36:40] andre__, *: see last 2 commits on https://git.wikimedia.org/commit/operations%2Fapache-config.git/d768fd64ea6594b01ddc6eb78a91507b846e1b22 ; tested and working in labs. now someone has to modify the php to generate the right conf. i'll be back in ~30 mins (anyone feel free to work on that php fix...) [14:37:30] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:35:34 PM UTC [14:38:30] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:35:34 PM UTC [14:39:30] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:35:34 PM UTC [14:40:30] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:35:34 PM UTC [14:40:53] !log swift: setting weight of ms-be5 sde1 to 0, pending RT 6555 [14:41:09] paravoid: all workers [14:41:18] RECOVERY - Disk space on ms-be5 is OK: DISK OK [14:41:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:35:34 PM UTC [14:42:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:35:34 PM UTC [14:42:43] (03PS1) 10Mark Bergsma: Update mobile-lb.eqiad AAAA record [operations/dns] - 10https://gerrit.wikimedia.org/r/104959 [14:43:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:35:34 PM UTC [14:43:34] (03CR) 10Mark Bergsma: [C: 032] Update mobile-lb.eqiad AAAA record [operations/dns] - 10https://gerrit.wikimedia.org/r/104959 (owner: 10Mark Bergsma) [14:44:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:35:34 PM UTC [14:45:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:35:34 PM UTC [14:46:40] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:35:34 PM UTC [14:47:34] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:47:57] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:35:34 PM UTC [14:47:57] manybubbles: i'm working a bit today, should I merge that priority CirrusSearch jobs change? [14:48:08] RECOVERY - RAID on searchidx1001 is OK: OK: optimal, 1 logical, 4 physical [14:48:22] ottomata: I'm happy with it. It can wait until you're working for reals though [14:48:26] when are you back? [14:48:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:35:34 PM UTC [14:49:01] monday 100% [14:49:18] but i'm kinda working a half day today, so I'm happy to do it now [14:49:24] !log hashar synchronized wmf-config 'Wikibase tweak for beta 976f2e9..7f80acb' [14:49:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:35:34 PM UTC [14:49:41] ottomata: cool. may as well then [14:49:54] no time like the present [14:50:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:35:34 PM UTC [14:51:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:35:34 PM UTC [14:52:28] (03PS2) 10Ottomata: Prioritize priority CirrusSearch jobs [operations/puppet] - 10https://gerrit.wikimedia.org/r/104763 (owner: 10Chad) [14:52:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:35:34 PM UTC [14:52:34] (03CR) 10Ottomata: [C: 032 V: 032] Prioritize priority CirrusSearch jobs [operations/puppet] - 10https://gerrit.wikimedia.org/r/104763 (owner: 10Chad) [14:54:25] ottomata: hello! [14:54:28] hiya! [14:54:29] happy new year :) [14:54:33] icinga is full of kafka errors [14:54:33] backaattcha! [14:54:43] ahahaha [14:54:46] yes yes, going to look at that in juuust a minute, i sent an email about that [14:54:51] happy new year :-) [14:54:54] they are not real [14:55:01] they are caused by ganglios problems [14:55:01] both varnishkafka & brokers [14:55:04] at least, they were on monday [14:56:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:53:10 PM UTC [14:57:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:53:10 PM UTC [14:58:18] thanks ottomata. got time to talk about elasticsearch plugins? [14:58:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:53:10 PM UTC [14:58:41] manybubbles: sorta, gonna try to fix this kafka icinga thing [14:58:48] we need to talk about that as a larger thing, right? [14:58:53] deploying jvm stuff? [14:59:06] been meaning to send an email to restart that discussion for about a week now [14:59:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:53:10 PM UTC [14:59:45] ottomata: at this point I just want my plugins. Larger thing or not. whatever it takes [15:00:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:53:10 PM UTC [15:00:35] RECOVERY - Varnishkafka Delivery Errors on cp4020 is OK: OK: kafka.varnishkafka.kafka_drerr.per_second is 0.0 [15:00:35] RECOVERY - Varnishkafka Delivery Errors on cp4012 is OK: OK: kafka.varnishkafka.kafka_drerr.per_second is 0.0 [15:01:04] RECOVERY - Varnishkafka Delivery Errors on cp4019 is OK: OK: kafka.varnishkafka.kafka_drerr.per_second is 0.0 [15:01:05] RECOVERY - Kafka Broker Messages In on analytics1021 is OK: OK: kafka.server.BrokerTopicMetrics.AllTopicsMessagesInPerSec.FifteenMinuteRate is 2751.14198192 [15:01:14] RECOVERY - Kafka Broker Messages In on analytics1022 is OK: OK: kafka.server.BrokerTopicMetrics.AllTopicsMessagesInPerSec.FifteenMinuteRate is 2752.83059598 [15:01:14] RECOVERY - Varnishkafka Delivery Errors on cp4011 is OK: OK: kafka.varnishkafka.kafka_drerr.per_second is 0.0 [15:01:26] ottomata: kafka still broken :-D [15:01:34] or no more hmm [15:01:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:53:10 PM UTC [15:01:51] naw, its ganglios that is broken [15:02:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:53:10 PM UTC [15:03:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:53:10 PM UTC [15:04:40] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:53:10 PM UTC [15:05:10] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [15:05:10] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Last successful Puppet run was Thu 01 Jan 1970 12:00:00 AM UTC [15:05:29] !log starting rolling upgrade of Elasticsearch servers. Going from 0.90.7 to 0.90.9. [15:05:39] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Thu 02 Jan 2014 02:53:10 PM UTC [15:07:09] could use two merges for beta, one to make mw-update-l10n verbose : https://gerrit.wikimedia.org/r/#/c/104956/ [15:07:10] next being to properly track beta parsoid config https://gerrit.wikimedia.org/r/#/c/104938/ [15:07:39] PROBLEM - Varnishkafka Delivery Errors on cp4020 is CRITICAL: STALE [15:07:39] PROBLEM - Varnishkafka Delivery Errors on cp4012 is CRITICAL: STALE [15:07:59] PROBLEM - Varnishkafka Delivery Errors on cp4019 is CRITICAL: STALE [15:08:09] PROBLEM - Kafka Broker Messages In on analytics1021 is CRITICAL: STALE [15:08:10] PROBLEM - Kafka Broker Messages In on analytics1022 is CRITICAL: STALE [15:08:10] PROBLEM - Varnishkafka Delivery Errors on cp4011 is CRITICAL: STALE [15:14:39] RECOVERY - Varnishkafka Delivery Errors on cp4020 is OK: OK: kafka.varnishkafka.kafka_drerr.per_second is 0.0 [15:14:39] RECOVERY - Varnishkafka Delivery Errors on cp4012 is OK: OK: kafka.varnishkafka.kafka_drerr.per_second is 0.0 [15:14:59] RECOVERY - Varnishkafka Delivery Errors on cp4019 is OK: OK: kafka.varnishkafka.kafka_drerr.per_second is 0.0 [15:15:09] RECOVERY - Kafka Broker Messages In on analytics1021 is OK: OK: kafka.server.BrokerTopicMetrics.AllTopicsMessagesInPerSec.FifteenMinuteRate is 2709.66566769 [15:15:10] RECOVERY - Kafka Broker Messages In on analytics1022 is OK: OK: kafka.server.BrokerTopicMetrics.AllTopicsMessagesInPerSec.FifteenMinuteRate is 2708.55707436 [15:15:10] RECOVERY - Varnishkafka Delivery Errors on cp4011 is OK: OK: kafka.varnishkafka.kafka_drerr.per_second is 0.0 [15:15:24] (03CR) 10Faidon Liambotis: "What's the deal we have with wikimedia.li & wikimedia.pl? We own the domains but use some commercial nameservers? What's legal's take on t" [operations/dns] - 10https://gerrit.wikimedia.org/r/86659 (owner: 10Dzahn) [15:18:14] (03CR) 10Alexandros Kosiaris: [C: 032] beta: make mw-update-l10n verbose [operations/puppet] - 10https://gerrit.wikimedia.org/r/104956 (owner: 10Hashar) [15:18:24]