[00:51:11] PROBLEM - MySQL Replication Heartbeat on db1016 is CRITICAL: CRIT replication delay 308 seconds [00:51:26] PROBLEM - MySQL Slave Delay on db1016 is CRITICAL: CRIT replication delay 323 seconds [00:53:26] RECOVERY - MySQL Replication Heartbeat on db1016 is OK: OK replication delay -0 seconds [00:53:37] RECOVERY - MySQL Slave Delay on db1016 is OK: OK replication delay 0 seconds [00:59:50] PROBLEM - puppet last run on cp1008 is CRITICAL: CRITICAL: Puppet has 1 failures [01:17:54] RECOVERY - puppet last run on cp1008 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [01:42:11] (03CR) 10Springle: [C: 031] Add defines for working with mysql config files, and mysql client settings [puppet] - 10https://gerrit.wikimedia.org/r/169722 (owner: 10Ottomata) [02:04:46] PROBLEM - puppet last run on amssq61 is CRITICAL: CRITICAL: puppet fail [02:24:57] RECOVERY - puppet last run on amssq61 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [04:12:06] (03PS7) 10Springle: prepare dbproxy1002 for trials [puppet] - 10https://gerrit.wikimedia.org/r/164293 [04:15:27] (03PS8) 10Springle: prepare dbproxy1002 for trials [puppet] - 10https://gerrit.wikimedia.org/r/164293 [04:16:53] (03CR) 10Springle: [C: 032] prepare dbproxy1002 for trials [puppet] - 10https://gerrit.wikimedia.org/r/164293 (owner: 10Springle) [04:44:27] springle: hey [04:45:06] springle: is there any reason to keep "mariadb::monitor_replication" under role::mariadb::dbstore? [04:46:18] paravoid: the alternative being? [04:46:40] it's defined in the submodule [04:46:59] isn't dbstore purposefully lagged? [04:47:08] oh right [04:47:12] one is, one isn't [04:47:15] oh [04:47:25] i have yet to split it up [04:48:25] I see some support for increased thresholds [04:48:39] but it probably doesn't work, icinga has been in a WARNING state for weeks now [04:48:41] this is about the icinga warnings? [04:49:10] yeah :) [04:49:17] sorry, XY problem etc. :) [04:49:18] the threshholds do work. the issue is that the current lag mechanism stops the slave, which makes seconds_behind_master null [04:49:58] the check script then warns because sql thread is stopped and/or lag is null (ie, wrong, though it doesn't know why) [04:50:23] I see a lot of that [04:50:25] but I also see [04:50:26] WARNING slave_sql_lag Seconds_Behind_Master: 95730 [04:50:36] log_warn is 90000 [04:50:52] that's 2.5h of additional lag, right? [04:51:00] yes [04:51:21] yeah ok [04:51:35] it shouldn't be doing that [04:51:52] so is this a legitimate warning? [04:52:34] with the thresholds we've set, yes. though i don't care unless it gets days behind :) [04:52:38] i'll sort it out [04:53:18] we should probably fix the checks :) [04:53:27] i'd rather fix the lag [04:53:33] heh [05:11:04] * springle adds $happy_paravoid flag to replag check script :P [05:11:08] :P [05:23:06] (03PS1) 10Springle: Option to silence some Icinga warnings for MariaDB replication. [puppet] - 10https://gerrit.wikimedia.org/r/170653 [05:29:39] (03PS1) 10Springle: Class parameter to pass --no-warn-stopped to check_mariadb.pl [puppet/mariadb] - 10https://gerrit.wikimedia.org/r/170654 [05:32:03] (03PS2) 10Springle: Class parameter to pass --no-warn-stopped to check_mariadb.pl [puppet/mariadb] - 10https://gerrit.wikimedia.org/r/170654 [05:37:39] (03PS1) 10Springle: Fix incorrect role parameter. [puppet] - 10https://gerrit.wikimedia.org/r/170656 [05:38:38] (03CR) 10Springle: [C: 032] Fix incorrect role parameter. [puppet] - 10https://gerrit.wikimedia.org/r/170656 (owner: 10Springle) [05:55:43] \o/ \o/ [05:55:44] :) [06:29:52] PROBLEM - puppet last run on mw1119 is CRITICAL: CRITICAL: Puppet has 1 failures [06:29:59] PROBLEM - puppet last run on amssq48 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:22] PROBLEM - puppet last run on elastic1024 is CRITICAL: CRITICAL: Puppet has 1 failures [06:37:35] _joe__: icinga is full of HHVM-related alerts too btw [06:42:19] (03CR) 10Faidon Liambotis: [C: 031] "WFM" [puppet/mariadb] - 10https://gerrit.wikimedia.org/r/170654 (owner: 10Springle) [06:42:29] PROBLEM - puppet last run on db1024 is CRITICAL: CRITICAL: Puppet has 1 failures [06:46:34] RECOVERY - puppet last run on mw1119 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [06:48:04] RECOVERY - puppet last run on amssq48 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [06:49:19] RECOVERY - puppet last run on elastic1024 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [06:56:49] <_joe__> paravoid: again? [06:56:56] yeah [06:57:12] <_joe__> sorry just woke up [06:57:44] 68 Matching Service Entries Displayed [06:57:46] *sigh* [06:57:51] (03PS2) 10Springle: Option to silence some Icinga warnings for MariaDB replication. [puppet] - 10https://gerrit.wikimedia.org/r/170653 [06:57:53] <_joe__> uh? [06:57:59] not just hhvm [06:58:03] it's just a sad picture [06:58:08] lots of alerts [06:58:40] (03CR) 10Springle: [C: 032] Option to silence some Icinga warnings for MariaDB replication. [puppet] - 10https://gerrit.wikimedia.org/r/170653 (owner: 10Springle) [06:59:00] <_joe__> paravoid: hhvm I just see the last three installed servers that have graphite checks failing [06:59:27] <_joe__> but that's because some changes must have broken the puppettization, clearly [07:00:33] RECOVERY - puppet last run on db1024 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:01:01] I see: mw1029 (check_procs), mw1030/mw1031/mw1032 (busy threads/queue size), mw1189 (same) [07:01:27] RECOVERY - DPKG on mw1113 is OK: All packages OK [07:01:40] and mw1163 nutcrackefr [07:01:54] RECOVERY - nutcracker port on mw1163 is OK: TCP OK - 0.000 second response time on port 11212 [07:02:08] (03CR) 10Springle: [C: 032] Class parameter to pass --no-warn-stopped to check_mariadb.pl [puppet/mariadb] - 10https://gerrit.wikimedia.org/r/170654 (owner: 10Springle) [07:02:54] <_joe__> paravoid: yes the queue size ones are to investigate, but it's a problem with the check] [07:04:15] <_joe__> and nutcracker was on a depooled host, but I don't see other alarms there [07:05:11] <_joe__> anyways, I'll take a shower and see why the latest appservers have failing icinga checks for busy threads and queue size. [07:05:18] :) [07:05:48] PROBLEM - puppet last run on cp3003 is CRITICAL: CRITICAL: puppet fail [07:08:44] (03PS1) 10Springle: dbproxy: use full path for included /etc/haproxy/conf.d config files [puppet] - 10https://gerrit.wikimedia.org/r/170662 [07:12:11] (03PS1) 10Springle: WIP dbproxy monitoring [puppet] - 10https://gerrit.wikimedia.org/r/170663 [07:19:32] (03PS1) 10Springle: update mariadb submodule [puppet] - 10https://gerrit.wikimedia.org/r/170664 [07:20:06] (03CR) 10Springle: [C: 032] update mariadb submodule [puppet] - 10https://gerrit.wikimedia.org/r/170664 (owner: 10Springle) [07:22:56] oh heh too tricky for own good [07:23:29] PROBLEM - puppet last run on db1073 is CRITICAL: CRITICAL: puppet fail [07:24:18] PROBLEM - puppet last run on es1001 is CRITICAL: CRITICAL: puppet fail [07:24:30] PROBLEM - puppet last run on db2019 is CRITICAL: CRITICAL: puppet fail [07:24:59] PROBLEM - puppet last run on db2009 is CRITICAL: CRITICAL: puppet fail [07:25:28] RECOVERY - puppet last run on cp3003 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [07:25:48] (03PS1) 10Springle: Don't get tricky with reusing variable names. [puppet/mariadb] - 10https://gerrit.wikimedia.org/r/170665 [07:25:49] PROBLEM - puppet last run on es2008 is CRITICAL: CRITICAL: puppet fail [07:25:59] PROBLEM - puppet last run on es1008 is CRITICAL: CRITICAL: puppet fail [07:26:28] PROBLEM - puppet last run on es2001 is CRITICAL: CRITICAL: puppet fail [07:27:37] (03CR) 10Springle: [C: 032] Don't get tricky with reusing variable names. [puppet/mariadb] - 10https://gerrit.wikimedia.org/r/170665 (owner: 10Springle) [07:27:48] PROBLEM - puppet last run on db1067 is CRITICAL: CRITICAL: puppet fail [07:27:52] PROBLEM - puppet last run on db1046 is CRITICAL: CRITICAL: puppet fail [07:27:52] PROBLEM - puppet last run on db2018 is CRITICAL: CRITICAL: puppet fail [07:28:38] PROBLEM - puppet last run on db1042 is CRITICAL: CRITICAL: puppet fail [07:28:59] PROBLEM - puppet last run on db2029 is CRITICAL: CRITICAL: puppet fail [07:29:31] <_joe__> eheh [07:29:58] PROBLEM - puppet last run on db1048 is CRITICAL: CRITICAL: puppet fail [07:29:59] PROBLEM - puppet last run on db2016 is CRITICAL: CRITICAL: puppet fail [07:30:24] (03PS1) 10Springle: Unbreak puppet on DBs [puppet] - 10https://gerrit.wikimedia.org/r/170666 [07:30:38] PROBLEM - puppet last run on db1071 is CRITICAL: CRITICAL: puppet fail [07:30:39] PROBLEM - puppet last run on db1036 is CRITICAL: CRITICAL: puppet fail [07:30:47] (03CR) 10Springle: [C: 032] Unbreak puppet on DBs [puppet] - 10https://gerrit.wikimedia.org/r/170666 (owner: 10Springle) [07:31:20] PROBLEM - puppet last run on db2023 is CRITICAL: CRITICAL: puppet fail [07:32:42] PROBLEM - puppet last run on db1062 is CRITICAL: CRITICAL: puppet fail [07:33:08] RECOVERY - puppet last run on db1073 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [07:33:39] RECOVERY - puppet last run on db1062 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [07:33:48] PROBLEM - puppet last run on db1070 is CRITICAL: CRITICAL: puppet fail [07:33:59] PROBLEM - puppet last run on db1063 is CRITICAL: CRITICAL: puppet fail [07:34:21] PROBLEM - puppet last run on es1002 is CRITICAL: CRITICAL: puppet fail [07:41:49] RECOVERY - puppet last run on es1001 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [07:43:28] RECOVERY - puppet last run on es1008 is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures [07:43:38] RECOVERY - puppet last run on db2009 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [07:44:09] RECOVERY - puppet last run on db2019 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:44:18] RECOVERY - puppet last run on es2008 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [07:45:12] RECOVERY - puppet last run on es2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:46:09] RECOVERY - puppet last run on db1042 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [07:46:29] RECOVERY - puppet last run on db1067 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [07:46:39] RECOVERY - puppet last run on db2018 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [07:46:39] RECOVERY - puppet last run on db1046 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [07:47:38] RECOVERY - puppet last run on db2029 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [07:48:29] RECOVERY - puppet last run on db1036 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [07:48:30] RECOVERY - puppet last run on db1048 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [07:48:39] RECOVERY - puppet last run on db2016 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [07:50:09] RECOVERY - puppet last run on db1071 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [07:50:09] RECOVERY - puppet last run on db2023 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [07:52:01] RECOVERY - puppet last run on es1002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:52:19] RECOVERY - puppet last run on db1070 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [07:52:38] RECOVERY - puppet last run on db1063 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [07:53:08] (03PS1) 10Springle: Disable Icinga warnings from dbstore1001 delayed slave. [puppet] - 10https://gerrit.wikimedia.org/r/170667 [07:56:50] (03CR) 10Springle: [C: 032] Disable Icinga warnings from dbstore1001 delayed slave. [puppet] - 10https://gerrit.wikimedia.org/r/170667 (owner: 10Springle) [08:05:37] (03PS2) 10Springle: dbproxy: use full path for included /etc/haproxy/conf.d config files [puppet] - 10https://gerrit.wikimedia.org/r/170662 [08:05:44] (03CR) 10Springle: [C: 032] dbproxy: use full path for included /etc/haproxy/conf.d config files [puppet] - 10https://gerrit.wikimedia.org/r/170662 (owner: 10Springle) [08:11:00] enough breaking stuff for today [08:11:34] <_joe__> eww I hate collected resources in nagios [08:14:36] (03PS2) 10Giuseppe Lavagetto: HHVM: fix monitoring. [puppet] - 10https://gerrit.wikimedia.org/r/170060 [08:14:47] trailing dot! [08:14:48] <_joe__> paravoid: ^^ I plainly forgot to merge this :/ [08:17:24] (03PS1) 10QChris: Install maven on analytics clients [puppet] - 10https://gerrit.wikimedia.org/r/170668 [08:18:20] (03CR) 10Giuseppe Lavagetto: [C: 032] HHVM: fix monitoring. [puppet] - 10https://gerrit.wikimedia.org/r/170060 (owner: 10Giuseppe Lavagetto) [08:22:06] (03PS2) 10QChris: Install maven on analytics clients [puppet] - 10https://gerrit.wikimedia.org/r/170668 [08:25:38] <_joe__> springle: are we using haproxy in prod now for databases? [08:33:38] (03CR) 10QChris: Install maven on analytics clients (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/170668 (owner: 10QChris) [08:40:04] what's the status with CirrusSearch? [08:47:18] last that I know is that Nik disabled insource:// support as some regex searches could bring it down [08:47:23] paravoid, ^ [08:47:38] oh, thanks [09:01:43] <_joe__> paravoid: yes, it seems like there is a memleak in lucene that was triggered by regex searches, nik wrote an email I guess [09:01:55] <_joe__> but cirrus is up and running, and lemme check the heap size there [09:04:03] <_joe__> http://ganglia.wikimedia.org/latest/graph_all_periods.php?c=Elasticsearch%20cluster%20eqiad&h=elastic1001.eqiad.wmnet&r=hour&z=default&jr=&js=&st=1415005330&v=8266078072&m=es_heap_used&vl=bytes&ti=es_heap_used&z=large looks like a few major GCs went on without issues durning the weekend [09:04:37] I'm wondering why we got old-search pages during the weekend [09:05:37] root@search1019:/a/search/log# ls -lh [09:05:37] total 47G [09:05:45] -rw-r--r-- 1 lsearch search 23M Oct 30 23:59 log.2014-10-30 [09:05:45] -rw-r--r-- 1 lsearch search 22G Oct 31 23:59 log.2014-10-31 [09:05:45] -rw-r--r-- 1 lsearch search 26G Nov 1 23:59 log.2014-11-01 [09:05:45] -rw-r--r-- 1 lsearch search 20M Nov 2 23:59 log.2014-11-02 [09:06:02] (/a 100% full) [09:27:50] <_joe__> smells like java stacktraces [10:21:37] (03CR) 10Filippo Giunchedi: "looks like we're duplicating work with https://gerrit.wikimedia.org/r/#/c/170300/ ?" [puppet] - 10https://gerrit.wikimedia.org/r/147487 (owner: 10Reedy) [10:43:14] RECOVERY - Disk space on search1019 is OK: DISK OK [10:56:04] (03PS1) 10Faidon Liambotis: apt: do not download Translations [puppet] - 10https://gerrit.wikimedia.org/r/170678 [10:56:11] (03CR) 10Giuseppe Lavagetto: hiera: mediawiki-based backend for labs (034 comments) [puppet] - 10https://gerrit.wikimedia.org/r/168984 (owner: 10Giuseppe Lavagetto) [11:02:19] (03Abandoned) 10Faidon Liambotis: apt: do not download Translations [puppet] - 10https://gerrit.wikimedia.org/r/170678 (owner: 10Faidon Liambotis) [11:03:41] paravoid: Then do acquire::languages en? [11:04:08] that's the default [11:04:14] that's what we fetch now [11:04:31] (well, unless you have a different LC_MESSAGES, which we don't) [11:04:56] ah ok, then I didn't get the point [11:05:01] thought we were fetching more stuff [11:07:36] (03PS3) 10Giuseppe Lavagetto: hiera: mediawiki-based backend for labs [puppet] - 10https://gerrit.wikimedia.org/r/168984 [11:08:22] <_joe_> godog: care to take another look? this should be ready to go live on labs [11:09:29] <_joe_> YuviPanda: is there any documentation of the yaml files for the instance projects in wikitech? We should probably have a link in the project page under "configure" [11:09:53] <_joe_> and encourage people to use it instead of declaring top-scope variables whenever possible. [11:10:06] <_joe_> well, once this is merged [11:10:16] _joe_: sure, I should be able to get to it before end of today [11:10:18] * YuviPanda waves [11:10:20] _joe_: nope, there's no docs... [11:10:22] true. [11:10:24] true [11:10:26] also I wonder if I should get +2 today :) [11:10:37] <_joe_> YuviPanda: eheh right! [11:10:43] <_joe_> are you officially ops now? [11:10:47] _joe_: I think so [11:10:59] _joe_: mark should send an email to wmfall at some point, I think [11:12:33] <_joe_> YuviPanda: anyways, I'm going to merge this now in beta, do a few tests, then merge it fully [11:12:41] _joe_: cool! :) [11:13:26] hey YuviPanda [11:13:31] hi godog! [11:14:03] (03CR) 10Faidon Liambotis: [C: 04-2] "I'm not sure if I understand this." [puppet] - 10https://gerrit.wikimedia.org/r/145997 (https://bugzilla.wikimedia.org/67957) (owner: 10Ori.livneh) [11:26:13] !log disable puppet on labsdb1004, labsdb1005 for postgresql reinitialization [11:26:24] Logged the message, Master [11:28:39] <_joe_> and of course the ruby-httpclient version we have on precise has a different api that the one on trusty [11:28:47] * _joe_ sings in joy [11:28:53] ruby ftw! [11:29:41] <_joe_> Error 400 on SERVER: undefined method `ssl_version=' for # [11:30:11] <_joe_> (yes, ruby-httpclient uses ssl3 by default, in 2014) [11:30:43] well, ruby [11:31:54] hey YuviPanda [11:31:56] I have a question [11:32:12] what's with labmon1001 and all the labs checks alerting us in production icinga? [11:32:21] is this temporary until you set up shinken in labs? [11:32:35] is it something you're aware of, in general? [11:46:13] paravoid: hey! [11:46:15] paravoid: yup, until shinken. which is fairly close, today or tomorrow [11:50:36] PROBLEM - puppet last run on amssq36 is CRITICAL: CRITICAL: puppet fail [11:50:48] ok, fair enough [11:53:53] (03PS4) 10Giuseppe Lavagetto: hiera: mediawiki-based backend for labs [puppet] - 10https://gerrit.wikimedia.org/r/168984 [11:55:03] <_joe_> the simple fact that I have to inspect the httpclient library to know how to set the SSL option I want, because subsequent versions of the library are mutually incompatible, says a lot. [11:55:19] a recruiter is pestering me for a ruby position [11:55:32] I didn't even respond to the first mail and I got a followup [11:55:41] i can pass your name on if you're interested :P [11:55:46] (03PS1) 10Alexandros Kosiaris: osm: Fix rsync monitoring [puppet] - 10https://gerrit.wikimedia.org/r/170681 [11:55:55] "Exciting Ruby opportunities" [11:56:15] i am excited :-) [11:56:50] <_joe_> paravoid: ahahahah [11:57:23] FTR I'll probably work somewhat close to EST hours [11:57:24] <_joe_> paravoid: I have a couple of stalkers too, but not for ruby shops. [11:57:51] <_joe_> paravoid: I can give you a PHP stalker in exchange for the ruby one [12:05:20] he... so I need a scala stalker to beat you both ... [12:05:40] well something jvm like... clojure would do I suppose [12:09:25] RECOVERY - puppet last run on amssq36 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [12:12:35] PROBLEM - puppet last run on labsdb1005 is CRITICAL: CRITICAL: Puppet has 1 failures [12:19:15] RECOVERY - puppet last run on labsdb1005 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [12:26:04] (03PS1) 10Alexandros Kosiaris: labs: specify a admin user for postgres [puppet] - 10https://gerrit.wikimedia.org/r/170684 [12:44:24] (03PS5) 10Giuseppe Lavagetto: hiera: mediawiki-based backend for labs [puppet] - 10https://gerrit.wikimedia.org/r/168984 [12:58:10] (03CR) 10Giuseppe Lavagetto: [C: 032] "Works well in beta." [puppet] - 10https://gerrit.wikimedia.org/r/168984 (owner: 10Giuseppe Lavagetto) [13:01:11] <_joe_> YuviPanda: https://wikitech.wikimedia.org/wiki/Hiera:Deployment-prep works well [13:03:33] <_joe_> mmmh virt1000 is a labs puppetmaster, right? It doesn't seem to work as expected (it gets the prod hiera config) [13:04:50] <_joe_> and... I think I was the one doing the dumb choice [13:08:28] (03PS1) 10Giuseppe Lavagetto: puppetmaster: use the correct hiera config [puppet] - 10https://gerrit.wikimedia.org/r/170689 [13:18:08] !log depool wtp1024.eqiad.wmnet in preparation for reimaging to trusty [13:18:13] and let's see how this goes ... [13:18:15] Logged the message, Master [13:19:03] \o/ [13:37:00] <_joe_> parsoid on trusty means a not-too-old node version, right? [13:37:16] (03CR) 10Giuseppe Lavagetto: [C: 032] puppetmaster: use the correct hiera config [puppet] - 10https://gerrit.wikimedia.org/r/170689 (owner: 10Giuseppe Lavagetto) [13:38:16] * _joe_ off to lunch [13:49:44] _joe_: \o/! [13:50:15] <_joe_> YuviPanda: and it works, which wasn't so obvious :P [13:50:22] <_joe_> YuviPanda: we need docs and UI [13:51:52] _joe_: yeah. On the wiki side it is just 'edit Hiera:'. Only projectadmins can do it [13:52:12] <_joe_> yes I have seen the code [13:52:22] <_joe_> but it should be doable from the configure project link [13:52:34] <_joe_> and docs for using labs should be updated accordingly [13:53:29] <_joe_> I evaluate the emotional cost of writing 1 line of ruby as 10 lines of docs. So you owe me several hundreds lines of docs :P [13:53:59] * _joe_ shamelessly tries to strongarm Yuvi [13:55:26] PROBLEM - Host wtp1024 is DOWN: CRITICAL - Plugin timed out after 15 seconds [13:56:17] _joe_: heh [13:57:09] _joe_: I'll add a patch for a link from 'manage projects' [13:57:54] <_joe_> YuviPanda: quite simply, as we move to hiera more and more, it would be crucial to have this at our disposal [14:00:47] RECOVERY - Host wtp1024 is UP: PING OK - Packet loss = 0%, RTA = 1.40 ms [14:06:18] (03PS5) 10Rush: Make the 'real name' user profile field optional in phabricator [puppet] - 10https://gerrit.wikimedia.org/r/170433 (owner: 1020after4) [14:07:52] (03CR) 10Rush: [C: 032 V: 032] Make the 'real name' user profile field optional in phabricator [puppet] - 10https://gerrit.wikimedia.org/r/170433 (owner: 1020after4) [14:11:51] (03PS5) 10Rush: Change phab_update_tag script to remove library lock file [puppet] - 10https://gerrit.wikimedia.org/r/166406 (owner: 10Christopher Johnson (WMDE)) [14:36:00] (03PS1) 10ArielGlenn: force daily rotation of nginx logs [puppet] - 10https://gerrit.wikimedia.org/r/170700 [14:38:07] !log repool wtp1024 with a weight of 1 instead of 15 for now [14:38:15] Logged the message, Master [14:38:18] !log wtp1024 re-installed as trusty [14:38:24] Logged the message, Master [14:40:16] (03CR) 10ArielGlenn: [C: 032] force daily rotation of nginx logs [puppet] - 10https://gerrit.wikimedia.org/r/170700 (owner: 10ArielGlenn) [15:00:04] manybubbles, anomie, ^d, marktraceur: Respected human, time to deploy SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20141103T1500). Please do the needful. [15:00:15] uh, bot? [15:00:30] your timezone seems off [15:00:32] I think the bot didn't handle DST ending correctly [15:00:42] hehe [15:01:02] neither did the wiki: https://wikitech.wikimedia.org/wiki/Deployments#Monday.2C.C2.A0November.C2.A003 [15:01:10] It is 1500 [15:01:11] oh no! [15:01:24] it just thinks I'm in UTC-5 instead of america/new york [15:01:27] I can fix that one [15:07:48] * anomie fixes Deployments to use PST, but sees the module's "us_in_dst" function is still b0rken [15:11:29] anomie: Screw that, UTC or nothing [15:12:59] yes please! [15:15:37] (03PS1) 10Giuseppe Lavagetto: puppetmaster: install ruby-httpclient in labs [puppet] - 10https://gerrit.wikimedia.org/r/170707 [15:15:56] (03CR) 10Giuseppe Lavagetto: [C: 032 V: 032] puppetmaster: install ruby-httpclient in labs [puppet] - 10https://gerrit.wikimedia.org/r/170707 (owner: 10Giuseppe Lavagetto) [15:16:23] Hm, a MatmaRex patch. But no MatmaRex in-channel. [15:16:23] I guess he has like 50 minutes to get in [15:17:12] marktraceur: Fancy doing another easy one? [15:19:28] Reedy: I didn't claim yet...but put 'er in [15:19:40] Oh, is it not window time yet? [15:19:45] (03PS3) 10Reedy: Add export-0.10.xsd [mediawiki-config] - 10https://gerrit.wikimedia.org/r/170351 (https://bugzilla.wikimedia.org/72417) [15:19:56] No, jouncebot is Wrong [15:20:14] It's only 07:00 in SF [15:21:33] (03PS1) 10Giuseppe Lavagetto: hiera: fix logging, caching of non-existence in mwyaml [puppet] - 10https://gerrit.wikimedia.org/r/170708 [15:22:49] (03CR) 10Giuseppe Lavagetto: [C: 032] hiera: fix logging, caching of non-existence in mwyaml [puppet] - 10https://gerrit.wikimedia.org/r/170708 (owner: 10Giuseppe Lavagetto) [15:49:15] marktraceur: I took Matma's patch off this morning's SWAT since the master-branch patch isn't merged yet. [15:49:34] 'kay [15:54:33] PROBLEM - puppet last run on elastic1017 is CRITICAL: CRITICAL: puppet fail [16:00:04] anomie: csteipp emailed saying he'd review Matma's patch this morning. We'll get it out post-swat. [16:00:05] manybubbles, anomie, ^d, marktraceur: Respected human, time to deploy SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20141103T1600). Please do the needful. [16:00:39] Better. [16:00:49] Who's doing it? [16:00:58] marktraceur: You? ;) [16:01:06] bd808: Or it could be done in the 16:00 PST slot [16:01:37] marktraceur: I can do it if you don't want to [16:01:54] anomie: I kinda don't [16:02:15] * anomie starts SWAT [16:02:22] (03CR) 10Anomie: [C: 032] Add export-0.10.xsd [mediawiki-config] - 10https://gerrit.wikimedia.org/r/170351 (https://bugzilla.wikimedia.org/72417) (owner: 10Reedy) [16:02:34] (03Merged) 10jenkins-bot: Add export-0.10.xsd [mediawiki-config] - 10https://gerrit.wikimedia.org/r/170351 (https://bugzilla.wikimedia.org/72417) (owner: 10Reedy) [16:02:42] anomie: should be good with a sync-docroot :) [16:03:01] !log anomie Synchronized docroot and w: (no message) (duration: 00m 10s) [16:03:09] Logged the message, Master [16:03:10] Reedy: I didn't know about sync-docroot, thanks. I'd have just done sync-dir docroot/mediawiki/xml [16:03:18] Reedy: ^^^ Test please [16:03:29] https://www.mediawiki.org/xml/export-0.10.xsd [16:03:30] LGTM [16:03:37] * anomie is done with SWAT [16:03:40] https://www.mediawiki.org/xml/ [16:03:41] Thanks [16:06:37] (03CR) 10Alexandros Kosiaris: [C: 032] osm: Fix rsync monitoring [puppet] - 10https://gerrit.wikimedia.org/r/170681 (owner: 10Alexandros Kosiaris) [16:06:47] (03CR) 10Alexandros Kosiaris: [C: 032] labs: specify a admin user for postgres [puppet] - 10https://gerrit.wikimedia.org/r/170684 (owner: 10Alexandros Kosiaris) [16:07:11] oo, Reedy had something swat'd out? [16:07:29] heh [16:08:07] :) [16:08:13] PROBLEM - Host ms-be2007 is DOWN: PING CRITICAL - Packet loss = 100% [16:11:05] (03CR) 10ArielGlenn: "so besides using copytruncate for now, since the api cluster at least with hhvm is a bit touchy, we should also toss 'notifempty', to forc" [puppet] - 10https://gerrit.wikimedia.org/r/130296 (owner: 10ArielGlenn) [16:13:03] RECOVERY - Host ms-be2007 is UP: PING OK - Packet loss = 0%, RTA = 43.61 ms [16:13:03] RECOVERY - puppet last run on elastic1017 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [16:17:13] PROBLEM - swift-account-replicator on ms-be2007 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-account-replicator [16:17:15] PROBLEM - swift-object-auditor on ms-be2007 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [16:17:21] Oh, I was going to ask, why isn't MatmaRex just deploying his own crap [16:17:22] Surely now we're paying him (right?) we can trust him with the servers [16:17:23] PROBLEM - swift-object-replicator on ms-be2007 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-object-replicator [16:17:24] PROBLEM - swift-account-server on ms-be2007 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-account-server [16:17:24] PROBLEM - swift-container-server on ms-be2007 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-server [16:17:37] PROBLEM - swift-object-server on ms-be2007 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-object-server [16:17:37] PROBLEM - swift-account-auditor on ms-be2007 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-account-auditor [16:17:43] PROBLEM - swift-container-updater on ms-be2007 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-updater [16:17:53] PROBLEM - swift-account-reaper on ms-be2007 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-account-reaper [16:17:59] PROBLEM - swift-container-auditor on ms-be2007 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [16:17:59] PROBLEM - swift-object-updater on ms-be2007 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-object-updater [16:17:59] PROBLEM - swift-container-replicator on ms-be2007 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-replicator [16:18:04] that's me [16:18:47] yeah I figured :) [16:21:38] (03PS1) 10Andrew Bogott: Added Yuvi to 'roots' [puppet] - 10https://gerrit.wikimedia.org/r/170717 [16:21:50] mark: +1? ^ [16:22:41] (03PS2) 10Krinkle: contint: Minor clean up [puppet] - 10https://gerrit.wikimedia.org/r/168629 [16:22:48] !log added yuvi to 'Ops' ldap group [16:22:55] Logged the message, Master [16:24:38] (03PS2) 10Andrew Bogott: Add Yuvi to 'roots' [puppet] - 10https://gerrit.wikimedia.org/r/170717 [16:25:08] (03PS3) 10Krinkle: contint: Clean up order of statements [puppet] - 10https://gerrit.wikimedia.org/r/168629 [16:25:32] (03CR) 10Krinkle: contint: Clean up order of statements (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/168629 (owner: 10Krinkle) [16:25:41] (03PS3) 10Krinkle: contint: Move /srv/localhost/qunit resource out of qunit_localhost class [puppet] - 10https://gerrit.wikimedia.org/r/168630 [16:25:47] (03PS3) 10Krinkle: [WIP] contint: Apply contint::qunit_localhost to labs slaves [puppet] - 10https://gerrit.wikimedia.org/r/168631 [16:25:48] !log reboot ms-be2007, disk replaced but no corresponding raid0 LD [16:25:54] Logged the message, Master [16:27:18] springle: if you are working late tonight, can you have a look at https://bugzilla.wikimedia.org/show_bug.cgi?id=72908 ? [16:28:41] (03PS1) 10Giuseppe Lavagetto: monitoring: move stuff out of nagios.pp, part 1 [puppet] - 10https://gerrit.wikimedia.org/r/170718 [16:28:43] (03PS1) 10Giuseppe Lavagetto: monitoring: move stuff out of nagios.pp, part 2 [puppet] - 10https://gerrit.wikimedia.org/r/170719 [16:28:45] (03PS1) 10Giuseppe Lavagetto: monitoring: move stuff out of nagios.pp, part 3 [puppet] - 10https://gerrit.wikimedia.org/r/170720 [16:28:47] (03PS1) 10Giuseppe Lavagetto: monitoring: move stuff out of nagios.pp, part 4 [puppet] - 10https://gerrit.wikimedia.org/r/170721 [16:28:54] RECOVERY - swift-container-updater on ms-be2007 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-updater [16:28:54] RECOVERY - swift-account-auditor on ms-be2007 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-auditor [16:28:54] RECOVERY - swift-object-server on ms-be2007 is OK: PROCS OK: 101 processes with regex args ^/usr/bin/python /usr/bin/swift-object-server [16:29:03] RECOVERY - swift-account-reaper on ms-be2007 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-reaper [16:29:03] RECOVERY - swift-container-auditor on ms-be2007 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [16:29:03] RECOVERY - swift-object-updater on ms-be2007 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-object-updater [16:29:04] RECOVERY - swift-container-replicator on ms-be2007 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-replicator [16:29:10] <_joe_> don't worry, I left the interesting parts behind for now. [16:29:33] RECOVERY - swift-account-replicator on ms-be2007 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-replicator [16:29:45] RECOVERY - swift-object-auditor on ms-be2007 is OK: PROCS OK: 3 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [16:29:46] RECOVERY - swift-object-replicator on ms-be2007 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-object-replicator [16:29:46] RECOVERY - swift-account-server on ms-be2007 is OK: PROCS OK: 25 processes with regex args ^/usr/bin/python /usr/bin/swift-account-server [16:29:47] RECOVERY - swift-container-server on ms-be2007 is OK: PROCS OK: 25 processes with regex args ^/usr/bin/python /usr/bin/swift-container-server [16:29:50] (03PS2) 10Giuseppe Lavagetto: monitoring: move stuff out of nagios.pp, part 1 [puppet] - 10https://gerrit.wikimedia.org/r/170718 [16:30:58] (03CR) 10Giuseppe Lavagetto: [C: 032] monitoring: move stuff out of nagios.pp, part 1 [puppet] - 10https://gerrit.wikimedia.org/r/170718 (owner: 10Giuseppe Lavagetto) [16:31:34] RECOVERY - puppet last run on ms-be2007 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [16:34:03] (03PS2) 10Giuseppe Lavagetto: monitoring: move stuff out of nagios.pp, part 2 [puppet] - 10https://gerrit.wikimedia.org/r/170719 [16:34:53] <_joe_> !log rolling-restarting hhvm appservers [16:34:58] Logged the message, Master [16:38:19] (03CR) 10Giuseppe Lavagetto: [C: 032] monitoring: move stuff out of nagios.pp, part 2 [puppet] - 10https://gerrit.wikimedia.org/r/170719 (owner: 10Giuseppe Lavagetto) [16:39:33] PROBLEM - HHVM busy threads on mw1114 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [90.0] [16:40:04] <_joe_> it works, nice. [16:41:06] (03PS2) 10Giuseppe Lavagetto: monitoring: move stuff out of nagios.pp, part 3 [puppet] - 10https://gerrit.wikimedia.org/r/170720 [16:41:15] (03CR) 10Giuseppe Lavagetto: [C: 032] monitoring: move stuff out of nagios.pp, part 3 [puppet] - 10https://gerrit.wikimedia.org/r/170720 (owner: 10Giuseppe Lavagetto) [16:47:32] (03CR) 10BBlack: [C: 031] remove en2.wikipedia.org [dns] - 10https://gerrit.wikimedia.org/r/170138 (owner: 10Dzahn) [16:49:21] (03CR) 10BBlack: [C: 031] Switch *.{wap,mobile}.wikipedia.org to wikipedia-lb [dns] - 10https://gerrit.wikimedia.org/r/98055 (owner: 10Faidon Liambotis) [16:50:38] (03PS2) 10Giuseppe Lavagetto: monitoring: move stuff out of nagios.pp, part 4 [puppet] - 10https://gerrit.wikimedia.org/r/170721 [16:53:45] (03CR) 10Giuseppe Lavagetto: [C: 032] monitoring: move stuff out of nagios.pp, part 4 [puppet] - 10https://gerrit.wikimedia.org/r/170721 (owner: 10Giuseppe Lavagetto) [16:54:52] RECOVERY - HHVM busy threads on mw1114 is OK: OK: Less than 1.00% above the threshold [60.0] [16:57:25] !log repool wtp1024 at regular weight [16:57:32] Logged the message, Master [16:59:44] (03PS3) 10Andrew Bogott: Add Yuvi to 'roots' [puppet] - 10https://gerrit.wikimedia.org/r/170717 [17:06:44] (03PS5) 10ArielGlenn: script to monitor, clean up salt keys of deleted labs instances [puppet] - 10https://gerrit.wikimedia.org/r/168601 [17:07:35] (03CR) 10ArielGlenn: [C: 032] script to monitor, clean up salt keys of deleted labs instances [puppet] - 10https://gerrit.wikimedia.org/r/168601 (owner: 10ArielGlenn) [17:10:40] (03PS4) 10Krinkle: [WIP] contint: Apply contint::qunit_localhost to labs slaves [puppet] - 10https://gerrit.wikimedia.org/r/168631 [17:11:13] hello bblack [17:11:16] Did you see https://bugzilla.wikimedia.org/show_bug.cgi?id=72856 [17:17:19] * _joe_ off until ops meeting [17:18:35] (03PS1) 10Giuseppe Lavagetto: monitoring: convert monitor_group to monitoring::group [puppet] - 10https://gerrit.wikimedia.org/r/170727 [17:21:09] (03PS2) 10Andrew Bogott: Add class and role for Openstack Horizon [puppet] - 10https://gerrit.wikimedia.org/r/170340 [17:21:22] andrewbogott: ^ yay [17:22:38] YuviPanda: I need to figure out if it will coexist with mediawiki… and also we should discuss 2fa during the Ops meeting today [17:23:02] andrewbogott: yeah. I haven't touched it at all yet, though [17:41:50] (03PS2) 10Yuvipanda: shinken: Fix typo [puppet] - 10https://gerrit.wikimedia.org/r/168133 [17:42:45] (03CR) 10Yuvipanda: [C: 032] shinken: Fix typo [puppet] - 10https://gerrit.wikimedia.org/r/168133 (owner: 10Yuvipanda) [17:42:51] ^ yay :) [17:45:56] (03PS1) 10RobH: setting mgmt ips for ms-be2013-2015 [dns] - 10https://gerrit.wikimedia.org/r/170736 [17:47:01] YuviPanda: zomg, h4x [17:52:34] (03CR) 10RobH: [C: 032] setting mgmt ips for ms-be2013-2015 [dns] - 10https://gerrit.wikimedia.org/r/170736 (owner: 10RobH) [17:56:19] (03PS2) 10Dereckson: Improving comments for wgCopyUploadsDomains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/170526 [17:58:14] (03Draft2) 10Dereckson: Improving comments for wgCopyUploadsDomains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/170526 [17:58:30] (03CR) 10Dereckson: "PS2: ebased" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/170526 (owner: 10Dereckson) [17:59:11] Reedy: heh :) [18:00:46] (03PS1) 10Dereckson: Adding tools.wikimedia.pl to Commons wgCopyUploadsDomains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/170740 (https://bugzilla.wikimedia.org/72897) [18:08:12] PROBLEM - puppet last run on db2036 is CRITICAL: CRITICAL: puppet fail [18:22:31] (03PS1) 10Gilles: Enable JPG thumbnail chaining on beta [mediawiki-config] - 10https://gerrit.wikimedia.org/r/170747 (https://bugzilla.wikimedia.org/67525) [18:24:59] (03PS1) 10Dzahn: svn - move certificate installation into role [puppet] - 10https://gerrit.wikimedia.org/r/170748 [18:26:31] RECOVERY - puppet last run on db2036 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [18:27:23] (03PS1) 10Dzahn: gitblit - move certificate installation into role [puppet] - 10https://gerrit.wikimedia.org/r/170751 [18:31:26] (03PS1) 10Dzahn: svn - move from antimony to zirconium [puppet] - 10https://gerrit.wikimedia.org/r/170752 [18:32:14] mutante, why zirconium instead of /dev/null ? :P [18:33:03] because we want all the history [18:37:18] (03CR) 10Manybubbles: [C: 031] Adjust number of content shards for largest wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/170173 (owner: 10Chad) [18:37:27] (03PS1) 10Andrew Bogott: Include the vendor repo in mediawiki checkouts. [puppet] - 10https://gerrit.wikimedia.org/r/170755 [18:38:32] (03CR) 10Manybubbles: [C: 04-1] "iirc we made it so you had to specify each of the index types of you use the array for of replica count specification. Right?" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/170182 (owner: 10Chad) [18:39:16] (03CR) 10Andrew Bogott: [C: 032] Include the vendor repo in mediawiki checkouts. [puppet] - 10https://gerrit.wikimedia.org/r/170755 (owner: 10Andrew Bogott) [18:45:22] Reedy, does an extension needs to be enabled on testwiki for localisation to be built properly these days? [18:45:40] Um [18:45:44] Definitely needs to be in extension-list [18:46:48] I remember we had this bug before, but not sure what the status of it [18:47:13] (03CR) 10Chad: "I know, that's why the overrides are posted using the + syntax so the config will merge the arrays. Did that because I figured listing 'ge" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/170182 (owner: 10Chad) [18:57:21] akosiaris: memory looks good so far; lets continue to monitor it until later tonight & proceed with the roll-out tomorrow? [18:57:31] (03PS1) 10Dzahn: blog - remove puppet and apache config [puppet] - 10https://gerrit.wikimedia.org/r/170760 [18:57:40] gwicke: yup [18:57:53] cool, thanks! [19:00:04] kaldari: Dear anthropoid, the time has come. Please deploy Mobile (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20141103T1900). [19:05:24] (03CR) 10MaxSem: [C: 032] Add WikiGrok [mediawiki-config] - 10https://gerrit.wikimedia.org/r/170450 (owner: 10MaxSem) [19:05:52] (03Merged) 10jenkins-bot: Add WikiGrok [mediawiki-config] - 10https://gerrit.wikimedia.org/r/170450 (owner: 10MaxSem) [19:08:31] !log maxsem Started scap: Build localization cache for WikiGrok [19:08:37] Logged the message, Master [19:20:58] (03CR) 10RobH: [C: 04-1] "I say no, as the server is still online. So we need to keep it in puppet." [puppet] - 10https://gerrit.wikimedia.org/r/170760 (owner: 10Dzahn) [19:21:32] (03CR) 10Dzahn: [C: 031] "welcome YuviPanda" [puppet] - 10https://gerrit.wikimedia.org/r/170717 (owner: 10Andrew Bogott) [19:22:00] (03CR) 10Manybubbles: [C: 031] "Didn't realize that is what it did. Cool." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/170182 (owner: 10Chad) [19:39:01] PROBLEM - HHVM busy threads on mw1114 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [90.0] [19:43:25] (03CR) 10RobH: "-1 revoked since it turns out the manifests didnt actually put in latest requirement for any package, so I suppose this really can sit in " [puppet] - 10https://gerrit.wikimedia.org/r/170760 (owner: 10Dzahn) [19:43:38] mutante: comment revoked, kill with fire ^ [19:43:41] !log maxsem Finished scap: Build localization cache for WikiGrok (duration: 35m 09s) [19:43:46] Logged the message, Master [19:44:10] (03CR) 10RobH: [C: 031] "meant to replace my -1 with +1" [puppet] - 10https://gerrit.wikimedia.org/r/170760 (owner: 10Dzahn) [19:45:16] Friends! AndyRussG and I are talking about a minor skin change to support a new CentralNotice feature, and we're wondering if there is documentation somewhere about how long these changes take to propagate--also how expensive it is to purge the entire cache in case we need to do an emergency rollback. [19:46:26] awight, 1) 1 week since nearest Wednesday 2) prohibitively costly [19:46:56] Which cache? [19:47:12] Reedy: all the varnish... [19:47:12] http, judging from context [19:47:16] yup [19:47:20] lol [19:47:27] MaxSem: thanks! is there a page about this... [19:47:31] yeah, that's really not gonna happen [19:47:47] <_joe_> clearing the whole cache is not advisable [19:47:49] Reedy: which, the rollback? yeah I assume that's a terrible situation [19:48:05] <_joe_> that is, if we want wikipedias to be reachable [19:48:12] but for a normally propagated change, Wednesday plus a week... [19:48:25] yeah, it's now tuesday/wednesday deploys [19:48:34] (03CR) 10MaxSem: [C: 032] Enable WikiGrok on test and test2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/170451 (owner: 10MaxSem) [19:48:43] so as long as you get it in before branching, it'll be on all wikipedias by the end of the next wednesday [19:48:45] (03Merged) 10jenkins-bot: Enable WikiGrok on test and test2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/170451 (owner: 10MaxSem) [19:48:59] Reedy: ok rad [19:49:12] Sounds like breaking the site is easier than I thought :D [19:49:55] yup, just insert die( 'die motherfuckers' ) somewhere :P [19:50:10] nooo not subtle enough [19:50:36] how about