[00:00:37] No, because it didn't break the site completely [00:00:41] and I have other work [00:00:48] so it gets in the mud. [00:00:55] I'm not sure I get the argument [00:01:07] pretty much the same reason hashar has root on it already. [00:01:10] you don't ask anyone but you wait days or weeks to get fixed? :) [00:01:17] because we don't have a representable emulation [00:01:21] so we test in production basically [00:02:06] I think you should try asking ops to fix things for you when you encounter them for a while [00:02:15] and to get that puppetised better I would likely go a whole lot easier to not be stabbing in the dark via puppet when puppetising the very thing itself. [00:02:20] and if it becomes a burden for us or we end up blocking you, then grant you access. [00:02:25] the fixing issues was just an example [00:02:35] like what? [00:02:39] please be specific [00:02:43] If I'm puppesing testswarm for example [00:02:47] I did that at jquery a few weeks back [00:03:08] nobody can review those changes because nobody else has even used testswarm around here before [00:03:08] the only way to ensure it works is to try it out [00:03:11] well, how? [00:03:20] you have integration.wmflabs don't you? [00:03:22] in production basically, at least until everything else is puppetised [00:03:31] yes, which is an empty apache server with some stuff on it [00:03:42] ? [00:03:48] not at all like gallium [00:03:52] but you're talking about bits you're puppetizing [00:03:57] yep [00:04:26] I agree it shouldn't be needed and in a few months I'll likely ask for it to be revoked and expect hashar to do the same. [00:04:27] so what prevents you from testing them in labs? [00:04:39] the environment isn't there [00:04:46] what environment? [00:04:58] everything gallium that isn't puppetised right now [00:05:13] and stuff that is is often hardcoded in production settings meaning it doesn't work in labs [00:05:15] but you're talking for *new* things [00:05:26] Yes, but they interact with existing things [00:05:35] I can't test zuul without jenkins and gerrit as well, for example. [00:07:04] so when hashar got that in production, we (he) basically set it up on gallium (using sudo for parts that need it) and put it in puppet once it works. [00:07:29] okay [00:07:38] I'm not entirely convinced [00:07:46] but it's limited root, so I'm not going to veto [00:07:46] that's okay, there's no rush. [00:07:53] it will get delayed a bit though [00:08:07] since we want to replace the certificate first [00:08:12] there's so much interaction with these apps you have to have it run to know it works, it's hard to review. And until everything is in puppet, it can't be tested in labs either. [00:08:17] sure [00:08:52] though hashar has access too as well. which means in most cases we assign integration features for him to create because I can't right now. [00:09:45] New patchset: Dzahn; "remove boards.wp->boards.wm redirect (bug 46341) neither of them exist in DNS, there is "board", but not "boards"" [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/54801 [00:09:46] New patchset: Ryan Lane; "Add ssh key and authorized_keys for nova" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54802 [00:09:58] Anyway, no rush. For now I've got enough other work to focus on. [00:11:05] Krinkle: so i'm going to remove that from svn now.. right [00:11:10] the docs stuff [00:11:33] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54802 [00:11:40] mutante: doesn't puppet do that? [00:12:01] Krinkle: after a human merges it on sockpuppet, yes. [00:12:09] and i was going to look at it to confirm it works [00:12:16] cool [00:13:00] mutante: is puppet for the svn server different then? Or do you always have to merge it on sockpuppet in addition to the puppet repo in git? [00:13:07] err: Failed to apply catalog: Could not find dependency Package[apache2] for File[/var/cache/svnusers] at /var/lib/git/operations/puppet/manifests/svn.pp:44 [00:13:10] there we go :p [00:13:19] Krinkle: always have to [00:13:34] now how did that ..looking [00:14:01] last puppet run was yesterday .. not an hour ago .. hmm [00:16:45] New patchset: Ryan Lane; "Properly reference private repo" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54803 [00:16:49] New patchset: MaxSem; "Beta doesn't have SSL" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/54804 [00:19:28] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54803 [00:22:34] New patchset: Matthias Mullie; "Completely disable AFTv5 on enwiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/54807 [00:23:02] New patchset: Dzahn; "after removing docs generation this require for apache2 package broke pupet runs - remove dependency" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54808 [00:23:37] paravoid: I'm afraid something went wrong [00:23:43] https://doc.wikimedia.org/VisualEditor/master/ [00:23:46] Uncaught ReferenceError: Ext is not defined [00:24:11] Same run on http://integration.wmflabs.org/mw/extensions/VisualEditor/docs/ works fine [00:24:25] http://integration.wmflabs.org/mw/extensions/VisualEditor/docs/extjs/ext-all.js [00:24:28] https://doc.wikimedia.org/VisualEditor/master/extjs/ext-all.js [00:24:37] Ext JS Library 3.0.3 vs. Ext JS 4.1 [00:25:06] New patchset: Ryan Lane; "Change ownership of nova ssh config to nova" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54809 [00:25:53] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54809 [00:28:14] New patchset: Dzahn; "after removing docs generation this require for apache2 package broke puppet runs - remove dependency" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54808 [00:29:08] New review: Dzahn; "fix formey puppet runs" [operations/puppet] (production) C: 2; - https://gerrit.wikimedia.org/r/54808 [00:29:21] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54808 [00:31:56] PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours [00:33:06] paravoid: The version mismatch isn't the problem actually [00:33:48] paravoid: Looks like libjs-extjs has an odd layout in that the file you reference from /usr/share/javascript/extjs is incomplete [00:33:59] (in the patch) [00:34:37] libjs-extjs's distribution (non-standard from extjs point of view) has a subdirectory with adaptors, which need to be prepended to the main file to work. [00:34:39] Crap [00:36:38] New patchset: Milimetric; "cron job to regenerate mobile apps stats" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54811 [00:37:40] New patchset: Dzahn; "remove another requirement for apache2 package and fix unquoted resource names and unquoted file modes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54812 [00:38:54] New review: Dzahn; "yea, we hopefully don't need the entire svn.pp anymore soon :)" [operations/puppet] (production) C: 2; - https://gerrit.wikimedia.org/r/54812 [00:39:06] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54812 [00:41:05] RECOVERY - Puppet freshness on formey is OK: puppet ran at Wed Mar 20 00:41:02 UTC 2013 [00:42:31] paravoid: Okay, figured it out. It is indeed just a version mismatch. Forget the whole "adators" directory. that's something specific to v3 of extjs. In v4 (which jsduck depends on) there is just one extjs.js file which works as intended. [00:42:51] paravoid: The problem is bascially that the libjs-extjs package in your repo is rather outdated. [00:43:14] 3.0.3 instead of 4.0+ (latest stable is 4.2.0) [00:43:31] our* [00:43:56] I'll take the debs you created as example to see if I can package this one [00:50:44] New patchset: Milimetric; "cron job to regenerate mobile apps stats" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54811 [00:50:50] New patchset: Pyoungmeister; "run puppet by cron instead of via the agent" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54815 [00:52:08] puppet run on formey (SVN) is .. taking ..just a little .. while [00:52:30] filebucketing like every single file in there [00:52:38] Krinkle: it's still running :p [00:59:20] New patchset: Pyoungmeister; "run puppet by cron instead of via the agent" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54815 [01:05:49] rfaulkner: ping [01:06:14] preilly: what's up? [01:07:20] New review: Matthias Mullie; "I won't merge it in myself - if you think current load may be "dangerous", feel free to merge it in,..." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/54807 [01:07:46] rfaulkner: I sent you a PM [01:08:00] Ryan_Lane: do you have a second for a puppet question? [01:09:54] give me a bit [01:12:21] preilly: what's up? [01:12:46] puppet agent --onetime --verbose --no-daemonize --no-splay --show_diff [01:13:02] Ryan_Lane: is there an easy way to get puppet to just ignore one file? [01:13:28] Ryan_Lane: e.g., can I get puppet to ignore php.ini in a VM [01:13:38] Ryan_Lane: but still do everything else as it normally would [01:14:03] hm [01:14:31] I don't believe so, if it's in the catalogue [01:14:51] like when you are doing recurse => true in a file {}? [01:14:54] file {} has an ingore => [01:14:56] Ryan_Lane: so I guess I could include a new file in the php.ini file that puppet doesn't manage right to put my changes in that file? [01:15:05] yep [01:15:11] Ryan_Lane: okay cool [01:15:18] Ryan_Lane: can I send you a PM? [01:15:21] sure [01:16:51] Krinkle: 19748 files later .. notice: /Stage[main]/Svn::Server/File[/var/mwdocs]/ensure: removed [01:16:55] :) [01:17:06] nice! [01:17:13] oh, right /var/mwdocs [01:17:19] that's all of svn :D [01:17:27] and then some [01:17:30] it went through every single file in there :P [01:17:38] Finished catalog run in 2203.85 seconds [01:17:52] Way to go puppet, who cares about efficiency [01:18:05] now it will check that is absent until the end of time [01:18:07] Why would it catalog the entire subtree only to remove it? [01:18:15] but that is just until we can drop the whole svn.pp :) [01:18:16] yeah, but only the top [01:18:32] and afaik for that we just need to make pywikipedia people finally move [01:19:04] well, it checked, and then found that: info: FileBucket got a duplicate file {md5} [01:19:10] for everything [01:19:41] New patchset: Pyoungmeister; "run puppet by cron instead of via the agent" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54815 [01:23:07] New patchset: Pyoungmeister; "run puppet by cron instead of via the agent" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54815 [01:34:14] PROBLEM - Varnish traffic logger on cp1028 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:46:26] paravoid: So, where in the debian package (debs-ruby-jsduck) do you specificy what version it fetches from rubygems? [01:46:39] In ./watch I see [01:46:41] version=3 [01:46:41] http://pkg-ruby-extras.alioth.debian.org/cgi-bin/gemwatch/jsduck .*/jsduck-(.*).tar.gz [01:46:48] yet it fetched 4.6.2 during the build [01:47:05] (as it should) [01:51:01] Krinkle: i think version=3 is the format of the watch file [01:51:07] yeah I know [01:51:38] I'm trying to find what it telling it to fetch that version [01:51:43] though I'm afraid nothing is [01:52:04] which means if you re-build the package without changing anything after a release is made, it will silently update [01:52:20] it would be automatically updating in a counter intuitive way (at build time) [02:06:54] PROBLEM - Puppet freshness on arsenic is CRITICAL: Puppet has not run in the last 10 hours [02:07:37] Change merged: Andrew Bogott; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/54804 [02:08:56] PROBLEM - Puppet freshness on db43 is CRITICAL: Puppet has not run in the last 10 hours [02:09:46] New review: Krinkle; "(1 comment)" [operations/debs/ruby-jsduck] (master) - https://gerrit.wikimedia.org/r/54691 [02:16:50] New patchset: Krinkle; "ExtJS: Use provided copy instead of overwriting with libsjs-extjs." [operations/debs/ruby-jsduck] (master) - https://gerrit.wikimedia.org/r/54818 [02:32:16] !log LocalisationUpdate completed (1.21wmf11) at Wed Mar 20 02:32:16 UTC 2013 [02:32:24] Logged the message, Master [02:34:14] PROBLEM - MySQL Slave Delay on db66 is CRITICAL: CRIT replication delay 207 seconds [02:34:25] PROBLEM - MySQL Replication Heartbeat on db66 is CRITICAL: CRIT replication delay 215 seconds [02:35:15] PROBLEM - MySQL Slave Delay on db1020 is CRITICAL: CRIT replication delay 185 seconds [02:35:54] PROBLEM - MySQL Replication Heartbeat on db1020 is CRITICAL: CRIT replication delay 209 seconds [02:37:54] PROBLEM - Puppet freshness on xenon is CRITICAL: Puppet has not run in the last 10 hours [02:38:54] PROBLEM - Puppet freshness on cp1034 is CRITICAL: Puppet has not run in the last 10 hours [02:44:04] !log LocalisationUpdate completed (1.21wmf12) at Wed Mar 20 02:44:03 UTC 2013 [02:44:10] Logged the message, Master [02:45:55] RECOVERY - MySQL Replication Heartbeat on db1020 is OK: OK replication delay 20 seconds [02:46:15] RECOVERY - MySQL Slave Delay on db1020 is OK: OK replication delay 1 seconds [03:08:18] New review: Ori.livneh; "Thanks very much for this, Andrew." [operations/debs/python-jsonschema] (master) - https://gerrit.wikimedia.org/r/54782 [03:12:50] New patchset: Ryan Lane; "Give nova a shell on nova-compute" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54820 [03:14:39] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54820 [03:56:14] RECOVERY - MySQL Slave Delay on db66 is OK: OK replication delay 0 seconds [03:56:25] RECOVERY - MySQL Replication Heartbeat on db66 is OK: OK replication delay 0 seconds [04:23:49] Change merged: Tim Starling; [operations/debs/lucene-search-2] (master) - https://gerrit.wikimedia.org/r/54522 [04:57:54] PROBLEM - Puppet freshness on ms1004 is CRITICAL: Puppet has not run in the last 10 hours [05:03:54] PROBLEM - SSH on lvs6 is CRITICAL: Server answer: [05:04:13] LeslieCarr: ^ [05:04:45] she's idle [05:05:52] i agree [05:06:55] RECOVERY - SSH on lvs6 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [05:13:56] RECOVERY - Puppet freshness on sq79 is OK: puppet ran at Wed Mar 20 05:13:51 UTC 2013 [05:39:25] PROBLEM - MySQL Slave Delay on db1047 is CRITICAL: CRIT replication delay 199 seconds [05:48:25] RECOVERY - MySQL Slave Delay on db1047 is OK: OK replication delay 21 seconds [05:49:26] RECOVERY - MySQL Slave Delay on db33 is OK: OK replication delay 23 seconds [05:49:29] RECOVERY - MySQL Replication Heartbeat on db33 is OK: OK replication delay 23 seconds [06:00:54] PROBLEM - Puppet freshness on mw1094 is CRITICAL: Puppet has not run in the last 10 hours [06:01:56] PROBLEM - Puppet freshness on colby is CRITICAL: Puppet has not run in the last 10 hours [06:01:56] PROBLEM - Puppet freshness on mw1052 is CRITICAL: Puppet has not run in the last 10 hours [06:01:56] PROBLEM - Puppet freshness on searchidx2 is CRITICAL: Puppet has not run in the last 10 hours [06:02:55] PROBLEM - Puppet freshness on mw1008 is CRITICAL: Puppet has not run in the last 10 hours [06:04:01] PROBLEM - Puppet freshness on mw1056 is CRITICAL: Puppet has not run in the last 10 hours [06:05:56] PROBLEM - Puppet freshness on cp1026 is CRITICAL: Puppet has not run in the last 10 hours [06:36:54] PROBLEM - Puppet freshness on db66 is CRITICAL: Puppet has not run in the last 10 hours [06:38:35] PROBLEM - MySQL Slave Delay on db1047 is CRITICAL: CRIT replication delay 196 seconds [06:59:55] PROBLEM - Puppet freshness on europium is CRITICAL: Puppet has not run in the last 10 hours [07:00:34] RECOVERY - MySQL Slave Delay on db1047 is OK: OK replication delay 0 seconds [07:25:35] PROBLEM - SSH on amslvs1 is CRITICAL: Server answer: [07:25:54] PROBLEM - Puppet freshness on lvs1004 is CRITICAL: Puppet has not run in the last 10 hours [07:25:54] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: Puppet has not run in the last 10 hours [07:25:54] PROBLEM - Puppet freshness on lvs1005 is CRITICAL: Puppet has not run in the last 10 hours [07:25:55] PROBLEM - Puppet freshness on msfe1002 is CRITICAL: Puppet has not run in the last 10 hours [07:30:35] RECOVERY - SSH on amslvs1 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [07:41:53] hello [07:46:37] PROBLEM - MySQL Slave Delay on db33 is CRITICAL: CRIT replication delay 181 seconds [07:46:37] PROBLEM - MySQL Replication Heartbeat on db33 is CRITICAL: CRIT replication delay 181 seconds [07:55:20] New review: Hashar; "indeed :)" [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/54798 [07:56:31] !log gallium: reloaded apache2 to make sure recent changes are applied. [07:56:37] Logged the message, Master [08:00:31] New review: Hashar; "(2 comments)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54466 [08:01:56] New review: Hashar; "Not sure it is needed but why not. Make sure to get Timo trained by ops about the do and don't on a ..." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/53861 [08:05:38] RECOVERY - MySQL Slave Delay on db33 is OK: OK replication delay 0 seconds [08:05:38] RECOVERY - MySQL Replication Heartbeat on db33 is OK: OK replication delay 0 seconds [08:07:54] PROBLEM - Puppet freshness on mc1002 is CRITICAL: Puppet has not run in the last 10 hours [08:07:54] PROBLEM - Puppet freshness on virt1005 is CRITICAL: Puppet has not run in the last 10 hours [08:38:55] PROBLEM - Puppet freshness on db35 is CRITICAL: Puppet has not run in the last 10 hours [08:38:55] PROBLEM - Puppet freshness on labstore1 is CRITICAL: Puppet has not run in the last 10 hours [08:44:43] New review: Hashar; "It seems there is already such a packaging work being made:" [operations/debs/python-jsonschema] (master) C: -1; - https://gerrit.wikimedia.org/r/54782 [08:48:49] Waiting for 10.64.16.15: 111 seconds lagged [08:48:51] from https://www.wikidata.org/?maxlag=-1 [08:52:02] hm back to normal now, i guess thanks [08:59:13] legoktm: db33 had some lag one hour ago [08:59:22] legoktm: maybe that is the DB for wikidata :-] [08:59:33] well it all works fine now :) [09:33:11] Change abandoned: Hashar; "No point in keeping this change around, will craft a new one if needed." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54466 [10:32:54] PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours [10:36:56] PROBLEM - Puppet freshness on mw1130 is CRITICAL: Puppet has not run in the last 10 hours [10:37:55] PROBLEM - Puppet freshness on analytics1001 is CRITICAL: Puppet has not run in the last 10 hours [10:45:40] New patchset: Hashar; "slightly refactor Zuul daemon web access" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54845 [10:46:04] New review: Hashar; "New change is https://gerrit.wikimedia.org/r/54845" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54466 [11:10:10] New patchset: Hashar; "adjust Zuul daemon web access" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54845 [11:10:36] New review: Hashar; "patchset 2 makes it MUCH simpler" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54845 [11:22:59] New patchset: Hashar; "adjust Zuul daemon web access" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54845 [11:23:47] New review: Hashar; "Turns out ProxyPassMatch require a $1 :-] That last patchset works fine, verified it directly on g..." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54845 [11:25:24] apergos: hi :-] I could use a merge for the contint server, a slight tweak to a mod proxy rule https://gerrit.wikimedia.org/r/#/c/54845/3/modules/contint/files/apache/proxy_zuul,unified [11:25:38] apergos: which I have already tested on the server so that is not going to cause any trouble :-] [11:28:53] what gets served if the main doc root is requested (or some subpage under it)? [11:30:24] apergos some 404 :-] [11:30:53] no index pages or anything like that? [11:31:00] the /zuul/randomthing will resolve back to the Apache DocumentRoot which is /org/wikimedia/integration/zuul/ [11:31:10] which I am writing right now :-] [11:31:15] ok [11:31:19] so yeah the change will make /zuul/ a 404 :-] [11:31:29] but that is not reference anywhere and will soon be populated [11:31:38] all right [11:31:51] ProxyPassMatch is a nice trick [11:32:35] Change merged: ArielGlenn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54845 [11:33:04] apergos: i will take care of puppet on the box to save you some time [11:33:31] ok, it's all yours [11:36:05] New review: Hashar; "Works fine thanks!" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54845 [11:36:10] thank you apergos [11:36:48] yw [11:38:33] my english is really awful [11:38:56] it took me a few minutes to understand the lyrics of http://youtu.be/4Xy6AH_tZWs [11:39:27] http://en.wikipedia.org/wiki/Step_into_My_Office,_Baby :-] [11:40:57] lyrics are usually hard [11:41:07] even in your mother tongue [11:41:25] they don't exactly sing them with clear articulation a lot of the tmie [11:41:51] at least I managed to found a few sentences and that was enough for google to find out the song for me :-] [11:42:08] :) [11:42:39] Platonides: while you are around, I removed the python pool counter daemon https://gerrit.wikimedia.org/r/54832 :-D [11:42:49] I don't mind [11:42:56] but beware of domas ;) [11:43:11] just cast your voice on the change ? :-D [11:47:27] btw, do you have many conference calls? [11:49:49] 2 on monday evening [11:49:56] some hangout from time to time during the evening [11:50:00] maybe once per week [11:50:04] rest is IRC :-] [11:50:15] most of my team seems to prefer writing over voice [11:54:36] Platonides: ^^^ [11:55:24] looks like you're trying to answer my question on https://xkcd.com/802/ in WMF :p [11:55:33] spoken language vs. written etc. [11:58:44] Nemo_bis: what question ? :-] [11:59:31] hashar: if the proportion in WMF is the same as there [11:59:52] oh, I found IRC: it's between the sea of protocol confusion and the sea of memes [11:59:59] yeah seeing it :-] [12:00:01] I love those islands [12:00:10] usenet was a nice place to live [12:00:20] together with troll bay and Wikipedia talkpages [12:00:51] yeah troll bay is enjoyable [12:00:56] much more than the sea of memes :-] [12:01:22] bah I got 95 bugs :/ [12:01:26] heh [12:01:33] you flooder non-+q-user [12:02:16] argh [12:02:22] PHP_CodeSniffer uses pear bugtracker :/ [12:02:55] <^demon> Why anyone's still using anything related to pear is beyond me :\ [12:04:12] <^demon> At least its not pear2 ;-) [12:07:34] F*** PEAR [12:07:51] Email is already in use for an existing account [12:07:54] PROBLEM - Puppet freshness on arsenic is CRITICAL: Puppet has not run in the last 10 hours [12:07:58] grmblb [12:10:02] PROBLEM - Puppet freshness on db43 is CRITICAL: Puppet has not run in the last 10 hours [12:10:27] ^demon: I have worked a bit on having the old parser tests system to report junit :D [12:10:39] ^demon: that is not a pretty patch though [12:10:48] <^demon> Heh. [12:11:56] will add you to review :-D [12:12:58] writing it, I figured out I haven't written any new feature in MediaWiki for quite a long time now [12:21:57] I guess you don't know why -with parserfunctions installed- a number of tests are failing with "" is not a valid magic word for "if" [12:22:27] I hate these testbugs which only happen due to some interaction with other tests [12:24:39] <^demon> hashar: Will upgrading zuul require some downtime? [12:24:41] New patchset: Milimetric; "cron job to regenerate mobile apps stats" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/54811 [12:24:50] thanks gerrit-wm! :) [12:31:03] ^demon: yeah [12:31:11] ^demon: but I can do it during European mornings :-] [12:31:14] <^demon> Maybe we should schedule a gerrit update for the same window. I've got a couple of things I'm wanting to pull into our install. Makes sense to take both services down at the same time to minimize disruption. [12:31:23] ^demon: and I need some python modules to be installed on gallium first. [12:38:54] PROBLEM - Puppet freshness on xenon is CRITICAL: Puppet has not run in the last 10 hours [12:39:54] PROBLEM - Puppet freshness on cp1034 is CRITICAL: Puppet has not run in the last 10 hours [13:03:16] New review: Faidon; "+2 on the idea, that's how I've been running puppet on every other setup I've ever been." [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/54815 [13:07:23] Change merged: Faidon; [operations/debs/ruby-jsduck] (master) - https://gerrit.wikimedia.org/r/54818 [13:09:23] New review: Faidon; "Definitely. Please just import the package from e.g. Debian experimental and fork your changes from ..." [operations/debs/python-jsonschema] (master) C: -1; - https://gerrit.wikimedia.org/r/54782 [13:12:29] New review: Faidon; "I don't see the point for this (and what's makes it different than the rest), and I'm not a fan of e..." [operations/puppet] (production) C: -2; - https://gerrit.wikimedia.org/r/54798 [13:12:32] New patchset: Mark Bergsma; "Purge buffered data when fflush() doesn't work" [operations/debs/varnish] (testing/3.0.3plus-rc1) - https://gerrit.wikimedia.org/r/54683 [13:13:20] found the memleak? [13:13:34] no [13:13:54] i have a suspicion that it could be it [13:24:10] hello? [13:24:15] I run git status on stat1 [13:24:20] and it doesn't work [13:24:25] on valid git repos [13:24:33] what should I do? [13:24:48] "it doesn't work"? [13:24:56] New review: Faidon; "I don't see the rest of my comments addressed, pip being the most important one. Also, you seemed to..." [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/53587 [13:26:11] mark: average_drifter means that it just sits there hanging [13:26:26] I've noticed the same thing, and additionally, saving files from vim takes a crazy long time [13:26:55] it's almost like the filesystem is borked somehow [13:27:01] or ssh [13:27:13] New review: Faidon; "Looks great; can someone (ottomata?) merge, build a package and put it up in apt?" [operations/debs/python-voluptuous] (master) C: 2; - https://gerrit.wikimedia.org/r/44408 [13:27:17] which git repo can I try? [13:27:29] so in vim issuing a save makes it hang for up to 60 seconds [13:28:02] saving where? [13:28:26] i'm working in my home directory, /home/dandreescu [13:28:43] /home/spetrea/wikistats/pageviews_reports is an example of a git repository where you can try git status [13:30:13] i can keep issuing vim commands and after about 60 seconds the screen refreshes and reflects my changes. so it's exciting, but not very productive :) [13:30:20] no that's a symlink [13:30:38] try /home/spetrea/wikistats/wikistats [13:31:07] ha, yeah, now it's fine - of course! [13:31:20] that symlink is best removed [13:31:36] but the problem seems intermittent, with the most consistent thing being the vim problem I described [13:31:42] i'll try editing with nano [13:31:53] i doubt vi is the problem [13:32:52] heh, crazy - editing with nano works perfectly [13:33:30] note that stat1 is running some jobs and its disks are very loaded at times [13:33:51] 100% utilized actually [13:33:56] that may easily explain some slowness [13:34:30] wonder if it's just vim's .swp and history tracking that makes it choke [13:34:39] anyway, thanks, I suspected it was spiked [13:49:02] New patchset: Hashar; "(bug 44041) adapt role::cache::mobile for beta" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/44709 [13:49:48] New review: Hashar; "rebased" [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/44709 [13:57:15] hashar: can we do loopback mounted filesystem images in labs? [13:57:18] then we could use XFS [13:57:31] mount -o ? [13:57:35] grrp [13:57:53] mount -t xfs /some/file.img /srv/sda3 -o loopback,... [13:58:05] that would be a lot closer to production then what you do now [13:58:05] we can try out :-] [13:58:59] I am not sure whether XFS matter or not [13:59:08] since we are mostly testing out MediaWiki itself [13:59:34] each instance has a /dev/vdb virtual device, maybe it can be formatted using XFS [13:59:35] it's definitely a goal to get the two environments as close to eachother as possible [13:59:56] why not use loopback images? [13:59:58] then you can have two, format them with xfs however you want [14:00:13] and since it's already virtualized anyway, another layer of indirection is not really gonna matter anymore anyway [14:00:51] that is for varnish isn't it ? [14:00:57] for varnish yes [14:01:06] so we can keep the storage backends the same as in production [14:01:09] (just a bit smaller I guess ;) [14:01:27] VCL refers to storage for example, it sucks if it has to be different in labs [14:01:32] and I don't see why it needs to be [14:01:41] ahh now I see what you mean [14:02:01] so /dev/vdb will hold some XFS images which would be mounted as /srv/sda3 and /srv/sdb3 [14:02:09] yes [14:02:12] XFS images as normal files [14:02:15] linux supports that [14:02:25] those images can live wherever [14:02:25] if you ever find out how to generate the file using puppet .. go for it :-] [14:02:32] easy [14:02:36] just run mkfs :) [14:02:51] we already do that in puppet in some places [14:02:53] at least for swift [14:02:58] and squid I think [14:03:19] so make the file, give it a certain size [14:03:21] then run mkfs on it [14:03:23] ah files/squid/setup-aufs-cachedirs [14:03:31] yeah but you don't need that script [14:03:39] just run dd to create the file [14:03:43] then mkfs it [14:03:45] then mount it [14:05:59] aharhazrar [14:06:04] I hate our role::cache class [14:06:09] i like it [14:06:25] and then the size of the storage backends can be put in a hash in the ::configuration class [14:06:29] the upload cache uses the LVS service IPs which do not exist in labs :-] [14:06:55] PROBLEM - Puppet freshness on mw1006 is CRITICAL: Puppet has not run in the last 10 hours [14:07:00] for the bits cache I had to do some case $::realm to have labs use the role::cache::configuration::backends instead of LVS ups [14:07:06] damn that is going to be ugly again [14:07:15] we should really make LVS work [14:07:54] yeah ideally [14:08:41] first filling a bug about that XFS trick [14:11:55] mark: I have logged the XFS trick at https://bugzilla.wikimedia.org/show_bug.cgi?id=46359 [14:34:15] nis there some current or recent caching problem where e.g. the enwiki [[Main Page]] was stale? [14:34:31] there's a handful (at least) of OTRS tickets about it within the last day or so [14:34:57] actually, they were all within the last 6 hours [14:35:26] someone sent a response saying it was being investigated. but he didn't give a bug # or anything and he's offline now [14:35:30] * jeremyb_ scrolls up [14:47:47]