[00:00:04] <bd808>	 meh
[00:00:07] * marxarelli is hacking the mainframe, almost through the firewall
[00:00:21] * bd808 sends SPIKE!
[00:02:31] <shinken-wm>	 RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0]  
[00:02:45] <marxarelli>	 bd808: nice. coming down at a decent 2MB/s
[01:38:30] <shinken-wm>	 PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]  
[03:43:31] <shinken-wm>	 RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0]  
[04:14:43] <wikibugs>	 3Wikimedia-Labs-Infrastructure: toolserver.org is not accessible from Labs instances - https://phabricator.wikimedia.org/T87086#983592 (10scfc) 3NEW
[04:59:30] <shinken-wm>	 PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]  
[05:24:30] <shinken-wm>	 RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0]  
[05:36:34] <shinken-wm>	 PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0]  
[06:01:34] <shinken-wm>	 RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0]  
[06:15:08] <wikibugs>	 3Tool-Labs: Provide page view metrics for individual tools on toollabs - https://phabricator.wikimedia.org/T87001#983660 (10yuvipanda) Oh, just whatever any UA classifier considers as browsers vs bots UA.
[06:17:32] <shinken-wm>	 PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]  
[06:19:19] <wikibugs>	 3Tool-Labs: Provide page view metrics for individual tools on toollabs - https://phabricator.wikimedia.org/T87001#983663 (10Ironholds) Our pageviews definition is based, in large parts, on MediaWiki's structure. Applying it to the tool labs structure would require an entirely new setup - one with variable accura...
[06:21:58] <wikibugs>	 3Tool-Labs: Provide page view metrics for individual tools on toollabs - https://phabricator.wikimedia.org/T87001#983664 (10yuvipanda) What I specifically have in mind is:  1. Parse the nginx logs directly once an hour, and update the count publicly somewhere. This would be good enough for tools, since they don'...
[06:22:08] <wikibugs>	 3Tool-Labs: Provide page view metrics for individual tools on toollabs - https://phabricator.wikimedia.org/T87001#983665 (10yuvipanda) a:3yuvipanda
[06:27:32] <Hydriz>	 !log dumps Deleted dumps-incr, replaced with dumps-3
[06:28:54] <wikibugs>	 3Tool-Labs: Provide page view metrics for individual tools on toollabs - https://phabricator.wikimedia.org/T87001#983667 (10Ironholds) The counting methodology will /not/ be consistent. It's based on things like MediaWiki directory names and specific hosts ;p.
[06:29:48] <wikibugs>	 3Tool-Labs: Provide page view metrics for individual tools on toollabs - https://phabricator.wikimedia.org/T87001#983668 (10yuvipanda) Ok, so 'as consistent as possible'? Which I suppose boils down to just deciding which UAs to bucket as 'humans' and which as 'bots' and nothing more.
[06:31:07] <wikibugs>	 3Tool-Labs: Provide page view metrics for individual tools on toollabs - https://phabricator.wikimedia.org/T87001#983669 (10Ironholds) Or, say, MIME type filtering. But yes. So, you want ua-parser ;p
[06:32:44] <wikibugs>	 3Wikimedia-Labs-Infrastructure: toolserver.org is not accessible from Labs instances - https://phabricator.wikimedia.org/T87086#983677 (10scfc) Thanks for merging, @yuvipanda. To complete the fix, OpenStack's dnsmasq needs to be restarted.  In the past, this has sometimes led to outages due to the wrong dnsmasq...
[06:33:27] <wikibugs>	 3Wikimedia-Labs-Infrastructure: toolserver.org is not accessible from Labs instances - https://phabricator.wikimedia.org/T87086#983680 (10scfc)
[06:35:32] <wikibugs>	 3Wikimedia-Labs-Infrastructure: toolserver.org is not accessible from Labs instances - https://phabricator.wikimedia.org/T87086#983686 (10yuvipanda) Yup, I wrote that sketchy outline :) Have restarted dnsmasq now, and dig toolserver.org returns the internal IP \o/
[06:35:43] <wikibugs>	 3Wikimedia-Labs-Infrastructure: toolserver.org is not accessible from Labs instances - https://phabricator.wikimedia.org/T87086#983687 (10yuvipanda) 5Open>3Resolved a:3yuvipanda
[06:37:16] <shinken-wm>	 PROBLEM - Puppet failure on tools-exec-07 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0]  
[06:37:48] <wikibugs>	 3Wikimedia-Labs-Infrastructure: toolserver.org is not accessible from Labs instances - https://phabricator.wikimedia.org/T87086#983689 (10yuvipanda) In the future, you can use https://wikitech.wikimedia.org/w/api.php?action=query&list=novainstances&niproject=toolserver-legacy&niregion=eqiad&format=json and simil...
[07:02:13] <shinken-wm>	 RECOVERY - Puppet failure on tools-exec-07 is OK: OK: Less than 1.00% above the threshold [0.0]  
[07:06:35] <wikibugs>	 3Tool-Labs: Provide page view metrics for individual tools on toollabs - https://phabricator.wikimedia.org/T87001#983696 (10yuvipanda) Cool. Can you tell me what exactly mime filtering is used for?
[07:26:44] <wikibugs>	 3Tool-Labs: Provide page view metrics for individual tools on toollabs - https://phabricator.wikimedia.org/T87001#983697 (10Ironholds) Filtering out calls to JS/image/css assets
[08:02:31] <shinken-wm>	 RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0]  
[08:17:35] <wikibugs>	 3Wikimedia-Labs-Infrastructure: toolserver.org is not accessible from Labs instances - https://phabricator.wikimedia.org/T87086#983719 (10scfc) >>! In T87086#983689, @yuvipanda wrote: > In the future, you can use https://wikitech.wikimedia.org/w/api.php?action=query&list=novainstances&niproject=toolserver-legacy...
[08:18:31] <shinken-wm>	 PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]  
[08:18:49] <wikibugs>	 3Wikimedia-Labs-Infrastructure: toolserver.org is not accessible from Labs instances - https://phabricator.wikimedia.org/T87086#983720 (10yuvipanda) It is directly from OpenStack, yeah. Same info that drives Special:NovaInstance :)  Also realized that all this info is available in LDAP too.
[08:43:30] <shinken-wm>	 RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0]  
[09:19:29] <shinken-wm>	 PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]  
[10:14:35] <wikibugs>	 3Labs-Team: Allow everyone in ops group in LDAP to login to all Labs instances - https://phabricator.wikimedia.org/T87094#983801 (10yuvipanda) 3NEW
[10:44:31] <shinken-wm>	 RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0]  
[11:31:21] <shinken-wm>	 PROBLEM - Puppet failure on tools-webgrid-02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]  
[11:56:31] <shinken-wm>	 PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]  
[12:01:19] <shinken-wm>	 RECOVERY - Puppet failure on tools-webgrid-02 is OK: OK: Less than 1.00% above the threshold [0.0]  
[12:22:42] <multichill>	 !log tools.delinker Delinked myself, I'm no longer a maintainer of this tool.
[12:24:37] * multichill wonders where the log bot went
[12:46:32] <shinken-wm>	 RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0]  
[12:57:32] <shinken-wm>	 PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]  
[12:59:44] <wikibugs>	 3Wikimedia-Labs-General: dewiki-20140918-pages-meta-history{3,4}.bz2 dumps are missing - https://phabricator.wikimedia.org/T73967#983865 (10Giftpflanze) a:3ArielGlenn
[13:00:55] <wikibugs>	 3Tool-Labs: Missing Toolserver features in Tools (tracking) - https://phabricator.wikimedia.org/T60791#983873 (10Giftpflanze)
[13:00:56] <wikibugs>	 3Tool-Labs: struct::stack/TclOO version mismatch - https://phabricator.wikimedia.org/T68984#983872 (10Giftpflanze) 5declined>3Resolved
[13:04:45] <wikibugs>	 3Tool-Labs: Missing Toolserver features in Tools (tracking) - https://phabricator.wikimedia.org/T60791#983884 (10valhallasw)
[13:23:39] <wikibugs>	 3Tool-Labs: IRC: Too many user connections (global) - https://phabricator.wikimedia.org/T66802#983928 (10Giftpflanze) Is this still an issue?
[13:26:41] <wikibugs>	 3Tool-Labs: Dumps not updating again. - https://phabricator.wikimedia.org/T74154#983930 (10Giftpflanze) 5stalled>3Resolved Everything is fine, obviously. Closing.
[13:35:21] <shinken-wm>	 PROBLEM - Free space - all mounts on tools-redis is CRITICAL: CRITICAL: tools.tools-redis.diskspace._var_lib.byte_percentfree.value (<22.22%)  
[13:42:31] <shinken-wm>	 RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0]  
[13:53:10] <gifti>	 i can't attach to processes running on the tools grid with gdb :(
[13:56:02] <valhallasw`cloud>	 gifti: that's weird. I assume you're ssh'ing to the right exec host?
[13:56:33] <gifti>	 yes, ofc
[13:56:54] <valhallasw`cloud>	 gifti: the only sort-of-weird thing I can think of is that there's two processes, one bash and one the actual process
[13:57:03] <valhallasw`cloud>	 what error do you get?
[13:58:32] <shinken-wm>	 PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]  
[13:58:56] <gifti>	 job 7386430, tools-exec-06, pid 21194; gdb --pid=21194
[13:58:58] <gifti>	 Attaching to process 21194
[13:58:59] <gifti>	 Could not attach to process.  If your uid matches the uid of the target
[13:58:59] <gifti>	 process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
[13:58:59] <gifti>	 again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
[13:59:26] <gifti>	 ptrace: Operation not permitted.
[13:59:50] <gifti>	 ptrace probably unrelated
[14:01:53] <gifti>	 i will now make semolina pudding, brb
[14:02:52] <valhallasw`cloud>	 gifti: probably http://askubuntu.com/questions/41629/after-upgrade-gdb-wont-attach-to-process
[14:08:39] <gifti>	 valhallasw`cloud: duh, do you think it can be changed for (tool) labs?
[14:09:58] <FumingPanda>	 it is
[14:10:06] <FumingPanda>	 i’m about to get on a flight
[14:10:16] <FumingPanda>	 matanya: was talking about it earlier
[14:10:38] <FumingPanda>	 i’ll be out of action for about 36h now however :(
[14:10:42] <FumingPanda>	 phabit!
[14:10:46] <FumingPanda>	 valhallasw`cloud: gifti ^
[14:10:57] <FumingPanda>	 there might besecurity considerations
[14:12:55] <shinken-wm>	 PROBLEM - Host tools-webgrid-generic-01 is DOWN: CRITICAL - Host Unreachable (10.68.17.152)  
[14:43:01] <valhallasw`cloud>	 gifti: you might be able to work around it by starting your process in a screen (and then you can ssh into the exec host, open a new tab in the screen and gdb there, I think?)
[14:43:30] <shinken-wm>	 RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0]  
[14:43:48] <gifti>	 what? omg
[14:44:22] <valhallasw`cloud>	 what's so 'omg' about that? the screen is the parent process, so should be able to start pdb
[14:44:25] <valhallasw`cloud>	 I think?
[14:48:41] <Nemo_bis>	 doesn't every screen use a separate process?
[14:48:50] <Nemo_bis>	 tmux would be different
[14:58:40] <valhallasw`cloud>	 Nemo_bis: maybe, yeah, but they'd at least have a shared parent process, I think? I'm not entirely sure from that SO link what the specific requirements are
[14:59:31] <shinken-wm>	 PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]  
[15:24:33] <shinken-wm>	 RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0]  
[15:35:30] <shinken-wm>	 PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]  
[16:04:23] <physikerwelt>	 is somone around who knows to connect to the database replica from labs instances other than tools-login
[16:04:54] <gifti>	 tools or non-tools?
[16:05:07] <physikerwelt>	 non-tools
[16:05:30] <physikerwelt>	 there is a guide at https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Database#Connecting_to_the_database_replicas_from_other_Labs_instances
[16:05:35] <physikerwelt>	 but it seems to be outdated
[16:05:46] <gifti>	 hm
[16:06:04] <gifti>	 i once did that, but that's some time ago
[16:06:34] <physikerwelt>	 I think I manged to guess the correct IP address but I get a SQL error 111
[16:09:05] <physikerwelt>	 got it... I had the wrong ip
[16:12:11] <gifti>	 yay :)
[16:25:31] <shinken-wm>	 RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0]  
[16:41:28] <shinken-wm>	 PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 88.89% of data above the critical threshold [0.0]  
[17:46:31] <shinken-wm>	 RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0]  
[17:57:33] <shinken-wm>	 PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0]  
[18:17:07] <legoktm>	 !log tools.lolrrit-wm manually restarted grrrit-wm
[18:18:33] <legoktm>	 !log tools.lolrrit-wm 2015-01-17T18:17:20.057Z - error: Caught error in redisClient.brpop: Redis connection to tools-redis:6379 failed - connect ECONNREFUSED
[18:18:38] <legoktm>	 FumingPanda: ^
[18:21:25] <legoktm>	 valhallasw`cloud: ^ ?
[18:21:43] <valhallasw`cloud>	 tools-redis dead?
[18:21:45] <legoktm>	 nc -C tools-redis 6379 isn't working for me
[18:21:56] <legoktm>	 well, wikibugs is still working
[18:22:01] <valhallasw`cloud>	 :|
[18:22:08] <legoktm>	 hmm, just worked for me now
[18:23:01] * legoktm crosses fingers
[18:23:34] <valhallasw`cloud>	 maybe also restart gerrit-to-redis?
[18:24:24] <legoktm>	 well
[18:24:26] <legoktm>	 2015-01-17 18:24:09,588 Attempting to Redis connection to tools-redis
[18:24:27] <legoktm>	 2015-01-17 18:24:09,588 Redis connection to tools-redis succeded
[18:24:35] <legoktm>	 that's from the gerrit-to-redis logs
[18:24:47] <valhallasw`cloud>	 ah ok
[18:24:49] <legoktm>	 ok
[18:24:52] <legoktm>	 grrrit-wm should be good now
[18:25:14] <shinken-wm>	 PROBLEM - Puppet failure on tools-exec-10 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0]  
[18:25:48] <legoktm>	 hmm
[18:25:52] <legoktm>	 I just rebased https://gerrit.wikimedia.org/r/#/c/184599/
[18:27:32] <shinken-wm>	 RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0]  
[18:31:19] <legoktm>	 I think it might be gerrit-to-redis.
[18:32:10] <FumingPanda>	 legoktm: shit. There was a disk space warning for tools-redis just before I left 
[18:32:21] <FumingPanda>	 Can you see how much space is left there?
[18:32:43] <legoktm>	 FumingPanda: I can't login to tools-redis
[18:32:49] <FumingPanda>	 Use tools.wmflabs.org/nagf
[18:32:50] <FumingPanda>	 I'm on security line 
[18:32:51] <legoktm>	 !log tools.gerrit-to-redis restarted
[18:32:52] <legoktm>	 ok
[18:34:51] <FumingPanda>	 How much?
[18:34:52] <legoktm>	 FumingPanda: yeah it's out of disk space. Someone started putting a bunch of stuff in earlier this week and there's a huge increase in network traffic at that time
[18:35:01] <FumingPanda>	 Hmm
[18:35:04] <FumingPanda>	 Worst timing 
[18:35:11] <FumingPanda>	 I don't have my keyboard 
[18:35:14] <FumingPanda>	 It is checked in 
[18:35:25] <legoktm>	 https://graphite.wmflabs.org/render/?title=tools-redis+Disk+space+last+week&width=400&height=250&from=-1week&hideLegend=false&uniqueLegend=true&target=aliasByNode%28tools.tools-redis.diskspace.%2A.byte_avail.value%2C-3%2C-2%29
[18:35:31] <legoktm>	 1/14
[18:35:33] <FumingPanda>	 See if anyone else from ops is around?
[18:35:46] <legoktm>	 how do you fill up 62GB???
[18:36:14] <legoktm>	 !log tools tools-redis is out of disk space
[18:36:23] <gifti>	 oh, was it me?
[18:36:23] <FumingPanda>	 Heh
[18:36:56] <legoktm>	 gifti: what did you dooooooooooo :P
[18:37:14] <gifti>	 i like to store millions of values in an array
[18:37:52] <legoktm>	 :<
[18:37:54] <legoktm>	 https://pbs.twimg.com/media/B5bZ12TCcAA60_a.jpg !!!
[18:38:13] <legoktm>	 gifti: is it one key? can we just delete it for now?
[18:38:22] <gifti>	 yes, please
[18:38:30] <shinken-wm>	 PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]  
[18:38:39] <gifti>	 i forgot to clean up
[18:39:01] <legoktm>	 what's the key?
[18:39:15] <gifti>	 mom
[18:39:50] <legoktm>	 GET mom
[18:39:50] <legoktm>	 -LOADING Redis is loading the dataset in memory
[18:40:15] <gifti>	 no, i mean, i need a sec to look it up ^^
[18:40:34] <legoktm>	 ohhhhh
[18:40:38] <legoktm>	 :P
[18:40:47] <legoktm>	 well if someone had stuff in a key named 'mom' it's gone now
[18:41:44] <gifti>	 jfaT7tPpI+ixCZFQJwXZez60r74dKbkUjqVC+i89P7N=-urllist and others -*
[18:42:54] <legoktm>	 DEL jfaT7tPpI+ixCZFQJwXZez60r74dKbkUjqVC+i89P7N=-urllist
[18:42:54] <legoktm>	 -LOADING Redis is loading the dataset in memory
[18:43:41] <gifti>	 yeah, that's what i'm getting, too
[18:44:33] <FumingPanda>	 Getting harassed at security again 
[18:45:05] <FumingPanda>	 I'll replace tools-redis with a bigger instance so 
[18:45:07] <FumingPanda>	 Soon 
[18:49:20] <legoktm>	 I think we killed redis
[18:49:41] <legoktm>	 I can't connect to it anymore
[18:49:47] <gifti>	 haha
[18:49:55] <gifti>	 can we reboot it?
[18:50:30] <SuperFumingPanda>	 Security dickbags
[18:50:42] <gifti>	 hm, 01/14, that wasn't me
[18:54:55] <SuperFumingPanda>	 i’m here
[18:57:12] <SuperFumingPanda>	 legoktm: can you help?
[18:57:19] <SuperFumingPanda>	 i have only one usable hand
[18:57:40] <legoktm>	 SuperFumingPanda: I'm not sure what I can do
[18:57:47] <SuperFumingPanda>	 yeah stay on
[18:57:54] <SuperFumingPanda>	 i’m on tools-redis
[18:58:46] <legoktm>	 okay
[18:59:13] <SuperFumingPanda>	 hmm meh
[18:59:21] <SuperFumingPanda>	 RANDOMKEY error: LOADING Redis is loading the dataset in memory
[18:59:31] <legoktm>	 haha
[19:02:14] <SuperFumingPanda>	 legoktm: i could just flush everything :P
[19:03:29] <shinken-wm>	 RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0]  
[19:05:26] <SuperFumingPanda>	 legoktm: valhallasw`cloud thoughts on just flushing it?
[19:05:47] <legoktm>	 fine by me
[19:05:53] <legoktm>	 sorry, internet dq'd me
[19:07:46] <legoktm>	 SuperFumingPanda: it's not intended to be super persistent so I think a flush is ok
[19:09:58] <SuperFumingPanda>	 !log tools tools-redis has too much data, can not handle. flushing everything (with a backup)
[19:11:35] <SuperFumingPanda>	 !log tools moving aof file from /var/lib/redis to /home/yuvipanda. Have stopped redis server
[19:11:59] <SuperFumingPanda>	 legoktm: can you relog these when the bot is up?
[19:15:13] <SuperFumingPanda>	 !log waaay tooo slow to copy to nfs, just flushing db.
[19:15:48] <valhallasw`cloud>	 SuperFumingPanda: also fine by me
[19:15:53] <SuperFumingPanda>	 !log restarted redis on tools-redis after flushing db'
[19:19:34] <shinken-wm>	 PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 70.00% of data above the critical threshold [0.0]  
[19:25:22] <shinken-wm>	 RECOVERY - Free space - all mounts on tools-redis is OK: OK: All targets OK  
[19:37:53] <wikibugs>	 3Tool-Labs, Labs-Team: Migrate tools-redis to a bigger instance - https://phabricator.wikimedia.org/T87107#984097 (10Legoktm) 3NEW
[19:49:19] <shinken-wm>	 PROBLEM - Free space - all mounts on tools-exec-14 is CRITICAL: CRITICAL: tools.tools-exec-14.diskspace.root.byte_percentfree.value (<100.00%)  
[20:06:29] <valhallasw`cloud>	 2015-01-17 20:04:29,345 Redis connection to tools-redis succeded
[20:06:29] <valhallasw`cloud>	 2015-01-17 20:04:29,351 Secsh channel 1 opened.
[20:06:29] <valhallasw`cloud>	 2015-01-17 20:05:18,183 Pushed to 0 clients
[20:06:35] <valhallasw`cloud>	 legoktm: ^ let me restart grrrit-wm again
[20:10:39] <valhallasw`cloud>	 legoktm: seems grrrit-wm is not actually subscribing
[20:10:46] <valhallasw`cloud>	 fscking node.js bot :P
[20:11:52] <legoktm>	 :<<<<<<<<<<<<
[20:12:40] <valhallasw`cloud>	 although
[20:12:48] <valhallasw`cloud>	 I'm a bit confused by the '0 clients'. how does it know that
[20:13:31] <valhallasw`cloud>	 ohhhh
[20:13:33] <valhallasw`cloud>	 I think I know
[20:13:37] <valhallasw`cloud>	 due to that flush
[20:13:43] <valhallasw`cloud>	 the magic lua also flushed away
[20:15:06] <valhallasw`cloud>	 no, it's even worse. The clients it sent to were stored in redis >_<
[20:15:56] <valhallasw`cloud>	 or not. I'm super confused.
[20:17:57] <FlyingPanda>	 valhallasw`cloud: there is a way to register the list of clients to push messages to
[20:18:24] <valhallasw`cloud>	 which is a... zmq/redis/mysql hybrid
[20:18:28] <valhallasw`cloud>	 you've created a monster!
[20:18:43] <FlyingPanda>	 Wait I used zmq?
[20:18:53] <FlyingPanda>	 I was a monster yes
[20:18:55] <valhallasw`cloud>	 yeah. between registrar.py and subscriptions.py
[20:19:03] <valhallasw`cloud>	 okay so the subscriptions are in mysql
[20:19:08] <FlyingPanda>	 Wow I didn't know I was that bad 
[20:19:10] <valhallasw`cloud>	 I don't know how to get them to redis though
[20:19:16] <gifti>	 lololol
[20:19:25] <FlyingPanda>	 I wrote this years ago
[20:19:32] <FlyingPanda>	 In my defence :p
[20:19:35] <FlyingPanda>	 And it mostly worked 
[20:19:42] <FlyingPanda>	 Flight on runway now 
[20:19:53] <valhallasw`cloud>	         self.db.insert(key, msg['service'], msg['description'], msg['createdby'])
[20:19:54] <valhallasw`cloud>	         self.redis.pipeline().sadd(clients_key, key).save().execute()
[20:20:03] <valhallasw`cloud>	 ^ add key = 'add to both redis and mysql' :P
[20:20:09] <FlyingPanda>	 Where is that 
[20:20:25] <valhallasw`cloud>	 in registrar.py which does the client registration
[20:20:27] <FlyingPanda>	 Feel free to simplify if require d
[20:20:29] <FlyingPanda>	 :)
[20:20:31] <FlyingPanda>	 Ah
[20:20:33] <FlyingPanda>	 I see
[20:20:43] <FlyingPanda>	 and nothing restores it from.db?
[20:20:55] <valhallasw`cloud>	 working on it
[20:24:16] <valhallasw`cloud>	 there's also something weird in that there's only a single key in the db
[20:24:24] <valhallasw`cloud>	 although it was sending messages to five clients before
[20:24:25] <valhallasw`cloud>	 w/e
[20:24:28] <multichill|2>	 FlyingPanda: Did anyone kill the log bot?
[20:24:32] <shinken-wm>	 RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0]  
[20:24:59] <FlyingPanda>	 It died. valhallasw`cloud and legoktm have access
[20:25:22] <valhallasw`cloud>	 ok let me see if I can revive it quickly
[20:25:47] <multichill|2>	 !log Live, dammit, live!
[20:25:48] <valhallasw`cloud>	 4990548 0.43791 labs-logbo tools.morebo dr    10/21/2014 20:51:21 continuous@tools-exec-01.eqiad     1
[20:25:48] <FlyingPanda>	 Its documented on Wikitech 
[20:25:53] <valhallasw`cloud>	 yeah and in the README
[20:26:37] <multichill|2>	 valhallasw`cloud: We hebben je gemist vandaag bij de nieuwjaarsborrel
[20:26:59] <valhallasw`cloud>	 multichill: oh jee :-p
[20:27:24] <valhallasw`cloud>	 labslogbot queued again
[20:27:46] <valhallasw`cloud>	 \o/
[20:27:57] <valhallasw`cloud>	 !log tools.morebots kicked labs-morebots, seems alive again
[20:28:01] <labs-morebots>	 Logged the message, Master
[20:28:01] <valhallasw`cloud>	 or not.
[20:28:05] <valhallasw`cloud>	 :>
[20:29:26] <valhallasw`cloud>	 !log gerrit-to-redis subscriptions are dead after yuvi flushed redis. Trying to recreate from mysql
[20:29:26] <labs-morebots>	 gerrit-to-redis is not a valid project.
[20:29:30] <valhallasw`cloud>	 !log tools.gerrit-to-redis subscriptions are dead after yuvi flushed redis. Trying to recreate from mysql
[20:29:33] <labs-morebots>	 Logged the message, Master
[20:36:24] <grrrit-wm>	 (03CR) 10Merlijn van Deen: "fdfdsfadsa" [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/184599 (owner: 10Merlijn van Deen)
[20:36:27] <valhallasw`cloud>	 \o/
[20:36:29] <shinken-wm>	 PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]  
[20:36:30] <valhallasw`cloud>	 !log Fixed!
[20:36:31] <labs-morebots>	 Message missing. Nothing logged.
[20:36:39] <valhallasw`cloud>	 :<
[20:37:18] <multichill>	 !log tools.delinker Delinked myself, I'm no longer a maintainer of this tool.
[20:37:20] <labs-morebots>	 Logged the message, Master
[20:38:18] <multichill>	 valhallasw`cloud: https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.delinker/SAL <- that worked
[20:38:30] <valhallasw`cloud>	 yeah, just had to restart it
[20:41:50] <valhallasw`cloud>	 god I hate gerrit
[20:45:12] <valhallasw`cloud>	 ...but I hate Joe even more. Who uses that editor?
[20:45:26] <valhallasw`cloud>	 Now save the file and leave joe, by typing ^KX where x is for "exit"  YES BRILLIANT
[20:45:42] <gifti>	 me!
[20:45:50] <valhallasw`cloud>	 of cours ^kx doesn't work, because... >_<
[20:46:53] <grrrit-wm>	 (03PS1) 10Merlijn van Deen: hacky script to dump mysql subscriptions to redis [labs/tools/gerrit-to-redis] - 10https://gerrit.wikimedia.org/r/185644 
[21:02:20] <multichill>	 valhallasw`cloud: Oh, why did you add me to the gerrit bot tool group? I forgot
[21:05:14] <Qcoder00>	 Hi
[21:05:20] <Qcoder00>	 Reffered here by operations
[21:05:35] <Qcoder00>	 Who do I ask about setting up an OSM stack on labs?
[21:05:53] <Qcoder00>	 I had a rather specialist project in mind... to support content on Wikimedia projects.
[21:06:24] <valhallasw`cloud>	 multichill: uuuuuuh
[21:06:33] <valhallasw`cloud>	 multichill: gerrit-reviewer-bot?
[21:06:45] <valhallasw`cloud>	 not sure :-p
[21:07:02] <multichill>	 tools.gerrit-reviewer-bot
[21:07:20] <valhallasw`cloud>	 I think so you could kick it if it died again
[21:07:26] <valhallasw`cloud>	 back in the day when it still used ssh instead of email
[21:14:00] <valhallasw`cloud>	 !log deployed labs-morebots with https://gerrit.wikimedia.org/r/#/c/180890/ applied (stored in ~/src/adminbot)
[21:14:00] <labs-morebots>	 deployed is not a valid project.
[21:14:05] <valhallasw`cloud>	 !log tools.morebots deployed labs-morebots with https://gerrit.wikimedia.org/r/#/c/180890/ applied (stored in ~/src/adminbot)
[21:14:11] <labs-morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.morebots/SAL, Master
[21:14:55] <valhallasw`cloud>	 \o/
[21:21:30] <shinken-wm>	 RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0]  
[21:57:32] <shinken-wm>	 PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]  
[22:20:24] <wikibugs>	 3Tool-Labs: WIWOSM not working in Wikipedias - https://phabricator.wikimedia.org/T87038#984199 (10pere_prlpz) Now, WIWOSM is working again (in cawiki). The other two problems still remain.
[22:42:35] <shinken-wm>	 RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0]  
[23:01:00] <wikibugs>	 3Tool-Labs: Install valgrind on bastions - https://phabricator.wikimedia.org/T87117#984224 (10Giftpflanze) 3NEW
[23:58:34] <shinken-wm>	 PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]