[00:00:04] meh [00:00:07] * marxarelli is hacking the mainframe, almost through the firewall [00:00:21] * bd808 sends SPIKE! [00:02:31] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [00:02:45] bd808: nice. coming down at a decent 2MB/s [01:38:30] PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [03:43:31] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [04:14:43] 3Wikimedia-Labs-Infrastructure: toolserver.org is not accessible from Labs instances - https://phabricator.wikimedia.org/T87086#983592 (10scfc) 3NEW [04:59:30] PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [05:24:30] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [05:36:34] PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [06:01:34] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [06:15:08] 3Tool-Labs: Provide page view metrics for individual tools on toollabs - https://phabricator.wikimedia.org/T87001#983660 (10yuvipanda) Oh, just whatever any UA classifier considers as browsers vs bots UA. [06:17:32] PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [06:19:19] 3Tool-Labs: Provide page view metrics for individual tools on toollabs - https://phabricator.wikimedia.org/T87001#983663 (10Ironholds) Our pageviews definition is based, in large parts, on MediaWiki's structure. Applying it to the tool labs structure would require an entirely new setup - one with variable accura... [06:21:58] 3Tool-Labs: Provide page view metrics for individual tools on toollabs - https://phabricator.wikimedia.org/T87001#983664 (10yuvipanda) What I specifically have in mind is: 1. Parse the nginx logs directly once an hour, and update the count publicly somewhere. This would be good enough for tools, since they don'... [06:22:08] 3Tool-Labs: Provide page view metrics for individual tools on toollabs - https://phabricator.wikimedia.org/T87001#983665 (10yuvipanda) a:3yuvipanda [06:27:32] !log dumps Deleted dumps-incr, replaced with dumps-3 [06:28:54] 3Tool-Labs: Provide page view metrics for individual tools on toollabs - https://phabricator.wikimedia.org/T87001#983667 (10Ironholds) The counting methodology will /not/ be consistent. It's based on things like MediaWiki directory names and specific hosts ;p. [06:29:48] 3Tool-Labs: Provide page view metrics for individual tools on toollabs - https://phabricator.wikimedia.org/T87001#983668 (10yuvipanda) Ok, so 'as consistent as possible'? Which I suppose boils down to just deciding which UAs to bucket as 'humans' and which as 'bots' and nothing more. [06:31:07] 3Tool-Labs: Provide page view metrics for individual tools on toollabs - https://phabricator.wikimedia.org/T87001#983669 (10Ironholds) Or, say, MIME type filtering. But yes. So, you want ua-parser ;p [06:32:44] 3Wikimedia-Labs-Infrastructure: toolserver.org is not accessible from Labs instances - https://phabricator.wikimedia.org/T87086#983677 (10scfc) Thanks for merging, @yuvipanda. To complete the fix, OpenStack's dnsmasq needs to be restarted. In the past, this has sometimes led to outages due to the wrong dnsmasq... [06:33:27] 3Wikimedia-Labs-Infrastructure: toolserver.org is not accessible from Labs instances - https://phabricator.wikimedia.org/T87086#983680 (10scfc) [06:35:32] 3Wikimedia-Labs-Infrastructure: toolserver.org is not accessible from Labs instances - https://phabricator.wikimedia.org/T87086#983686 (10yuvipanda) Yup, I wrote that sketchy outline :) Have restarted dnsmasq now, and dig toolserver.org returns the internal IP \o/ [06:35:43] 3Wikimedia-Labs-Infrastructure: toolserver.org is not accessible from Labs instances - https://phabricator.wikimedia.org/T87086#983687 (10yuvipanda) 5Open>3Resolved a:3yuvipanda [06:37:16] PROBLEM - Puppet failure on tools-exec-07 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [06:37:48] 3Wikimedia-Labs-Infrastructure: toolserver.org is not accessible from Labs instances - https://phabricator.wikimedia.org/T87086#983689 (10yuvipanda) In the future, you can use https://wikitech.wikimedia.org/w/api.php?action=query&list=novainstances&niproject=toolserver-legacy&niregion=eqiad&format=json and simil... [07:02:13] RECOVERY - Puppet failure on tools-exec-07 is OK: OK: Less than 1.00% above the threshold [0.0] [07:06:35] 3Tool-Labs: Provide page view metrics for individual tools on toollabs - https://phabricator.wikimedia.org/T87001#983696 (10yuvipanda) Cool. Can you tell me what exactly mime filtering is used for? [07:26:44] 3Tool-Labs: Provide page view metrics for individual tools on toollabs - https://phabricator.wikimedia.org/T87001#983697 (10Ironholds) Filtering out calls to JS/image/css assets [08:02:31] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [08:17:35] 3Wikimedia-Labs-Infrastructure: toolserver.org is not accessible from Labs instances - https://phabricator.wikimedia.org/T87086#983719 (10scfc) >>! In T87086#983689, @yuvipanda wrote: > In the future, you can use https://wikitech.wikimedia.org/w/api.php?action=query&list=novainstances&niproject=toolserver-legacy... [08:18:31] PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [08:18:49] 3Wikimedia-Labs-Infrastructure: toolserver.org is not accessible from Labs instances - https://phabricator.wikimedia.org/T87086#983720 (10yuvipanda) It is directly from OpenStack, yeah. Same info that drives Special:NovaInstance :) Also realized that all this info is available in LDAP too. [08:43:30] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [09:19:29] PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [10:14:35] 3Labs-Team: Allow everyone in ops group in LDAP to login to all Labs instances - https://phabricator.wikimedia.org/T87094#983801 (10yuvipanda) 3NEW [10:44:31] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [11:31:21] PROBLEM - Puppet failure on tools-webgrid-02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [11:56:31] PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [12:01:19] RECOVERY - Puppet failure on tools-webgrid-02 is OK: OK: Less than 1.00% above the threshold [0.0] [12:22:42] !log tools.delinker Delinked myself, I'm no longer a maintainer of this tool. [12:24:37] * multichill wonders where the log bot went [12:46:32] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [12:57:32] PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [12:59:44] 3Wikimedia-Labs-General: dewiki-20140918-pages-meta-history{3,4}.bz2 dumps are missing - https://phabricator.wikimedia.org/T73967#983865 (10Giftpflanze) a:3ArielGlenn [13:00:55] 3Tool-Labs: Missing Toolserver features in Tools (tracking) - https://phabricator.wikimedia.org/T60791#983873 (10Giftpflanze) [13:00:56] 3Tool-Labs: struct::stack/TclOO version mismatch - https://phabricator.wikimedia.org/T68984#983872 (10Giftpflanze) 5declined>3Resolved [13:04:45] 3Tool-Labs: Missing Toolserver features in Tools (tracking) - https://phabricator.wikimedia.org/T60791#983884 (10valhallasw) [13:23:39] 3Tool-Labs: IRC: Too many user connections (global) - https://phabricator.wikimedia.org/T66802#983928 (10Giftpflanze) Is this still an issue? [13:26:41] 3Tool-Labs: Dumps not updating again. - https://phabricator.wikimedia.org/T74154#983930 (10Giftpflanze) 5stalled>3Resolved Everything is fine, obviously. Closing. [13:35:21] PROBLEM - Free space - all mounts on tools-redis is CRITICAL: CRITICAL: tools.tools-redis.diskspace._var_lib.byte_percentfree.value (<22.22%) [13:42:31] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [13:53:10] i can't attach to processes running on the tools grid with gdb :( [13:56:02] gifti: that's weird. I assume you're ssh'ing to the right exec host? [13:56:33] yes, ofc [13:56:54] gifti: the only sort-of-weird thing I can think of is that there's two processes, one bash and one the actual process [13:57:03] what error do you get? [13:58:32] PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [13:58:56] job 7386430, tools-exec-06, pid 21194; gdb --pid=21194 [13:58:58] Attaching to process 21194 [13:58:59] Could not attach to process. If your uid matches the uid of the target [13:58:59] process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try [13:58:59] again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf [13:59:26] ptrace: Operation not permitted. [13:59:50] ptrace probably unrelated [14:01:53] i will now make semolina pudding, brb [14:02:52] gifti: probably http://askubuntu.com/questions/41629/after-upgrade-gdb-wont-attach-to-process [14:08:39] valhallasw`cloud: duh, do you think it can be changed for (tool) labs? [14:09:58] it is [14:10:06] i’m about to get on a flight [14:10:16] matanya: was talking about it earlier [14:10:38] i’ll be out of action for about 36h now however :( [14:10:42] phabit! [14:10:46] valhallasw`cloud: gifti ^ [14:10:57] there might besecurity considerations [14:12:55] PROBLEM - Host tools-webgrid-generic-01 is DOWN: CRITICAL - Host Unreachable (10.68.17.152) [14:43:01] gifti: you might be able to work around it by starting your process in a screen (and then you can ssh into the exec host, open a new tab in the screen and gdb there, I think?) [14:43:30] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [14:43:48] what? omg [14:44:22] what's so 'omg' about that? the screen is the parent process, so should be able to start pdb [14:44:25] I think? [14:48:41] doesn't every screen use a separate process? [14:48:50] tmux would be different [14:58:40] Nemo_bis: maybe, yeah, but they'd at least have a shared parent process, I think? I'm not entirely sure from that SO link what the specific requirements are [14:59:31] PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [15:24:33] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [15:35:30] PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [16:04:23] is somone around who knows to connect to the database replica from labs instances other than tools-login [16:04:54] tools or non-tools? [16:05:07] non-tools [16:05:30] there is a guide at https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Database#Connecting_to_the_database_replicas_from_other_Labs_instances [16:05:35] but it seems to be outdated [16:05:46] hm [16:06:04] i once did that, but that's some time ago [16:06:34] I think I manged to guess the correct IP address but I get a SQL error 111 [16:09:05] got it... I had the wrong ip [16:12:11] yay :) [16:25:31] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [16:41:28] PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 88.89% of data above the critical threshold [0.0] [17:46:31] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [17:57:33] PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:17:07] !log tools.lolrrit-wm manually restarted grrrit-wm [18:18:33] !log tools.lolrrit-wm 2015-01-17T18:17:20.057Z - error: Caught error in redisClient.brpop: Redis connection to tools-redis:6379 failed - connect ECONNREFUSED [18:18:38] FumingPanda: ^ [18:21:25] valhallasw`cloud: ^ ? [18:21:43] tools-redis dead? [18:21:45] nc -C tools-redis 6379 isn't working for me [18:21:56] well, wikibugs is still working [18:22:01] :| [18:22:08] hmm, just worked for me now [18:23:01] * legoktm crosses fingers [18:23:34] maybe also restart gerrit-to-redis? [18:24:24] well [18:24:26] 2015-01-17 18:24:09,588 Attempting to Redis connection to tools-redis [18:24:27] 2015-01-17 18:24:09,588 Redis connection to tools-redis succeded [18:24:35] that's from the gerrit-to-redis logs [18:24:47] ah ok [18:24:49] ok [18:24:52] grrrit-wm should be good now [18:25:14] PROBLEM - Puppet failure on tools-exec-10 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [18:25:48] hmm [18:25:52] I just rebased https://gerrit.wikimedia.org/r/#/c/184599/ [18:27:32] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [18:31:19] I think it might be gerrit-to-redis. [18:32:10] legoktm: shit. There was a disk space warning for tools-redis just before I left [18:32:21] Can you see how much space is left there? [18:32:43] FumingPanda: I can't login to tools-redis [18:32:49] Use tools.wmflabs.org/nagf [18:32:50] I'm on security line [18:32:51] !log tools.gerrit-to-redis restarted [18:32:52] ok [18:34:51] How much? [18:34:52] FumingPanda: yeah it's out of disk space. Someone started putting a bunch of stuff in earlier this week and there's a huge increase in network traffic at that time [18:35:01] Hmm [18:35:04] Worst timing [18:35:11] I don't have my keyboard [18:35:14] It is checked in [18:35:25] https://graphite.wmflabs.org/render/?title=tools-redis+Disk+space+last+week&width=400&height=250&from=-1week&hideLegend=false&uniqueLegend=true&target=aliasByNode%28tools.tools-redis.diskspace.%2A.byte_avail.value%2C-3%2C-2%29 [18:35:31] 1/14 [18:35:33] See if anyone else from ops is around? [18:35:46] how do you fill up 62GB??? [18:36:14] !log tools tools-redis is out of disk space [18:36:23] oh, was it me? [18:36:23] Heh [18:36:56] gifti: what did you dooooooooooo :P [18:37:14] i like to store millions of values in an array [18:37:52] :< [18:37:54] https://pbs.twimg.com/media/B5bZ12TCcAA60_a.jpg !!! [18:38:13] gifti: is it one key? can we just delete it for now? [18:38:22] yes, please [18:38:30] PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [18:38:39] i forgot to clean up [18:39:01] what's the key? [18:39:15] mom [18:39:50] GET mom [18:39:50] -LOADING Redis is loading the dataset in memory [18:40:15] no, i mean, i need a sec to look it up ^^ [18:40:34] ohhhhh [18:40:38] :P [18:40:47] well if someone had stuff in a key named 'mom' it's gone now [18:41:44] jfaT7tPpI+ixCZFQJwXZez60r74dKbkUjqVC+i89P7N=-urllist and others -* [18:42:54] DEL jfaT7tPpI+ixCZFQJwXZez60r74dKbkUjqVC+i89P7N=-urllist [18:42:54] -LOADING Redis is loading the dataset in memory [18:43:41] yeah, that's what i'm getting, too [18:44:33] Getting harassed at security again [18:45:05] I'll replace tools-redis with a bigger instance so [18:45:07] Soon [18:49:20] I think we killed redis [18:49:41] I can't connect to it anymore [18:49:47] haha [18:49:55] can we reboot it? [18:50:30] Security dickbags [18:50:42] hm, 01/14, that wasn't me [18:54:55] i’m here [18:57:12] legoktm: can you help? [18:57:19] i have only one usable hand [18:57:40] SuperFumingPanda: I'm not sure what I can do [18:57:47] yeah stay on [18:57:54] i’m on tools-redis [18:58:46] okay [18:59:13] hmm meh [18:59:21] RANDOMKEY error: LOADING Redis is loading the dataset in memory [18:59:31] haha [19:02:14] legoktm: i could just flush everything :P [19:03:29] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [19:05:26] legoktm: valhallasw`cloud thoughts on just flushing it? [19:05:47] fine by me [19:05:53] sorry, internet dq'd me [19:07:46] SuperFumingPanda: it's not intended to be super persistent so I think a flush is ok [19:09:58] !log tools tools-redis has too much data, can not handle. flushing everything (with a backup) [19:11:35] !log tools moving aof file from /var/lib/redis to /home/yuvipanda. Have stopped redis server [19:11:59] legoktm: can you relog these when the bot is up? [19:15:13] !log waaay tooo slow to copy to nfs, just flushing db. [19:15:48] SuperFumingPanda: also fine by me [19:15:53] !log restarted redis on tools-redis after flushing db' [19:19:34] PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 70.00% of data above the critical threshold [0.0] [19:25:22] RECOVERY - Free space - all mounts on tools-redis is OK: OK: All targets OK [19:37:53] 3Tool-Labs, Labs-Team: Migrate tools-redis to a bigger instance - https://phabricator.wikimedia.org/T87107#984097 (10Legoktm) 3NEW [19:49:19] PROBLEM - Free space - all mounts on tools-exec-14 is CRITICAL: CRITICAL: tools.tools-exec-14.diskspace.root.byte_percentfree.value (<100.00%) [20:06:29] 2015-01-17 20:04:29,345 Redis connection to tools-redis succeded [20:06:29] 2015-01-17 20:04:29,351 Secsh channel 1 opened. [20:06:29] 2015-01-17 20:05:18,183 Pushed to 0 clients [20:06:35] legoktm: ^ let me restart grrrit-wm again [20:10:39] legoktm: seems grrrit-wm is not actually subscribing [20:10:46] fscking node.js bot :P [20:11:52] :<<<<<<<<<<<< [20:12:40] although [20:12:48] I'm a bit confused by the '0 clients'. how does it know that [20:13:31] ohhhh [20:13:33] I think I know [20:13:37] due to that flush [20:13:43] the magic lua also flushed away [20:15:06] no, it's even worse. The clients it sent to were stored in redis >_< [20:15:56] or not. I'm super confused. [20:17:57] valhallasw`cloud: there is a way to register the list of clients to push messages to [20:18:24] which is a... zmq/redis/mysql hybrid [20:18:28] you've created a monster! [20:18:43] Wait I used zmq? [20:18:53] I was a monster yes [20:18:55] yeah. between registrar.py and subscriptions.py [20:19:03] okay so the subscriptions are in mysql [20:19:08] Wow I didn't know I was that bad [20:19:10] I don't know how to get them to redis though [20:19:16] lololol [20:19:25] I wrote this years ago [20:19:32] In my defence :p [20:19:35] And it mostly worked [20:19:42] Flight on runway now [20:19:53] self.db.insert(key, msg['service'], msg['description'], msg['createdby']) [20:19:54] self.redis.pipeline().sadd(clients_key, key).save().execute() [20:20:03] ^ add key = 'add to both redis and mysql' :P [20:20:09] Where is that [20:20:25] in registrar.py which does the client registration [20:20:27] Feel free to simplify if require d [20:20:29] :) [20:20:31] Ah [20:20:33] I see [20:20:43] and nothing restores it from.db? [20:20:55] working on it [20:24:16] there's also something weird in that there's only a single key in the db [20:24:24] although it was sending messages to five clients before [20:24:25] w/e [20:24:28] FlyingPanda: Did anyone kill the log bot? [20:24:32] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [20:24:59] It died. valhallasw`cloud and legoktm have access [20:25:22] ok let me see if I can revive it quickly [20:25:47] !log Live, dammit, live! [20:25:48] 4990548 0.43791 labs-logbo tools.morebo dr 10/21/2014 20:51:21 continuous@tools-exec-01.eqiad 1 [20:25:48] Its documented on Wikitech [20:25:53] yeah and in the README [20:26:37] valhallasw`cloud: We hebben je gemist vandaag bij de nieuwjaarsborrel [20:26:59] multichill: oh jee :-p [20:27:24] labslogbot queued again [20:27:46] \o/ [20:27:57] !log tools.morebots kicked labs-morebots, seems alive again [20:28:01] Logged the message, Master [20:28:01] or not. [20:28:05] :> [20:29:26] !log gerrit-to-redis subscriptions are dead after yuvi flushed redis. Trying to recreate from mysql [20:29:26] gerrit-to-redis is not a valid project. [20:29:30] !log tools.gerrit-to-redis subscriptions are dead after yuvi flushed redis. Trying to recreate from mysql [20:29:33] Logged the message, Master [20:36:24] (03CR) 10Merlijn van Deen: "fdfdsfadsa" [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/184599 (owner: 10Merlijn van Deen) [20:36:27] \o/ [20:36:29] PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [20:36:30] !log Fixed! [20:36:31] Message missing. Nothing logged. [20:36:39] :< [20:37:18] !log tools.delinker Delinked myself, I'm no longer a maintainer of this tool. [20:37:20] Logged the message, Master [20:38:18] valhallasw`cloud: https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.delinker/SAL <- that worked [20:38:30] yeah, just had to restart it [20:41:50] god I hate gerrit [20:45:12] ...but I hate Joe even more. Who uses that editor? [20:45:26] Now save the file and leave joe, by typing ^KX where x is for "exit" YES BRILLIANT [20:45:42] me! [20:45:50] of cours ^kx doesn't work, because... >_< [20:46:53] (03PS1) 10Merlijn van Deen: hacky script to dump mysql subscriptions to redis [labs/tools/gerrit-to-redis] - 10https://gerrit.wikimedia.org/r/185644 [21:02:20] valhallasw`cloud: Oh, why did you add me to the gerrit bot tool group? I forgot [21:05:14] Hi [21:05:20] Reffered here by operations [21:05:35] Who do I ask about setting up an OSM stack on labs? [21:05:53] I had a rather specialist project in mind... to support content on Wikimedia projects. [21:06:24] multichill: uuuuuuh [21:06:33] multichill: gerrit-reviewer-bot? [21:06:45] not sure :-p [21:07:02] tools.gerrit-reviewer-bot [21:07:20] I think so you could kick it if it died again [21:07:26] back in the day when it still used ssh instead of email [21:14:00] !log deployed labs-morebots with https://gerrit.wikimedia.org/r/#/c/180890/ applied (stored in ~/src/adminbot) [21:14:00] deployed is not a valid project. [21:14:05] !log tools.morebots deployed labs-morebots with https://gerrit.wikimedia.org/r/#/c/180890/ applied (stored in ~/src/adminbot) [21:14:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.morebots/SAL, Master [21:14:55] \o/ [21:21:30] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [21:57:32] PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [22:20:24] 3Tool-Labs: WIWOSM not working in Wikipedias - https://phabricator.wikimedia.org/T87038#984199 (10pere_prlpz) Now, WIWOSM is working again (in cawiki). The other two problems still remain. [22:42:35] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [23:01:00] 3Tool-Labs: Install valgrind on bastions - https://phabricator.wikimedia.org/T87117#984224 (10Giftpflanze) 3NEW [23:58:34] PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]