[00:01:09] (03PS1) 10Platonides: Add an "authenticate" command for identifying with nickserv after connection [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/318229 [00:03:12] 10Tool-Labs-tools-stewardbots: StewardBot not logged into irc - https://phabricator.wikimedia.org/T149265#2747153 (10Platonides) This should help in case it happens again by allowing any privileged user to reauthenticate it. https://gerrit.wikimedia.org/r/318229 [00:40:46] PROBLEM - Puppet run on tools-puppetmaster-01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [00:57:05] !log tools.stewardbots Restart - it looks like this started during a NickServ outage, resulting in no authentication, resulting in IRC ops getting pinged by an anti-flood bot about this bot's behaviour [00:57:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL, Master [05:07:54] !log tools.stashbot Tried to switch main bot from OGE to k8s but pod ended in CrashLoopBackOff status with no log output that I could find [05:08:00] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stashbot/SAL, Master [06:32:28] PROBLEM - Puppet run on tools-flannel-etcd-03 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0] [07:11:46] (03CR) 10Ricordisamoa: [C: 032] Support GET requests in get_json() and get_json_cached() [labs/tools/ptable] - 10https://gerrit.wikimedia.org/r/318054 (owner: 10Ricordisamoa) [07:12:01] (03Merged) 10jenkins-bot: Support GET requests in get_json() and get_json_cached() [labs/tools/ptable] - 10https://gerrit.wikimedia.org/r/318054 (owner: 10Ricordisamoa) [07:12:26] RECOVERY - Puppet run on tools-flannel-etcd-03 is OK: OK: Less than 1.00% above the threshold [0.0] [08:02:40] 10Labs-Kubernetes, 06Wikisource, 03Community-Tech-Sprint: Make Google OCR API on Tool Labs work under Kubernetes - https://phabricator.wikimedia.org/T146311#2747662 (10Samwilson) ``` 2016-10-27 06:26:57: (mod_fastcgi.c.2569) unexpected end-of-file (perhaps the fastcgi process died): pid: 9 socket: unix:/var/... [09:17:52] PROBLEM - Puppet run on tools-services-02 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [09:34:25] RECOVERY - Host tools-secgroup-test-103 is UP: PING OK - Packet loss = 0%, RTA = 0.64 ms [09:44:22] PROBLEM - Host tools-secgroup-test-103 is DOWN: CRITICAL - Host Unreachable (10.68.21.22) [09:57:51] RECOVERY - Puppet run on tools-services-02 is OK: OK: Less than 1.00% above the threshold [0.0] [10:01:17] 10Labs-Kubernetes, 06Wikisource, 03Community-Tech-Sprint: Make Google OCR API on Tool Labs work under Kubernetes - https://phabricator.wikimedia.org/T146311#2747860 (10Niharika) a:03Samwilson [10:13:10] RECOVERY - Host secgroup-lag-102 is UP: PING OK - Packet loss = 0%, RTA = 121.89 ms [10:18:08] PROBLEM - Host secgroup-lag-102 is DOWN: CRITICAL - Host Unreachable (10.68.17.218) [10:38:11] PROBLEM - Puppet staleness on tools-prometheus-01 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [43200.0] [11:12:53] PROBLEM - Puppet staleness on tools-prometheus-02 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [43200.0] [11:38:30] Hi! Running a script on the labs instance dwl I get the error '35 SSL connect error. The SSL handshaking failed.' so about one time the hour. What is the reason? [11:48:53] PROBLEM - Puppet run on tools-services-01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [11:49:38] PROBLEM - Puppet run on tools-docker-registry-01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [12:04:40] RECOVERY - Puppet run on tools-docker-registry-01 is OK: OK: Less than 1.00% above the threshold [0.0] [12:23:51] RECOVERY - Puppet run on tools-services-01 is OK: OK: Less than 1.00% above the threshold [0.0] [12:26:40] 10Wikibugs: wikibugs - throttle output, don't get kicked for flooding - https://phabricator.wikimedia.org/T112032#2748088 (10Samtar) 05Open>03Resolved [13:33:54] 06Labs, 10Tool-Labs, 06Developer-Relations, 06WMF-Legal: Provide an easy way for Tool Labs tools to expose their source code - https://phabricator.wikimedia.org/T102081#1355202 (10Qgil) Would this task be a good topic for the #wikidev17 ? If so, the deadline to submit new proposals is next Monday, October... [13:34:08] 06Labs, 10Tool-Labs, 06Developer-Relations, 06WMF-Legal: Make sure tools can be taken over after they are abandoned - https://phabricator.wikimedia.org/T102066#1354813 (10Qgil) Would this task be a good topic for the #wikidev17 ? If so, the deadline to submit new proposals is next Monday, October 31: https... [13:35:21] 06Labs, 06Community-Tech-Tool-Labs, 06Developer-Relations, 10wikitech.wikimedia.org, 07Epic: [EPIC] Make wikitech more friendly for the multiple audiences it supports - https://phabricator.wikimedia.org/T123425#2748274 (10Qgil) Would this task be a good topic for the #wikidev17 ? If so, the deadline to s... [13:35:35] 06Labs, 10Tool-Labs-tools-Other, 06Community-Tech-Tool-Labs, 06Developer-Relations: Create an authoritative and well promoted catalog of Wikimedia tools - https://phabricator.wikimedia.org/T115650#2748275 (10Qgil) Would this task be a good topic for the #wikidev17 ? If so, the deadline to submit new propos... [13:35:54] 06Labs, 10Wikimedia-Labs-General, 06Developer-Relations: Community-maintained projects on Labs are hard to track - https://phabricator.wikimedia.org/T64837#2748276 (10Qgil) Would this task be a good topic for the #wikidev17 ? If so, the deadline to submit new proposals is next Monday, October 31: https://www... [13:37:24] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 06Developer-Relations, and 4 others: Set up process / criteria for taking over abandoned tools - https://phabricator.wikimedia.org/T87730#2748277 (10Qgil) Would this task be a good topic for the #wikidev17 ? If so, the deadline to submit new proposals is nex... [13:47:57] RECOVERY - Host tools-docker-builder-01 is UP: PING OK - Packet loss = 0%, RTA = 0.76 ms [13:50:50] RECOVERY - SSH on tools-webgrid-generic-1403 is OK: SSH OK - OpenSSH_6.9p1 Ubuntu-2~trusty1 (protocol 2.0) [13:50:56] !log tools reboot dockerbuilder-01 [13:51:00] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [13:51:04] !log tools reboot tools-webgrid-generic-1403 [13:51:08] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [13:51:55] PROBLEM - Puppet staleness on tools-docker-builder-01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [43200.0] [14:01:03] RECOVERY - Puppet staleness on tools-webgrid-generic-1403 is OK: OK: Less than 1.00% above the threshold [3600.0] [14:01:53] RECOVERY - Puppet staleness on tools-docker-builder-01 is OK: OK: Less than 1.00% above the threshold [3600.0] [14:35:35] why doesnt grid work with my bot it works fine from bastion [14:35:38] PROBLEM - Puppet run on tools-docker-registry-01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [14:43:25] Zppix: the most common answer is you need to specify trusty on the grid as precise is still the default but the bastions are trusty as well [14:43:45] otherwise you'll need to be more descriptive [14:43:55] trusty? [14:46:13] PROBLEM - Host tools-exec-cyberbot is DOWN: CRITICAL - Host Unreachable (10.68.16.39) [14:47:55] Zppix|Away: https://wiki.ubuntu.com/Releases [14:48:29] chasemp: as of midday yesterday, trusty is the default :) [14:48:50] ah ok right, I need to reverse my narrative [14:49:03] !log tools.stashbot Shutdown OGE job, started k8s job [14:49:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stashbot/SAL, Master [14:50:51] bd808: https://graphite-labs.wikimedia.org/render/?width=586&height=308&_salt=1477579792.187&target=sumSeries(tools.tools-services-01.sge.hosts.tools*12*.job_count)&from=-3d [14:50:54] interesting [14:51:14] https://graphite-labs.wikimedia.org/render/?width=586&height=308&_salt=1477579792.187&target=sumSeries(tools.tools-services-01.sge.hosts.tools*12*.job_count)&target=sumSeries(tools.tools-services-01.sge.hosts.tools*14*.job_count)&from=-3d [14:52:40] https://graphite-labs.wikimedia.org/render/?width=586&height=308&_salt=1477579792.187&target=cactiStyle(sumSeries(tools.tools-services-01.sge.hosts.tools*12*.job_count))&target=cactiStyle(sumSeries(tools.tools-services-01.sge.hosts.tools*14*.job_count))&from=-3d [14:55:49] RECOVERY - Puppet run on tools-docker-registry-01 is OK: OK: Less than 1.00% above the threshold [0.0] [14:57:04] chasemp: :) changing the default is having some impact then. In a couple of weeks I'll start trying to figure out who is still using precise and sending them nice emails about switching. [14:57:04] I'd really love to get everyone off well before our drop dead date [15:09:01] bd808: cool, that seems like a good approach [15:09:24] I'm kind of curious on the groupings for remaining precise things tbh, the why of it and if that sticks [15:09:36] and in theory worst comes to worse we can surely containerize those outliers [15:11:27] Some amount of it will just be long running jobs that have not restarted. Others will be things with cautious maintainers who haven't taken the time to test yet. I think there will actually be very few that at intrinsically tied to precise. [15:12:10] but at some point we have to pull the plug on precise. Faidon already grumbled at me that my deprecation timeline ran too long. ;) [15:16:29] 06Labs, 10Labs-Infrastructure, 10DBA, 06Operations: Create maintain-views user for labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T148560#2748504 (10chasemp) @Marostegui I'm of the opinion at the moment that reworking the definer could be a more nuanced bit of work itself. Honestly, no idea... [15:16:51] yeah, I think the long running jobs thing cuts at the majority [15:17:36] If I'd been on the ball and had the switch done before the last kernel reboot... [15:17:51] * bd808 shakes fist at the time lords [15:24:02] 06Labs, 10Labs-Infrastructure, 10DBA, 06Operations: Create maintain-views user for labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T148560#2748518 (10jcrespo) > Then the path forward I think is to keep both the VIEWMASTER and MAINTAINVIEWS users with SUPER privs and to use the MAINTAINVEWS us... [15:25:41] 06Labs, 10Labs-Infrastructure, 10DBA, 06Operations: Create maintain-views user for labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T148560#2748524 (10jcrespo) Also, there is a labsdbadmin user already existing, which was tasked to do maintenance (create users), maybe that should be the one us... [15:27:33] 06Labs, 10Tool-Labs, 13Patch-For-Review, 07Wikimedia-Incident: Setup a simple service that pages when it is unreachable - https://phabricator.wikimedia.org/T143638#2748529 (10madhuvishy) 05Open>03Resolved [15:31:02] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 06Developer-Relations, and 4 others: Set up process / criteria for taking over abandoned tools - https://phabricator.wikimedia.org/T87730#2748543 (10bd808) >>! In T87730#2748277, @Qgil wrote: > Would this task be a good topic for the #wikidev17 ? If so, the... [15:36:10] 06Labs, 10Labs-Infrastructure, 10DBA, 06Operations: Create maintain-views user for labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T148560#2748568 (10chasemp) >>! In T148560#2748518, @jcrespo wrote: >> Then the path forward I think is to keep both the VIEWMASTER and MAINTAINVIEWS users with S... [15:57:26] 10Labs-project-Wikistats: W3C wiki updates broken - https://phabricator.wikimedia.org/T149000#2748779 (10Dzahn) p:05Triage>03Normal [16:28:21] 10Labs-project-Wikistats: W3C wiki updates broken - https://phabricator.wikimedia.org/T149000#2739316 (10hashar) The page shows they all have HTTP error 301, a redirect. Most probably because we list http and they have switched to https with the script not following redirects? :] [17:03:40] 10Tool-Labs-tools-Pageviews: Investigation: Recursive category search in Massviews - https://phabricator.wikimedia.org/T149334#2749058 (10MusikAnimal) [17:31:13] 10Labs-Kubernetes, 06Wikisource, 03Community-Tech-Sprint: Make Google OCR API on Tool Labs work under Kubernetes - https://phabricator.wikimedia.org/T146311#2749203 (10kaldari) 05Open>03Resolved Nice work! @bd808: Is there some documentation somewhere about things that won't work under Kubernetes? If so... [17:34:05] Are the reboots in labs instances related to this? https://www.theguardian.com/technology/2016/oct/21/dirty-cow-linux-vulnerability-found-after-nine-years [17:34:14] 10Labs-Kubernetes, 06Wikisource, 03Community-Tech-Sprint: Make Google OCR API on Tool Labs work under Kubernetes - https://phabricator.wikimedia.org/T146311#2749224 (10bd808) Having looked at the icecave/isolator library briefly, I'm not sure that it really works anywhere in a robust and stable manner. It do... [17:36:55] what is trusty? [17:39:41] Zppix, are you refering to https://en.wikipedia.org/wiki/Ubuntu_version_history#Ubuntu_14.04_LTS_.28Trusty_Tahr.29 ? [17:39:56] it has to do with the grid [17:40:55] yes, probably ubuntu precise 12.04 starting to be deprecated in favour of ubuntu trussty 14.04 ? [17:41:52] you can see here soon it will stop receiving security updates: https://en.wikipedia.org/wiki/Ubuntu_version_history#Version_timeline [17:42:35] well my bot doesnt seem to be able to run it connects to irc but once issued a command in irc it disconnects. but when running bot via bastion its working fine [17:45:00] nevermind [17:52:54] Zppix: what does your bot try to do when it receives a command? Is your source published somewhere that I can look at? [17:53:17] bd808: you can check the error file in my tool acct (it should be a public dir [17:53:32] We have a lot of irc connected bots so it seems likely that the problem is somewhere in the implementation [17:53:34] its under project/zppixbot [17:57:22] the "Fatal Python error: Couldn't create autoTLSkey mapping" was from running on precise (older Ubuntu version). The bastions are trusty (newer that precise) [17:58:45] The last run in the err log shows that it started on trusty because I changed the default from precise to trusty yesterday for everyone. [17:58:56] It doesn't seem to have the crash message [18:00:19] i just tried trustty [18:09:46] !log tools.morebots Stopping bots in #wikimedia-labs and #wikimedia-releng channels. [18:09:49] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.morebots/SAL, Master [18:12:41] !log tools.stashbot Restarting to take over wiki logging in #wikimedia-labs and #wikimedia-releng channels. [18:13:27] !log tools.stashbot First test of wiki logging [18:13:29] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stashbot/SAL [18:14:28] !log deployment-prep Testing dual page wiki logging by stashbot. [18:14:31] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:14:33] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL [18:17:00] hmmm... that wasn't actually supposed to show both links [18:36:09] !log deployment-prep Testing dual page wiki logging by stashbot. (second attempt) [18:36:14] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL [18:46:13] !log deployment-prep Testing dual page wiki logging by stashbot. (check #3) [18:46:18] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL [18:49:10] !log tools rebooting tools-webgrid-lighttpd-1401 [18:49:13] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [19:00:49] PROBLEM - Puppet staleness on tools-webgrid-lighttpd-1401 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [43200.0] [19:05:50] RECOVERY - Puppet staleness on tools-webgrid-lighttpd-1401 is OK: OK: Less than 1.00% above the threshold [3600.0] [19:10:41] 06Labs, 10Tool-Labs, 07Epic: Phase out precise instances from toollabs - https://phabricator.wikimedia.org/T94790#2749580 (10bd808) [19:10:43] 06Labs, 10Tool-Labs, 15User-bd808: Make webservice warn when run with `-l release=precise` - https://phabricator.wikimedia.org/T143283#2749576 (10bd808) 05Open>03Resolved a:03bd808 This was done quite a while ago in {rOSTW5acbb62dc9d8ff6fc4f7d001fa9740323497a111} [19:12:38] PROBLEM - Puppet run on tools-checker-02 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [19:15:18] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 07Epic, 15User-bd808: Remove support for precise OGE exec hosts - https://phabricator.wikimedia.org/T94792#2749591 (10bd808) [19:17:10] !log stashbot should tell me that I forgot to give a project/tool name [19:17:11] Did you mean tools.stashbot instead of stashbot? [19:17:18] !log and stashbot should tell me that I forgot to give a project/tool name [19:18:03] !log invalid project [19:37:19] PROBLEM - Puppet run on tools-docker-builder-01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [19:47:39] RECOVERY - Puppet run on tools-checker-02 is OK: OK: Less than 1.00% above the threshold [0.0] [19:59:57] 10Tool-Labs-tools-Pageviews: Query stats.grok.se for data older than July 2015 - https://phabricator.wikimedia.org/T149358#2749796 (10MusikAnimal) [20:00:20] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Gilles was created, changed by Gilles link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Gilles edit summary: Created page with "{{Tools Access Request |Justification=Trying to make ori's perflogbot work |Completed=false |User Name=Gilles }}" [20:24:59] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Gilles was modified, changed by BryanDavis link https://wikitech.wikimedia.org/w/index.php?diff=933902 edit summary: [20:26:05] !log tools.perflogbot Added Gilles as maintainer [20:26:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.perflogbot/SAL [20:29:03] PROBLEM - Puppet run on tools-webgrid-generic-1401 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [20:38:36] PROBLEM - Host tools-secgroup-test-102 is DOWN: CRITICAL - Host Unreachable (10.68.21.170) [20:39:09] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 06Research-and-Data, 15User-bd808: 2016 Tool Labs user survey - https://phabricator.wikimedia.org/T147336#2749998 (10bd808) [21:04:04] RECOVERY - Puppet run on tools-webgrid-generic-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [21:07:54] RECOVERY - Puppet staleness on tools-prometheus-02 is OK: OK: Less than 1.00% above the threshold [3600.0] [21:09:43] !log tools upgrade prometheus on tools-prometheus0[12] [21:09:46] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [21:17:59] godog: nice! [21:20:17] yuvipanda: aye! not sure why tools-prometheus-02 didn't like the upgrade yesterday, -01 was fine [21:21:50] heh [21:21:54] I've never actively used -02 [21:22:02] godog: also, we found a fun issue yesterday [21:22:14] which is that a container was dying when it hit the 1Gi limit I'd set for it [21:22:26] godog: but prometheus thinks it was always at 800M [21:22:28] because [21:22:39] code tries to malloc a large buffer, and dies [21:22:43] but this happens in between scrapes [21:22:45] so is missed [21:22:57] I could find out this was happening from /var/log/messages on the host [21:23:01] since that keeps track of oom kills [21:23:09] RECOVERY - Puppet staleness on tools-prometheus-01 is OK: OK: Less than 1.00% above the threshold [3600.0] [21:23:37] yuvipanda: fun indeed! [21:23:47] no OOM/deaths exported by k8s ? [21:26:03] godog: good question. I don't know if k8s reportsit anywhere [21:26:09] godog: but the container itself didn't die [21:26:19] godog: the process (python) was killed by OOM killer [21:28:34] yuvipanda: ah! misread what was going on, heh sort of same problem with thumbor (tracking OOMs) [21:28:59] godog: is that why I heard a bit about mtail? [21:29:44] it is yeah yuvipanda, finishing the code review for mtail is actually ~next on my TODO [21:29:51] godog: nice [21:30:00] godog: any particular reason for usnig it over logster? [21:31:45] I just read about logster, I suppose already packaged for debian and prometheus support amongst the reasons [21:35:42] "written in golang" if you want to stretch it! [21:45:53] yuvipanda: I have a small IRC notif bot written in node that I am trying to launch on tool labs. It runs from the staging env, but when I submit it to run via 'jstart' node.js prints the following error message and core dumps: [21:45:55] FATAL ERROR: v8::Context::New() V8 is no longer usable [21:45:55] Aborted (core dumped) [21:45:56] [2016-27-10T19:05] /usr/bin/nodejs exited with code 134. Respawning... [21:46:09] have you seen that before? [21:46:14] hey [21:46:21] ori: try -mem 4G to jstart [21:46:46] * ori tries [21:49:31] yuvipanda: is node a mem hog in general? [21:52:26] yuvipanda: it's working [21:52:39] chasemp-tester: yeah [21:52:39] there's no way it is using up 4 gigs of ram, tho! [21:52:53] chasemp-tester: ori yeah, gridengine's memory counting is... 'weird' [21:53:00] I'll admit to not entirely understanding it [21:53:06] I vaguely recall it being detached from reality yeah [21:53:20] ori: yeah, 4G is just a 'sane max' of sorts [21:53:29] ori: for node / java [21:53:36] python's fine with 512, so is php [21:53:44] node might be ok with 1G actually [21:53:52] but since this doesn't really match reality too much... [21:53:57] it's fine-ish [21:54:16] small blah as SGE sucks so much at hard and soft allocations (i.e. no concept) but yeah [21:54:18] ori: if you want to, you can binary search your way down to something (try 2G, 1G, etc until it crashes again) [21:54:20] one more reason (tm) [21:54:25] yeah [21:54:37] k8s has 'guarantees' (requests) and 'limits' [21:54:41] former for scheduling purposes, latter for killing [21:54:45] works pretty nicely [21:55:19] for fun today I did search docker hub for SGE and poked at few of them, and then I mused on that for a few minutes over lunch [21:55:26] :D [21:55:36] I need to do another push maybe next week [21:55:42] let's see if I can sharpen the trusty imgae today [21:55:44] *this week [21:56:10] I'm going to have to step afk for a bit now, i'll brb (switching locations, this place too rainy) [21:56:36] hahaha [21:56:37] from the man page [21:56:40] "qstat -s h is an abbreviation for qstat -s huhohshdhjha" [21:56:49] later on yuvipanda, I'm going to clean out a fish tank post-haste [21:58:42] 10Tool-Labs-tools-Pageviews: Show smaller pie chart in addition to line/bar/radar chart - https://phabricator.wikimedia.org/T149374#2750346 (10MusikAnimal) [22:12:48] hi! I have a question regarding tool labs. I have a db on tools.labsdb. Is it backed up regularly? If so, how often? if not, what is the preferred way to configure backups? [22:34:15] yuvipanda, ^ [22:48:27] Leloiandudu: there are no backups of tool databases. [22:48:45] You could cron a mysqldump or similar for your tool's db [22:49:35] in an ideal world the tool databases are just transient working data. The world is often not ideal however :/ [22:50:52] bd808, I heard that about the replica server, but is it also true for tools.labsdb? where am I supposed to store the persistent data then? [22:52:08] tools.labsdb and/or your tool's $HOME on NFS are the only persistent storage options [22:53:03] ok. tools.labsdb is where I store my data. I will configure the backups manually then... [22:53:07] there has been some discussion about looking for ways to provide backups for tool databases but there is no solution at the moment [22:53:10] thanks [23:40:31] 10Labs-Kubernetes, 06Wikisource, 03Community-Tech-Sprint: Make Google OCR API on Tool Labs work under Kubernetes - https://phabricator.wikimedia.org/T146311#2750641 (10Samwilson) @bd808 I quite agree! It's a ridiculous library. I mean, all I wanted was a simple thing to turn errors into exceptions (and so wa...