[00:00:27] Are old binlogs interesting on a non-replicated MariaDB server? [00:07:42] Krinkle: Try again, please? [00:08:01] Looks good now [00:08:34] Poooh ... [00:11:01] !log tools tools-db: Moved 4.2 GBytes of the oldest binlogs to /var/lib/mysql2/ [00:11:03] Logged the message, Master [00:11:22] !log tools tools-db: and restarted mysqld [00:11:24] Logged the message, Master [00:14:33] (03PS1) 10Gerrit Patch Uploader: I have no idea what I'm doing. [labs/tools/WMT] - 10https://gerrit.wikimedia.org/r/109258 [00:14:36] (03CR) 10Gerrit Patch Uploader: "This commit was uploaded using the Gerrit Patch Uploader [1]." [labs/tools/WMT] - 10https://gerrit.wikimedia.org/r/109258 (owner: 10Gerrit Patch Uploader) [00:16:23] (03PS2) 10John F. Lewis: I have no idea what I'm doing. [labs/tools/WMT] - 10https://gerrit.wikimedia.org/r/109258 (owner: 10Gerrit Patch Uploader) [00:16:43] (03CR) 10PiRSquared17: [C: 032 V: 031] Make changes to rcreader [labs/tools/WMT] - 10https://gerrit.wikimedia.org/r/109258 (owner: 10Gerrit Patch Uploader) [00:16:56] (03CR) 10PiRSquared17: [V: 04-1] Make changes to rcreader [labs/tools/WMT] - 10https://gerrit.wikimedia.org/r/109258 (owner: 10Gerrit Patch Uploader) [00:17:04] (03CR) 10PiRSquared17: [C: 031 V: 032] Make changes to rcreader [labs/tools/WMT] - 10https://gerrit.wikimedia.org/r/109258 (owner: 10Gerrit Patch Uploader) [00:17:11] (03CR) 10PiRSquared17: [C: 032] Make changes to rcreader [labs/tools/WMT] - 10https://gerrit.wikimedia.org/r/109258 (owner: 10Gerrit Patch Uploader) [00:18:09] andrewbogott: Can extra storage (resize a volume or add another) be easily added to an instance after it is created? [00:20:55] sorry for the flood [00:21:01] i never used git or gerrit before much [00:21:04] scfc_de: I think that resizing is possible, I haven't tested it much. [00:22:13] scfc_de: I think I would need to do it by hand, and can't guarantee that the instance will survive :) So it might be better to just start a new larger-sized instance and migrate over. [00:24:24] scfc_de: Of course there's shared project storage (via gluster or NFS) which should provide you with tons of space in any case... [00:27:55] andrewbogott: It's about tools-db which after the eqiad move will find a new home anyway, so I don't want to risk wrecking it :-). But if there was an easy way to enlarge its 170 GByte volume, that would certainly be an alternative to waiting for the next full disk :-). [00:28:35] Ah, I see. That's probably worth looking into if we're in danger of hitting the limit. [00:29:14] We did just now, and I think we had the same problem a few days or weeks ago. [00:32:52] andrewbogott: Is http://docs.openstack.org/admin-guide-cloud/content//managing-volumes.html what we're using? [00:33:27] No, we only have local storage and shared storage. We aren't using block at all. [00:33:43] So we would be resizing the whole instance -- that's why I"m nervous about it. [00:34:46] andrewbogott: Do you have a pointer for the docs for that? [00:35:54] scfc_de: doing it is quite trivial. It's only that I don't trust it. [00:35:54] http://docs.openstack.org/user-guide/content/nova_cli_resize.html [00:38:59] a) I'm not into experiments either :-). b) If we can only resize a instance to a "higher" image, that wouldn't help as tools-db is already the biggest size. So I think I'll wait for the move to eqiad. [00:39:57] Yeah, I just thought of b) as I was looking at that page :( [00:42:48] Thanks anyway. [00:50:14] scfc_de: Actually, Ryan's attempts at resizing instances in the past have unfailingly ended in disaster. [00:50:30] It might have gotten better since, but I wouldn't chance it. [00:51:58] * Coren purges binlogs. [00:52:14] Coren: My original idea would have been to add a separate volume and mount that (http://docs.openstack.org/user-guide/content/cli_manage_volumes.html). But that seems far from the beaten track on Labs. [00:52:29] Coren: In /var/lib/mysql2 there's more. [00:53:18] scfc_de: For the record, a good stopgap is to edit mariadb-bin.index to remove the last couple hundred entries (smallest index numbers), then just rm them. [00:53:29] Should we decrease expire_logs_days? [00:54:04] scfc_de: I've already done so twice, but we could probably stand another decrease as usage has increased a lot lately (and thus binlog rate increases) [00:55:13] We're rolling 100M of binlogs roughly every 45min now. [00:55:14] Do you know how binlogs interact with transactions? I. e., if a transaction runs for four days, and the binlogs are expired after three, could we shoot ourselves in the foot with that? [00:56:00] Possibly, when doing so by hand, but setting expire_logs_days would be perfectly safe IIRC. [00:56:49] Then +1 to -x :-). [00:57:31] Also, in theory, transactions don't end up in binlogs until commited (I recall there are some partial exceptions with some engines in the presence of temporary tables, but those are edge cases) [00:58:39] All the better. [00:59:45] At first glance, we probably don't want to keep more than ~24h [01:02:20] {{done}} [01:02:30] That shouldn't bite us until we flee to eqiad at least. [01:03:49] Wonderful. I can delete the old logs in /var/lib/mysql2 then as well? [01:04:49] (Did you also restart mysqld?) [01:05:59] scfc_de: No need; I've set global as well as change the my.cnf. [01:06:26] scfc_de: Yes, you can clean those logs up if you want. [01:07:26] !log tools tools-db: Removed /var/lib/mysql2, set expire_logs_days to 1 day [01:07:30] Logged the message, Master [03:10:55] Is labs experiencing issues that last 5-6 hours? There are multiple bots that seem to be not running. [03:13:28] Cyberbot, AnomieBOT, Helpmebot, maybe more are all not running [03:25:30] T13|sleeps: tools-db was down and restarted. Don't know if the bots catch up automatically. [03:36:05] How long ago was that? [03:36:44] i'm giving you all she's got captain! [03:37:06] (A.k.a: omg we are running out of resources. Flee! Flee to equiad!) [03:37:45] Could move the db to NFS! Then it would be much bigger and muuuuuuuch slower [03:38:40] Coren, any chance we could start using the native eqiad db from within pmtpa labs? Would that be unmanageably slow? [03:39:27] andrewbogott: People rely on tools-db to not have the labsdb latency atm as a workaround. I'd scuttle their efforts if I did that. [03:39:43] ah, ok. [03:39:44] * T13|sleeps hides the baby and runs into... [03:40:11] And no, running a DB on NFS is on that long list of "Are you daft!?!" newbie mistakes. [03:40:49] Did the fibers ever get fixed properly? [03:40:56] They should be. [03:41:15] Good good. [03:41:52] * T13|sleeps is daft, but that's a discussion for another day with professionals in white coats... [03:54:28] Could we make it an option, i. e. set it up as tools-db-v2 or so with a big sign: "Beta users only"? I think there are a number of tools who rely more on bulk than on latency. (If setting it up burns up man-power otherwhere needed = No.) [08:17:23] (03CR) 10Nikerabbit: Make changes to rcreader (032 comments) [labs/tools/WMT] - 10https://gerrit.wikimedia.org/r/109258 (owner: 10Gerrit Patch Uploader) [08:55:10] # I was doing few mediawiki database queries for tewiki_p and find that the data is more than few days old. Any guess as to what is the problem? Is replication lag too bad [09:02:41] repleg sometimes does get bad (general advice, idk about your problem in particular) [09:09:42] Jasper_Deng: tewiki replication seems broken. revision count query gives 976399,while special:stats shows 10,10,990. Please advise how to get this fixed [09:10:07] I can't do anything about it unfortunately [09:10:15] legoktm would know [09:24:04] Jasper_Deng: thanks. I sent an email to labs-l . [09:36:19] andrewbogott: Coren : any idea which existing puppet role included on virt0 (role:nova:controller, ldap:role:server:labs and others) would be the appropriate place to put the robots.txt that ends up on wikitech? [09:36:55] mutante: whichever one includes openstack manager. [09:37:17] role::nova::manager ? ah ok, thanks! [09:37:28] But, really, sometime soon I hope to have that wiki managed via the deployment system like other wikis. In that case would robots be puppetized, or… handled as part of the mw install? [09:37:38] I guess maybe it's not a part of mw install anyway. [09:37:49] andrewbogott: that question is on 6689:) [09:37:58] i want to puppetize it [09:38:06] because it's easier for _me_ [09:38:12] and we do it on all these other services [09:38:21] but if others feel like the production PHP script is easier [09:38:27] and want to do that, i'm also fine [09:38:35] but i think the quicker fix is puppet for now:) [09:38:41] and we can still do the other later [09:38:45] sure, better than the status quo for sure. [09:38:49] at least then there are no manual deploys [09:39:00] which is what i just want to resolve quickly [09:39:04] The current contents of wikitech robots.txt is a c/p from en wiki. Feel free to adjust it however you see fit. [09:39:23] cool, will do!:) [09:40:19] andrewbogott: oh, i had one more point:) "this way you can actually _test_ changes in robots.txt on labs":) [09:40:27] and have code review [09:40:30] true [09:58:46] andrewbogott: gah, tabs:) [09:59:26] and the lint stuff, be consistent with existing style or don't create lint warnings, heh [09:59:51] i'll try to keep functional change separate from linting it..but i'm never really sure which is better:) [10:00:07] sneaking in. vs mixing things :p [11:44:26] @seen anomie [11:44:26] T13|needsCoffee: Last time I saw anomie they were quitting the network with reason: Quit: ... N/A at 1/23/2014 10:57:14 PM (12h47m12s ago) [11:46:56] @seenrx Cyberpower [11:46:56] T13|needsCoffee: Last time I saw Cyberpower678 they were joining the channel, they are still in the channel #wmt at 1/24/2014 4:19:07 AM (7h27m48s ago) (multiple results were found: Cyberpower678, Cyberpower6780, Cyberpower678], Cyberpower6789) [12:36:34] Hoi .. http://ultimategerardm.blogspot.nl/2014/01/six-million-english-labels-in-wikidata.html [12:50:37] anyone around that can help me with a puppet issue on a new instance? [12:51:25] heh, well, just an issue with a new instance :> [12:52:52] addshore: technically afk , making food, but general advice really quick: first create new instance without selecting any puppet class, wait, and when it's done after a coffee try configuring it and using a role [12:53:04] usually works much better from experiece [12:53:33] turns out a few reboots seems to have fixed it :) I hadnt added any classes yet but puppet seemed to keep getting stuck and not outputing anything to the syslog :P [12:54:17] you can already ssh to it but it's still doing stuff and not done when it's really fresh [12:54:28] you'll see in syslog it still installs packages and stuff [12:54:38] once it's really done with that things should work much better [12:55:04] i dont think you needed to reboot, but if it works now, cool [12:55:06] bbl [13:24:21] The original puppet run can sometimes take /forever/. [13:24:27] l:> [15:52:22] Coren, ping [15:52:47] @seen petan [15:52:47] Cyberpower678: Last time I saw petan they were quitting the network with reason: Ping timeout: 264 seconds N/A at 1/24/2014 1:44:03 PM (2h8m43s ago) [15:52:47] Pong. [15:53:06] Coren, is there an easy way to toss everything from qstat? [15:55:27] Coren, I mean is there an easy way to remove all jobs from the grid with a single command or do I have to qdel every single job? [15:56:50] Cyberpower678: hi CP, maybe qdel -u is your friend http://gridscheduler.sourceforge.net/htmlman/htmlman1/qdel.html [15:56:51] I think wdel will allow some other means of selection, or qmod might be the tool. Honestly, I've never tried to do that execpt at the queue level. [15:57:47] "qdel \*" works. [15:58:09] hedonil, thanks. All jobs are shutting down. [15:59:36] Cyberpower678: 'k [16:02:23] Coren, Service Temporarily Unavailable [16:02:23] The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later. [16:03:13] What service? [16:03:21] Webservice. [16:03:31] Oh, huh, qdel \* would have also shut down any webservice jobs you had. [16:03:37] It /is/ one of your jobs. [16:03:45] Urgh. [16:03:52] /facepalm [16:03:55] Thanks. [16:04:06] Just a 'webservice start' will fire it back up. :-) [16:04:34] I froze the terminal. :/ [16:05:18] Opinion time: I'm going to have alternative to lighttpd (like tomcat) in the future. Would people find it simpler to manage as an option to webservice (like, 'webservice -tomcat start') or as a different command (like 'javaservice start')? [16:05:37] In other words, which is less confusing? [16:08:11] We should keep an open mind for other, creative (:-)) alternatives. Is "javaservice" well-defined? I. e., is there only one way to set up a Java server in a directory? [16:13:41] Oh, and if we add options, /please/ "--tomcat". It's enough when SGE falls out of line :-). [16:13:59] (+1 to scfc_de) [16:14:04] actually [16:14:09] you should have subcommands [16:14:14] webservice tomcat start [16:14:18] webservice lighty start [16:14:30] --tomcat doesn't feel right. it isn't really an option or a parameter [16:28:42] hi there [16:29:29] i am trying to migrate something to the 'NewWeb' per-user method [16:30:03] it seems that NewWeb somehow uses different python library versions, which creates problems [16:30:09] yuvipanda: Well, I'd rather go --tomcat than tomcat if it came to that; but I see the parallel with 'service' I suppose. It really /is/ an option to how webservice works fundamentally, "reserve port, inform proxy, start daemon on grid, etc". [16:30:53] JohannesK_WMDE: It shouldn't; the webgrid shares the same exec environment the webservers do nominally. If there are divergences, it's a bug. Do you know what package seems to not match? [16:31:29] Coren: yes; i will post some info from a log file in a minute [16:32:47] Coren: http://pastebin.com/3DYdkpwz [16:33:34] Coren: but why? [16:33:43] Coren: +1 on service analogy [16:34:13] JohannesK_WMDE: It's a bug. 1.1 was built locally, I think things might have been configured before puppet added the local repos. Lemme see if it's just a necessary aptitude upgrade. [16:35:40] Preparing to replace python-requests 0.8.2-1 (using .../python-requests_1.2.3-1_all.deb) ... [16:35:59] Yep. ensure => present bites Labs again. [16:36:00] Coren: thanks, that sounds good :) [16:37:27] Coren: yes, works flawlessly now (and chunked transfer encoding seems to work as well) [16:37:54] Coren, not to be pushy, but when is Eqiad scheduled? [16:38:07] Out of curiousity of course. [16:38:42] Cyberpower678: We're stalled waiting on a contractor that has fallen ill, but I'm still hoping for end of February, but it might end up being mid-March. [16:41:00] Oooh my birthday is coming up. [16:41:14] * Cyberpower678 is getting eqiad for his birthday. :D [16:42:09] Heh. [16:45:00] Coren: BTW, is there a 20000" picture of pmtpa/eqiad somewhere? I. e., x servers with CPU/disk, y disk arrays connected to the servers this way, etc.? [16:47:40] what's " [16:47:47] Feet. [16:48:00] Well, no, inch. [16:48:24] scfc_de: ... you know? I don't think there is. At least no public enumeration that I know of. You can probably piece it together from different places, but I don't know that anyone ever took the time to draw a pretty diagram. [16:48:27] So '20000 * 12" picture'. [16:48:36] 20000 inch * 4 arcminutes =~ 0.6 m [16:49:00] 20000 ft * 4 arcminutes =~ 7 m [16:49:10] I think the 0.6m view is what you want, actually. [16:49:32] ;-) [16:49:41] valhallasw: Depends on where you take it from; if you choose poorly you'll only see rack posts. :-) [16:49:55] http://hostcabi.net/domain/wmflabs.org is amusing. [16:51:02] http://hostcabi.net/domain/wmflabs.org why am I waiting on connect.facebook.net when I go there ?? [16:51:23] GerardM-: I think because they have little 'like' icons or somesuch. [16:51:34] stupid [16:51:35] GerardM-: My browser tends to hide most of that junk. [16:51:37] now, interestingly, I could have written the above as 20000' * 4' *grin* [16:51:51] (' is used both for ft and arcsec) [16:52:01] arcmin* [16:52:40] Their geolocation is teh suck. [16:53:15] I found the money evaluations amusing mostly. We estimate this website generates about $256 USD of daily revenue. Hah. [16:54:54] Coren: I would accept a (pretty) table as well :-). [16:55:46] Hello! Is there somebody knowledgeable who could explain me what is the right server address (for dewiki_p) to type in Putty and create a tunnel for an ODBC connector. I tried with: s5.labsdb.pmtpa.wmflabs.org and some more but it didn't work. Thanks a lot! [16:56:19] Uh, wikimedia.de is only "worth" $7,708 USD. Though they have a lot more revenue, I believe. [16:57:15] scfc_de: I think they derive this from an hypothetical "possible ad revenue from traffic" metric. [16:57:29] iassen: It should be just 's5.labsdb' [16:58:05] Thanks, Coren! Trying.. [16:58:49] iassen: But also, you're probably not going to like the result; ODBC is very verbose and things are likely to be ridiculously slow. You're probably better off running the queries within the labs and export the results. [16:59:45] I am using R, so it would be handier to manage to connect it. It used to work good with the Toolserver [17:00:41] iassen: No worries, I was just pointing it out but if it works for you all is well. :-) [17:01:34] You might also prefer to use 'dewiki.labsdb' instead; that will point to the same place but is proof against the database changing slice (although that's honestly not all that likely) [17:07:31] @Coren it seems dewiki.labsdb is not enough. Do you think I don't need the ending: wmflabs.org in putty? [17:08:35] iassen: No, those are all local alises not true FQDN. It really depends on how putty is transmitting the hostname to the other end when trying to set the tunnel up, something which is a bit obscure. You could use the IP but that's not a good long-term solution even though it'd work. [17:09:11] Aha, I'll try.. [17:09:31] If putty is trying to resolve the hostname on your side, then that'd make things not work because those names are only visible from tool labs. [17:10:46] And if s5.labsdb works, dewiki.labsdb should work as well - or both of them fail. [17:31:26] Both work now as you said. I had a typo... Thanks a lot Coren and scfc_de! And have a good day! [19:19:26] (03CR) 10PiRSquared17: "(2 comments)" (032 comments) [labs/tools/WMT] - 10https://gerrit.wikimedia.org/r/109258 (owner: 10Gerrit Patch Uploader) [19:41:07] (03PS1) 10John F. Lewis: Add zhwikivoyage to smallwikis [labs/tools/WMT] - 10https://gerrit.wikimedia.org/r/109334 [19:59:53] (03CR) 10PiRSquared17: [C: 032] Add zhwikivoyage to smallwikis [labs/tools/WMT] - 10https://gerrit.wikimedia.org/r/109334 (owner: 10John F. Lewis) [20:03:33] (03CR) 10PiRSquared17: [C: 04-2 V: 032] Add zhwikivoyage to smallwikis [labs/tools/WMT] - 10https://gerrit.wikimedia.org/r/109334 (owner: 10John F. Lewis) [20:04:24] (03CR) 10PiRSquared17: [C: 032] "is this how to do it?" [labs/tools/WMT] - 10https://gerrit.wikimedia.org/r/109334 (owner: 10John F. Lewis) [23:47:48] Can some one help solve the broken replication of tewiki_p http://lists.wikimedia.org/pipermail/labs-l/2014-January/002018.html [23:49:01] Arjunaraoc: Which query specifically shows the problem? From the thread it seems it's working fine. [23:50:03] anomie: both queries have problem, as the data with respect to edits on the last three pages is missed out [23:50:33] I am referring to the query linked from the email. [23:50:37] Arjunaraoc: Your first query returned the same results when I tried it on the live servers. [23:50:56] anomie: strange! [23:52:25] anomie: Can you look at the history and let me know why edits are not reflected in the results https://te.wikipedia.org/w/index.php?title=%E0%B0%AE%E0%B0%A6%E0%B0%B0%E0%B1%8D_%E0%B0%A5%E0%B1%86%E0%B0%B0%E0%B1%80%E0%B0%B8%E0%B0%BE&action=history [23:54:33] Arjunaraoc: For one thing, page titles in the database have underscores ("_") rather than spaces. [23:55:50] anomie: Thx. you are right. I missed them when I modified from a old script.