From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Eric Abrahamsen Newsgroups: gmane.emacs.help Subject: Re: Emacs as a translator's tool Date: Fri, 29 May 2020 11:22:55 -0700 Message-ID: <87tuzyr474.fsf@ericabrahamsen.net> References: <871rn35lqc.fsf@mbork.pl> <87zh9r45ad.fsf@mbork.pl> <87h7vz2m5g.fsf@ebih.ebihd> <87zh9qr67n.fsf@ericabrahamsen.net> <3DBA2692-28A0-4AC3-B884-78763A9C7B16@traduction-libre.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="50092"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Cc: help-gnu-emacs , Emanuel Berg , Yuri Khan To: Jean-Christophe Helary Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Fri May 29 20:23:33 2020 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jejfJ-000Cwh-BP for geh-help-gnu-emacs@m.gmane-mx.org; Fri, 29 May 2020 20:23:33 +0200 Original-Received: from localhost ([::1]:37156 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jejfI-00012a-Ch for geh-help-gnu-emacs@m.gmane-mx.org; Fri, 29 May 2020 14:23:32 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:52624) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jejem-0000xq-AD for help-gnu-emacs@gnu.org; Fri, 29 May 2020 14:23:00 -0400 Original-Received: from ericabrahamsen.net ([52.70.2.18]:35880 helo=mail.ericabrahamsen.net) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jejel-0002Sl-5j for help-gnu-emacs@gnu.org; Fri, 29 May 2020 14:22:59 -0400 Original-Received: from localhost (75-172-112-137.tukw.qwest.net [75.172.112.137]) (Authenticated sender: eric@ericabrahamsen.net) by mail.ericabrahamsen.net (Postfix) with ESMTPSA id 82C59FA157; Fri, 29 May 2020 18:22:56 +0000 (UTC) In-Reply-To: <3DBA2692-28A0-4AC3-B884-78763A9C7B16@traduction-libre.org> (Jean-Christophe Helary's message of "Sat, 30 May 2020 02:58:07 +0900") Received-SPF: pass client-ip=52.70.2.18; envelope-from=eric@ericabrahamsen.net; helo=mail.ericabrahamsen.net X-detected-operating-system: by eggs.gnu.org: First seen = 2020/05/29 13:39:27 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "help-gnu-emacs" Xref: news.gmane.io gmane.emacs.help:123183 Archived-At: Jean-Christophe Helary writes: >> On May 30, 2020, at 2:39, Eric Abrahamsen wrote: > >> I've thought many times over the years about what I would really want an >> Emacs-based translation environment to provide for me. I don't do >> technical translation, so there's not a whole lot of value in >> sentence-by-sentence correspondences. > > Most translation tools I know (or I've used professionally) rely on a > segmentation scheme set by the user. If the user wants paragraph based > segmentation, so be it. What people call "sentence" segmentation is > actually a regex based system that takes into account various signs in > the source language. Okay, that's good to know. I guess I would just set it to split by paragraph, but would also like manual control in some cases. >> But as Yuri mentions it can be >> very useful to keep track of how you've translated certain names, or >> certain important terms, in different places throughout the text. >> Basically I would want two things: >> >> 1. A way to keep track of location correspondences between the source >> text and translated text. CAT tool split the text up by sentence, > > (not true, see above) > >> but >> that's not very useful for fiction (particularly Chinese->English >> translation) because there's rarely a one-to-one correspondence. >> There /is/ a more reliable correspondence between paragraphs, though, >> and I'd like to know which paragraph equals which. The point would >> mostly be to find my place again when I start translating at the >> beginning of the day, and to implement a more useful follow-mode. > > I'm not sure I understand what you mean. What's the difficulty that you are facing ? > >> I >> imagined this would happen when the mode was turned on: it would run >> down the file and insert markers that would be used to find >> correspondences. Special characters could be inserted into the file >> to indicate that two paragraphs should be joined, or one paragraph >> split. > > What would be the use of such a marking ? A follow-mode, as I mentioned above. And just finding my place. I do my translation in two sibling Org sub-trees, original and translation, displayed in two side-by-side windows. I don't want to mess with two-column-mode or anything like that. I want to be able to go to the bottom of the translation, run a command, and have the second window display the corresponding original. If I realize I've done something wrong a couple of chapters previously, and I skip back up to that location in the translation, I want to run the same command to display the corresponding spot in the original. >> 2. Link terms in the translation to a glossary pulled from the original. >> This would be character names, places, special terms, etc. They might >> not always be translated the same way, but I need to know how I've >> handled them earlier in the document. Glossary terms would be >> highlighted in the source text, and when you came to the equivalent >> spot in the translation, you'd use a command like >> insert-translation-term that would prompt for the translation, >> offering completion on earlier translations, and then insert that >> term into the translated text with a link to the original in the >> glossary. There would also be two multi-occur commands: one that >> prompted for a translation and showed all the places in the source >> text where it came from, and another that did the opposite: prompted >> for an original glossary term and showed all the places in the >> translation where it was translated. > > Very nice ideas. Maybe this will inspire me to write some code! The nice thing about the glossary is that it wouldn't have to just be vocabulary. You could just as easily use it for "every time the car crash is referenced", or something like that. You'd just have to manually mark the passage in the original, rather than automated marking by text search.