From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Emanuel Berg via Users list for the GNU Emacs text editor Newsgroups: gmane.emacs.help Subject: Re: Emacs as a translator's tool Date: Mon, 08 Jun 2020 14:15:12 +0200 Message-ID: <87ftb5ycrz.fsf@ebih.ebihd> References: <871rn35lqc.fsf@mbork.pl> <87zh9r45ad.fsf@mbork.pl> <87h7vz2m5g.fsf@ebih.ebihd> <87d06k4rmg.fsf@mbork.pl> <87eeqzmanl.fsf@ebih.ebihd> <877dwmoboq.fsf@mbork.pl> <87bllypckg.fsf@ebih.ebihd> <87tuzpmnuo.fsf@mbork.pl> <87bllu4lx0.fsf@ebih.ebihd> <87blluxfcq.fsf@mbork.pl> <1rmqrrvn.fsf@ebih.ebihd> <87o8ptydil.fsf@ebih.ebihd> Reply-To: Emanuel Berg Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="36443"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) To: help-gnu-emacs@gnu.org Cancel-Lock: sha1:+zaBKx5n/heA51pfVNGNCU0OxNU= Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Mon Jun 08 14:18:10 2020 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jiGjC-0009NY-15 for geh-help-gnu-emacs@m.gmane-mx.org; Mon, 08 Jun 2020 14:18:10 +0200 Original-Received: from localhost ([::1]:51934 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jiGjB-0005iZ-2N for geh-help-gnu-emacs@m.gmane-mx.org; Mon, 08 Jun 2020 08:18:09 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:37242) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jiGgU-0003tY-IF for help-gnu-emacs@gnu.org; Mon, 08 Jun 2020 08:15:22 -0400 Original-Received: from ciao.gmane.io ([159.69.161.202]:50272) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jiGgT-0000Qn-8L for help-gnu-emacs@gnu.org; Mon, 08 Jun 2020 08:15:22 -0400 Original-Received: from list by ciao.gmane.io with local (Exim 4.92) (envelope-from ) id 1jiGgQ-0006B2-0O for help-gnu-emacs@gnu.org; Mon, 08 Jun 2020 14:15:18 +0200 X-Injected-Via-Gmane: http://gmane.org/ Mail-Followup-To: help-gnu-emacs@gnu.org Mail-Copies-To: never Received-SPF: pass client-ip=159.69.161.202; envelope-from=geh-help-gnu-emacs@m.gmane-mx.org; helo=ciao.gmane.io X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/08 07:08:30 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -16 X-Spam_score: -1.7 X-Spam_bar: - X-Spam_report: (-1.7 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "help-gnu-emacs" Xref: news.gmane.io gmane.emacs.help:123278 Archived-At: > One thing tho, M. Helary said something about the > chopping up of the input into segments, my > intuition tells me they are shorter (the input more > segmentized) than what you get with > `forward-sentence' and `backward-sentence'. (My > intuition also tells me backward-sentence is > (forward-sentence -1) ...) > > Maybe `sentence-end' already has been configured > somewhere to get the most restrictive definition, > i.e., here, with the purpose of getting the > shortest possible segments that still make sense... Unless... unless the DB is really fined tuned already. Then we should do our own segmentation rules, we should get the exact same as they (OmegaT or whoever has it) uses... -- underground experts united http://user.it.uu.se/~embe8573 https://dataswamp.org/~incal