From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#24405: 24.5; Possibly ``forward-word`` doesn't respect ``word-combining-categories`` for word boundaries on changing between latin/phonetic scripts. Date: Sat, 10 Sep 2016 13:05:09 +0300 Message-ID: <83lgz083ze.fsf@gnu.org> References: <87mvjgupau.fsf@gavenkoa.example.com> Reply-To: Eli Zaretskii NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1473501990 27054 195.159.176.226 (10 Sep 2016 10:06:30 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sat, 10 Sep 2016 10:06:30 +0000 (UTC) Cc: 24405@debbugs.gnu.org To: Oleksandr Gavenko Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat Sep 10 12:06:26 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bifAv-0005hx-PM for geb-bug-gnu-emacs@m.gmane.org; Sat, 10 Sep 2016 12:06:17 +0200 Original-Received: from localhost ([::1]:33851 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bifAt-00017k-Rq for geb-bug-gnu-emacs@m.gmane.org; Sat, 10 Sep 2016 06:06:15 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:50700) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bifAk-00016R-OY for bug-gnu-emacs@gnu.org; Sat, 10 Sep 2016 06:06:07 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bifAh-0005Bg-3S for bug-gnu-emacs@gnu.org; Sat, 10 Sep 2016 06:06:06 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:57439) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bifAg-0005Bc-WC for bug-gnu-emacs@gnu.org; Sat, 10 Sep 2016 06:06:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1bifAg-0004um-Ql for bug-gnu-emacs@gnu.org; Sat, 10 Sep 2016 06:06:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 10 Sep 2016 10:06:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 24405 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 24405-submit@debbugs.gnu.org id=B24405.147350193918850 (code B ref 24405); Sat, 10 Sep 2016 10:06:02 +0000 Original-Received: (at 24405) by debbugs.gnu.org; 10 Sep 2016 10:05:39 +0000 Original-Received: from localhost ([127.0.0.1]:55150 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bifAI-0004tx-Rc for submit@debbugs.gnu.org; Sat, 10 Sep 2016 06:05:39 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:59627) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bifAH-0004tm-R0 for 24405@debbugs.gnu.org; Sat, 10 Sep 2016 06:05:38 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bifA9-00057z-FT for 24405@debbugs.gnu.org; Sat, 10 Sep 2016 06:05:32 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:57998) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bifA0-000566-Hw; Sat, 10 Sep 2016 06:05:20 -0400 Original-Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:2439 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1bif9y-0005jt-0o; Sat, 10 Sep 2016 06:05:19 -0400 In-reply-to: <87mvjgupau.fsf@gavenkoa.example.com> (message from Oleksandr Gavenko on Sat, 10 Sep 2016 11:33:45 +0300) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:123141 Archived-At: tags 24405 + notabug thanks > From: Oleksandr Gavenko > Date: Sat, 10 Sep 2016 11:33:45 +0300 > > Evaluate following form by C-x C-e: > > (let ((word-combining-categories '((?l . ?y) (?y . ?l) (?l . ?l))) > (word-separating-categories nil)) > (forward-word)) > > HelloПривLLжɪəʊheləʊaiɪa > > My pointer stopped between ʊh. > > I have: > > (aref char-script-table ?ʊ) phonetic > (aref char-script-table ?h) latin > (aref char-script-table ?ж) cyrillic > > (category-set-mnemonics (char-category-set ?ʊ)) ".Ljl" > (category-set-mnemonics (char-category-set ?h)) ".Lalr" > > (category-docstring ?y) "Cyrillic" > (category-docstring ?l) "Latin" > > I expect that point moved to last character before new line. > > Seems that: > > (?l . ?y) (?y . ?l) > > has effect because pointer moved across Cyrillic/Latin and Cyrillic/Phonetic > scripts but refused to move through Latin/Phonetic scripts. > > If it is intended behavior how will I make Emacs to move across Latin/Phonetic > scripts? You can't do this for 2 characters that belong to different scripts, but have the same categories in their category sets. Those two characters both have the 'l' (Latin) category in their sets, so you cannot force Emacs to consider them not as word boundary. For the same reason, including a cons cell whose members are identical, such as (?l . ?l), has no effect. This is the intended behavior, yes. The word-combining-categories feature is designed to support specific rare situations with mixing the Far Eastern scripts (e.g., use of Kanji characters in Japanese text), not for arbitrary games with Latin and European scripts. May I ask why do you need to consider the above a single word? In what situation(s) does that make sense? Thanks.