all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Oleksandr Gavenko <gavenkoa@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 24405@debbugs.gnu.org
Subject: bug#24405: 24.5; Possibly ``forward-word`` doesn't respect ``word-combining-categories`` for word boundaries on changing between latin/phonetic scripts.
Date: Sun, 11 Sep 2016 14:57:33 +0300	[thread overview]
Message-ID: <87r38qtzrm.fsf@gavenkoa.example.com> (raw)
In-Reply-To: <83h99n8y9e.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 10 Sep 2016 20:23:25 +0300")

On 2016-09-10, Eli Zaretskii wrote:

>> Another solution is to invent own:
>> 
>>   (define-category ?p "Phonetic")
>> 
>> and to add it to IPA characters:
>> 
>>   (mapc (lambda (ch) (modify-category-entry ch "p"))
>>         '(?ʌ ?ə ?ɜ ?ɒ ?ɛ ?θ ?ʊ ?ɪ ?ɔ ?ɑ ?ʃ ?ʧ ?ː ?ˈ ?ˌ ?ʒ ?ŋ))
>> 
>> so it becomes possible to use:
>> 
>>   (add-to-list 'word-combining-categories '(?p . ?l))
>>   (add-to-list 'word-combining-categories '(?l . ?p))
>
> That'd be my second best advice.  But I think regular expressions
> should provide a better and easier solution.

This works for me:

  (defconst my/ipa-chars (list ?ˈ ?ˌ ?ː ?ǁ ?ʲ ?θ ?ð ?ŋ ?ɡ ?ʒ ?ʃ ?ʧ ?ə ?ɜ ?ɛ ?ʌ ?ɒ ?ɔ ?ɑ ?æ ?ʊ ?ɪ))
  (define-category ?p "Phonetic")
  (mapc (lambda (ch)
       (cond
        ((eq (aref char-script-table ch) 'phonetic)
         (modify-category-entry ch ?p)
         (modify-category-entry ch ?l nil t))
        ((eq (aref char-script-table ch) 'latin)  ; (aref char-script-table ?ˌ) is 'latin but (char-category-set ?ˌ) is ".j"
         (modify-category-entry ch ?l))))
        my/ipa-chars)
  (add-to-list 'word-combining-categories '(?p . ?l))
  (add-to-list 'word-combining-categories '(?l . ?p))

But adding and removing categories looks too low level. It is necessary to use
some (define-category ?p "Phonetic") that is not defined in Emacs itself.

This looks easier to me:

  (mapc (lambda (ch)
          (aset char-script-table ch 'latin)
          (modify-syntax-entry ch "w"))
        my/ipa-chars)

But ``char-script-table`` derived from Unicode and some code my depends on
this database...

-- 
http://defun.work/





  reply	other threads:[~2016-09-11 11:57 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-10  8:33 bug#24405: 24.5; Possibly ``forward-word`` doesn't respect ``word-combining-categories`` for word boundaries on changing between latin/phonetic scripts Oleksandr Gavenko
2016-09-10 10:05 ` Eli Zaretskii
2016-09-10 17:12   ` Oleksandr Gavenko
2016-09-10 17:23     ` Eli Zaretskii
2016-09-11 11:57       ` Oleksandr Gavenko [this message]
2019-09-29  4:33 ` Stefan Kangas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r38qtzrm.fsf@gavenkoa.example.com \
    --to=gavenkoa@gmail.com \
    --cc=24405@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.