all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Drew Adams" <drew.adams@oracle.com>
To: "'martin rudalics'" <rudalics@gmx.at>
Cc: perin@panix.com, perin@acm.org, 13041@debbugs.gnu.org
Subject: bug#13041: 24.2; diacritic-fold-search
Date: Wed, 5 Dec 2012 07:38:10 -0800	[thread overview]
Message-ID: <611DD154E83240D183A7B5B88691DC37@us.oracle.com> (raw)
In-Reply-To: <50BF1702.4020100@gmx.at>

> `ignore-diacritics' is misleading.  The variable would have 
> to be called `observe-decompositions' or something the like.


1. "Observe decompositions" doesn't mean anything to me.  The verb should
probably be more active - what does it mean to observe the char decompositions
here?

BTW, if we use "decomposition" in the name and description then we should
probably also use "char" - this is not about decomposing strings in some way
(whatever that might mean); it involves decomposing Unicode characters.


2. But my confusion over the name/description is in fact wrt function
`decomposed-string-lessp': I guess it's not 100% clear to me what it does.

Your doc string said "STRING1 is decomposition-less than STRING2", which
confuses me.  And it is a bit ambiguous wrt "-less":

 a. decomposition-less as in comparing the strings only after
    removing (some parts of) their decompositions (i.e., "-less"
    as in "sans")?

or

 b. -lessp as in `string<': a comparison ordering relation?

In the version of `decomposed-string-lessp' that I sent, I changed the doc
string to this: "decomposed STRING1 is less than decomposed STRING2".  But that
is no doubt incorrect (less correct than yours, if perhaps clearer).  In
particular, it says nothing about how we compare the two decompositions.

In practical (use) terms, this is typically about ignoring diacritics, keeping
only the "base" characters.  Something about that should at least be mentioned
in the doc, so that users know they can use this for that.

But IIUC this is not just about diacritics; it sometimes might not be about
diacritics at all; and diacritics present are sometimes not ignored.  E.g., the
ligature ffi gets treated the same as the 3 chars f f i.  There are no
diacritics present in that case.

IIUC, we convert the two strings to their Unicode decompositions and then use
the Unicode char compatibility specs to compare the decompositions.  IOW, we
treat equivalent chars, as defined by Unicode, as the same.

Perhaps the name/description should speak in terms of Unicode char compatibility
or equivalence.  Perhaps a name like `string-less-compat-p'?  Or
`Unicode-equivalent-p'?  Or `string-equivalent-p'?

How would you characterize what the function does?  No doubt Eli can help here.
It is important to try to get the function name and description right from the
outset, if we can.  If the Unicode standard has some terminology that applies
here then perhaps we can/should leverage that.

Beyond the name and an accurate description, the doc should, as I say, at least
mention that you can use this to ignore diacritics (such as accents), as that
will be a common use case.






  reply	other threads:[~2012-12-05 15:38 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-30 18:22 bug#13041: 24.2; diacritic-fold-search Lewis Perin
2012-11-30 18:51 ` Juri Linkov
2012-11-30 21:07   ` Lewis Perin
2012-12-01  0:27     ` Juri Linkov
2012-12-01  0:47       ` Drew Adams
2012-12-01  0:49         ` Drew Adams
2012-12-01  1:20           ` Lew Perin
2012-12-01  6:50             ` Drew Adams
2012-12-01  8:32       ` Eli Zaretskii
2012-12-01  9:09         ` Eli Zaretskii
2012-12-01 16:38         ` Drew Adams
2012-12-02  0:27         ` Juri Linkov
2012-12-02 17:45           ` martin rudalics
2012-12-02 18:02             ` Eli Zaretskii
2012-12-03 10:16               ` martin rudalics
2012-12-03 16:47                 ` Eli Zaretskii
2012-12-03 17:42                   ` martin rudalics
2012-12-03 17:59                     ` Eli Zaretskii
2012-12-04 17:54                       ` martin rudalics
2012-12-04 19:28                         ` Eli Zaretskii
2012-12-05  9:41                           ` martin rudalics
2012-12-05 16:37                             ` Eli Zaretskii
2012-12-06 10:31                               ` martin rudalics
2012-12-06 17:48                                 ` Eli Zaretskii
2012-12-05 23:05                             ` Juri Linkov
2012-12-06 10:32                               ` martin rudalics
2012-12-04 20:12                         ` Drew Adams
2012-12-04 23:15                           ` Drew Adams
2012-12-05  6:50                             ` Drew Adams
2012-12-05  9:42                               ` martin rudalics
2012-12-05 15:38                                 ` Drew Adams
2012-12-06  9:25                               ` Kenichi Handa
2012-12-06 10:34                                 ` martin rudalics
2012-12-06 17:50                                   ` Eli Zaretskii
2012-12-07  0:58                                 ` Juri Linkov
2012-12-07  6:33                                   ` Eli Zaretskii
2012-12-07 10:37                                   ` martin rudalics
2012-12-07 23:55                                     ` Juri Linkov
2012-12-08  8:20                                       ` Eli Zaretskii
2012-12-08 11:35                                         ` martin rudalics
2012-12-08 12:40                                           ` Eli Zaretskii
2012-12-08 11:21                                       ` martin rudalics
2012-12-08 23:07                                         ` Juri Linkov
2012-12-09  0:04                                           ` Drew Adams
2012-12-09 17:52                                           ` martin rudalics
2012-12-09 18:06                                             ` Drew Adams
2012-12-11  7:19                                               ` Eli Zaretskii
2012-12-08 23:54                                       ` Stefan Monnier
2012-12-09  0:14                                         ` Drew Adams
2012-12-09 15:42                                           ` Stefan Monnier
2012-12-09 18:00                                             ` Drew Adams
2012-12-09  0:35                                         ` Juri Linkov
2012-12-09 11:35                                           ` Stephen Berman
2012-12-09 17:52                                             ` martin rudalics
2012-12-09 15:45                                           ` Stefan Monnier
2012-12-10  7:57                                             ` Juri Linkov
2012-12-10  8:20                                               ` Eli Zaretskii
2012-12-05  9:42                             ` martin rudalics
2012-12-05  9:42                           ` martin rudalics
2012-12-05 15:38                             ` Drew Adams [this message]
2012-12-05 15:51                               ` Lewis Perin
2012-12-05 16:20                                 ` Drew Adams
2012-12-05 17:16                               ` Drew Adams
2012-12-05 18:00                                 ` Drew Adams
2012-12-05 18:27                                   ` Eli Zaretskii
2012-12-06 10:31                                   ` martin rudalics
2012-12-06 15:59                                     ` Drew Adams
2012-12-06 10:28                               ` martin rudalics
2012-12-06 17:53                                 ` Eli Zaretskii
2012-12-05 23:04                             ` Juri Linkov
2012-12-06 10:31                               ` martin rudalics
2012-12-07  0:52                                 ` Juri Linkov
2012-12-02 21:39             ` Juri Linkov
2012-12-03 10:16               ` martin rudalics
2012-12-04  0:17                 ` Juri Linkov
2012-12-04  3:41                   ` Eli Zaretskii
2012-12-02 18:16           ` Eli Zaretskii
2012-12-02 21:31             ` Juri Linkov
2012-12-05 19:17             ` Drew Adams
2012-12-05 21:19               ` Eli Zaretskii
2012-11-30 19:31 ` Stefan Monnier
2016-08-31 14:45 ` Michael Albinus
     [not found]   ` <22473.57245.883865.68491@panix5.panix.com>
2016-09-03  7:06     ` Michael Albinus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=611DD154E83240D183A7B5B88691DC37@us.oracle.com \
    --to=drew.adams@oracle.com \
    --cc=13041@debbugs.gnu.org \
    --cc=perin@acm.org \
    --cc=perin@panix.com \
    --cc=rudalics@gmx.at \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.