unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
From: Joost Kremers <joostkremers@yahoo.com>
To: help-gnu-emacs@gnu.org
Subject: Re: How to compare strings?
Date: 29 Apr 2007 23:06:12 GMT	[thread overview]
Message-ID: <slrnf3a95i.6tf.joostkremers@j.kremers4.news.arnhem.chello.nl> (raw)
In-Reply-To: mailman.2702.1177885779.7795.help-gnu-emacs@gnu.org

Jesper Harder wrote:
> "Lennart Borgman (gmail)" <lennart.borgman@gmail.com> writes:
>
>> But I think there are completely different problems too. Does not some 
>> languages sort partly depending the phonetics instead of the spelling?
>
> Yes. In Danish 'aa' is alphabetized according to how it's
> pronounced. 
>
> If it is pronounced as two vowels (e.g. ekstraarbejde), it's
> alphabetized as two a's. If it is pronounced as one vowel
> (e.g. afrikaans) is alphabetized as å (the last letter in the Danish
> alphabet).

technically, this is not (if i understand things correctly, i don't speak
danish) a case of alphabetising according to pronunciation. when 'aa' is,
as you put it, pronounced as one vowel, it is technically a digraph, i.e. a
combination of two letters that indicate a single sound.

many languages have digraphs, e.g. english has th, ch, ph and ng, and quite
a few vowel combinations that are pronounced as one vowel (or diphthong);
dutch has quite a few vowel digraphs (with pronunciations that are somewhat
more regular than in english ;-), e.g. oe, eu, ui, au, ou, ei and ij.

in some languages, digraphs are treated as single letters for
alphabetisation. the 'aa' case in danish above is an example. sometimes,
digraphs present particularly interesting problems. in dutch dictionaries,
the digraph ij is treated as two letters, so words starting with ij appear
under i, but in phone books and the like, it's often treated as equivalent
to y, so that names starting with ij appear intermingled with y.

and then there's the case of nahuatl, which has a bunch of consonant
digraphs (ch, cu/uc, hu/uh, qu, tl, tz). dictionaries often (though not
always, there's no "standard" here), have separate sections for words
starting with these digraphs, but for the rest treat them as two separate
letters for alphabetisation within a section. (well, there's of course the
whole issue of roots vs. stems and the fact that cu/uc and hu/uh change
based on the position of the word they're in, but let's not get into
that. ;-)


-- 
Joost Kremers                                      joostkremers@yahoo.com
Selbst in die Unterwelt dringt durch Spalten Licht
EN:SiS(9)

  parent reply	other threads:[~2007-04-29 23:06 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-29 16:23 How to compare strings? David Kastrup
2007-04-29 19:38 ` Eli Zaretskii
2007-04-29 20:06   ` Lennart Borgman (gmail)
2007-04-29 20:52     ` Maciej Katafiasz
     [not found]     ` <mailman.2696.1177880336.7795.help-gnu-emacs@gnu.org>
2007-05-01 13:19       ` Malte Spiess
     [not found] ` <mailman.2692.1177876391.7795.help-gnu-emacs@gnu.org>
2007-04-29 20:39   ` Joost Kremers
2007-04-29 21:31     ` sigvaldi
2007-04-29 21:47     ` Harald Hanche-Olsen
2007-04-29 21:56     ` Lennart Borgman (gmail)
2007-04-29 22:22       ` Jesper Harder
     [not found]       ` <mailman.2702.1177885779.7795.help-gnu-emacs@gnu.org>
2007-04-29 23:06         ` Joost Kremers [this message]
     [not found]     ` <mailman.2701.1177884177.7795.help-gnu-emacs@gnu.org>
2007-04-29 22:08       ` Joost Kremers
2007-04-30  7:50         ` Harald Hanche-Olsen
2007-04-29 22:25   ` David Kastrup
2007-04-30  5:30     ` Stefan Monnier
2007-04-30 19:28     ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=slrnf3a95i.6tf.joostkremers@j.kremers4.news.arnhem.chello.nl \
    --to=joostkremers@yahoo.com \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).