From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Joost Kremers Newsgroups: gmane.emacs.help Subject: Re: How to compare strings? Date: 29 Apr 2007 23:06:12 GMT Message-ID: References: <85slajb15e.fsf@lola.goethe.zz> <46351490.6000700@gmail.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Transfer-Encoding: 8bit X-Trace: sea.gmane.org 1177889724 11943 80.91.229.12 (29 Apr 2007 23:35:24 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Sun, 29 Apr 2007 23:35:24 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Mon Apr 30 01:35:13 2007 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1HiIvB-0000nl-GR for geh-help-gnu-emacs@m.gmane.org; Mon, 30 Apr 2007 01:35:13 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1HiJ1I-0000jI-Pc for geh-help-gnu-emacs@m.gmane.org; Sun, 29 Apr 2007 19:41:32 -0400 Original-Path: shelby.stanford.edu!newshub.stanford.edu!news.tele.dk!news.tele.dk!small.news.tele.dk!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail Original-Newsgroups: gnu.emacs.help Original-Lines: 45 Original-X-Trace: individual.net c7zNSzw3K3pMUIFNTYwLUQ4ZpfGR0woMBlmCbBrDTSzRgDkY9F Mail-Copies-To: nobody X-Editor: Emacs of course! User-Agent: slrn/0.9.8.1 (Linux) Original-Xref: shelby.stanford.edu gnu.emacs.help:147771 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:43374 Archived-At: Jesper Harder wrote: > "Lennart Borgman (gmail)" writes: > >> But I think there are completely different problems too. Does not some >> languages sort partly depending the phonetics instead of the spelling? > > Yes. In Danish 'aa' is alphabetized according to how it's > pronounced. > > If it is pronounced as two vowels (e.g. ekstraarbejde), it's > alphabetized as two a's. If it is pronounced as one vowel > (e.g. afrikaans) is alphabetized as å (the last letter in the Danish > alphabet). technically, this is not (if i understand things correctly, i don't speak danish) a case of alphabetising according to pronunciation. when 'aa' is, as you put it, pronounced as one vowel, it is technically a digraph, i.e. a combination of two letters that indicate a single sound. many languages have digraphs, e.g. english has th, ch, ph and ng, and quite a few vowel combinations that are pronounced as one vowel (or diphthong); dutch has quite a few vowel digraphs (with pronunciations that are somewhat more regular than in english ;-), e.g. oe, eu, ui, au, ou, ei and ij. in some languages, digraphs are treated as single letters for alphabetisation. the 'aa' case in danish above is an example. sometimes, digraphs present particularly interesting problems. in dutch dictionaries, the digraph ij is treated as two letters, so words starting with ij appear under i, but in phone books and the like, it's often treated as equivalent to y, so that names starting with ij appear intermingled with y. and then there's the case of nahuatl, which has a bunch of consonant digraphs (ch, cu/uc, hu/uh, qu, tl, tz). dictionaries often (though not always, there's no "standard" here), have separate sections for words starting with these digraphs, but for the rest treat them as two separate letters for alphabetisation within a section. (well, there's of course the whole issue of roots vs. stems and the fact that cu/uc and hu/uh change based on the position of the word they're in, but let's not get into that. ;-) -- Joost Kremers joostkremers@yahoo.com Selbst in die Unterwelt dringt durch Spalten Licht EN:SiS(9)