From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Simon Josefsson Newsgroups: gmane.emacs.devel Subject: Re: More Cyrillic vs UTF-8 Date: Sat, 26 Apr 2003 23:47:21 +0200 Sender: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Message-ID: References: <84he8lovtc.fsf@lucy.is.informatik.uni-duisburg.de> <841xzphrr4.fsf@lucy.is.informatik.uni-duisburg.de> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: main.gmane.org 1051393762 16707 80.91.224.249 (26 Apr 2003 21:49:22 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Sat, 26 Apr 2003 21:49:22 +0000 (UTC) Original-X-From: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Sat Apr 26 23:49:21 2003 Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 199XXt-0004LL-00 for ; Sat, 26 Apr 2003 23:49:21 +0200 Original-Received: from monty-python.gnu.org ([199.232.76.173]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 199XfG-0004JR-00 for ; Sat, 26 Apr 2003 23:56:59 +0200 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 199XXK-0006kC-06 for emacs-devel@quimby.gnus.org; Sat, 26 Apr 2003 17:48:46 -0400 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10.13) id 199XX9-0006gy-00 for emacs-devel@gnu.org; Sat, 26 Apr 2003 17:48:35 -0400 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10.13) id 199XX2-0006Ke-00 for emacs-devel@gnu.org; Sat, 26 Apr 2003 17:48:28 -0400 Original-Received: from 178.230.13.217.in-addr.dgcsystems.net ([217.13.230.178] helo=yxa.extundo.com) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 199XW1-0005h6-00 for emacs-devel@gnu.org; Sat, 26 Apr 2003 17:47:25 -0400 Original-Received: from latte.josefsson.org (yxa.extundo.com [217.13.230.178]) (authenticated bits=0) by yxa.extundo.com (8.12.9/8.12.9) with ESMTP id h3QLlL07018279 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK) for ; Sat, 26 Apr 2003 23:47:21 +0200 Original-To: emacs-devel@gnu.org Mail-Copies-To: nobody X-Payment: hashcash 1.2 0:030426:emacs-devel@gnu.org:801dfac9530ec124 X-Hashcash: 0:030426:emacs-devel@gnu.org:801dfac9530ec124 In-Reply-To: <841xzphrr4.fsf@lucy.is.informatik.uni-duisburg.de> (Kai =?iso-8859-1?q?Gro=DFjohann's?= message of "Sat, 26 Apr 2003 23:29:35 +0200") User-Agent: Gnus/5.090019 (Oort Gnus v0.19) Emacs/21.3.50 (gnu/linux) X-MIME-Autoconverted: from 8bit to quoted-printable by yxa.extundo.com id h3QLlL07018279 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1b5 Precedence: list List-Id: Emacs development discussions. List-Help: List-Post: List-Subscribe: , List-Archive: List-Unsubscribe: , Errors-To: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:13487 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:13487 kai.grossjohann@gmx.net (Kai Gro=DFjohann) writes: > Simon Josefsson writes: > >> kai.grossjohann@gmx.net (Kai Gro=DFjohann) writes: >> >>> Simon Josefsson writes: >>> >>>> Richard Stallman writes: >>>> >>>>> Mentioning this in PROBLEMS seems like a good idea to me, but a use= ful >>>>> entry needs to be stated in terms of what behavior the user sees. >>>>> This text doesn't explain the practical consequences; a user would = say >>>>> "so what does that mean for me?" >>>> >>>> Is this better? >>> >>> Can you say what characters you're talking about, instead of just the >>> code points? I guess that most people haven't memorized the Unicode >>> table (your truly included ;-). >> >> I agree, but I don't know which they are, and maybe the range includes >> very many different kind of characters. And as new characters are >> added all the time, I fear that both the list of supported characters >> and the list of unsupported characters would be too long to be useful. >> Hm. > > Well, isn't Unicode divided into blocks so that one can list the > blocks? Hm. Oh! See http://www.unicode.org/charts/ -- looks quite > promising. Searching for the code blocks there and then giving the > names ought to be useful. WDYT? The compiled list is below. Does it really help anyone to list all of them? Supported: Basic Latin Optical Character Recognition Latin-1 Supplement Enclosed Alphanumerics Latin Extended-A Box Drawing Latin Extended-B Block Elements IPA Extensions Geometric Shapes Spacing Modifier Letters Miscellaneous Symbols Combining Diacritical Marks Dingbats Greek Miscellaneous Mathematical Symbols-A Cyrillic Supplemental Arrows-A Cyrillic Supplement Braille Patterns Armenian Supplemental Arrows-B Hebrew Miscellaneous Mathematical Symbols-B Arabic Supplemental Mathematical Operators Syriac CJK Radicals Supplement Thaana Kangxi Radicals Devanagari Ideographic Description Characters Bengali CJK Symbols and Punctuation Gurmukhi Hiragana Gujarati Katakana Oriya Bopomofo Tamil Hangul Compatibility Jamo Telugu Kanbun Kannada Bopomofo Extended Malayalam Enclosed CJK Letters and Months Sinhala CJK Compatibility Thai =09 Lao =09 Tibetan =09 Myanmar =09 Georgian =09 Hangul Jamo =09 Ethiopic =09 Cherokee Private Use Area Unified Canadian Aboriginal Syllabic CJK Compatibility Ideographs Ogham Alphabetic Presentation Forms Runic Arabic Presentation Forms-A Tagalog Variation Selectors Hanunoo Combining Half Marks Buhid CJK Compatibility Forms Tagbanwa Small Form Variants Khmer Arabic Presentation Forms-B Mongolian Halfwidth and Fullwidth Forms Latin Extended Additional Specials Greek Extended =09 General Punctuation =09 Superscripts and Subscripts =09 Currency Symbols =09 Combining Marks for Symbols =09 Letterlike Symbols =09 Number Forms =09 Arrows =09 Mathematical Operators =09 Miscellaneous Technical =09 Control Pictures =09 Unsupported: CJK Unified Ideographs Extension A (1.5MB) CJK Unified Ideographs (5MB) Yi Syllables Yi Radicals Hangul Syllables (7MB) High Surrogates Low Surrogates Old Italic Gothic Deseret Byzantine Musical Symbols Musical Symbols Mathematical Alphanumeric Symbols CJK Unified Ideographs Extension B (13MB) CJK Compatibility Ideographs Supplement Tags Supplementary Private Use Area-A Supplementary Private Use Area-B