From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: "Stephen J. Turnbull" Newsgroups: gmane.emacs.devel Subject: Re: lisp/ChangeLog coding system Date: 29 Apr 2002 20:28:55 +0900 Organization: The XEmacs Project Sender: emacs-devel-admin@gnu.org Message-ID: <87it6a3frc.fsf@tleepslib.sk.tsukuba.ac.jp> References: <86g01i8qoa.fsf@gerd.dnsq.org> <200204272241.g3RMfqI05559@aztec.santafe.edu> <6923-Sun28Apr2002212223+0300-eliz@is.elta.co.il> <87znzn48el.fsf@tleepslib.sk.tsukuba.ac.jp> <200204290155.g3T1tT814296@rum.cs.yale.edu> NNTP-Posting-Host: localhost.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: main.gmane.org 1020079874 9322 127.0.0.1 (29 Apr 2002 11:31:14 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Mon, 29 Apr 2002 11:31:14 +0000 (UTC) Cc: Eli Zaretskii , emacs-devel@gnu.org Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.33 #1 (Debian)) id 1729NC-0002QF-00 for ; Mon, 29 Apr 2002 13:31:14 +0200 Original-Received: from fencepost.gnu.org ([199.232.76.164]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 1729Qw-0003hL-00 for ; Mon, 29 Apr 2002 13:35:06 +0200 Original-Received: from localhost ([127.0.0.1] helo=fencepost.gnu.org) by fencepost.gnu.org with esmtp (Exim 3.34 #1 (Debian)) id 1729Mc-0001uH-00; Mon, 29 Apr 2002 07:30:38 -0400 Original-Received: from tleepslib.sk.tsukuba.ac.jp ([130.158.98.109]) by fencepost.gnu.org with esmtp (Exim 3.34 #1 (Debian)) id 1729Ll-0001hZ-00 for ; Mon, 29 Apr 2002 07:29:45 -0400 Original-Received: from steve by tleepslib.sk.tsukuba.ac.jp with local (Exim 3.35 #1 (Debian)) id 1729Kz-000755-00; Mon, 29 Apr 2002 20:28:57 +0900 Original-To: "Stefan Monnier" In-Reply-To: <200204290155.g3T1tT814296@rum.cs.yale.edu> Original-Lines: 64 User-Agent: Gnus/5.0808 (Gnus v5.8.8) XEmacs/21.4 (Common Lisp) X-Delivery-Agent: TMDA/0.51 (Python 2.1.3 on Linux/i686) Errors-To: emacs-devel-admin@gnu.org X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.0.9 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Emacs development discussions. List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.emacs.devel:3396 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:3396 >>>>> "Stefan" == Stefan Monnier writes: >> One aspect is making better guesses about desired coding >> systems. Stefan> I'm not sure what kind of improvements you're thinking Stefan> about. Well, in the version (mid-January, maybe?) of GNU Emacs I have, when I tried saving a buffer with mixed ascii, latin-1, and latin-2 in it, it gave me an abominably long list of coding systems including mule internal, all the -with-esc systems, and iso-2022-jp-2. But all of the characters used in the buffer are in ISO-8859-2, it's just Mule making false distinctions. At the very least, the defaults in Emacs should be to identify identical characters (eg, those from the Latin-## subsets) and to distinguish those where unification is controversial (the Han ideographs). Stefan> non-MIME coding-systems should be in the "unlikely" list, tho. There is no unique "the unlikely list". For example, if I were Croatian, I probably would want the buffer described above saved in ISO-8859-2 without being asked, but a German would probably want to save it in UTF-8 (or maybe ISO-2022-7 if she were an Emacs developer), or be queried, defaulting to ISO-8859-2. And some of the "universal" coding systems (UTF-32, mule internal, all the -with-esc systems) should probably not even be offered to most users; they should have to ask for them by name. But people with special needs should be able to configure them for regular use. And what's a "non-MIME coding system"? AFAIK MIME has nothing to do with coding systems except that the notation "the preferred MIME name" is a useful convention. But KOI8-R and all the Windows-125x sets are MIME registered. Stefan> Looking at the README, I have the impression that most of Stefan> the functionality is already part of the Emacs CVS code Stefan> (mostly thanks to Dave's ucs-tables.el). Someone should Stefan> try and figure out the details. As for most functionality being in Emacs, yes, that's why I said I'd help refactor; relative to ucs-tables.el the contribution is all UI. My duplication[1] of ucs-tables is straightforward, not terribly efficient code; all the meat is devoted to the question of "how do we know which coding systems to offer the user". Specifically I address the issues of preferred unibyte systems and preferred universal systems described above. Footnotes: [1] XEmacs 21.5 has built-in support for Unicode. The UCS tables are loaded at startup from (a local copy of) the Unicode Consortium tables, and an API is provided to reload if desirable. The code predates the release of Emacs 21, and so is different from ucs-tables.el, unfortunately. The duplicative parts are for 21.4. -- Institute of Policy and Planning Sciences http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN My nostalgia for Icon makes me forget about any of the bad things. I don't have much nostalgia for Perl, so its faults I remember. Scott Gilbert c.l.py