From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: David Kastrup Newsgroups: gmane.emacs.devel Subject: Re: Unibyte characters, strings, and buffers Date: Sun, 30 Mar 2014 11:01:12 +0200 Organization: Organization?!? Message-ID: <87a9c8njqf.fsf@fencepost.gnu.org> References: <831txozsqa.fsf@gnu.org> <83ppl7y30l.fsf@gnu.org> <87r45nouvx.fsf@uwakimon.sk.tsukuba.ac.jp> <8361myyac6.fsf@gnu.org> <87a9capqfr.fsf@uwakimon.sk.tsukuba.ac.jp> <83eh1mfd09.fsf@gnu.org> <87ob0pnyt6.fsf@uwakimon.sk.tsukuba.ac.jp> <87ioqxnhhk.fsf@uwakimon.sk.tsukuba.ac.jp> <87bnwpov7b.fsf@fencepost.gnu.org> <87eh1lnf4q.fsf@uwakimon.sk.tsukuba.ac.jp> <877g7dos88.fsf@fencepost.gnu.org> <87a9c8o2yq.fsf@uwakimon.sk.tsukuba.ac.jp> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1396170116 2506 80.91.229.3 (30 Mar 2014 09:01:56 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 30 Mar 2014 09:01:56 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Mar 30 11:01:49 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1WUBce-0006XS-UN for ged-emacs-devel@m.gmane.org; Sun, 30 Mar 2014 11:01:45 +0200 Original-Received: from localhost ([::1]:43433 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WUBce-0006tq-I0 for ged-emacs-devel@m.gmane.org; Sun, 30 Mar 2014 05:01:44 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:51002) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WUBcV-0006tX-On for emacs-devel@gnu.org; Sun, 30 Mar 2014 05:01:41 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WUBcP-00062a-BW for emacs-devel@gnu.org; Sun, 30 Mar 2014 05:01:35 -0400 Original-Received: from plane.gmane.org ([80.91.229.3]:40770) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WUBcP-00062S-5G for emacs-devel@gnu.org; Sun, 30 Mar 2014 05:01:29 -0400 Original-Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1WUBcK-0006K0-6O for emacs-devel@gnu.org; Sun, 30 Mar 2014 11:01:24 +0200 Original-Received: from x2f4094b.dyn.telefonica.de ([2.244.9.75]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 30 Mar 2014 11:01:24 +0200 Original-Received: from dak by x2f4094b.dyn.telefonica.de with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 30 Mar 2014 11:01:24 +0200 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 39 Original-X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: x2f4094b.dyn.telefonica.de X-Face: 2FEFf>]>q>2iw=B6, xrUubRI>pR&Ml9=ao@P@i)L:\urd*t9M~y1^:+Y]'C0~{mAl`oQuAl \!3KEIp?*w`|bL5qr,H)LFO6Q=qx~iH4DN; i"; /yuIsqbLLCh/!U#X[S~(5eZ41to5f%E@'ELIi$t^ Vc\LWP@J5p^rst0+('>Er0=^1{]M9!p?&:\z]|;&=NP3AhB!B_bi^]Pfkw User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) Cancel-Lock: sha1:GBrUR2hX8GRSilPQ3yQhkx94cIM= X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 80.91.229.3 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:171193 Archived-At: "Stephen J. Turnbull" writes: > David Kastrup writes: > > "Stephen J. Turnbull" writes: > > > > It just requires a slightly more complex design, which would be > > > appropriate for Emacsen (as compared to Python). > > > > If the "slightly more complexity" hits in unexpected places, it's going > > to end up a liability. Having more than one charset to work with if > > characters themselves don't contain a charset specification is affecting > > a load of stuff that can then conceivably work in more than one > > way. > > I'm a little smarter than that. Building on smartness is relying on a limited resource. It's not always easy to find wingmen (pun intended but unworkable). > The design I have in mind would be transparent. I don't think it gets much more transparent than "unibyte flag only marks the valid Unicode-in-Emacs character range". I'm for the range 0..255, Andreas for something like 0..127 U 4194176..4194303 which I find cumbersome for little return. > Maybe it wouldn't work; maybe it would be inefficient. But one thing > it wouldn't do is present a charset other than Unicode to Lisp. Neither does the above. Abolishing unibyte just means that buffers/strings have only one possible character range. That does not really give any "transparency" per se from the Lisp level. The interesting level is the C level. You need a byte stream representation in C at some point anyway, and not being able to call this representation either "string" or "buffer" may be neat in some manners but will end up cumbersome in others. -- David Kastrup