From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Tom Lord Newsgroups: gmane.lisp.guile.devel Subject: Re: Unicode and Guile Date: Tue, 11 Nov 2003 17:40:28 -0800 (PST) Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Message-ID: <200311120140.RAA26670@morrowfield.regexps.com> References: <20031021171534.GA13246@lark> <200310260003.RAA10375@morrowfield.regexps.com> <20031031132525.GB715@lark> <200311032031.MAA19389@morrowfield.regexps.com> <20031106181635.GA9546@lark> <200311111902.LAA25202@morrowfield.regexps.com> <87znf2ig46.fsf@zagadka.ping.de> NNTP-Posting-Host: deer.gmane.org X-Trace: sea.gmane.org 1068600457 27270 80.91.224.253 (12 Nov 2003 01:27:37 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 12 Nov 2003 01:27:37 +0000 (UTC) Cc: guile-devel@gnu.org Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Wed Nov 12 02:27:34 2003 Return-path: Original-Received: from monty-python.gnu.org ([199.232.76.173]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1AJjnC-0008Uk-00 for ; Wed, 12 Nov 2003 02:27:34 +0100 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.24) id 1AJkjR-00083B-Qi for guile-devel@m.gmane.org; Tue, 11 Nov 2003 21:27:45 -0500 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.24) id 1AJkiV-00082p-Rb for guile-devel@gnu.org; Tue, 11 Nov 2003 21:26:47 -0500 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.24) id 1AJkhy-0007p0-Po for guile-devel@gnu.org; Tue, 11 Nov 2003 21:26:45 -0500 Original-Received: from [65.234.195.251] (helo=morrowfield.regexps.com) by monty-python.gnu.org with esmtp (Exim 4.24) id 1AJkhw-0007ob-Dw for guile-devel@gnu.org; Tue, 11 Nov 2003 21:26:13 -0500 Original-Received: (from lord@localhost) by morrowfield.regexps.com (8.9.1/8.9.1) id RAA26670; Tue, 11 Nov 2003 17:40:28 -0800 (PST) (envelope-from lord@morrowfield.regexps.com) Original-To: mvo@zagadka.de In-reply-to: <87znf2ig46.fsf@zagadka.ping.de> (message from Marius Vollmer on Wed, 12 Nov 2003 01:29:29 +0100) X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.2 Precedence: list List-Id: Developers list for Guile, the GNU extensibility library List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: main.gmane.org gmane.lisp.guile.devel:3002 X-Report-Spam: http://spam.gmane.org/gmane.lisp.guile.devel:3002 > From: Marius Vollmer > Tom Lord writes: > > ~ (grapheme=? g1 g2 [locale]) => > > ~ (grapheme > ~ (grapheme>? g1 g2 [locale]) > > [...] > > ~ (grapheme-ci=? g1 g2 [locale]) > > ~ (grapheme-ci > ~ (grapheme-ci>? g1 g2 [locale]) > > The usual orderings. > Is it a good idea to have an ordering among graphemes, or would it be > better to only order texts, i.e., to allow for the context of a > grapheme to determine the order? I think it's a fine idea to order graphemes but, depending on the locale, the ordering of texts is _not_ a lexical ordering grounded in grapheme ordering. It would be good to provide a locale, perhaps the default, in which ordering of texts _is_ a lexical ordering grounded in (default) grapheme order. > > ~ (make-text-marker text index) => > What about having _only_ markers and not allow integers as > indices? Seems excessive and aribtrary. How do I implement (Emacs') GOTO-CHAR without standing on my head? > Also, what about making TEXTs unmutable by default and instead let > TEXT-REPLACE, etc return a new text object? Given an implementation that can do that efficiently, I see no obstacle to implementing a new type, META-TEXT?, which is mutable in exactly the way that TEXT? is in my proposal. That'd be ridiculously inconvenient though. So, make META-TEXT? the same thing as TEXT?. (I strongly suggest splay trees as an ideal implementation strategy for for TEXT?. They would make _both_ mutating and functional REPLACE efficient.) > > There is no essential difference between a grapheme and a text > > object of length 1, and thus the proposal makes GRAPHEME? a > > subtype of TYPE. > Do we need the concept of grapheme at all, then? Interesting question! And it ties in with your question about "why not just markers and not integer indexes". I don't see a good way to ground markers _without_ integer indexes. Graphemes are a reasonable "what the user thinks of as a character". What does DELETE-BACKWARD-CHAR delete (for example) (at least by default) if not a grapheme? And in the non-default cases, how does it analyze the TEXT? value to figure out what to do? > > The proposal also makes it possible to pass strings everywhere that > > text can be used. I think that's the more interesting direction: > > just use text- and grapheme- procedures from now on except where you > > _really_ want to refer to octets. > Could we make strings/chars go away completely over time? For vectors > of octets, there is u8vector? from SRFI-4. I wouldn't object to seeing a complete unification of STRING? with u8vector. I'm not so sure that the CHAR? type is particularly useful in the long run -- it's rather culturally biased. -t _______________________________________________ Guile-devel mailing list Guile-devel@gnu.org http://mail.gnu.org/mailman/listinfo/guile-devel