From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Richard Stallman Newsgroups: gmane.emacs.devel Subject: Re: xml-parse-file and text properties Date: Fri, 21 Jul 2006 00:46:41 -0400 Message-ID: References: <1153433461.32596.48.camel@turtle.as.arizona.edu> Reply-To: rms@gnu.org NNTP-Posting-Host: main.gmane.org Content-Type: text/plain; charset=ISO-8859-15 X-Trace: sea.gmane.org 1153457224 28670 80.91.229.2 (21 Jul 2006 04:47:04 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Fri, 21 Jul 2006 04:47:04 +0000 (UTC) Cc: mah@everybody.org, emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Jul 21 06:47:02 2006 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1G3mui-0004KN-K6 for ged-emacs-devel@m.gmane.org; Fri, 21 Jul 2006 06:47:00 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1G3mui-0006Pu-5E for ged-emacs-devel@m.gmane.org; Fri, 21 Jul 2006 00:47:00 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1G3muR-0006PX-Ng for emacs-devel@gnu.org; Fri, 21 Jul 2006 00:46:43 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1G3muQ-0006P0-IG for emacs-devel@gnu.org; Fri, 21 Jul 2006 00:46:43 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1G3muQ-0006Ot-3A for emacs-devel@gnu.org; Fri, 21 Jul 2006 00:46:42 -0400 Original-Received: from [199.232.76.164] (helo=fencepost.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.52) id 1G3muk-000157-VT for emacs-devel@gnu.org; Fri, 21 Jul 2006 00:47:03 -0400 Original-Received: from rms by fencepost.gnu.org with local (Exim 4.34) id 1G3muP-0000l5-26; Fri, 21 Jul 2006 00:46:41 -0400 Original-To: JD Smith Original-to: handa@m17n.org In-reply-to: <1153433461.32596.48.camel@turtle.as.arizona.edu> (message from JD Smith on Thu, 20 Jul 2006 15:11:01 -0700) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:57409 Archived-At: ;; Note that {buffer-substring,match-string}-no-properties were ;; formerly used in several places, but that removes composition info. but neither of us were clear on the meaning of the statement, or why retaining text properties in any XML parsed data would be desirable. I think I see why. Losing the composition info could mean that the composed characters turn into other sequences of characters. It literally would change the text! This is an ugly problem. Many things want to get rid of most text properties, but they don't want to forget about composition. Logically speaking, composition is really part of the characters in the text. Using text properties to encode it is fundamentally inconsistent. We have been lucky so far, in that this inconsistency has not caused a lot of problems -- but now our luck is running out. I can see only two kinds of approaches: 1. Distinguish composition properties from others, and make functions like buffer-substring-no-properties preserve composition properties, even as they discard all other properties. 2. Change the representation of composition so it uses something other than text properties. #2 would be a big maintenance trouble. It would take us a long time to get everything working again after such a change. We certainly should not install such a change now, and I hope we won't need to do it ever. Can #1 work? Handa, please respond.