From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Oliver Scholz Newsgroups: gmane.emacs.devel Subject: Re: enriched-mode and switching major modes. Date: Wed, 22 Sep 2004 12:35:15 +0200 Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Message-ID: References: <200409042358.i84Nwjt19152@raven.dms.auburn.edu> <87llfn5ihw.fsf@emacswiki.org> <01c49c75$Blat.v2.2.2$7a37cb00@zahav.net.il> <01c49d70$Blat.v2.2.2$f7cfb860@zahav.net.il> <01c49da7$Blat.v2.2.2$cd5f7160@zahav.net.il> <01c49dc6$Blat.v2.2.2$3b624d40@zahav.net.il> Original-Received: from lists.gnu.org ([199.232.76.165]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1CA5PW-0002ZR-00 for ; Wed, 22 Sep 2004 13:35:46 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.33) id 1CA5VU-0007yA-8v for ged-emacs-devel@m.gmane.org; Wed, 22 Sep 2004 07:41:56 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.33) id 1CA5VM-0007xu-MZ for emacs-devel@gnu.org; Wed, 22 Sep 2004 07:41:48 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.33) id 1CA5VL-0007xf-VY for emacs-devel@gnu.org; Wed, 22 Sep 2004 07:41:48 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.33) id 1CA5VL-0007xK-SZ for emacs-devel@gnu.org; Wed, 22 Sep 2004 07:41:47 -0400 Original-Received: from [213.165.64.20] (helo=mail.gmx.net) by monty-python.gnu.org with smtp (Exim 4.34) id 1CA5P2-0002vu-R0 for emacs-devel@gnu.org; Wed, 22 Sep 2004 07:35:17 -0400 Original-Received: (qmail 6468 invoked by uid 65534); 22 Sep 2004 10:35:15 -0000 Original-Received: from dsl-084-057-030-048.arcor-ip.net (EHLO USER-2MOEN8BWBA.gmx.de) (84.57.30.48) by mail.gmx.net (mp021) with SMTP; 22 Sep 2004 12:35:15 +0200 X-Authenticated: #1497658 Original-To: rms@gnu.org In-Reply-To: (Richard Stallman's message of "Tue, 21 Sep 2004 14:30:53 -0400") X-Attribution: os X-Face: "HgH2sgK|bfH$; PiOJI6|qUCf.ve<51_Od(%ynHr?=>znn#~#oS>",F%B8&\vus),2AsPYb -n>PgddtGEn}s7kH?7kH{P_~vu?]OvVN^qD(L)>G^gDCl(U9n{:d>'DkilN!_K"eNzjrtI4Ya6; Td% IZGMbJ{lawG+'J>QXPZD&TwWU@^~A}f^zAb[Ru;CT(UA]c& User-Agent: Gnus/5.1006 (Gnus v5.10.6) Emacs/21.3.50 (windows-nt) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:27428 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:27428 I split my answer to this mail in order to adress some issues separately. Richard Stallman writes: [nested blocks] > What does it *mean* to copy a character from inside environment > `larum' which is inside environment `lirum' and insert it somewere > else? What should that character look like in its new location? Two things could make sense here: * Copy the properties of the /immediate/ containing block. * Ignore the block formatting properties and copy only normal text properties. I definitely prefer the second one. I think it would be the Right Thing. If I copy text from a H1 paragraph and insert it into a H2 paragraph, then it should get all character formatting properties that are specified at the paragraph level from the H2 environment. But if the text has /additional/ character formatting properties specified, like it contains some italic words, those should be preserved. [...] >

Some meaningless heading

> > The element maps directly to text properties, of course. But the > h1 element both demands that its contents be rendered as a paragraph > (a block) /and/ specifies certain character formatting properties for > the whole of it, e.g. a large bold font. > > When encoding a buffer, I need to identify the whole paragraph as > being of the type "h1". I.e. I have to distinguish it from: > >

Some meaningless heading

> > Why do you have to distinguish them? It is about preserving the user's intent. Word processors as well as the file formats used in word processing typically provide several ways to apply character formatting properties on text: * paragraph formatting stylesheets - RTF: \sN - HTML: block elements like h1, h2 ... * character formatting stylesheets - RTF: \csN - HTML: inline elements like em * direct specification of character formatting properties - RTF: \fN, \fsN, \b ... - HTML: i, b, font ... The first two provide an layer of indirection which allows to specify the user's /semantical/ intent on the document text. Some users---well, /I/ for example---would prefer /not/ to work with direct specification of formatting properties at all. It is a matter of what is the intent that the user has expressed. Did she specify "I want this to be a top level headline" or did she specify "I want this to be large, bold text"? The difference will show up, when the document is transfered to another rendering device or when the user changes her mind and changes the stylesheet for "level 1 headlines". We have to preserve that intent of the user in the data structure. That's why I introduced the concept of the abstract document and distinguished it from the appearance. The abstract document is the aggregation of the user's intent. Specifying only the appearance ("This should be large, bold text") is considered bad practice in word processing. Some users do it this way; but many, at least most people /I/ know, prefer stylesheets. If Emacs would fail to preserve the semantical intents, it would get a very bad reputation as a word processor. Even worse, we would have to expect that sophisticated users would recommend /not/ to use Emacs in document exchange. This must not happen. Emacs has the potential to be much better than any existing word processor; I would be very sad if it happens to become worse. > Why wouldn't it work simply to put these properties on the whole > text of the paragraph? What aspect would work differently as a > result of doing one or the other, and why is it better if the > properties are attached to paragraphs? When encoding the document, I have to determine the type of a paragraph, so that the encoded document file conserves the user's semantical intent. I have to get that information from somewhere. If we can guarantee, that text properties affecting the paragraph /always/ cover the whole of text of a paragraph, then this o.k. When encoding, I first distuinguish the paragraph; then I look at the text property. Kim has hinted at some ways of guaranteeing this. Offhand I believe that this would work for non-nested paragraphs (blocks). I dislike that approach, though, partly because I don't trust its robustness, partly because it does not scale to handle nested blocks. This whole affair is partly an UI problem. The functions that encode the document must be able to unambigously determine the type of a paragraph as well as its other features from the data structure. But also the user must get feedback on how her actions affected the abstract document (as expressed in said data structure): > We have to deal with the case that a user deletes the hard newline (if > you evaluate the code above: just hit backspace). Is the resulting > paragraph of type `h1' or of type `h2'? > > Why ask the question? Why not just accept that it's a paragraph > of partly h1 text and partly h2 text? In HTML there is no such thing as a paragraph that is partly H1 and H2 text. What you suggest would result in this:

lirum larum

lirum larum

Any user agent (web browser, another word processor) would render this as two paragraphs (blocks). But the user in Emacs saw it as a single paragraph when she saved that document. Due to the commands she has issued (maybe accidentally) the data structure treats it as two separate paragraphs and encodes it accordingly when writing to the file; but the user does not get any visual feedback on this. She will be surprised. If she knows that things like this could happen, she could feel the urge to examine the encoded document file before she transers it to somebody else. Eventually she could even stop to use the word processing facilities and edit the raw HTML from the beginning; or use another word processor. Of course, treating "h1" and "h2" always as character formatting types only would avoid the "one paragraph that suddenly becomes two paragraphs" effect:

lirum larumlirum larum

But then we fail again to preserve any semantical intent. Oliver -- Oliver Scholz Jour de la Révolution de l'Année 212 de la Révolution Ostendstr. 61 Liberté, Egalité, Fraternité! 60314 Frankfurt a. M.