From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Oliver Scholz Newsgroups: gmane.emacs.devel Subject: Re: enriched-mode and switching major modes. Date: Tue, 14 Sep 2004 16:41:25 +0200 Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Message-ID: References: <200409042358.i84Nwjt19152@raven.dms.auburn.edu> <87llfn5ihw.fsf@emacswiki.org> NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: sea.gmane.org 1095186775 21768 80.91.229.6 (14 Sep 2004 18:32:55 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Tue, 14 Sep 2004 18:32:55 +0000 (UTC) Cc: boris@gnu.org, emacs-devel@gnu.org, alex@emacswiki.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Sep 14 20:32:42 2004 Return-path: Original-Received: from mail-relay.eunet.no ([193.71.71.242]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1C7Hul-0001lc-01 for ; Tue, 14 Sep 2004 20:20:27 +0200 Original-Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) by mail-relay.eunet.no (8.12.11/8.12.11/GN) with ESMTP id i8EF8crE094710 for ; Tue, 14 Sep 2004 17:08:38 +0200 (CEST) (envelope-from emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org) Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.33) id 1C7F0h-0000N4-P7 for ged-emacs-devel@m.gmane.org; Tue, 14 Sep 2004 11:14:23 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.33) id 1C7F0W-0000Ke-6s for emacs-devel@gnu.org; Tue, 14 Sep 2004 11:14:12 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.33) id 1C7F0U-0000Jx-MD for emacs-devel@gnu.org; Tue, 14 Sep 2004 11:14:11 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.33) id 1C7F0U-0000Jn-I4 for emacs-devel@gnu.org; Tue, 14 Sep 2004 11:14:10 -0400 Original-Received: from [213.165.64.20] (helo=mail.gmx.net) by monty-python.gnu.org with smtp (Exim 4.34) id 1C7Eup-0000Vg-BH for emacs-devel@gnu.org; Tue, 14 Sep 2004 11:08:19 -0400 Original-Received: (qmail 19938 invoked by uid 65534); 14 Sep 2004 14:41:37 -0000 Original-Received: from dsl-082-082-140-234.arcor-ip.net (EHLO USER-2MOEN8BWBA.gmx.de) (82.82.140.234) by mail.gmx.net (mp018) with SMTP; 14 Sep 2004 16:41:37 +0200 X-Authenticated: #1497658 Original-To: rms@gnu.org In-Reply-To: (Richard Stallman's message of "Mon, 13 Sep 2004 19:04:31 -0400") X-Attribution: os X-Face: "HgH2sgK|bfH$; PiOJI6|qUCf.ve<51_Od(%ynHr?=>znn#~#oS>",F%B8&\vus),2AsPYb -n>PgddtGEn}s7kH?7kH{P_~vu?]OvVN^qD(L)>G^gDCl(U9n{:d>'DkilN!_K"eNzjrtI4Ya6; Td% IZGMbJ{lawG+'J>QXPZD&TwWU@^~A}f^zAb[Ru;CT(UA]c& User-Agent: Gnus/5.1006 (Gnus v5.10.6) Emacs/21.3.50 (windows-nt) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org X-MIME-Autoconverted: from 8bit to quoted-printable by mail-relay.eunet.no id i8EF8crE094710 Xref: main.gmane.org gmane.emacs.devel:27108 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:27108 Richard Stallman writes: > Word > processors assign properties to paragraphs, including defaults for > character styles (like the font, the weight etc.); and they support > style sheets for that. > > Why can't we do something like that in Emacs using text properties? > We could perhaps have a text property on the whole paragraph > that indirects to a list of default properties, and then have other ove= rriding > properties on specific characters in the paragraph. > > Aside from data format, what would be the difference between a "style > sheet" and that list of default properties? > > It is very, very hairy to keep paragraphs, > their properties and their representation in an Emacs buffer in syn= c, > not to talk about style sheets. In fact I do think that getting a = WP > UI right in Emacs is currently impossible. The impossible part are tables (which I consider to be important). table.el is a very nifty package for tables in documents that are basically text/plain; but think of a table where the table cells of a row contain text with different character and paragraph formatting properties, for instance, column 1 has text with height 24 pt, column 2 only 12 pt, both in a proportional font and both having the same line spacing. The hairy part is whitespace formatting. The problems arise from the fact that I can't tell Emacs: "Display this text from position POS1 to POS2 as a paragraph with a left margin of 20 pt and a right margin of 40 pt with 20 pt above and below -- *without* adding any character to the buffer." I am going to expand a little bit on these difficulties below with a practical example. Parts of it are probably solvable; I am not yet sure how reliable those solutions are. > Since you're saying something negative, I think you should fill in the > argument for this conclusion. What methods have you considered? > > Indeed, I believe that in the long run Emacs' display engine should > support a real block model. > > Could you explain more clearly what you mean by that? I actually meant "box model". I am thinking of something like specified by CSS 2 or CSS 3 (draft). In short: a box model is an abstract way to specify the formatting of a piece of character data on screen. Emacs' text-properties (those affecting the display of text) could be regarded as "inline boxes" in the terminology of that model, because they do not force the text to which they apply to be displayed as a block (a "paragraph"). Block boxes are missing. CSS's block box model specifies margins (between borders and surounding boxes), borders, padding (between borders and content area) and content area as the four components of a block box. In a picture: + - - - - - - - - - - - - - - - - - - - - - - - + Margin (Top) | +--------------------------------------+ | | Padding (Top) | | | + - - - - - - - - - - - - - - - + | | |PL | |PR| |ML | Content Area | MR | | | | | | | | | | + - - - - - - - - - - - - - - - + | | | Padding (Bottom) | | +--------------------------------------+ | | Margin (Bottom) + - - - - - - - - - - - - - - - - - - - - - - - + If Emacs' display engine would support this, e.g. as a `block' text property, then I could write: (progn (switch-to-buffer (generate-new-buffer "*tmp*")) (insert "Example text. Example paragraph. Example text.") (put-text-property 15 33 'block '(:margin (4 1 1 1) :border nil :padding nil))) And then the text "Example paragraph" would get displayed as a paragraph on its own with a left margin of four canonical character units etc.. No inserting of newline characters or inserting of spaces for the left margin involved here. Other box types of the CSS include `list-item' for numbered or bulleted lists or various table-boxes for specifying tables. I am not bound to this particular model of CSS. But I do think that in the long run Emacs' display engine should support a visual formatting model that is equally powerful. The reason being, that I envision Emacs-the-Word-Processor as an XML-centric application. Even non-XML formats like RTF should be parsed into a data structure that is an instance of the XML infoset (DOM or SXML, probably). So that users have a nice API for writing extensions to that WP in Emacs Lisp. So much for the answer to your question what I mean by "box model". Now for the more concrete problem of implementing WP functionality for Emacs with its current capabilities. The difficult part here is the relation of data structure ("the document"), visual appearance ("the formatting") and user interface. With text/plain their relation is so simple that we hardly distinguish them at all. The visual appearance is determined by control characters like space and newline, which are part of the document (i.e. part of the data structure). The user interface is also simple: to change the (whitespace) formatting, we just insert spaces and newlines where appropriate, which in turn become part of the data structure. To some extend this also works for text/enriched. But it stops to work for more elaborate, more widely used and -- IMNSHO -- more interesting document types and document formats. Consider the following RTF document: {\rtf1\ansi\deff0 {\fonttbl{\f0\froman Times;}{\f1\fswiss Helvetica;}} {\stylesheet{\s1\f0\fs24\snext1 Standard;} {\s2\keepn\f1\sb400\sa200\fs48 Headline;} {\s3\sbasedon1\i\sb100\sa100\fs20\lin709 Motto;}} {\*\listtable {\list\listtemplateid1 {\listlevel\levelnfc23\leveljc0\levelstartat1\levelfollow2 {\leveltext\'01\u8226 ?;}} \listid1}} {\listoverridetable{\listoverride\listid1\listoverridecount0\ls0}} {\s2 Lirum larum (A Headline)} {\par\pard\s3 "Mariage is the chief cause of divorce."} \par\pard\plain\s1 This is just ordinary {\fs48 paragraph} text. Nothing special here. \par\pard\plain\ls0\ilvl0 This is a list item. It contains two subitems: \par\pard\plain\ls0\ilvl1 One and \par\pard\plain\ls0\ilvl1 Two. \par\pard\plain\ls0\ilvl0 This is another list item.} A short explanation: Brackets group stuff together. Everything up to line 10 ("{\listoverridetable ...") is header information. The \fonttbl group specifies the fonts to use in the document. Each font definition starts with \fN where N is a decimal number which is used to refer to that font. The \stylesheet group defines stylesheets. Here I only define paragraph stylesheets whose definition is started with \sN. I define three paragraph styles here, "Standard", "Headline" and "Motto". For example for the "Headline" style this specifies that a "Headline"-paragraph should use the font "Helvetica" (\f1) with a height of 24pt (\fs48), that it should be preceded by 20 pt vertical whitespace (\sb400 -- the units are "Twips") and followed by 10 pt vertical whitespace. The rest of the header is important for bulleted or numbered lists; I won't go into details here, because that is a black art, which I have not yet fully mastered myself. In the document itself \par starts a new paragraph and \sN refers to a stylesheet. \lsN\ilvlN is for list-items, again. A plain/text approximation to the whitespace formatting of the document (e.g. how it would be rendered on a tty) could look like this: ---------------------- Start Document -------------------------------- Lirum larum (A headline) "Mariage is the chief cause of divorce." This is just ordinary paragraph text. Nothing special here. * This is a list item. It contains two subitems: 1. One and 2. Two * This is another list item. ---------------------- End Document ---------------------------------- If Emacs display engine would support a block model, we would just tell the display engine how to render the paragraphs. There is not a single newline chars and no space between paragraphs that would be part of the character data. I.e. `(buffer-substring-no-properties (point-min) (point-max))' would return: "Lirum larum (A headline)\"Mariage is the chief cause of divorce.\"\ This is just ordinary paragraph text. Nothing special here. This is\ a list item. It contains two subitems:One and Two This is another \ list item." (Note that the bullets and the numbers of the lists are not part of the character data, either.) Without a block model supported by the display engine, we have to fake it by inserting newline characters and space (probably with a `display' property) where appropriate. In this case we would have to make sure that the UI is right. For instance a user must not be able insert characters in a place where "no character data are". For instance, here: Lirum larum (A Headline) -!- "Mariage is the chief cause of divorce." Or here: * This is a list item. It contains two subitems: -!- 1. One and 2. Two. The UI in typical word processors simply inhibits to move the cursor to these places. If the cursor is after "subitems:" and the user hits , the cursor would move before "One". To get the same effect in Emacs we would have to make everything from the newline after "subitems:" up to "1." intangible. For this we need a specialised fill function. If we store the paragraph properties in a text property, then this fill-function would 1) determine how far the paragraph extents, this could be, for instance, every text with an `eq' paragraph text property. 2) Remove every newline or space character that was inserted programatically by any previous filling. Those newlines and spaces were not entered by the user and she does not want them to be part of her document. They were added to the buffer only for visual rendering. 3) Determine the whitespace formatting properties of the paragraph. They may be specified via a stylesheet or directly or both (direct specification which overrides the defaults of a style sheet). 4) Add newline chars (word wrapping) and spaces (indentation) where appropriate to get a visual approximation to the paragraph properties specified in step 3). Those programatically added spaces and newlines should probably marked with a text property in order to make them distinguishable in step 2) from spaces that were entered by the user. So far I have only talked about vertical and horizontal whitespace. Character formatting information is another issue. Take for example this part from the RTF above: \par\pard\plain\s1 This is just ordinary {\fs48 paragraph} text. \s1 says: use paragraph stylesheet #1: Font: Helvetica; font-height: 12pt. But this default for the paragraph is overriden by \fs48 for the single word paragraph, it is meant to displayed with a font height of 24 pt; however, this overrides only the height, all other properties of the stylesheet do apply. I guess this is best solved by letting font-lock look at the paragraph properties, resolve all style information and then put an according anonymous face on the `face' property. Large parts of a WP may be possible in this or similar ways. Tables, borders (and border styles), embedded vector graphics, multiple column text are probably not feasible; but with the exeption of tables they are IMO not /that/ important for now. However, about one thing I am positiv: there is absolutely no room for a minor mode here. That's why I say that enriched-mode (as a minor mode) is a dead end. Oliver -- Oliver Scholz 29 Fructidor an 212 de la R=C3=A9volution Ostendstr. 61 Libert=C3=A9, Egalit=C3=A9, Fraternit=C3=A9! 60314 Frankfurt a. M.