From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Ken Raeburn Newsgroups: gmane.emacs.devel Subject: Re: please consider emacs-unicode for pervasive changes Date: Thu, 18 Jul 2002 14:39:39 -0400 Sender: emacs-devel-admin@gnu.org Message-ID: References: NNTP-Posting-Host: localhost.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: main.gmane.org 1027017625 26159 127.0.0.1 (18 Jul 2002 18:40:25 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Thu, 18 Jul 2002 18:40:25 +0000 (UTC) Cc: emacs-devel@gnu.org Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.33 #1 (Debian)) id 17VGCM-0006nn-00 for ; Thu, 18 Jul 2002 20:40:22 +0200 Original-Received: from fencepost.gnu.org ([199.232.76.164]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 17VGON-0004ot-00 for ; Thu, 18 Jul 2002 20:52:47 +0200 Original-Received: from localhost ([127.0.0.1] helo=fencepost.gnu.org) by fencepost.gnu.org with esmtp (Exim 3.35 #1 (Debian)) id 17VGCJ-0002bA-00; Thu, 18 Jul 2002 14:40:19 -0400 Original-Received: from 208-59-178-90.c3-0.smr-ubr1.sbo-smr.ma.cable.rcn.com ([208.59.178.90] helo=raeburn.org) by fencepost.gnu.org with esmtp (Exim 3.35 #1 (Debian)) id 17VGBj-0002ab-00 for ; Thu, 18 Jul 2002 14:39:43 -0400 Original-Received: from kal-el.raeburn.org (mail@kal-el.raeburn.org [18.101.0.230]) by raeburn.org (8.11.3/8.11.3) with ESMTP id g6IIdef03300; Thu, 18 Jul 2002 14:39:40 -0400 (EDT) Original-Received: from raeburn by kal-el.raeburn.org with local (Exim 3.35 #1 (Debian)) id 17VGBf-0008Gx-00; Thu, 18 Jul 2002 14:39:40 -0400 Original-To: Dave Love In-Reply-To: (Dave Love's message of "18 Jul 2002 17:02:59 +0100") Original-Lines: 70 User-Agent: Gnus/5.090005 (Oort Gnus v0.05) Emacs/21.1.50 (i686-pc-linux-gnu) Errors-To: emacs-devel-admin@gnu.org X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.0.11 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Emacs development discussions. List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.emacs.devel:5870 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:5870 --=-=-= Dave Love writes: > There have been various pervasive changes recently which will cause > grief merging the unicode branch eventually. Could people doing this > for Mule-related files (including display) please be prepared to > modify the `emacs-unicode' branch too. I don't know if handa would > like to be consulted beforehand, but I assume he'd appreciate having > it done now, with less problems later on. My string-macro changes, while fairly pervasive, are not as tough to make as they might appear. You might be surprised by what can be accomplished with a set of regular expressions that correspond to various styles of C expressions :-). I've attached the Lisp code I use; scmcvt-string and scmcvt-string-2 are the interesting bits. There's always a little cleanup needed afterwards (a few cases that aren't matched; restoring lisp.h macros that become defined to expand to themselves because the definition matched the pattern) but it does 95% of the work. Other changes I'm working on right now are to reduce the uses of SDATA that aren't read-only, to make it easier to identify places for inserting write-barrier code; that's a slow and manual process, and while by no means as pervasive, still twice as painful if I have to do it on a branch as well. Though aside from a few cases where SDATA is used to write to a string, it'll mostly consist of adding "const" to char pointers in places, and eventually adding a const cast to the SDATA macro; is that pervasive enough that you want it brought over, or small enough to merge later? As soon as any big changes that need to be applied to the branch depend on any small changes that weren't applied to the branch, we get a maintenance hassle, with someone who didn't write those small changes trying to bring them and anything they might depend on over to the branch. (The dependencies may be obvious, but if they're of the sort "simplify macro X because all code relying on side-effect Y has been changed over the course of a couple months to not rely on it", they may be hard to find.) It may be better just to say *everything* goes into the branch as well as the trunk. Then again, is anything else likely to be as problematic as taking string handling in two different directions on different branches? > (I guess the same would go > for other active branches, if there are any.) I've assumed that when I start work on a Guile branch, I'd be responsible for dealing with merges in both directions and all the coordination that implies. ("We", not "I"; I really hope to get some help in this work.) That's also why I am taking the approach I am -- essentially, I *am* merging changes to the trunk that I made in my divergent source tree started long ago. It would be helpful for automatic tools or other useful techniques to be made available, but I wouldn't want to demand that everyone making big changes on the trunk also be required to know which branches are "active" and how their changes might have to be applied differently to those branches, and rewrite their changes to suit. If you get around to merging in some big changes to the trunk to change the character-data handling in ways that better support the Unicode changes -- or perhaps the completed set of Unicode changes -- would you want to be required to merge them onto a Guile branch as well? I realize the Unicode work, which we've been talking about as the "big thing" for Emacs 22, is probably in a special category, and maybe it makes sense to ask for parallel development in this case and not others. Ken --=-=-= Content-Disposition: attachment Content-Description: c expr rewrite code (defun quiet-replace-regexp (regexp to-string) (save-excursion (goto-char (point-min)) (while (re-search-forward regexp nil t) (replace-match to-string nil nil)))) (defun qrep-car-cdr (base) (quiet-replace-regexp (concat "XCONS ?(\\(" base "\\))->car") "XCAR (\\1)") (quiet-replace-regexp (concat "XCONS ?(\\(" base "\\))->cdr") "XCDR (\\1)") ) (defun qrep-float (base) (quiet-replace-regexp (concat "XFLOAT ?(\\(" base "\\))->data") "XFLOAT_DATA (\\1)") ) ;; All of these must accept only paren-balanced C expressions. No ;; wildcard matching here... (defvar c-exprs nil "") (setq c-exprs '( ;; no leading whitespace! "[-*a-z_A-Z0-9.][-*a-z_A-Z0-9.]*" "[-*a-z_A-Z0-9.][-*a-z_A-Z0-9.]*-> *[*a-z_A-Z0-9.]+" ;; a(b), zero or more trailing ->s "[a-z_A-Z0-9][a-z_A-Z0-9 ]*([-a-z_A-Z0-9> .]*)[-a-z_>]*" ;; a(b(c)), trailing ->s at end and after b(c) "[a-z_A-Z0-9][a-z_A-Z0-9 ]*([a-z_A-Z0-9 ]+([a-z_A-Z0-9 ]*)[-a-z_>]*)[-a-z_>]*" ;; a(b(c(d))) "[a-z_A-Z0-9][a-z_A-Z0-9 ]*([a-z_A-Z0-9 ]+([a-z_A-Z0-9 ]+([a-z_A-Z0-9 ]*)))" ;; something subscripted - a[b] "[a-z_A-Z0-9][a-z_A-Z0-9 ]*\\[[a-z_A-Z0-9 ]+\\]" ;; a(b[c]) "[a-z_A-Z0-9][a-z_A-Z0-9 ]*([a-z_A-Z0-9 ]+\\[[a-z_A-Z0-9 ]+\\])" ;; GET_TRANSLATION_TABLE macro defn - subscript in the arg ;; a(b)->c[(d)] "[a-z_A-Z0-9][a-z_A-Z0-9 ]*([a-z_A-Z0-9 ]*)->[a-z_A-Z0-9]+\\[([a-z_A-Z0-9 ]+)\\]" ;; xfaces uses a->b[c] "[a-z_A-Z0-9][a-z_A-Z0-9 ]*->[a-z_A-Z0-9]+\\[[a-z_A-Z0-9 ]+\\]" ;; (x) "([-*a-z_A-Z0-9.][-*a-z_A-Z0-9.]*)" ;; f(a->b[c]) "[a-z_A-Z0-9][a-z_A-Z0-9 ]*([a-z_A-Z0-9 ]*->[a-z_A-Z0-9]+\\[[a-z_A-Z0-9 ]+\\])" ;; ( ! f(a) \n ? b \n : c ), where a b c can contain -> "(!?[a-z_A-Z0-9][a-z_A-Z0-9 ]*([->a-z_A-Z0-9 ]*)[ \n\t]*\\? [->a-z_A-Z0-9 ]*[ \n\t]*: [->a-z_A-Z0-9 ]*)" ;; pure numbers (as extra macro args, of course, not variables) "-?[0-9][0-9]*" )) (defvar c-all-exprs nil "") (setq c-all-exprs (apply 'concat (car c-exprs) (mapcar (lambda (x) (concat "\\|" x)) (cdr c-exprs)))) (defun scmcvt-car-and-cdr () (interactive) (mapcar 'qrep-car-cdr c-exprs) ;; be careful -- this can change the definition of XFLOAT_DATA itself (qrep-float "[-*a-z_A-Z0-9>]+") (qrep-float "[-*a-z_A-Z0-9>]+\\[[a-z]+\\]") ) (defun map-over-files (fun) (let ((names (directory-files "." nil "\\.[ch]$"))) (mapcar fun names))) (defun get-fn-value (f) (if (symbolp f) (get-fn-value (symbol-function f)) f)) (defun apply-and-save-wrapper (fun) (let ((x (get-fn-value fun))) (eval `(lambda (fn) (apply-and-save ,x fn))))) (defun apply-and-save (fun fn) (if (file-regular-p fn) (progn (message "Working on %s..." fn) (find-file fn) (goto-char (point-min)) (funcall fun) (if (buffer-modified-p nil) (save-buffer)) (kill-buffer nil) (message "Working on %s...done" fn) ))) (defun map-edit-files (fun) (let ((enable-local-variables nil)) (map-over-files (apply-and-save-wrapper fun))) nil ) (defun qrep-string (base) (quiet-replace-regexp (concat "SMBP ?(\\(" base "\\))") "STRING_MULTIBYTE (\\1)") ;; do size_byte before size, since the latter is a substring of the ;; former and would match (quiet-replace-regexp (concat "XSTRING ?(\\(" base "\\))->size_byte") "STRING_SIZE_BYTE (\\1)") (quiet-replace-regexp (concat "XSTRING ?(\\(" base "\\))->size") "SCHARS (\\1)") (quiet-replace-regexp (concat "STRING_SIZE_BYTE ?(\\(" base "\\))") "XSTRING (\\1)->size_byte") ;; other fields (quiet-replace-regexp (concat "XSTRING ?(\\(" base "\\))->intervals") "STRING_INTERVALS (\\1)") (quiet-replace-regexp (concat "XSTRING ?(\\(" base "\\))-> *data") "SDATA (\\1)") (quiet-replace-regexp (concat "STRING_BYTES (XSTRING ?(\\(" base "\\)))") "SBYTES (\\1)") (quiet-replace-regexp (concat "XSETSTRING (\\(" base "\\),[\n\t ]*XSTRING (\\(" base "\\)))") "\\1 = \\2") (quiet-replace-regexp (concat "SET_STRING_BYTES (XSTRING ?(\\(" base "\\)), -1)") "STRING_SET_UNIBYTE (\\1)") (quiet-replace-regexp (concat "SDATA (\\(" base "\\)) *\\[\\(" base "\\)\\]") "SREF (\\1, \\2)") ) (defun qrep-string-2 (base) (quiet-replace-regexp (concat "SET_STRING_BYTES (XSTRING ?(\\(" base "\\)), *\\((" base ")\\))") "STRING_SET_BYTES (\\1, \\2)") (quiet-replace-regexp (concat "STRING_SET_BYTES (\\(" base "\\), -1)") "STRING_SET_UNIBYTE (\\1)") ) (defun scmcvt-string () (interactive) (mapcar 'qrep-string c-exprs)) (defun scmcvt-string-2 () (interactive) (mapcar 'qrep-string-2 c-exprs)) (defun scmcvt-all () (scmcvt-car-and-cdr) (scmcvt-string) ) (if nil (progn ;; run these forms one at a time (map-edit-files 'scmcvt-car-and-cdr) (map-edit-files 'scmcvt-string) (map-edit-files 'scmcvt-string-2) (map-edit-files 'scmcvt-all) )) --=-=-=--