unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* [Fwd: Re: more on mumamo/php mode]
@ 2007-07-22 10:50 Lennart Borgman (gmail)
  2007-07-24 18:37 ` Claus
  0 siblings, 1 reply; 5+ messages in thread
From: Lennart Borgman (gmail) @ 2007-07-22 10:50 UTC (permalink / raw
  To: Emacs Devel

I got this report some days ago. Could please someone who understands 
what utf-translate-cjk-mode does comment on this?


-------- Original Message --------
Subject: Re: more on mumamo/php mode
Date: Thu, 19 Jul 2007 16:47:21 +0200
From: Claus <claus.klingberg@gmail.com>
To: Lennart Borgman (gmail) <lennart.borgman@gmail.com>

Hi Lennart,

some status update and a question on nxhtml-mode:

1. Somebody reported an error in nxml-mode upon validation/parsing
("Invalid regexp: Range striding over charsets") with Emacs 22 on the
nxml-mailing list (see e.g.
http://osdir.com/ml/emacs.nxml.general/2006-01/msg00039.html). Shortly
after Darkman posted a patch that seemed to work (at least for me):

http://drkm-lib.sourceforge.net/nxml/xsd-regexp.el.2006-01-26.patch

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Fwd: Re: more on mumamo/php mode]
  2007-07-22 10:50 [Fwd: Re: more on mumamo/php mode] Lennart Borgman (gmail)
@ 2007-07-24 18:37 ` Claus
  2007-07-24 20:02   ` Stefan Monnier
  0 siblings, 1 reply; 5+ messages in thread
From: Claus @ 2007-07-24 18:37 UTC (permalink / raw
  To: Emacs Devel; +Cc: Lennart Borgman (gmail)

Hello,

all I could find about this issue was the explanation in the thread
linked from my message. The description of the cause goes like this:

"I found why this error happens. In the recent CVS Emacs, as
utf-translate-cjk-mode is on by default, some Unicode characters will
be decoded into one of CJK character set. But, as
xsdre-range-list-to-char-alternative (xsd-regexp.el) simply do
something like this to generate a character range: (format "%c-%c"
(decode-char 'ucs FROM) (decode-char 'ucs TO)) the resulting regexp
causes the above error."

The patch below is supposingly fixing this issue (and is working for
me). However, I guess we need someone to verify this since the
original maintainer is currently not available. Could someone look
into this and suggest whether this is the right solution?

--- xsd-regexp.el.orig	2006-01-27 00:31:24.000000000 +0100
+++ xsd-regexp.el	2006-01-27 00:32:20.328529600 +0100
@@ -290,7 +290,8 @@
 (defun xsdre-compile-single-char (ch)
   (if (memq ch '(?. ?* ?+ ?? ?\[ ?\] ?^ ?$ ?\\))
       (string ?\\ ch)
-    (string (decode-char 'ucs ch))))
+    (let ((utf-translate-cjk-mode nil))
+      (string (decode-char 'ucs ch)))))

 (defun xsdre-char-class-to-range-list (cc)
   "Return a range-list for a symbolic char-class."
@@ -403,7 +404,8 @@
       (setq range-list (cdr range-list)))
     (setq chars
 	  (mapcar (lambda (c)
-		    (decode-char 'ucs c))
+                    (let ((utf-translate-cjk-mode nil))
+                      (decode-char 'ucs c)))
 		  chars))
     (when caret
       (setq chars (cons ?^ chars)))

Thanks,
Claus



On 7/22/07, Lennart Borgman (gmail) <lennart.borgman@gmail.com> wrote:
> I got this report some days ago. Could please someone who understands
> what utf-translate-cjk-mode does comment on this?
>
>
> -------- Original Message --------
> Subject: Re: more on mumamo/php mode
> Date: Thu, 19 Jul 2007 16:47:21 +0200
> From: Claus <claus.klingberg@gmail.com>
> To: Lennart Borgman (gmail) <lennart.borgman@gmail.com>
>
> Hi Lennart,
>
> some status update and a question on nxhtml-mode:
>
> 1. Somebody reported an error in nxml-mode upon validation/parsing
> ("Invalid regexp: Range striding over charsets") with Emacs 22 on the
> nxml-mailing list (see e.g.
> http://osdir.com/ml/emacs.nxml.general/2006-01/msg00039.html). Shortly
> after Darkman posted a patch that seemed to work (at least for me):
>
> http://drkm-lib.sourceforge.net/nxml/xsd-regexp.el.2006-01-26.patch
>
>
>
> _______________________________________________
> Emacs-devel mailing list
> Emacs-devel@gnu.org
> http://lists.gnu.org/mailman/listinfo/emacs-devel
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Fwd: Re: more on mumamo/php mode]
  2007-07-24 18:37 ` Claus
@ 2007-07-24 20:02   ` Stefan Monnier
  2007-07-24 22:12     ` Lennart Borgman (gmail)
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan Monnier @ 2007-07-24 20:02 UTC (permalink / raw
  To: Claus; +Cc: Lennart Borgman (gmail), Emacs Devel

> "I found why this error happens. In the recent CVS Emacs, as
> utf-translate-cjk-mode is on by default, some Unicode characters will
> be decoded into one of CJK character set. But, as
> xsdre-range-list-to-char-alternative (xsd-regexp.el) simply do
> something like this to generate a character range: (format "%c-%c"
> (decode-char 'ucs FROM) (decode-char 'ucs TO)) the resulting regexp
> causes the above error."

What it should do instead is to loop over all chars between FROM and TO and
add them one by one.  Character-ranges in regexps work rather poorly in
such cases.

> The patch below is supposingly fixing this issue (and is working for
> me).

It works around the problem, but will most likely break the use of those
CJK chars.


        Stefan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Fwd: Re: more on mumamo/php mode]
  2007-07-24 20:02   ` Stefan Monnier
@ 2007-07-24 22:12     ` Lennart Borgman (gmail)
  2007-07-25  1:01       ` Stefan Monnier
  0 siblings, 1 reply; 5+ messages in thread
From: Lennart Borgman (gmail) @ 2007-07-24 22:12 UTC (permalink / raw
  To: Stefan Monnier; +Cc: Claus, Emacs Devel

Stefan Monnier wrote:
>> "I found why this error happens. In the recent CVS Emacs, as
>> utf-translate-cjk-mode is on by default, some Unicode characters will
>> be decoded into one of CJK character set. But, as
>> xsdre-range-list-to-char-alternative (xsd-regexp.el) simply do
>> something like this to generate a character range: (format "%c-%c"
>> (decode-char 'ucs FROM) (decode-char 'ucs TO)) the resulting regexp
>> causes the above error."
> 
> What it should do instead is to loop over all chars between FROM and TO and
> add them one by one.  Character-ranges in regexps work rather poorly in
> such cases.
> 
>> The patch below is supposingly fixing this issue (and is working for
>> me).
> 
> It works around the problem, but will most likely break the use of those
> CJK chars.


There are too much things I do not understand here. Could please you or 
someone else who understands this write the needed code?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Fwd: Re: more on mumamo/php mode]
  2007-07-24 22:12     ` Lennart Borgman (gmail)
@ 2007-07-25  1:01       ` Stefan Monnier
  0 siblings, 0 replies; 5+ messages in thread
From: Stefan Monnier @ 2007-07-25  1:01 UTC (permalink / raw
  To: Lennart Borgman (gmail); +Cc: Claus, Emacs Devel

>>> "I found why this error happens. In the recent CVS Emacs, as
>>> utf-translate-cjk-mode is on by default, some Unicode characters will
>>> be decoded into one of CJK character set. But, as
>>> xsdre-range-list-to-char-alternative (xsd-regexp.el) simply do
>>> something like this to generate a character range: (format "%c-%c"
>>> (decode-char 'ucs FROM) (decode-char 'ucs TO)) the resulting regexp
>>> causes the above error."
>> 
>> What it should do instead is to loop over all chars between FROM and TO and
>> add them one by one.  Character-ranges in regexps work rather poorly in
>> such cases.
>> 
>>> The patch below is supposingly fixing this issue (and is working for
>>> me).
>> 
>> It works around the problem, but will most likely break the use of those
>> CJK chars.


> There are too much things I do not understand here. Could please you or
> someone else who understands this write the needed code?

As I said above, you just need to enumerate the chars individually rather
than use char-ranges in regexps.  I.e. you ned to replace things like "a-d"
with "abcd".  It may result in absurdly large regexps and it may be
a problem in itself, but in that case the only better solution would be to
restructure the code so as not to use regexps, or use category classes or
something like that.


        Stefan

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-07-25  1:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-22 10:50 [Fwd: Re: more on mumamo/php mode] Lennart Borgman (gmail)
2007-07-24 18:37 ` Claus
2007-07-24 20:02   ` Stefan Monnier
2007-07-24 22:12     ` Lennart Borgman (gmail)
2007-07-25  1:01       ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).