unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Re: Your Emacs changes
       [not found] <E1D7MTJ-0002Jj-BJ@fencepost.gnu.org>
@ 2005-04-26 12:07 ` Arne Jørgensen
  2005-04-28 11:58   ` Thien-Thi Nguyen
  0 siblings, 1 reply; 13+ messages in thread
From: Arne Jørgensen @ 2005-04-26 12:07 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 923 bytes --]

Richard Stallman <rms@gnu.org> wrote March 4, 2005:

> We can now install them.  Could you send me the latest version of your
> changes, with change log entries?

First of all: pardon me for answering the above e-mail on this list. I
have however tried to answer it privately several times since I
received it without succes (I used to having problems with
gnu.org-mailserver). So now I'm trying to post to list via Gmane
instead.

The mentioned changes where the coding system recognition for LaTeX
files.

If it will not be commited to Emacs 22 (which I would understand) I
will probably post it to gnu.emacs.sources.

This is a ChangeLog entry for the change:

2005-04-26  Arne Jørgensen  <arne@arnested.dk>

	* international/latexenc.el: New file. Guess correct coding system
	of LaTeX files.

	* international/mule-conf.el (file-coding-system-alist): Use it.

Kind regards,
-- 
Arne Jørgensen <http://arnested.dk/>


[-- Attachment #2: lisp/international/latexenc.el --]
[-- Type: application/emacs-lisp, Size: 7287 bytes --]

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: Patch for lisp/international/mule-conf.el --]
[-- Type: text/x-patch, Size: 706 bytes --]

Index: lisp/international/mule-conf.el
===================================================================
RCS file: /cvsroot/emacs/emacs/lisp/international/mule-conf.el,v
retrieving revision 1.78
diff -u -p -r1.78 mule-conf.el
--- lisp/international/mule-conf.el	20 Apr 2005 14:31:24 -0000	1.78
+++ lisp/international/mule-conf.el	26 Apr 2005 11:47:26 -0000
@@ -501,6 +501,7 @@ for decoding and encoding files, process
 	("\\(\\`\\|/\\)loaddefs.el\\'" . (raw-text . raw-text-unix))
 	("\\.tar\\'" . (no-conversion . no-conversion))
 	( "\\.po[tx]?\\'\\|\\.po\\." . po-find-file-coding-system)
+	("\\.tex\\|\\.ltx\\|\\.dtx\\|\\.drv\\'" . latexenc-find-file-coding-system)
 	("" . (undecided . nil))))
 
 \f

[-- Attachment #4: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Your Emacs changes
  2005-04-26 12:07 ` Your Emacs changes Arne Jørgensen
@ 2005-04-28 11:58   ` Thien-Thi Nguyen
  2005-04-29 12:11     ` latexenc-find-file-coding-system is slow. (was: Your Emacs changes) Lute Kamstra
  0 siblings, 1 reply; 13+ messages in thread
From: Thien-Thi Nguyen @ 2005-04-28 11:58 UTC (permalink / raw)
  Cc: emacs-devel

Arne Jørgensen <arne@arnested.dk> writes:

> Richard Stallman <rms@gnu.org> wrote March 4, 2005:
> 
> > We can now install them.  Could you send me the
> > latest version of your changes, with change log
> > entries?
> 
> So now I'm trying to post to list

i have installed the files w/ the suggested names;
only modification was to re-indent latexenc.el.

thi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* latexenc-find-file-coding-system is slow. (was: Your Emacs changes)
  2005-04-28 11:58   ` Thien-Thi Nguyen
@ 2005-04-29 12:11     ` Lute Kamstra
  2005-04-29 14:57       ` latexenc-find-file-coding-system is slow Arne Jørgensen
                         ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Lute Kamstra @ 2005-04-29 12:11 UTC (permalink / raw)
  Cc: emacs-devel, Arne Jørgensen

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=utf-8, Size: 2121 bytes --]

Thien-Thi Nguyen <ttn@gnu.org> writes:

> Arne Jørgensen <arne@arnested.dk> writes:
>
>> Richard Stallman <rms@gnu.org> wrote March 4, 2005:
>> 
>> > We can now install them.  Could you send me the
>> > latest version of your changes, with change log
>> > entries?
>> 
>> So now I'm trying to post to list
>
> i have installed the files w/ the suggested names;
> only modification was to re-indent latexenc.el.

Since this change, opening a 117k .texi file takes seconds.  It used
to take a fraction of a second.  I did a debug-on-quit during the wait
a couple of times and that consistently gave me one of these two
backtraces:

Debugger entered--Lisp error: (quit)
  re-search-forward("^[^%$]*\\inputencoding{\\(.*\\)}" nil t)
  latexenc-find-file-coding-system((insert-file-contents "/soft/careful/emacs/lispref/modes.texi" t 0 117151 nil))
  insert-file-contents("/soft/careful/emacs/lispref/modes.texi" t)
  byte-code("Â.Ã	Â\".)‡" [inhibit-read-only filename t insert-file-contents] 3)
  find-file-noselect-1(#<buffer modes.texi> "/soft/careful/emacs/lispref/modes.texi" nil nil "/soft/careful/emacs/lispref/modes.texi" (185464 775))
  find-file-noselect("/soft/careful/emacs/lispref/modes.texi" nil nil t)
  find-file("/soft/careful/emacs/lispref/modes.texi" t)
  call-interactively(find-file)

Debugger entered--Lisp error: (quit)
  re-search-forward("^[^%$]*\\usepackage\\[\\(.*\\)\\]{inputenc}" nil t)
  latexenc-find-file-coding-system((insert-file-contents "/soft/careful/emacs/lispref/modes.texi" t 0 117151 nil))
  insert-file-contents("/soft/careful/emacs/lispref/modes.texi" t)
  byte-code("Â.Ã	Â\".)‡" [inhibit-read-only filename t insert-file-contents] 3)
  find-file-noselect-1(#<buffer modes.texi<3>> "/soft/careful/emacs/lispref/modes.texi" nil nil "/soft/careful/emacs/lispref/modes.texi" (185464 775))
  find-file-noselect("/soft/careful/emacs/lispref/modes.texi" nil nil t)
  find-file("/soft/careful/emacs/lispref/modes.texi" t)
  call-interactively(find-file)

I guess the re-searching in latexenc-find-file-coding-system needs to
be improved.

Lute.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: latexenc-find-file-coding-system is slow.
  2005-04-29 12:11     ` latexenc-find-file-coding-system is slow. (was: Your Emacs changes) Lute Kamstra
@ 2005-04-29 14:57       ` Arne Jørgensen
  2005-04-29 15:48         ` Lute Kamstra
  2005-04-29 16:13       ` Stefan Monnier
                         ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Arne Jørgensen @ 2005-04-29 14:57 UTC (permalink / raw)
  Cc: Lute Kamstra

Lute Kamstra <Lute.Kamstra.lists@xs4all.nl> writes:

> Thien-Thi Nguyen <ttn@gnu.org> writes:
>
>> Arne Jørgensen <arne@arnested.dk> writes:
>>
>>> Richard Stallman <rms@gnu.org> wrote March 4, 2005:
>>> 
>>> > We can now install them.  Could you send me the
>>> > latest version of your changes, with change log
>>> > entries?
>>> 
>>> So now I'm trying to post to list
>>
>> i have installed the files w/ the suggested names;
>> only modification was to re-indent latexenc.el.
>
> Since this change, opening a 117k .texi file takes seconds. 

First of all latexenc-find-file-coding-system shouldn't search .texi
files. I just tested it and in my emacs it is not called on .texi
files, but there could be something wrong with the entry in
file-coding-system-alist:

 ("\\.tex\\|\\.ltx\\|\\.dtx\\|\\.drv\\'" . latexenc-find-file-coding-system)

Does that entry match .texi files?

Secondly the problem will of course still be there on large .tex files
etc. which latexenc-find-file-coding-system is supposed search. See
below.

But my guess is .tex files normally doesn't grow as large as .texi
files. YMMV.

> It used
> to take a fraction of a second.  I did a debug-on-quit during the wait
> a couple of times and that consistently gave me one of these two
> backtraces:

[...]

> I guess the re-searching in latexenc-find-file-coding-system needs to
> be improved.

latexenc-find-file-coding-system re-searches all of the files for an
\inputencoding{...} command and if none is found it re-searches all of
the file for an \usepackage[...]{inputenc} command.

The two re-search-forward's could of course be limited to only search
the first n positions of the buffer. The problem is though to find
decent defaults for these limits.

It would be possible to introduce some variables for these limits and
let people customize them for their individual needs/tastes.

Kind regards,
-- 
Arne Jørgensen <http://arnested.dk/>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: latexenc-find-file-coding-system is slow.
  2005-04-29 14:57       ` latexenc-find-file-coding-system is slow Arne Jørgensen
@ 2005-04-29 15:48         ` Lute Kamstra
  0 siblings, 0 replies; 13+ messages in thread
From: Lute Kamstra @ 2005-04-29 15:48 UTC (permalink / raw)
  Cc: emacs-devel

Arne Jørgensen <arne@arnested.dk> writes:

> Lute Kamstra <Lute.Kamstra.lists@xs4all.nl> writes:

[...]

>> Since this change, opening a 117k .texi file takes seconds. 
>
> First of all latexenc-find-file-coding-system shouldn't search .texi
> files. I just tested it and in my emacs it is not called on .texi
> files, but there could be something wrong with the entry in
> file-coding-system-alist:
>
>  ("\\.tex\\|\\.ltx\\|\\.dtx\\|\\.drv\\'" . latexenc-find-file-coding-system)
>
> Does that entry match .texi files?

Yup, you probably mean "\\.\\(tex\\|ltx\\|dtx\\|drv\\)\\'".  I'll fix that.

> Secondly the problem will of course still be there on large .tex files
> etc. which latexenc-find-file-coding-system is supposed search. See
> below.
>
> But my guess is .tex files normally doesn't grow as large as .texi
> files. YMMV.

My LaTeX files are typically 100k, sometimes even 200k.

>> It used to take a fraction of a second.  I did a debug-on-quit
>> during the wait a couple of times and that consistently gave me one
>> of these two backtraces:
>
> [...]
>
>> I guess the re-searching in latexenc-find-file-coding-system needs to
>> be improved.
>
> latexenc-find-file-coding-system re-searches all of the files for an
> \inputencoding{...} command and if none is found it re-searches all of
> the file for an \usepackage[...]{inputenc} command.
>
> The two re-search-forward's could of course be limited to only search
> the first n positions of the buffer. The problem is though to find
> decent defaults for these limits.
>
> It would be possible to introduce some variables for these limits and
> let people customize them for their individual needs/tastes.

That seems like a good approach.

There is no need to search beyond a \begin{document}, is it?  Maybe
that can be used as well: first search the first N characters for
\begin{document} and then search backward for \inputencoding{...} and
\usepackage[...]{inputenc}.  You probably have to test this on a
number of files to see if it really helps.

Would you like to work on this?

Another idea is to change the regexps and search only for
\inputencoding{...} and \usepackage[...]{inputenc} at the start of a
line.  That's where they typically are.  This will speed up the search
dramatically.  If you agree with it, I'll apply this fix for now.

Lute.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: latexenc-find-file-coding-system is slow.
  2005-04-29 12:11     ` latexenc-find-file-coding-system is slow. (was: Your Emacs changes) Lute Kamstra
  2005-04-29 14:57       ` latexenc-find-file-coding-system is slow Arne Jørgensen
@ 2005-04-29 16:13       ` Stefan Monnier
       [not found]         ` <877jil8rz5.fsf@arnested.dk>
  2005-04-30  8:08       ` David Kastrup
                         ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Stefan Monnier @ 2005-04-29 16:13 UTC (permalink / raw)
  Cc: Thien-Thi Nguyen, Arne Jørgensen, emacs-devel

>   re-search-forward("^[^%$]*\\usepackage\\[\\(.*\\)\\]{inputenc}" nil t)
                              ^^
shouldn't this be \\\\ ?

> I guess the re-searching in latexenc-find-file-coding-system needs to
> be improved.

I see two obvious ways to speed it up:
- use (re-search-forward "\\\\usepackage\\[\\(.*\\)\\]{inputenc}")
  and once it matched, check if it's inside a comment.  This should be
  *much* faster because of how the regexp-engine works (basically,
  it will backtrack much less).  The search as it is coded now could very
  well fail with "regexp stack overflow".
- don't search through the whole buffer but only though the first part (10K
  or so) of it.

> Since this change, opening a 117k .texi file takes seconds.  It used

The filename regexp was broken, I've adjusted it so it doesn't get triggered
for .texi files.


        Stefan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: latexenc-find-file-coding-system is slow.
       [not found]         ` <877jil8rz5.fsf@arnested.dk>
@ 2005-04-29 17:19           ` Lute Kamstra
  0 siblings, 0 replies; 13+ messages in thread
From: Lute Kamstra @ 2005-04-29 17:19 UTC (permalink / raw)
  Cc: Thien-Thi Nguyen, Stefan Monnier, emacs-devel

Arne Jørgensen <arne@arnested.dk> writes:

> Stefan Monnier <monnier@iro.umontreal.ca> writes:

[...]

>> - use (re-search-forward "\\\\usepackage\\[\\(.*\\)\\]{inputenc}")
>>   and once it matched, check if it's inside a comment.  This should be
>>   *much* faster because of how the regexp-engine works (basically,
>>   it will backtrack much less).  The search as it is coded now could very
>>   well fail with "regexp stack overflow".

[...]

> Lute Kamstra <Lute.Kamstra.lists@xs4all.nl> writes:

[...]

>> Another idea is to change the regexps and search only for
>> \inputencoding{...} and \usepackage[...]{inputenc} at the start of a
>> line.  That's where they typically are.  This will speed up the search
>> dramatically.
>
> That would probably be an ok compromise (between what is actually
> legal LaTeX and what is normally used). But the other approaches can
> do the trick I would prefer to avoid this.

You can avoid it by using Stefan's suggestion above.  It is probably
faster than my idea, but a bit more work to implement.

Lute.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: latexenc-find-file-coding-system is slow.
  2005-04-29 12:11     ` latexenc-find-file-coding-system is slow. (was: Your Emacs changes) Lute Kamstra
  2005-04-29 14:57       ` latexenc-find-file-coding-system is slow Arne Jørgensen
  2005-04-29 16:13       ` Stefan Monnier
@ 2005-04-30  8:08       ` David Kastrup
  2005-05-01 11:08         ` Lute Kamstra
  2005-05-01 12:07       ` latexenc-find-file-coding-system is slow. (was: Your Emacs changes) Richard Stallman
       [not found]       ` <871x8m96p6.fsf@arnested.dk>
  4 siblings, 1 reply; 13+ messages in thread
From: David Kastrup @ 2005-04-30  8:08 UTC (permalink / raw)
  Cc: Thien-Thi Nguyen, Arne =?utf-8?Q?J=C3=B8rgensen?=, emacs-devel

Lute Kamstra <Lute.Kamstra.lists@xs4all.nl> writes:

> Thien-Thi Nguyen <ttn@gnu.org> writes:
>
>> Arne Jørgensen <arne@arnested.dk> writes:
>>
>>> Richard Stallman <rms@gnu.org> wrote March 4, 2005:
>>> 
>>> > We can now install them.  Could you send me the
>>> > latest version of your changes, with change log
>>> > entries?
>>> 
>>> So now I'm trying to post to list
>>
>> i have installed the files w/ the suggested names;
>> only modification was to re-indent latexenc.el.
>
> Since this change, opening a 117k .texi file takes seconds.  It used
> to take a fraction of a second.  I did a debug-on-quit during the wait
> a couple of times and that consistently gave me one of these two
> backtraces:
>
> Debugger entered--Lisp error: (quit)
>   re-search-forward("^[^%$]*\\inputencoding{\\(.*\\)}" nil t)
>
> Debugger entered--Lisp error: (quit)
>   re-search-forward("^[^%$]*\\usepackage\\[\\(.*\\)\\]{inputenc}" nil t)
>
> I guess the re-searching in latexenc-find-file-coding-system needs to
> be improved.

It simply needs to get limited: if the string is not in the first 3k
or so, don't look further for it.

However, if a 117k file already takes seconds, this also is a sign
that the regular expressions are faulty (even searching the whole file
should not take seconds at this size).  And it certainly is a bug that
the inverted character range contains $ instead of \n: $ is not
special in the expression.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: latexenc-find-file-coding-system is slow.
  2005-04-30  8:08       ` David Kastrup
@ 2005-05-01 11:08         ` Lute Kamstra
  0 siblings, 0 replies; 13+ messages in thread
From: Lute Kamstra @ 2005-05-01 11:08 UTC (permalink / raw)
  Cc: Thien-Thi Nguyen, Arne =?utf-8?Q?J=C3=B8rgensen?=, Stefan Monnier,
	emacs-devel

David Kastrup <dak@gnu.org> writes:

[...]

> However, if a 117k file already takes seconds, this also is a sign
> that the regular expressions are faulty (even searching the whole file
> should not take seconds at this size).  And it certainly is a bug that
> the inverted character range contains $ instead of \n: $ is not
> special in the expression.

For now, I've installed the fix below.

Lute.


*** lisp/international/latexenc.el      28 Apr 2005 20:58:55 -0000      1.2
--- lisp/international/latexenc.el      1 May 2005 10:49:43 -0000
***************
*** 121,128 ****
          ;; try to find the coding system in this file
          (goto-char (point-min))
          (if (or
!              (re-search-forward "^[^%$]*\\inputencoding{\\(.*\\)}" nil t)
!              (re-search-forward "^[^%$]*\\usepackage\\[\\(.*\\)\\]{inputenc}" nil t))
              (let* ((match (match-string 1))
                     (sym (intern match)))
                (when (latexenc-inputenc-to-coding-system match)
--- 121,128 ----
          ;; try to find the coding system in this file
          (goto-char (point-min))
          (if (or
!              (re-search-forward "^[^%\n]*\\\\inputencoding{\\(.*\\)}" nil t)
!              (re-search-forward "^[^%\n]*\\\\usepackage\\[\\(.*\\)\\]{inputenc}" nil t))
              (let* ((match (match-string 1))
                     (sym (intern match)))
                (when (latexenc-inputenc-to-coding-system match)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: latexenc-find-file-coding-system is slow. (was: Your Emacs changes)
  2005-04-29 12:11     ` latexenc-find-file-coding-system is slow. (was: Your Emacs changes) Lute Kamstra
                         ` (2 preceding siblings ...)
  2005-04-30  8:08       ` David Kastrup
@ 2005-05-01 12:07       ` Richard Stallman
       [not found]       ` <871x8m96p6.fsf@arnested.dk>
  4 siblings, 0 replies; 13+ messages in thread
From: Richard Stallman @ 2005-05-01 12:07 UTC (permalink / raw)
  Cc: ttn, arne, emacs-devel

    I guess the re-searching in latexenc-find-file-coding-system needs to
    be improved.

It would be faster to search for the strings "\\inputencoding{" and
"\\usepackage\\[" using search-forward, and each time it finds an
occurrence, do string-match at the beginning of the line to see if the
line is a real match.  It would need to do this loop once for each
string.  Such a loop could easily be 100 times faster than the call to
re-search-forward.

   (while (and (not found) (search-forward string nil t))
     (save-excursion
       (beginning-of-line)
       (if (looking-at entire-regexp)
           (setq found (point))))
     (forward-line 1))

There's another way to speed this up, if there is some rule about
where in the file those constructs should occur.  For instance, if the
rule is that they should occur before the first \foo, and \foo is
commonly found in LaTeX files, it could be faster to search first for
\foo, then use the position of the first \foo as a bound in the other
searches.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: latexenc-find-file-coding-system is slow.
       [not found]       ` <871x8m96p6.fsf@arnested.dk>
@ 2005-05-11 11:48         ` Lute Kamstra
  2005-05-11 17:21           ` Arne Jørgensen
  0 siblings, 1 reply; 13+ messages in thread
From: Lute Kamstra @ 2005-05-11 11:48 UTC (permalink / raw)
  Cc: Thien-Thi Nguyen, Stefan Monnier, Richard Stallman, emacs-devel

[For some reason, I didn't see your message on emacs-devel.]

Arne Jørgensen <arne@arnested.dk> writes:

> Lute Kamstra <Lute.Kamstra.lists@xs4all.nl> writes:
>
>> I guess the re-searching in latexenc-find-file-coding-system needs to
>> be improved.
>
> I have made a try at speeding up the search for input encoding.

Thanks.

> What is does now is search for "inputenc" with `search-forward' until
> the first match also matches (with `looking-at') an
> \inputencoding{...} or \usepackage[...]{inputenc} (that is not inside
> a comment).

That should speed things up.

> For now I have not added a limit for how far to search for this
> because I still see no way to decide this for all common cases. And
> then I would like to hear if the change has any effect on the speed as
> it is.

It's probably fast enough now.

> I limitied the search for a TeX-master/tex-main-file in the local
> variables section though. The limit is borrowed from
> `hack-local-variables'.

That should work.

[...]

I think there a some minor problems with the details of the
implementation:

> Index: lisp/international/latexenc.el
> ===================================================================
> RCS file: /cvsroot/emacs/emacs/lisp/international/latexenc.el,v
> retrieving revision 1.3
> diff -u -p -r1.3 latexenc.el
> --- lisp/international/latexenc.el	1 May 2005 11:01:49 -0000	1.3
> +++ lisp/international/latexenc.el	4 May 2005 18:12:23 -0000
> @@ -120,24 +120,33 @@ coding system names is determined from `
>        (save-excursion
>          ;; try to find the coding system in this file
>          (goto-char (point-min))
> -        (if (or
> -             (re-search-forward "^[^%\n]*\\\\inputencoding{\\(.*\\)}" nil t)
> -             (re-search-forward "^[^%\n]*\\\\usepackage\\[\\(.*\\)\\]{inputenc}" nil t))
> -            (let* ((match (match-string 1))
> -                   (sym (intern match)))
> -              (when (latexenc-inputenc-to-coding-system match)
> -                (setq sym (latexenc-inputenc-to-coding-system match))
> -                (when (coding-system-p sym)
> -		  sym
> -                  (if (and (require 'code-pages nil t) (coding-system-p sym))
> -                      sym
> -                    'undecided))))
> +	(if (catch 'cs
> +	      (let ((case-fold-search nil))
> +		(while (search-forward "inputenc" nil t)
> +		  (goto-char (match-beginning 0))
> +		  (beginning-of-line)
> +		  (if (or (looking-at "[^%\n]*\\\\usepackage\\[\\(.*\\)\\]{\\(.*,\\)?inputenc\\(,.*\\)?}")

That also matches something like:

\usepackage[opt]{package} % don't use {package,inputenc}

> +			  (looking-at "[^%\n]*\\\\inputencoding{\\(.*\\)}"))
> +		      (throw 'cs (match-string 1))

Why throw (match-string 1) instead of t?

> +		    (goto-char (match-end 0))))))
> +	    (let* ((match (match-string 1))
> +		   (sym (intern match)))
> +	      (when (latexenc-inputenc-to-coding-system match)
> +		(setq sym (latexenc-inputenc-to-coding-system match)))
> +	      (when (coding-system-p sym)
> +		sym
> +		(if (and (require 'code-pages nil t) (coding-system-p sym))
> +		    sym
> +		  'undecided)))
>            ;; else try to find it in the master/main file
> -          (let (latexenc-main-file)
> +          (let (latexenc-main-file
> +		bound)
>              ;; is there a TeX-master or tex-main-file in the local variable section
>              (unless latexenc-dont-use-TeX-master-flag
>                (goto-char (point-max))
> -              (when (re-search-backward "^%+ *\\(TeX-master\\|tex-main-file\\): *\"\\(.+\\)\"" nil t)
> +	      (search-backward "\n\^L" (max (- (point-max) 3000) (point-min)) 'move)
> +	      (setq bound (search-forward "Local Variables:" nil t))
> +              (when (re-search-forward "^%+ *\\(TeX-master\\|tex-main-file\\): *\"\\(.+\\)\"" nil t)
>                  (let ((file (concat (file-name-directory (nth 1 arg-list)) (match-string 2))))
>                    (if (file-exists-p file)
>                        (setq latexenc-main-file file)

You don't seem to use the variable bound.

Could you also write a ChangeLog entry for your patch?

Lute.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: latexenc-find-file-coding-system is slow.
  2005-05-11 11:48         ` latexenc-find-file-coding-system is slow Lute Kamstra
@ 2005-05-11 17:21           ` Arne Jørgensen
  2005-05-11 23:07             ` Lute Kamstra
  0 siblings, 1 reply; 13+ messages in thread
From: Arne Jørgensen @ 2005-05-11 17:21 UTC (permalink / raw)
  Cc: Thien-Thi Nguyen, Stefan Monnier, Lute Kamstra, Richard Stallman

[-- Attachment #1: Type: text/plain, Size: 3661 bytes --]

Lute Kamstra <Lute.Kamstra.lists@xs4all.nl> writes:

> [For some reason, I didn't see your message on emacs-devel.]

Neither did I. It got lost again as most of my mail to @gnu.org
addresses do. This is posted through Gmane instead.

> Arne Jørgensen <arne@arnested.dk> writes:

[...]

> I think there a some minor problems with the details of the
> implementation:
>
>> Index: lisp/international/latexenc.el
>> ===================================================================
>> RCS file: /cvsroot/emacs/emacs/lisp/international/latexenc.el,v
>> retrieving revision 1.3
>> diff -u -p -r1.3 latexenc.el
>> --- lisp/international/latexenc.el	1 May 2005 11:01:49 -0000	1.3
>> +++ lisp/international/latexenc.el	4 May 2005 18:12:23 -0000
>> @@ -120,24 +120,33 @@ coding system names is determined from `
>>        (save-excursion
>>          ;; try to find the coding system in this file
>>          (goto-char (point-min))
>> -        (if (or
>> -             (re-search-forward "^[^%\n]*\\\\inputencoding{\\(.*\\)}" nil t)
>> -             (re-search-forward "^[^%\n]*\\\\usepackage\\[\\(.*\\)\\]{inputenc}" nil t))
>> -            (let* ((match (match-string 1))
>> -                   (sym (intern match)))
>> -              (when (latexenc-inputenc-to-coding-system match)
>> -                (setq sym (latexenc-inputenc-to-coding-system match))
>> -                (when (coding-system-p sym)
>> -		  sym
>> -                  (if (and (require 'code-pages nil t) (coding-system-p sym))
>> -                      sym
>> -                    'undecided))))
>> +	(if (catch 'cs
>> +	      (let ((case-fold-search nil))
>> +		(while (search-forward "inputenc" nil t)
>> +		  (goto-char (match-beginning 0))
>> +		  (beginning-of-line)
>> +		  (if (or (looking-at "[^%\n]*\\\\usepackage\\[\\(.*\\)\\]{\\(.*,\\)?inputenc\\(,.*\\)?}")
>
> That also matches something like:
>
> \usepackage[opt]{package} % don't use {package,inputenc}

Right. I should be fixed now.

>> +			  (looking-at "[^%\n]*\\\\inputencoding{\\(.*\\)}"))
>> +		      (throw 'cs (match-string 1))
>
> Why throw (match-string 1) instead of t?

You're right.

>> +		    (goto-char (match-end 0))))))
>> +	    (let* ((match (match-string 1))
>> +		   (sym (intern match)))
>> +	      (when (latexenc-inputenc-to-coding-system match)
>> +		(setq sym (latexenc-inputenc-to-coding-system match)))
>> +	      (when (coding-system-p sym)
>> +		sym
>> +		(if (and (require 'code-pages nil t) (coding-system-p sym))
>> +		    sym
>> +		  'undecided)))
>>            ;; else try to find it in the master/main file
>> -          (let (latexenc-main-file)
>> +          (let (latexenc-main-file
>> +		bound)
>>              ;; is there a TeX-master or tex-main-file in the local variable section
>>              (unless latexenc-dont-use-TeX-master-flag
>>                (goto-char (point-max))
>> -              (when (re-search-backward "^%+ *\\(TeX-master\\|tex-main-file\\): *\"\\(.+\\)\"" nil t)
>> +	      (search-backward "\n\^L" (max (- (point-max) 3000) (point-min)) 'move)
>> +	      (setq bound (search-forward "Local Variables:" nil t))
>> +              (when (re-search-forward "^%+ *\\(TeX-master\\|tex-main-file\\): *\"\\(.+\\)\"" nil t)
>>                  (let ((file (concat (file-name-directory (nth 1 arg-list)) (match-string 2))))
>>                    (if (file-exists-p file)
>>                        (setq latexenc-main-file file)
>
> You don't seem to use the variable bound.

No. It's gone now. (I was probably tired).

> Could you also write a ChangeLog entry for your patch?

Done.

New patch attached.

Thanks and kind regard,
-- 
Arne Jørgensen <http://arnested.dk/>


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: latexenc.patch --]
[-- Type: text/x-patch, Size: 3415 bytes --]

Index: lisp/ChangeLog
===================================================================
RCS file: /cvsroot/emacs/emacs/lisp/ChangeLog,v
retrieving revision 1.7484
diff -u -p -r1.7484 ChangeLog
--- lisp/ChangeLog	11 May 2005 16:42:40 -0000	1.7484
+++ lisp/ChangeLog	11 May 2005 17:14:03 -0000
@@ -1,3 +1,11 @@
+2005-05-11  Arne J^[,Ax^[(Brgensen  <arne@arnested.dk>
+
+	* international/latexenc.el (latexenc-find-file-coding-system):
+	Avoid `re-search-forward' when looking for input encoding because
+	of speed and safety. Better regexp's for recognizing input
+	encoding. Limit a search for TeX-master/tex-main-file to the local
+	variable section.
+
 2005-05-11  Stefan Monnier  <monnier@iro.umontreal.ca>
 
 	* files.el (executable-find): Move from executable.el. Use locate-file.
Index: lisp/international/latexenc.el
===================================================================
RCS file: /cvsroot/emacs/emacs/lisp/international/latexenc.el,v
retrieving revision 1.3
diff -u -p -r1.3 latexenc.el
--- lisp/international/latexenc.el	1 May 2005 11:01:49 -0000	1.3
+++ lisp/international/latexenc.el	11 May 2005 17:14:03 -0000
@@ -120,24 +120,32 @@ coding system names is determined from `
       (save-excursion
         ;; try to find the coding system in this file
         (goto-char (point-min))
-        (if (or
-             (re-search-forward "^[^%\n]*\\\\inputencoding{\\(.*\\)}" nil t)
-             (re-search-forward "^[^%\n]*\\\\usepackage\\[\\(.*\\)\\]{inputenc}" nil t))
-            (let* ((match (match-string 1))
-                   (sym (intern match)))
-              (when (latexenc-inputenc-to-coding-system match)
-                (setq sym (latexenc-inputenc-to-coding-system match))
-                (when (coding-system-p sym)
-		  sym
-                  (if (and (require 'code-pages nil t) (coding-system-p sym))
-                      sym
-                    'undecided))))
+	(if (catch 'cs
+	      (let ((case-fold-search nil))
+		(while (search-forward "inputenc" nil t)
+		  (goto-char (match-beginning 0))
+		  (beginning-of-line)
+		  (if (or (looking-at "[^%\n]*\\\\usepackage\\[\\([^]]*\\)\\]{\\([^}]*,\\)?inputenc\\(,[^}]*\\)?}")
+			  (looking-at "[^%\n]*\\\\inputencoding{\\([^}]*\\)}"))
+		      (throw 'cs t)
+		    (goto-char (match-end 0))))))
+	    (let* ((match (match-string 1))
+		   (sym (intern match)))
+	      (when (latexenc-inputenc-to-coding-system match)
+		(setq sym (latexenc-inputenc-to-coding-system match)))
+	      (when (coding-system-p sym)
+		sym
+		(if (and (require 'code-pages nil t) (coding-system-p sym))
+		    sym
+		  'undecided)))
           ;; else try to find it in the master/main file
           (let (latexenc-main-file)
             ;; is there a TeX-master or tex-main-file in the local variable section
             (unless latexenc-dont-use-TeX-master-flag
               (goto-char (point-max))
-              (when (re-search-backward "^%+ *\\(TeX-master\\|tex-main-file\\): *\"\\(.+\\)\"" nil t)
+	      (search-backward "\n\^L" (max (- (point-max) 3000) (point-min)) 'move)
+	      (search-forward "Local Variables:" nil t)
+              (when (re-search-forward "^%+ *\\(TeX-master\\|tex-main-file\\): *\"\\(.+\\)\"" nil t)
                 (let ((file (concat (file-name-directory (nth 1 arg-list)) (match-string 2))))
                   (if (file-exists-p file)
                       (setq latexenc-main-file file)

[-- Attachment #3: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: latexenc-find-file-coding-system is slow.
  2005-05-11 17:21           ` Arne Jørgensen
@ 2005-05-11 23:07             ` Lute Kamstra
  0 siblings, 0 replies; 13+ messages in thread
From: Lute Kamstra @ 2005-05-11 23:07 UTC (permalink / raw)
  Cc: Thien-Thi Nguyen, Stefan Monnier, Richard Stallman, emacs-devel

Arne Jørgensen <arne@arnested.dk> writes:

[...]

> New patch attached.

Committed.

Thanks for your work,

  Lute.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2005-05-11 23:07 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <E1D7MTJ-0002Jj-BJ@fencepost.gnu.org>
2005-04-26 12:07 ` Your Emacs changes Arne Jørgensen
2005-04-28 11:58   ` Thien-Thi Nguyen
2005-04-29 12:11     ` latexenc-find-file-coding-system is slow. (was: Your Emacs changes) Lute Kamstra
2005-04-29 14:57       ` latexenc-find-file-coding-system is slow Arne Jørgensen
2005-04-29 15:48         ` Lute Kamstra
2005-04-29 16:13       ` Stefan Monnier
     [not found]         ` <877jil8rz5.fsf@arnested.dk>
2005-04-29 17:19           ` Lute Kamstra
2005-04-30  8:08       ` David Kastrup
2005-05-01 11:08         ` Lute Kamstra
2005-05-01 12:07       ` latexenc-find-file-coding-system is slow. (was: Your Emacs changes) Richard Stallman
     [not found]       ` <871x8m96p6.fsf@arnested.dk>
2005-05-11 11:48         ` latexenc-find-file-coding-system is slow Lute Kamstra
2005-05-11 17:21           ` Arne Jørgensen
2005-05-11 23:07             ` Lute Kamstra

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).