all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* case conversion by replace-match
@ 2003-05-16 14:16 Roland Winkler
  2003-05-16 19:47 ` Stefan Monnier
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Roland Winkler @ 2003-05-16 14:16 UTC (permalink / raw)


This bug report will be sent to the Free Software Foundation,
not to your local site managers!
Please write in English, because the Emacs maintainers do not have
translators to read other languages for them.

Your bug report will be posted to the bug-gnu-emacs@gnu.org mailing list,
and to the gnu.emacs.bug news group.

In GNU Emacs 21.2.1 (i386-pc-linux-gnu, X toolkit, Xaw3d scroll bars)
 of 2002-04-09 on tfkp12
configured using `configure  --prefix=/nfs/common --libexecdir=/nfs/common/lib --bindir=/nfs/common/lib/emacs/21.2/bin/i686-Linux --mandir=/nfs/common/share/man --infodir=/nfs/common/share/info --with-gcc --with-pop --with-x --with-x-toolkit=athena i386-pc-linux'
Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: POSIX
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: en_US
  locale-coding-system: iso-latin-1
  default-enable-multibyte-characters: nil

Please describe exactly what actions triggered the bug
and the precise symptoms of the bug:


Start a fresh emacs --no-init-file

define the following function

(defun foo ()
  (interactive)
  (let (case-fold-search)
    (while (search-forward "=FC" nil t)
      (replace-match (string 252) nil t))))

`(string 252)' gives a lowercase umlaut-u (iso-latin-1)

However, when foo is run in a buffer containing the string "=FC",
this string will be replaced with an uppercase umlaut-U.

PS In mime-encoded mails "=FC" represents a lowercase umlaut-u.
PPS Same problem with GNU Emacs 21.3.50.2.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: case conversion by replace-match
  2003-05-16 14:16 case conversion by replace-match Roland Winkler
@ 2003-05-16 19:47 ` Stefan Monnier
  2003-05-16 21:07 ` Andreas Schwab
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Stefan Monnier @ 2003-05-16 19:47 UTC (permalink / raw)


>       (replace-match (string 252) nil t))))

You might want to pass t rather than nil as second arg.


        Stefan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: case conversion by replace-match
  2003-05-16 14:16 case conversion by replace-match Roland Winkler
  2003-05-16 19:47 ` Stefan Monnier
@ 2003-05-16 21:07 ` Andreas Schwab
       [not found] ` <mailman.6326.1053119374.21513.bug-gnu-emacs@gnu.org>
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Andreas Schwab @ 2003-05-16 21:07 UTC (permalink / raw)
  Cc: bug-gnu-emacs

Roland Winkler <roland.winkler@physik.uni-erlangen.de> writes:

|> Start a fresh emacs --no-init-file
|> 
|> define the following function
|> 
|> (defun foo ()
|>   (interactive)
|>   (let (case-fold-search)
|>     (while (search-forward "=FC" nil t)
|>       (replace-match (string 252) nil t))))
|> 
|> `(string 252)' gives a lowercase umlaut-u (iso-latin-1)
|> 
|> However, when foo is run in a buffer containing the string "=FC",
|> this string will be replaced with an uppercase umlaut-U.

Exactly as documented.  If you don't want this, pass a non-nil second
argument to replace-match:

    If second arg FIXEDCASE is non-nil, do not alter case of replacement text.
    Otherwise maybe capitalize the whole text, or maybe just word initials,
    based on the replaced text.
    If the replaced text has only capital letters
    and has at least one multiletter word, convert NEWTEXT to all caps.
    Otherwise if all words are capitalized in the replaced text,
    capitalize each word in NEWTEXT.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Deutschherrnstr. 15-19, D-90429 Nürnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: case conversion by replace-match
       [not found] ` <mailman.6326.1053119374.21513.bug-gnu-emacs@gnu.org>
@ 2003-05-16 21:32   ` Kevin Rodgers
  2003-05-16 21:51     ` Andreas Schwab
  0 siblings, 1 reply; 9+ messages in thread
From: Kevin Rodgers @ 2003-05-16 21:32 UTC (permalink / raw)


Andreas Schwab wrote:

> Roland Winkler <roland.winkler@physik.uni-erlangen.de> writes:
> 
> |> Start a fresh emacs --no-init-file
> |> 
> |> define the following function
> |> 
> |> (defun foo ()
> |>   (interactive)
> |>   (let (case-fold-search)
> |>     (while (search-forward "=FC" nil t)
> |>       (replace-match (string 252) nil t))))
> |> 
> |> `(string 252)' gives a lowercase umlaut-u (iso-latin-1)
> |> 
> |> However, when foo is run in a buffer containing the string "=FC",
> |> this string will be replaced with an uppercase umlaut-U.
> 
> Exactly as documented.


That's what I thought at first.  Then I thought the "=" obviously means that
not all the characters in the replacement text are capital letters, so NEXTEXT
should not be uppercase'd.  But then I thought the "=" divides the replacement
text into 2 words, an empty word and an uppercase word; so if the empty word is
ignored and an uppercase word is considered to be capitalized, then (each word
in) the replacement text should be capitalized.

But  really, that's all just a rationalization to support the observed behavior.
The user should specify FIXEDCASE as t if he/she knows that NEWTEXT is is case-
precise.  And Emacs should not consider "=FC" to be a sequence of capitalized
words.

> If you don't want this, pass a non-nil second
> argument to replace-match:
> 
>     If second arg FIXEDCASE is non-nil, do not alter case of replacement text.
>     Otherwise maybe capitalize the whole text, or maybe just word initials,
>     based on the replaced text.
>     If the replaced text has only capital letters
>     and has at least one multiletter word, convert NEWTEXT to all caps.
>     Otherwise if all words are capitalized in the replaced text,
>     capitalize each word in NEWTEXT.


-- 
<a href="mailto:&lt;kevin.rodgers&#64;ihs.com&gt;">Kevin Rodgers</a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: case conversion by replace-match
  2003-05-16 21:32   ` Kevin Rodgers
@ 2003-05-16 21:51     ` Andreas Schwab
  0 siblings, 0 replies; 9+ messages in thread
From: Andreas Schwab @ 2003-05-16 21:51 UTC (permalink / raw)
  Cc: gnu-emacs-bug

Kevin Rodgers <ihs_4664@yahoo.com> writes:

|> Andreas Schwab wrote:
|> 
|> > Roland Winkler <roland.winkler@physik.uni-erlangen.de> writes:
|> > |> Start a fresh emacs --no-init-file
|> > |> |> define the following function
|> > |> |> (defun foo ()
|> > |>   (interactive)
|> > |>   (let (case-fold-search)
|> > |>     (while (search-forward "=FC" nil t)
|> > |>       (replace-match (string 252) nil t))))
|> > |> |> `(string 252)' gives a lowercase umlaut-u (iso-latin-1)
|> > |> |> However, when foo is run in a buffer containing the string "=FC",
|> > |> this string will be replaced with an uppercase umlaut-U.
|> > Exactly as documented.
|> 
|> 
|> That's what I thought at first.  Then I thought the "=" obviously means that
|> not all the characters in the replacement text are capital letters, so NEXTEXT

Since "=" is not a letter, it is ignored.  Otherwise sentence marks would
make the function of case-replace useless.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Deutschherrnstr. 15-19, D-90429 Nürnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: case conversion by replace-match
  2003-05-16 14:16 case conversion by replace-match Roland Winkler
                   ` (2 preceding siblings ...)
       [not found] ` <mailman.6326.1053119374.21513.bug-gnu-emacs@gnu.org>
@ 2003-05-17 13:50 ` Richard Stallman
       [not found] ` <mailman.6339.1053179576.21513.bug-gnu-emacs@gnu.org>
  4 siblings, 0 replies; 9+ messages in thread
From: Richard Stallman @ 2003-05-17 13:50 UTC (permalink / raw)
  Cc: bug-gnu-emacs

    However, when foo is run in a buffer containing the string "=FC",
    this string will be replaced with an uppercase umlaut-U.

This is a feature: when replacing an uppercase word, replace-match
converts the replacement to uppercase.  If you don't want that feature,
pass t for the second argument to replace-match.

    PS In mime-encoded mails "=FC" represents a lowercase umlaut-u.

replace-match only cares that the letters are upper case.
It does not know you intend them to stand for something else.

By the way, perhaps you want to use mail-unquote-printable-region
to do this decoding.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: case conversion by replace-match
       [not found] ` <mailman.6339.1053179576.21513.bug-gnu-emacs@gnu.org>
@ 2003-05-17 20:09   ` Roland Winkler
  2003-05-19 20:24     ` Kevin Rodgers
  0 siblings, 1 reply; 9+ messages in thread
From: Roland Winkler @ 2003-05-17 20:09 UTC (permalink / raw)
  Cc: Richard Stallman

Richard Stallman <rms@gnu.org> writes:

>     However, when foo is run in a buffer containing the string "=FC",
>     this string will be replaced with an uppercase umlaut-U.
> 
> This is a feature: when replacing an uppercase word, replace-match
> converts the replacement to uppercase.  If you don't want that feature,
> pass t for the second argument to replace-match.

Thanks a lot. Passing t for the second argument to replace-match
gives me what I want.

I was just surprised that "=E4" and "=F6" gave me lowercase umlaut-a
and umlaut-o as desired. But replacing "=FC" with umlaut-u did not
work.

>     PS In mime-encoded mails "=FC" represents a lowercase umlaut-u.
> 
> replace-match only cares that the letters are upper case.
> It does not know you intend them to stand for something else.

But then I would expect that "=E4" (umlaut-a) and "=F6" (umlaut-o)
should give rise to a case conversion, too. Or am I once again
missing something?

> By the way, perhaps you want to use mail-unquote-printable-region
> to do this decoding.

The code is supposed to handle various conversions for umlauts:
"7 bit", iso-latin, TeX, german LaTeX, html, mime, etc.

Roland

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: case conversion by replace-match
  2003-05-17 20:09   ` Roland Winkler
@ 2003-05-19 20:24     ` Kevin Rodgers
  2003-05-21  1:55       ` Richard Stallman
  0 siblings, 1 reply; 9+ messages in thread
From: Kevin Rodgers @ 2003-05-19 20:24 UTC (permalink / raw)


Roland Winkler wrote:

> Richard Stallman <rms@gnu.org> writes:
>>replace-match only cares that the letters are upper case.
>>It does not know you intend them to stand for something else.
> 
> But then I would expect that "=E4" (umlaut-a) and "=F6" (umlaut-o)
> should give rise to a case conversion, too. Or am I once again
> missing something?

No, your expectation is reasonable.  But Emacs is inconsistent in how
non-letters in REGEXP are treated.


BTW, this inconsistency was not present in Emacs 19.34.

-- 
<a href="mailto:&lt;kevin.rodgers&#64;ihs.com&gt;">Kevin Rodgers</a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: case conversion by replace-match
  2003-05-19 20:24     ` Kevin Rodgers
@ 2003-05-21  1:55       ` Richard Stallman
  0 siblings, 0 replies; 9+ messages in thread
From: Richard Stallman @ 2003-05-21  1:55 UTC (permalink / raw)
  Cc: bug-gnu-emacs

      But Emacs is inconsistent in how
    non-letters in REGEXP are treated.

    BTW, this inconsistency was not present in Emacs 19.34.

What people expect in case propagation is inconsistent; it depends on
semantics that only an AI could understand.  So there is no chance
we could make this always work.

Consistent implementations tend to be wrong very often.  I made
changes in order to produce results that people expect in more cases.
(Sorry I don't remember which cases--it was too long ago.)

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2003-05-21  1:55 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-05-16 14:16 case conversion by replace-match Roland Winkler
2003-05-16 19:47 ` Stefan Monnier
2003-05-16 21:07 ` Andreas Schwab
     [not found] ` <mailman.6326.1053119374.21513.bug-gnu-emacs@gnu.org>
2003-05-16 21:32   ` Kevin Rodgers
2003-05-16 21:51     ` Andreas Schwab
2003-05-17 13:50 ` Richard Stallman
     [not found] ` <mailman.6339.1053179576.21513.bug-gnu-emacs@gnu.org>
2003-05-17 20:09   ` Roland Winkler
2003-05-19 20:24     ` Kevin Rodgers
2003-05-21  1:55       ` Richard Stallman

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.