* bug#5131: Subject: 23.1; interaction of transpose-regions with markers and multibyte chars
@ 2009-12-06 2:22 schochet
2016-07-17 4:13 ` Andrew Hyatt
0 siblings, 1 reply; 4+ messages in thread
From: schochet @ 2009-12-06 2:22 UTC (permalink / raw)
To: bug-gnu-emacs
From: schochet@post.tau.ac.il
To: bug-gnu-emacs@gnu.org
Subject: 23.1; interaction of transpose-regions with markers and
multibyte chars
--text follows this line--
Repeated use of the function transpose-regions on regions defined by
markers sometimes yields unexpected results when those regions contain
multibyte characters. In some cases the text obtained after running
transpose-regions even includes characters that were not present before.
The function reverse-all given below is designed to reverse the order
of the characters in a specified region. However, I obtain the following
results:
input region: abcd output region: dcba as expected
input region: ÷bcd output region: d÷bc expected: dcb÷
input region: ÷ab"äé output region has CJK ideograph expected: éä"ba÷
To reproduce this bug, simply copy to a file the text below,
beginning with the line starting with a semicolon,
visit it in emacs, and evaluate the indicated lisp expressions
by entering \C-j at the end of the indicated lines.
Note that the lisp expressions set markers to specific locations,
so the file should begin precisely where indicated.
The first character after the space after the word "case1:" should be at
position 64 in the file. If for some reason it
is not, the values given to the variable start should be adjusted.
The file below also contains an alternative function reverse-all2,
which differs from reverse-all only in using variables instead of markers.
The function reverse-all2 yields the expected results in all the above cases.
This bug does not depend on my .emacs file, since I have reproduced it with
a blank .emacs file.
Please let me know if you need any more information.
Steve Schochet
;-*- mode: lisp-interaction; coding: utf-8-unix -*-
; case 1: abcd was: abcd
; case 2: ÷bcd was: ÷bcd
; case 3: ÷ab"äé was: ÷ab"äé
(progn (defvar start nil) (defvar len nil)) ;do \C-j here
; Using markers to move multi-byte characters may cause problems
(progn (setq begm (make-marker)) (setq endm (make-marker))) ;do \C-j here
(defun reverse-all ()
(set-marker begm start)
(set-marker endm (+ start (1- len)))
(while (> endm begm)
(progn (transpose-regions begm (1+ begm) endm (1+ endm) t)
(set-marker begm (1+ begm))
(set-marker endm (1- endm))))) ;do \C-j here
;case1
(progn (setq start 64) (setq len 4) (reverse-all)) ;do \C-j here
;case2
(progn (setq start 94) (setq len 4) (reverse-all)) ;do \C-j here
;case3
(progn (setq start 124) (setq len 6) (reverse-all)) ;do \C-j here
; Using variables instead of markers works
(progn (defvar begv nil) (defvar endv nil))
(defun reverse-all2 ()
(setq begv start)
(setq endv (+ start (1- len)))
(while (> endv begv)
(progn (transpose-regions begv (1+ begv) endv (1+ endv) t)
(setq begv (1+ begv))
(setq endv (1- endv)))))
;case1
(progn (setq start 64) (setq len 4) (reverse-all2))
;case2
(progn (setq start 94) (setq len 4) (reverse-all2))
;case3
(progn (setq start 124) (setq len 6) (reverse-all2))
; end of attached file
In GNU Emacs 23.1.1 (i586-suse-linux-gnu, GTK+ Version 2.18.1)
of 2009-10-24 on build16
Windowing system distributor `The X.Org Foundation', version 11.0.10605000
configured using `configure '--with-pop' '--without-hesiod'
'--with-kerberos' '--with-kerberos5' '--with-xim' '--prefix=/usr'
'--mandir=/usr/share/man' '--infodir=/usr/share/info'
'--datadir=/usr/share' '--localstatedir=/var'
'--sharedstatedir=/var/lib' '--libexecdir=/usr/lib' '--with-x'
'--with-sound' '--with-sync-input' '--with-xpm' '--with-jpeg'
'--with-tiff' '--with-gif' '--with-png' '--with-rsvg' '--with-dbus'
'--without-gpm' '--with-x-toolkit=gtk' '--x-includes=/usr/include'
'--x-libraries=/usr/lib:/usr/share/X11' '--with-xft' '--with-libotf'
'--with-m17n-flt' '--build=i586-suse-linux'
'build_alias=i586-suse-linux' 'CC=gcc' 'CFLAGS=-fomit-frame-pointer
-fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector
-funwind-tables -fasynchronous-unwind-tables -g -D_GNU_SOURCE
-std=gnu89 -pipe -Wno-pointer-sign -Wno-unused-variable
-Wno-unused-label -Wno-unprototyped-calls
-DSYSTEM_PURESIZE_EXTRA=55000 -DSITELOAD_PURESIZE_EXTRA=10000 '
'LDFLAGS=-Wl,-O2 -Wl,--hash-size=65521''
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: en_US.UTF-8
value of $XMODIFIERS: @im=local
locale-coding-system: utf-8-unix
default-enable-multibyte-characters: t
Major mode: Lisp Interaction
Minor modes in effect:
show-paren-mode: t
tooltip-mode: t
tool-bar-mode: t
mouse-wheel-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
global-auto-composition-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
transient-mark-mode: t
Recent input:
C-x 1 <down-mouse-1> <mouse-1> C-j <down-mouse-1> <mouse-1>
C-j <down-mouse-1> <mouse-1> C-j <down-mouse-1> <mouse-1>
C-j <down-mouse-1> <mouse-1> C-j <down> <down> <down>
<down> <down> <down-mouse-1> <mouse-1> C-j C-x C-s
<up> <up> <up> <up> <up> <up> <up> <up> <up> <up> <up>
<up> <up> <up> <up> <up> <up> <up> <up> <up> <up> <up>
<up> <up> <up> <up> <up> <up> <up> <up> <up> <up> <up>
<up> <up> <up> <up> <up> <up> <up> <up> <up> <up> <up>
<help-echo> <help-echo> <help-echo> <help-echo> <help-echo>
<help-echo> <help-echo> <menu-bar> <help-menu> <se
nd-emacs-bug-report>
Recent messages:
Loading /usr/share/emacs/site-lisp/nxml-mode/rng-auto.el (source)...done
For information about GNU Emacs and the GNU system, type C-h C-a.
Invalid image size (see `max-image-size') [9 times]
Saving file /home/schochet/try/files/reverse-out.el...
Wrote /home/schochet/try/files/reverse-out.el
^ permalink raw reply [flat|nested] 4+ messages in thread
* bug#5131: Subject: 23.1; interaction of transpose-regions with markers and multibyte chars
2009-12-06 2:22 bug#5131: Subject: 23.1; interaction of transpose-regions with markers and multibyte chars schochet
@ 2016-07-17 4:13 ` Andrew Hyatt
2016-07-17 4:46 ` npostavs
0 siblings, 1 reply; 4+ messages in thread
From: Andrew Hyatt @ 2016-07-17 4:13 UTC (permalink / raw)
To: schochet; +Cc: 5131
Sorry for the late reply. I can reproduce the problem of unexpected
characters when transposing using markers in Emacs 25.
schochet@post.tau.ac.il writes:
> From: schochet@post.tau.ac.il
> To: bug-gnu-emacs@gnu.org, 5131@debbugs.gnu.org
> Subject: 23.1; interaction of transpose-regions with markers and multibyte
> chars
> --text follows this line--
>
> Repeated use of the function transpose-regions on regions defined by
> markers sometimes yields unexpected results when those regions contain
> multibyte characters. In some cases the text obtained after running
> transpose-regions even includes characters that were not present before.
>
> The function reverse-all given below is designed to reverse the order
> of the characters in a specified region. However, I obtain the following
> results:
>
> input region: abcd output region: dcba as expected
> input region: ÷bcd output region: d÷bc expected: dcb÷
> input region: ÷ab"äé output region has CJK ideograph expected: éä"ba÷
>
> To reproduce this bug, simply copy to a file the text below,
> beginning with the line starting with a semicolon,
> visit it in emacs, and evaluate the indicated lisp expressions
> by entering \C-j at the end of the indicated lines.
> Note that the lisp expressions set markers to specific locations,
> so the file should begin precisely where indicated.
> The first character after the space after the word "case1:" should be at
> position 64 in the file. If for some reason it
> is not, the values given to the variable start should be adjusted.
>
> The file below also contains an alternative function reverse-all2,
> which differs from reverse-all only in using variables instead of markers.
> The function reverse-all2 yields the expected results in all the above cases.
>
> This bug does not depend on my .emacs file, since I have reproduced it with
> a blank .emacs file.
>
> Please let me know if you need any more information.
>
> Steve Schochet
>
> ;-*- mode: lisp-interaction; coding: utf-8-unix -*-
>
> ; case 1: abcd was: abcd
> ; case 2: ÷bcd was: ÷bcd
> ; case 3: ÷ab"äé was: ÷ab"äé
>
> (progn (defvar start nil) (defvar len nil)) ;do \C-j here
>
> ; Using markers to move multi-byte characters may cause problems
>
> (progn (setq begm (make-marker)) (setq endm (make-marker))) ;do \C-j here
>
> (defun reverse-all ()
> (set-marker begm start)
> (set-marker endm (+ start (1- len)))
> (while (> endm begm)
> (progn (transpose-regions begm (1+ begm) endm (1+ endm) t)
> (set-marker begm (1+ begm))
> (set-marker endm (1- endm))))) ;do \C-j here
>
> ;case1
> (progn (setq start 64) (setq len 4) (reverse-all)) ;do \C-j here
>
> ;case2
> (progn (setq start 94) (setq len 4) (reverse-all)) ;do \C-j here
>
> ;case3
> (progn (setq start 124) (setq len 6) (reverse-all)) ;do \C-j here
>
>
> ; Using variables instead of markers works
>
> (progn (defvar begv nil) (defvar endv nil))
>
> (defun reverse-all2 ()
> (setq begv start)
> (setq endv (+ start (1- len)))
> (while (> endv begv)
> (progn (transpose-regions begv (1+ begv) endv (1+ endv) t)
> (setq begv (1+ begv))
> (setq endv (1- endv)))))
>
> ;case1
> (progn (setq start 64) (setq len 4) (reverse-all2))
>
> ;case2
> (progn (setq start 94) (setq len 4) (reverse-all2))
>
> ;case3
> (progn (setq start 124) (setq len 6) (reverse-all2))
>
> ; end of attached file
>
>
>
>
> In GNU Emacs 23.1.1 (i586-suse-linux-gnu, GTK+ Version 2.18.1)
> of 2009-10-24 on build16
> Windowing system distributor `The X.Org Foundation', version 11.0.10605000
> configured using `configure '--with-pop' '--without-hesiod' '--with-kerberos'
> '--with-kerberos5' '--with-xim' '--prefix=/usr' '--mandir=/usr/share/man'
> '--infodir=/usr/share/info' '--datadir=/usr/share' '--localstatedir=/var'
> '--sharedstatedir=/var/lib' '--libexecdir=/usr/lib' '--with-x' '--with-sound'
> '--with-sync-input' '--with-xpm' '--with-jpeg' '--with-tiff' '--with-gif'
> '--with-png' '--with-rsvg' '--with-dbus' '--without-gpm' '--with-x-toolkit=gtk'
> '--x-includes=/usr/include' '--x-libraries=/usr/lib:/usr/share/X11'
> '--with-xft' '--with-libotf' '--with-m17n-flt' '--build=i586-suse-linux'
> 'build_alias=i586-suse-linux' 'CC=gcc' 'CFLAGS=-fomit-frame-pointer
> -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector
> -funwind-tables -fasynchronous-unwind-tables -g -D_GNU_SOURCE -std=gnu89 -pipe
> -Wno-pointer-sign -Wno-unused-variable -Wno-unused-label
> -Wno-unprototyped-calls -DSYSTEM_PURESIZE_EXTRA=55000
> -DSITELOAD_PURESIZE_EXTRA=10000 ' 'LDFLAGS=-Wl,-O2 -Wl,--hash-size=65521''
>
> Important settings:
> value of $LC_ALL: nil
> value of $LC_COLLATE: nil
> value of $LC_CTYPE: nil
> value of $LC_MESSAGES: nil
> value of $LC_MONETARY: nil
> value of $LC_NUMERIC: nil
> value of $LC_TIME: nil
> value of $LANG: en_US.UTF-8
> value of $XMODIFIERS: @im=local
> locale-coding-system: utf-8-unix
> default-enable-multibyte-characters: t
>
> Major mode: Lisp Interaction
>
> Minor modes in effect:
> show-paren-mode: t
> tooltip-mode: t
> tool-bar-mode: t
> mouse-wheel-mode: t
> menu-bar-mode: t
> file-name-shadow-mode: t
> global-font-lock-mode: t
> font-lock-mode: t
> blink-cursor-mode: t
> global-auto-composition-mode: t
> auto-composition-mode: t
> auto-encryption-mode: t
> auto-compression-mode: t
> line-number-mode: t
> transient-mark-mode: t
>
> Recent input:
> C-x 1 <down-mouse-1> <mouse-1> C-j <down-mouse-1> <mouse-1>
> C-j <down-mouse-1> <mouse-1> C-j <down-mouse-1> <mouse-1>
> C-j <down-mouse-1> <mouse-1> C-j <down> <down> <down>
> <down> <down> <down-mouse-1> <mouse-1> C-j C-x C-s
> <up> <up> <up> <up> <up> <up> <up> <up> <up> <up> <up>
> <up> <up> <up> <up> <up> <up> <up> <up> <up> <up> <up>
> <up> <up> <up> <up> <up> <up> <up> <up> <up> <up> <up>
> <up> <up> <up> <up> <up> <up> <up> <up> <up> <up> <up>
> <help-echo> <help-echo> <help-echo> <help-echo> <help-echo>
> <help-echo> <help-echo> <menu-bar> <help-menu> <se
> nd-emacs-bug-report>
>
> Recent messages:
> Loading /usr/share/emacs/site-lisp/nxml-mode/rng-auto.el (source)...done
> For information about GNU Emacs and the GNU system, type C-h C-a.
> Invalid image size (see `max-image-size') [9 times]
> Saving file /home/schochet/try/files/reverse-out.el...
> Wrote /home/schochet/try/files/reverse-out.el
^ permalink raw reply [flat|nested] 4+ messages in thread
* bug#5131: Subject: 23.1; interaction of transpose-regions with markers and multibyte chars
2016-07-17 4:13 ` Andrew Hyatt
@ 2016-07-17 4:46 ` npostavs
2016-07-19 16:05 ` Eli Zaretskii
0 siblings, 1 reply; 4+ messages in thread
From: npostavs @ 2016-07-17 4:46 UTC (permalink / raw)
To: Andrew Hyatt; +Cc: schochet, 5131
Andrew Hyatt <ahyatt@gmail.com> writes:
> Sorry for the late reply. I can reproduce the problem of unexpected
> characters when transposing using markers in Emacs 25.
>
> schochet@post.tau.ac.il writes:
>
>> ;-*- mode: lisp-interaction; coding: utf-8-unix -*-
>>
>> ; case 1: abcd was: abcd
>> ; case 2: ÷bcd was: ÷bcd
>> ; case 3: ÷ab"äé was: ÷ab"äé
>>
>> (progn (defvar start nil) (defvar len nil)) ;do \C-j here
>>
>> ; Using markers to move multi-byte characters may cause problems
>>
>> (progn (setq begm (make-marker)) (setq endm (make-marker))) ;do \C-j here
>>
>> (defun reverse-all ()
>> (set-marker begm start)
>> (set-marker endm (+ start (1- len)))
>> (while (> endm begm)
>> (progn (transpose-regions begm (1+ begm) endm (1+ endm) t)
>> (set-marker begm (1+ begm))
>> (set-marker endm (1- endm))))) ;do \C-j here
>>
>> ;case1
>> (progn (setq start 64) (setq len 4) (reverse-all)) ;do \C-j here
>>
>> ;case2
>> (progn (setq start 94) (setq len 4) (reverse-all)) ;do \C-j here
>>
>> ;case3
>> (progn (setq start 124) (setq len 6) (reverse-all)) ;do \C-j here
With the latest emacs-25 branch after evaluating up to case3 here, I get
an abort, here is the backtrace:
(gdb) bt
#0 0x00007ffff1218d59 in raise () from /usr/lib/libpthread.so.0
#1 0x00000000005738c4 in terminate_due_to_signal (sig=6, backtrace_limit=2147483647) at emacs.c:381
#2 0x0000000000600d84 in die (msg=0x6f4140 "IT_BYTEPOS (*it) == CHAR_TO_BYTE (IT_CHARPOS (*it))", file=0x6f1ff0 "xdisp.c",
line=7442) at alloc.c:7223
#3 0x0000000000452c1c in set_iterator_to_next (it=0x7fffffff90f0, reseat_p=true) at xdisp.c:7442
#4 0x00000000004832b4 in display_line (it=0x7fffffff90f0) at xdisp.c:20997
#5 0x00000000004793af in try_window_id (w=0x13fc690) at xdisp.c:18413
#6 0x000000000046fd44 in redisplay_window (window=20956821, just_this_one_p=true) at xdisp.c:16573
#7 0x0000000000467ad2 in redisplay_window_1 (window=20956821) at xdisp.c:14454
#8 0x0000000000621077 in internal_condition_case_1 (bfun=0x467a90 <redisplay_window_1>, arg=20956821, handlers=14478067,
hfun=0x467a0a <redisplay_window_error>) at eval.c:1333
#9 0x0000000000466cbc in redisplay_internal () at xdisp.c:14079
#10 0x00000000004640c2 in redisplay () at xdisp.c:13214
#11 0x000000000057b647 in read_char (commandflag=1, map=17541507, prev_event=0, used_mouse_menu=0x7fffffffe42f, end_time=0x0)
at keyboard.c:2477
#12 0x000000000058b90f in read_key_sequence (keybuf=0x7fffffffe5e0, bufsize=30, prompt=0, dont_downcase_last=false,
can_return_switch_frame=true, fix_current_buffer=true, prevent_redisplay=false) at keyboard.c:9063
#13 0x000000000057854d in command_loop_1 () at keyboard.c:1365
#14 0x0000000000620fdd in internal_condition_case (bfun=0x57810b <command_loop_1>, handlers=19056, hfun=0x577779 <cmd_error>)
at eval.c:1309
#15 0x0000000000577d38 in command_loop_2 (ignore=0) at keyboard.c:1107
#16 0x000000000062056f in internal_catch (tag=45840, func=0x577d0f <command_loop_2>, arg=0) at eval.c:1074
#17 0x0000000000577cda in command_loop () at keyboard.c:1086
#18 0x0000000000577251 in recursive_edit_1 () at keyboard.c:692
#19 0x000000000057745d in Frecursive_edit () at keyboard.c:763
#20 0x00000000005751f5 in main (argc=3, argv=0x7fffffffea78) at emacs.c:1606
Lisp Backtrace:
"redisplay_internal (C function)" (0x0)
In GNU Emacs 25.0.95.21 (x86_64-unknown-linux-gnu, X toolkit)
of 2016-07-11 built on zony
Repository revision: d1300340cbd44abe79ef71a57ae1488479f76b0d
Windowing system distributor 'The X.Org Foundation', version 11.0.11803000
Configured using:
'configure --cache-file=../debug-config.cache 'CFLAGS=-O0 -g3
-march=native' --enable-checking MAKEINFO=makeinfo-4.13a
--with-x-toolkit=lucid --without-toolkit-scroll-bars --with-gif=no
--with-jpeg=no'
^ permalink raw reply [flat|nested] 4+ messages in thread
* bug#5131: Subject: 23.1; interaction of transpose-regions with markers and multibyte chars
2016-07-17 4:46 ` npostavs
@ 2016-07-19 16:05 ` Eli Zaretskii
0 siblings, 0 replies; 4+ messages in thread
From: Eli Zaretskii @ 2016-07-19 16:05 UTC (permalink / raw)
To: npostavs; +Cc: ahyatt, schochet, 5131-done
> From: npostavs@users.sourceforge.net
> Date: Sun, 17 Jul 2016 00:46:34 -0400
> Cc: schochet@post.tau.ac.il, 5131@debbugs.gnu.org
>
> Andrew Hyatt <ahyatt@gmail.com> writes:
>
> > Sorry for the late reply. I can reproduce the problem of unexpected
> > characters when transposing using markers in Emacs 25.
> >
> > schochet@post.tau.ac.il writes:
> >
> >> ;-*- mode: lisp-interaction; coding: utf-8-unix -*-
> >>
> >> ; case 1: abcd was: abcd
> >> ; case 2: ÷bcd was: ÷bcd
> >> ; case 3: ÷ab"äé was: ÷ab"äé
> >>
> >> (progn (defvar start nil) (defvar len nil)) ;do \C-j here
> >>
> >> ; Using markers to move multi-byte characters may cause problems
> >>
> >> (progn (setq begm (make-marker)) (setq endm (make-marker))) ;do \C-j here
> >>
> >> (defun reverse-all ()
> >> (set-marker begm start)
> >> (set-marker endm (+ start (1- len)))
> >> (while (> endm begm)
> >> (progn (transpose-regions begm (1+ begm) endm (1+ endm) t)
> >> (set-marker begm (1+ begm))
> >> (set-marker endm (1- endm))))) ;do \C-j here
> >>
> >> ;case1
> >> (progn (setq start 64) (setq len 4) (reverse-all)) ;do \C-j here
> >>
> >> ;case2
> >> (progn (setq start 94) (setq len 4) (reverse-all)) ;do \C-j here
> >>
> >> ;case3
> >> (progn (setq start 124) (setq len 6) (reverse-all)) ;do \C-j here
>
> With the latest emacs-25 branch after evaluating up to case3 here, I get
> an abort, here is the backtrace:
>
> (gdb) bt
> #0 0x00007ffff1218d59 in raise () from /usr/lib/libpthread.so.0
> #1 0x00000000005738c4 in terminate_due_to_signal (sig=6, backtrace_limit=2147483647) at emacs.c:381
> #2 0x0000000000600d84 in die (msg=0x6f4140 "IT_BYTEPOS (*it) == CHAR_TO_BYTE (IT_CHARPOS (*it))", file=0x6f1ff0 "xdisp.c",
> line=7442) at alloc.c:7223
> #3 0x0000000000452c1c in set_iterator_to_next (it=0x7fffffff90f0, reseat_p=true) at xdisp.c:7442
That's because your build is with --enable-checking, while Andrew's
probably isn't. This recipe causes some markers to have invalid
bytepos values, so any code that calls CHAR_TO_BYTE is likely to crash
or cause assertion violations.
It feels strange to fix bugs that were introduced 18 years ago; I
guess almost no one invokes transpose-regions with last argument
non-nil.
Fixed on the master branch. I'm closing the bug; feel free to reopen
if there are some left-overs.
Thanks.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-07-19 16:05 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-06 2:22 bug#5131: Subject: 23.1; interaction of transpose-regions with markers and multibyte chars schochet
2016-07-17 4:13 ` Andrew Hyatt
2016-07-17 4:46 ` npostavs
2016-07-19 16:05 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).