From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: schochet@post.tau.ac.il Newsgroups: gmane.emacs.bugs Subject: bug#5131: Subject: 23.1; interaction of transpose-regions with markers and multibyte chars Date: Sun, 06 Dec 2009 04:22:06 +0200 Message-ID: <20091206042206.10974kro2g12qlhq@webmail.tau.ac.il> Reply-To: schochet@post.tau.ac.il, 5131@emacsbugs.donarmstrong.com NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; DelSp="Yes"; format="flowed" Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1260081481 8305 80.91.229.12 (6 Dec 2009 06:38:01 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 6 Dec 2009 06:38:01 +0000 (UTC) To: bug-gnu-emacs@gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sun Dec 06 07:37:54 2009 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1NHAkg-00026B-Pw for geb-bug-gnu-emacs@m.gmane.org; Sun, 06 Dec 2009 07:37:51 +0100 Original-Received: from localhost ([127.0.0.1]:38951 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NHAkg-0003Cw-HC for geb-bug-gnu-emacs@m.gmane.org; Sun, 06 Dec 2009 01:37:50 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NH79q-0003lN-AU for bug-gnu-emacs@gnu.org; Sat, 05 Dec 2009 21:47:34 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NH79l-0003kV-Dd for bug-gnu-emacs@gnu.org; Sat, 05 Dec 2009 21:47:33 -0500 Original-Received: from [199.232.76.173] (port=56674 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NH79l-0003kS-6w for bug-gnu-emacs@gnu.org; Sat, 05 Dec 2009 21:47:29 -0500 Original-Received: from rzlab.ucr.edu ([138.23.92.77]:55585) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1NH79k-0005ii-Kb for bug-gnu-emacs@gnu.org; Sat, 05 Dec 2009 21:47:29 -0500 Original-Received: from rzlab.ucr.edu (rzlab.ucr.edu [127.0.0.1]) by rzlab.ucr.edu (8.14.3/8.14.3/Debian-5) with ESMTP id nB62lPGu007425; Sat, 5 Dec 2009 18:47:26 -0800 Original-Received: (from debbugs@localhost) by rzlab.ucr.edu (8.14.3/8.14.3/Submit) id nB62U78A005489; Sat, 5 Dec 2009 18:30:07 -0800 Resent-Date: Sat, 5 Dec 2009 18:30:07 -0800 X-Loop: owner@emacsbugs.donarmstrong.com Resent-From: schochet@post.tau.ac.il Resent-To: bug-submit-list@donarmstrong.com Resent-CC: Emacs Bugs 2Resent-Date: Sun, 06 Dec 2009 02:30:06 +0000 Resent-Message-ID: Resent-Sender: owner@emacsbugs.donarmstrong.com X-Emacs-PR-Message: report 5131 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Original-Received: via spool by submit@emacsbugs.donarmstrong.com id=B.12600662324812 (code B ref -1); Sun, 06 Dec 2009 02:30:06 +0000 Original-Received: (at submit) by emacsbugs.donarmstrong.com; 6 Dec 2009 02:23:52 +0000 X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. Original-Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) by rzlab.ucr.edu (8.14.3/8.14.3/Debian-5) with ESMTP id nB62NniR004809 for ; Sat, 5 Dec 2009 18:23:51 -0800 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NH6mr-0000hv-Ae for bug-gnu-emacs@gnu.org; Sat, 05 Dec 2009 21:23:49 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NH6mk-0000d3-Uu for bug-gnu-emacs@gnu.org; Sat, 05 Dec 2009 21:23:47 -0500 Original-Received: from [199.232.76.173] (port=49066 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NH6mk-0000d0-SB for bug-gnu-emacs@gnu.org; Sat, 05 Dec 2009 21:23:42 -0500 Original-Received: from mxcampus2.tau.ac.il ([132.66.7.202]:46168) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NH6mk-0004VB-2A for bug-gnu-emacs@gnu.org; Sat, 05 Dec 2009 21:23:42 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApECAI6kGkuEQhFRU2dsb2JhbACEKJYYgSIBFg0IB6kIj0aBL4ItVwSBZw X-IronPort-AV: E=Sophos;i="4.47,348,1257112800"; d="scan'208";a="87755418" Original-Received: from webmail.tau.ac.il (HELO localhost) ([132.66.17.81]) by mxout2.tau.ac.il with ESMTP; 06 Dec 2009 04:22:06 +0200 Original-Received: from 93-172-58-100.bb.netvision.net.il (93-172-58-100.bb.netvision.net.il [93.172.58.100]) by webmail.tau.ac.il (Horde Framework) with HTTP; Sun, 06 Dec 2009 04:22:06 +0200 Content-Disposition: inline User-Agent: Internet Messaging Program (IMP) H3 (4.2) X-detected-operating-system: by monty-python.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6 (newer, 2) Resent-Date: Sat, 05 Dec 2009 21:47:33 -0500 X-Mailman-Approved-At: Sun, 06 Dec 2009 01:37:41 -0500 X-BeenThere: bug-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:33308 Archived-At: From: schochet@post.tau.ac.il To: bug-gnu-emacs@gnu.org Subject: 23.1; interaction of transpose-regions with markers and =20 multibyte chars --text follows this line-- Repeated use of the function transpose-regions on regions defined by markers sometimes yields unexpected results when those regions contain multibyte characters. In some cases the text obtained after running transpose-regions even includes characters that were not present before. The function reverse-all given below is designed to reverse the order of the characters in a specified region. However, I obtain the following results: input region: abcd output region: dcba as expected input region: =C3=B7bcd output region: d=C3=B7bc expected: = dcb=C3=B7 input region: =C3=B7ab"=C3=A4=C3=A9 output region has CJK ideograph expec= ted: =C3=A9=C3=A4"ba=C3=B7 To reproduce this bug, simply copy to a file the text below, beginning with the line starting with a semicolon, visit it in emacs, and evaluate the indicated lisp expressions by entering \C-j at the end of the indicated lines. Note that the lisp expressions set markers to specific locations, so the file should begin precisely where indicated. The first character after the space after the word "case1:" should be at position 64 in the file. If for some reason it is not, the values given to the variable start should be adjusted. The file below also contains an alternative function reverse-all2, which differs from reverse-all only in using variables instead of markers. The function reverse-all2 yields the expected results in all the above cases= . This bug does not depend on my .emacs file, since I have reproduced it with a blank .emacs file. Please let me know if you need any more information. Steve Schochet ;-*- mode: lisp-interaction; coding: utf-8-unix -*- ; case 1: abcd was: abcd ; case 2: =C3=B7bcd was: =C3=B7bcd ; case 3: =C3=B7ab"=C3=A4=C3=A9 was: =C3=B7ab"=C3=A4=C3=A9 (progn (defvar start nil) (defvar len nil)) ;do \C-j here ; Using markers to move multi-byte characters may cause problems (progn (setq begm (make-marker)) (setq endm (make-marker))) ;do \C-j here (defun reverse-all () (set-marker begm start) (set-marker endm (+ start (1- len))) (while (> endm begm) (progn (transpose-regions begm (1+ begm) endm (1+ endm) t) (set-marker begm (1+ begm)) (set-marker endm (1- endm))))) ;do \C-j here ;case1 (progn (setq start 64) (setq len 4) (reverse-all)) ;do \C-j here ;case2 (progn (setq start 94) (setq len 4) (reverse-all)) ;do \C-j here ;case3 (progn (setq start 124) (setq len 6) (reverse-all)) ;do \C-j here ; Using variables instead of markers works (progn (defvar begv nil) (defvar endv nil)) (defun reverse-all2 () (setq begv start) (setq endv (+ start (1- len))) (while (> endv begv) (progn (transpose-regions begv (1+ begv) endv (1+ endv) t) (setq begv (1+ begv)) (setq endv (1- endv))))) ;case1 (progn (setq start 64) (setq len 4) (reverse-all2)) ;case2 (progn (setq start 94) (setq len 4) (reverse-all2)) ;case3 (progn (setq start 124) (setq len 6) (reverse-all2)) ; end of attached file In GNU Emacs 23.1.1 (i586-suse-linux-gnu, GTK+ Version 2.18.1) of 2009-10-24 on build16 Windowing system distributor `The X.Org Foundation', version 11.0.10605000 configured using `configure '--with-pop' '--without-hesiod' =20 '--with-kerberos' '--with-kerberos5' '--with-xim' '--prefix=3D/usr' =20 '--mandir=3D/usr/share/man' '--infodir=3D/usr/share/info' =20 '--datadir=3D/usr/share' '--localstatedir=3D/var' =20 '--sharedstatedir=3D/var/lib' '--libexecdir=3D/usr/lib' '--with-x' =20 '--with-sound' '--with-sync-input' '--with-xpm' '--with-jpeg' =20 '--with-tiff' '--with-gif' '--with-png' '--with-rsvg' '--with-dbus' =20 '--without-gpm' '--with-x-toolkit=3Dgtk' '--x-includes=3D/usr/include' =20 '--x-libraries=3D/usr/lib:/usr/share/X11' '--with-xft' '--with-libotf' =20 '--with-m17n-flt' '--build=3Di586-suse-linux' =20 'build_alias=3Di586-suse-linux' 'CC=3Dgcc' 'CFLAGS=3D-fomit-frame-pointer = =20 -fmessage-length=3D0 -O2 -Wall -D_FORTIFY_SOURCE=3D2 -fstack-protector =20 -funwind-tables -fasynchronous-unwind-tables -g -D_GNU_SOURCE =20 -std=3Dgnu89 -pipe -Wno-pointer-sign -Wno-unused-variable =20 -Wno-unused-label -Wno-unprototyped-calls =20 -DSYSTEM_PURESIZE_EXTRA=3D55000 -DSITELOAD_PURESIZE_EXTRA=3D10000 ' =20 'LDFLAGS=3D-Wl,-O2 -Wl,--hash-size=3D65521'' Important settings: value of $LC_ALL: nil value of $LC_COLLATE: nil value of $LC_CTYPE: nil value of $LC_MESSAGES: nil value of $LC_MONETARY: nil value of $LC_NUMERIC: nil value of $LC_TIME: nil value of $LANG: en_US.UTF-8 value of $XMODIFIERS: @im=3Dlocal locale-coding-system: utf-8-unix default-enable-multibyte-characters: t Major mode: Lisp Interaction Minor modes in effect: show-paren-mode: t tooltip-mode: t tool-bar-mode: t mouse-wheel-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t global-auto-composition-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t Recent input: C-x 1 C-j C-j C-j C-j C-j C-j C-x C-s Recent messages: Loading /usr/share/emacs/site-lisp/nxml-mode/rng-auto.el (source)...done For information about GNU Emacs and the GNU system, type C-h C-a. Invalid image size (see `max-image-size') [9 times] Saving file /home/schochet/try/files/reverse-out.el... Wrote /home/schochet/try/files/reverse-out.el