From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: mstanojevic@janestreet.com (Milan =?UTF-8?Q?Stanojevi=C4=87?=) Newsgroups: gmane.emacs.bugs Subject: bug#31837: 26.1; replace-buffer-contents doesn't work if buffer has multibyte characters Date: Thu, 14 Jun 2018 17:34:27 -0400 Message-ID: <7dm4li5jbmk.fsf@janestreet.com> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: blaine.gmane.org 1529013286 3654 195.159.176.226 (14 Jun 2018 21:54:46 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Thu, 14 Jun 2018 21:54:46 +0000 (UTC) To: 31837@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Thu Jun 14 23:54:42 2018 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fTaCS-0000em-Nk for geb-bug-gnu-emacs@m.gmane.org; Thu, 14 Jun 2018 23:54:36 +0200 Original-Received: from localhost ([::1]:43200 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fTaEY-0003Cz-6M for geb-bug-gnu-emacs@m.gmane.org; Thu, 14 Jun 2018 17:56:46 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:48649) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fTaA4-0008W5-10 for bug-gnu-emacs@gnu.org; Thu, 14 Jun 2018 17:52:09 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fTa9z-0003hn-1Z for bug-gnu-emacs@gnu.org; Thu, 14 Jun 2018 17:52:07 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:40733) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fTa9y-0003hU-Su for bug-gnu-emacs@gnu.org; Thu, 14 Jun 2018 17:52:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1fTa9y-00076J-C5 for bug-gnu-emacs@gnu.org; Thu, 14 Jun 2018 17:52:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: mstanojevic@janestreet.com (Milan =?UTF-8?Q?Stanojevi=C4=87?=) Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 14 Jun 2018 21:52:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 31837 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Original-Received: via spool by submit@debbugs.gnu.org id=B.152901310427270 (code B ref -1); Thu, 14 Jun 2018 21:52:02 +0000 Original-Received: (at submit) by debbugs.gnu.org; 14 Jun 2018 21:51:44 +0000 Original-Received: from localhost ([127.0.0.1]:48630 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fTa9f-00075l-Ar for submit@debbugs.gnu.org; Thu, 14 Jun 2018 17:51:43 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:46685) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fTZtA-0006hC-O2 for submit@debbugs.gnu.org; Thu, 14 Jun 2018 17:34:41 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fTZt4-0001oT-P5 for submit@debbugs.gnu.org; Thu, 14 Jun 2018 17:34:35 -0400 Original-Received: from lists.gnu.org ([2001:4830:134:3::11]:54449) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fTZt4-0001oM-La for submit@debbugs.gnu.org; Thu, 14 Jun 2018 17:34:34 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:37629) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fTZt3-0004es-FS for bug-gnu-emacs@gnu.org; Thu, 14 Jun 2018 17:34:34 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fTZt0-0001lC-Ca for bug-gnu-emacs@gnu.org; Thu, 14 Jun 2018 17:34:33 -0400 Original-Received: from mxout1.mail.janestreet.com ([38.105.200.78]:34396) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fTZt0-0001kQ-91 for bug-gnu-emacs@gnu.org; Thu, 14 Jun 2018 17:34:30 -0400 X-JS-Received: from [30.32.81.38] (helo=igm-qpr-mailcore1) by mxout1.mail.janestreet.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.90_1) (envelope-from ) id 1fTZsz-0005AM-2G for bug-gnu-emacs@gnu.org; Thu, 14 Jun 2018 17:34:29 -0400 X-JS-Flow: external X-JS-Received: by igm-qpr-mailcore1 with ocaml/mailcore/main_production (840e4f60ecac) (envelope-from ) id BbIt9l-T23iAA-BY; 2018-06-14 17:34:29.051463-04:00 X-JS-Scanner-attachment: No attachments X-JS-Scanner-esets: Not scanned (internal mail) X-JS-Internal-Origin: from igm-qws-u37093a.delacy.com ([30.32.41.30] helo=igm-qws-u37093a) by tot-oib-qsmtp1.delacy.com with esmtps (TLSv1.2:AES128-GCM-SHA256:128) (Exim 4.82) (envelope-from ) id 1fTZsz-0008CJ-0Z for bug-gnu-emacs@gnu.org; Thu, 14 Jun 2018 17:34:29 -0400 X-JS-Processed-by: mailcore X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Mailman-Approved-At: Thu, 14 Jun 2018 17:51:40 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:147435 Archived-At: Here is a small recipe that illustrates the bug. recipe.el --------- (setq use-multibyte (< 0 (length argv))) (switch-to-buffer "file1") (when use-multibyte (insert-char (char-from-name "SMILE"))) (insert "1234") (switch-to-buffer "file2") (when use-multibyte (insert-char (char-from-name "SMILE"))) (insert "5678") (replace-buffer-contents "file1") (princ (buffer-substring-no-properties (point-min) (point-max))) (princ "\n") ------------- Running the recipe as script $ emacs -Q --script /tmp/recipe.el 1234 $ emacs -Q --script /tmp/recipe.el multibyte =E2=8C=A35234 In the first run, with just ascii characters, everything works as expected.=20 In the second run, with multibyte characters, the function didn't replace '5' with '1' as expected. =20 I looked at the code and it looks to me like there is a very obvious bug in function buffer_chars_equal in editfns.c. It calls BUF_FETCH_CHAR_AS_MULTIBYTE passing *character* positions, but the macro expects *byte* positions. (it would be nice if these char vs byte positions could be distinguished with types, but I'm not sure it is possible in C). The simple fix is to replace BUF_FETCH_CHAR_AS_MULTIBYTE with STRING_CHAR (BUF_CHAR_ADDRESS (buf, pos)) and this seems to work. =20 I'm not sure about performance of the above fix, though, because accessing random character position in a buffer is not constant. If diffing algorithm is accessing buffer positions in more or less localized manner, maybe it makes sense to move the point inside buffer_chars_equal so the char position to byte position conversion is fast. It probably doesn't matter for small files. Emacs info: In GNU Emacs 26.1 (build 4, x86_64-pc-linux-gnu, X toolkit, Xaw scroll bars) Windowing system distributor 'The X.Org Foundation', version 11.0.11905000 System Description: Linux=20 Configured using: 'configure --with-x-toolkit=3Dlucid --without-gpm --without-gconf --withou= t-selinux --without-imagemagick --with-modules --with-gif=3Dno'