From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#46933: Possible bugs in filepos-to-bufferpos / bufferpos-to-filepos Date: Sat, 27 Mar 2021 16:54:14 +0300 Message-ID: <83pmzkog6x.fsf@gnu.org> References: <871rc0u3v4.fsf@gnu.org> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="37960"; mail-complaints-to="usenet@ciao.gmane.io" Cc: gregory@heytings.org, 46933@debbugs.gnu.org To: handa Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Mar 27 14:55:09 2021 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lQ9PA-0009lu-NH for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 27 Mar 2021 14:55:08 +0100 Original-Received: from localhost ([::1]:47674 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lQ9P9-0000Rk-Od for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 27 Mar 2021 09:55:07 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:56448) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lQ9P3-0000RX-Sl for bug-gnu-emacs@gnu.org; Sat, 27 Mar 2021 09:55:01 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:60296) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lQ9P3-0003wE-LK for bug-gnu-emacs@gnu.org; Sat, 27 Mar 2021 09:55:01 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1lQ9P3-0002uN-K0 for bug-gnu-emacs@gnu.org; Sat, 27 Mar 2021 09:55:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 27 Mar 2021 13:55:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 46933 X-GNU-PR-Package: emacs Original-Received: via spool by 46933-submit@debbugs.gnu.org id=B46933.161685325911118 (code B ref 46933); Sat, 27 Mar 2021 13:55:01 +0000 Original-Received: (at 46933) by debbugs.gnu.org; 27 Mar 2021 13:54:19 +0000 Original-Received: from localhost ([127.0.0.1]:43605 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lQ9ON-0002tF-IV for submit@debbugs.gnu.org; Sat, 27 Mar 2021 09:54:19 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:35132) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lQ9OK-0002t2-22 for 46933@debbugs.gnu.org; Sat, 27 Mar 2021 09:54:17 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:35955) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lQ9OE-0003Pc-GO; Sat, 27 Mar 2021 09:54:10 -0400 Original-Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:4820 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1lQ9OD-0002zh-Qe; Sat, 27 Mar 2021 09:54:10 -0400 In-Reply-To: <871rc0u3v4.fsf@gnu.org> (message from handa on Sat, 27 Mar 2021 22:23:59 +0900) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:203126 Archived-At: > From: handa > Cc: gregory@heytings.org, 46933@debbugs.gnu.org > Date: Sat, 27 Mar 2021 22:23:59 +0900 > > How about something like this method: > 1. Encode the buffer text one line by one until we get a longer byte > sequence than BYTE. > 2. Delete the result of enoding the last line above. > 3. Provided that the above last line has chars C1 C2 ... Cn, > encode characters C1...Cn, C1...Cn-1, C1...Cn-2 until we get a shorter > byte sequence than BYTE. > > The first step may be optimized by encode multiple lines instead of > single line. Even if we do optimize, this would be very slow, I think. And what if the buffer has no newlines? In any case, the problem is not with encoding, the problem is with decoding. Encoding doesn't have this problem because we always encode more than enough (we use the value of BYTE as the count of _characters_ to encode, so for ISO-2022 encoding it is usually much more than needed). By contrast, when decoding, we decode exactly BYTE+1 bytes, which then hits the problem if that offset is inside a shift sequence.