From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: handa Newsgroups: gmane.emacs.bugs Subject: bug#46933: Possible bugs in filepos-to-bufferpos / bufferpos-to-filepos Date: Sun, 04 Apr 2021 01:12:06 +0900 Message-ID: <87im53ny95.fsf@gnu.org> References: <9cff0f8894f167925251@heytings.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="12879"; mail-complaints-to="usenet@ciao.gmane.io" Cc: handa@gnu.org, gregory@heytings.org, 46933@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Apr 03 18:13:10 2021 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lSita-0003FQ-LD for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 03 Apr 2021 18:13:10 +0200 Original-Received: from localhost ([::1]:37014 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lSitZ-0006cf-MA for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 03 Apr 2021 12:13:09 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:38730) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lSitS-0006cF-Bh for bug-gnu-emacs@gnu.org; Sat, 03 Apr 2021 12:13:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:51036) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lSitS-0007nK-4h for bug-gnu-emacs@gnu.org; Sat, 03 Apr 2021 12:13:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1lSitS-0002JA-0l for bug-gnu-emacs@gnu.org; Sat, 03 Apr 2021 12:13:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: handa Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 03 Apr 2021 16:13:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 46933 X-GNU-PR-Package: emacs Original-Received: via spool by 46933-submit@debbugs.gnu.org id=B46933.16174663408803 (code B ref 46933); Sat, 03 Apr 2021 16:13:01 +0000 Original-Received: (at 46933) by debbugs.gnu.org; 3 Apr 2021 16:12:20 +0000 Original-Received: from localhost ([127.0.0.1]:34349 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lSism-0002Hv-CS for submit@debbugs.gnu.org; Sat, 03 Apr 2021 12:12:20 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:45662) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lSisk-0002HX-BZ for 46933@debbugs.gnu.org; Sat, 03 Apr 2021 12:12:18 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:55524) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lSise-0007Jj-OB; Sat, 03 Apr 2021 12:12:12 -0400 Original-Received: from fl1-60-236-248-230.iba.mesh.ad.jp ([60.236.248.230]:54461 helo=shatin) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1lSisd-0003XL-68; Sat, 03 Apr 2021 12:12:11 -0400 Original-Received: from handa by shatin with local (Exim 4.93) (envelope-from ) id 1lSisY-0001Yz-Gj; Sun, 04 Apr 2021 01:12:06 +0900 In-Reply-To: <83zgyif2aq.fsf@gnu.org> (message from Eli Zaretskii on Thu, 01 Apr 2021 18:32:45 +0300) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:203493 Archived-At: In article <83zgyif2aq.fsf@gnu.org>, Eli Zaretskii writes: > Leaving the :pre-write/:post-read-conversion use case aside, do we > have some means of find where ISO-2022 shift-in/out sequence begins > and ends, so that we never try to decode a partial sequence (and > produce "characters" that are not really in the original buffer)? > If not, where can I find the description of every kind of such > sequences, i.e. sequences that modify the decoder state without > producing any characters? The official definition is in the standard ISO/IEC 2022, but it seems that this wiki page: https://en.wikipedia.org/wiki/ISO/IEC_2022 is more concise. Emacs implements all control sequences shown in the sections: "Shift functions", "Character set designations", and "Interaction with other coding systems". > > By the way, what is the intention of filepos-to-bufferpos? Why that > > function was introduce? > The original (and so far the only) use case was an Info manual > separated into several files, where the tag table at the end of the > main file specifies offsets in bytes. See the function > Info-find-node-2 in info.el. As filepos-to-bufferpos accepts the optional arg CODING-SYSTEM, I've thought BYTE arg is: a byte position in a file that will be created by encoding the current buffer by CODING-SYSTEM But it seems that the usage in Info-find-node-2 is: a byte position in an existing file that may not be created by Emacs There's a case that they are different. The method I wrote in the previous mail works only in the former case. And it seems that the current implementation of filepos-to-bufferpos is the same because it tries to get byte sequence by encode-coding-region. For the latter case, perhaps something like the following code works. ;; Return the buffer position correspoinding to the byte position ;; FILEPOS in FILE provided that FILE is decoded by CODING-SYSTEM. (defun temp (file filepos coding-system) (with-temp-buffer (set-buffer-multibyte nil) (insert-file-contents-literally file) (let ((full (decode-coding-region 1 (point-max) coding-system t)) partial) (while (and (setq partial (decode-coding-region 1 (1+ filepos) coding-system t)) (not (eq (compare-strings full 0 (length partial) partial 0 (length partial)) t))) (setq filepos (1+ filepos))) (1+ (length partial))))) If it is too slow, there are a few ways to make it faster. --- K. Handa handa@gnu.org