From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#61726: [PATCH] Eglot: Support positionEncoding capability Date: Thu, 23 Feb 2023 12:39:09 +0200 Message-ID: <83cz60r7hu.fsf@gnu.org> References: <87a614g628.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="21898"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 61726@debbugs.gnu.org, joaotavora@gmail.com To: Augusto Stoffel Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Feb 23 11:40:26 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pV91V-0005T2-Hm for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 23 Feb 2023 11:40:25 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pV91A-0001yl-4g; Thu, 23 Feb 2023 05:40:04 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pV919-0001x8-3S for bug-gnu-emacs@gnu.org; Thu, 23 Feb 2023 05:40:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pV918-00078B-Qe for bug-gnu-emacs@gnu.org; Thu, 23 Feb 2023 05:40:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1pV918-0006On-Cn for bug-gnu-emacs@gnu.org; Thu, 23 Feb 2023 05:40:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 23 Feb 2023 10:40:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 61726 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 61726-submit@debbugs.gnu.org id=B61726.167714874424515 (code B ref 61726); Thu, 23 Feb 2023 10:40:02 +0000 Original-Received: (at 61726) by debbugs.gnu.org; 23 Feb 2023 10:39:04 +0000 Original-Received: from localhost ([127.0.0.1]:33092 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pV90B-0006NK-Go for submit@debbugs.gnu.org; Thu, 23 Feb 2023 05:39:03 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:48754) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pV909-0006Mn-9p for 61726@debbugs.gnu.org; Thu, 23 Feb 2023 05:39:02 -0500 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pV902-0006x6-D9; Thu, 23 Feb 2023 05:38:54 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=0MIbZ8Xy51kJNlbhHmIh4Rbqxighjr/BiiLK2eOkvWk=; b=cb/zVfjHNoTRoA527xJY v604q/BnDQlahdNxbct8OlNKGoh2rUQkPX3ZLZT5zdKUNVw1XTPj992ZYMq3HnSgETZ8i/lHc9jCa BgU8cdxV+mAdHQwrPyfk3LlJTY1SCKw0kBeD0M4rK6vWjr5PQgJjbOlic2S/9jykQ7JxGkmO4te2K Q5boEOKR23vGLO0a5CpAo/U6lPzXIbkaqNLLrh6JiX0LFvWssuLy935Izd2D6NO9uHeuXVoH/nEwc 53jt///Y/ZPfZAeddFx0RtYB7bwZQ5z4KfNb83S5GP27TTpjSksr+W2xnGOk74VrlmS/FgDrcf8bn XP9wm+pNRDnGuA==; Original-Received: from [87.69.77.57] (helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pV901-0002FQ-Gk; Thu, 23 Feb 2023 05:38:53 -0500 In-Reply-To: <87a614g628.fsf@gmail.com> (message from Augusto Stoffel on Thu, 23 Feb 2023 09:05:35 +0100) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:256438 Archived-At: > Cc: João Távora > From: Augusto Stoffel > Date: Thu, 23 Feb 2023 09:05:35 +0100 > > There is a new LSP capability allowing the server and client to agree on > a way to count character offsets. What do you think fo the attached > patch? > > It expresses Eglot's preferences as counting character offsets, then > byte offsets, then the UTF-16 nonsense, in that order. > > I would also suggest preparing the stage to eventually make > `eglot-current-column-function' and `eglot-move-to-column-function' > obsolete. For that, I suggest renaming > > - eglot-current-column -> eglot--current-column-utf-32 > - eglot-lsp-abiding-column -> eglot--current-columns-utf-16 > - eglot-move-to-column -> eglot--move-to-columns-utf-32 > - eglot-move-to-lsp-abiding-column -> eglot--move-to-columns-utf-16 > > and then making the old names obsolete aliases of the new names. Please tell more about this, as I don't think I have a clear enough idea of the issues and the implications for Emacs. > +(defun eglot--current-column-utf-8 () > + "Calculate current column, counting bytes." > + (- (position-bytes (point)) (position-bytes (line-beginning-position)))) This is subtly incorrect: position-bytes doesn't cound UTF-8 bytes, it counts the bytes in the internal representation Emacs uses for buffer and string text. The differences are minor and subtle, but not negligible. > (defun eglot-move-to-column (column) > - "Move to COLUMN without closely following the LSP spec." > + "Move to COLUMN, counting Unicode codepoints." > ;; We cannot use `move-to-column' here, because it moves to *visual* > ;; columns, which can be different from LSP columns in case of > ;; `whitespace-mode', `prettify-symbols-mode', etc. (github#296, > @@ -1490,8 +1505,14 @@ eglot-move-to-column > (goto-char (min (+ (line-beginning-position) column) > (line-end-position)))) > > +(defun eglot--move-to-column-utf-8 (column) > + "Move to COLUMN, regarded as a byte offset." > + (goto-char (min (byte-to-position > + (+ (position-bytes (line-beginning-position)) column)) > + (line-end-position)))) > + > (defun eglot-move-to-lsp-abiding-column (column) > - "Move to COLUMN abiding by the LSP spec." > + "Move to COLUMN, counting UTF-16 code units as in the original LSP spec." > (save-restriction > (cl-loop > with lbp = (line-beginning-position) > @@ -1515,14 +1536,20 @@ eglot--lsp-position-to-point > (forward-line (min most-positive-fixnum > (plist-get pos-plist :line))) > (unless (eobp) ;; if line was excessive leave point at eob > - (let ((tab-width 1) > + (let ((movefn (or eglot-move-to-column-function > + (pcase (plist-get (eglot--capabilities (eglot-current-server)) > + :positionEncoding) > + ("utf-32" #'eglot-move-to-column) > + ("utf-8" #'eglot--move-to-column-utf-8) > + (_ #'eglot-move-to-lsp-abiding-column)))) > + (tab-width 1) > (col (plist-get pos-plist :character))) > (unless (wholenump col) > (eglot--warn > "Caution: LSP server sent invalid character position %s. Using 0 instead." > col) > (setq col 0)) > - (funcall eglot-move-to-column-function col))) > + (funcall movefn col))) > (if marker (copy-marker (point-marker)) (point))))) What does this stuff do with double-width or zero-width characters? Emacs takes character-width into consideration when it counts columns, but it is unclear to me what do LSP servers do in those cases. Likewise with characters that are composed on display. So I think this mess needs to be carefully and elaborately discussed before we decide how to implement it correctly. Thanks.