From: Eli Zaretskii <eliz@gnu.org>
To: Augusto Stoffel <arstoffel@gmail.com>
Cc: 61726@debbugs.gnu.org, joaotavora@gmail.com
Subject: bug#61726: [PATCH] Eglot: Support positionEncoding capability
Date: Thu, 23 Feb 2023 12:39:09 +0200 [thread overview]
Message-ID: <83cz60r7hu.fsf@gnu.org> (raw)
In-Reply-To: <87a614g628.fsf@gmail.com> (message from Augusto Stoffel on Thu, 23 Feb 2023 09:05:35 +0100)
> Cc: João Távora <joaotavora@gmail.com>
> From: Augusto Stoffel <arstoffel@gmail.com>
> Date: Thu, 23 Feb 2023 09:05:35 +0100
>
> There is a new LSP capability allowing the server and client to agree on
> a way to count character offsets. What do you think fo the attached
> patch?
>
> It expresses Eglot's preferences as counting character offsets, then
> byte offsets, then the UTF-16 nonsense, in that order.
>
> I would also suggest preparing the stage to eventually make
> `eglot-current-column-function' and `eglot-move-to-column-function'
> obsolete. For that, I suggest renaming
>
> - eglot-current-column -> eglot--current-column-utf-32
> - eglot-lsp-abiding-column -> eglot--current-columns-utf-16
> - eglot-move-to-column -> eglot--move-to-columns-utf-32
> - eglot-move-to-lsp-abiding-column -> eglot--move-to-columns-utf-16
>
> and then making the old names obsolete aliases of the new names.
Please tell more about this, as I don't think I have a clear enough
idea of the issues and the implications for Emacs.
> +(defun eglot--current-column-utf-8 ()
> + "Calculate current column, counting bytes."
> + (- (position-bytes (point)) (position-bytes (line-beginning-position))))
This is subtly incorrect: position-bytes doesn't cound UTF-8 bytes, it
counts the bytes in the internal representation Emacs uses for buffer
and string text. The differences are minor and subtle, but not
negligible.
> (defun eglot-move-to-column (column)
> - "Move to COLUMN without closely following the LSP spec."
> + "Move to COLUMN, counting Unicode codepoints."
> ;; We cannot use `move-to-column' here, because it moves to *visual*
> ;; columns, which can be different from LSP columns in case of
> ;; `whitespace-mode', `prettify-symbols-mode', etc. (github#296,
> @@ -1490,8 +1505,14 @@ eglot-move-to-column
> (goto-char (min (+ (line-beginning-position) column)
> (line-end-position))))
>
> +(defun eglot--move-to-column-utf-8 (column)
> + "Move to COLUMN, regarded as a byte offset."
> + (goto-char (min (byte-to-position
> + (+ (position-bytes (line-beginning-position)) column))
> + (line-end-position))))
> +
> (defun eglot-move-to-lsp-abiding-column (column)
> - "Move to COLUMN abiding by the LSP spec."
> + "Move to COLUMN, counting UTF-16 code units as in the original LSP spec."
> (save-restriction
> (cl-loop
> with lbp = (line-beginning-position)
> @@ -1515,14 +1536,20 @@ eglot--lsp-position-to-point
> (forward-line (min most-positive-fixnum
> (plist-get pos-plist :line)))
> (unless (eobp) ;; if line was excessive leave point at eob
> - (let ((tab-width 1)
> + (let ((movefn (or eglot-move-to-column-function
> + (pcase (plist-get (eglot--capabilities (eglot-current-server))
> + :positionEncoding)
> + ("utf-32" #'eglot-move-to-column)
> + ("utf-8" #'eglot--move-to-column-utf-8)
> + (_ #'eglot-move-to-lsp-abiding-column))))
> + (tab-width 1)
> (col (plist-get pos-plist :character)))
> (unless (wholenump col)
> (eglot--warn
> "Caution: LSP server sent invalid character position %s. Using 0 instead."
> col)
> (setq col 0))
> - (funcall eglot-move-to-column-function col)))
> + (funcall movefn col)))
> (if marker (copy-marker (point-marker)) (point)))))
What does this stuff do with double-width or zero-width characters?
Emacs takes character-width into consideration when it counts columns,
but it is unclear to me what do LSP servers do in those cases.
Likewise with characters that are composed on display.
So I think this mess needs to be carefully and elaborately discussed
before we decide how to implement it correctly.
Thanks.
next prev parent reply other threads:[~2023-02-23 10:39 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-23 8:05 bug#61726: [PATCH] Eglot: Support positionEncoding capability Augusto Stoffel
2023-02-23 10:39 ` Eli Zaretskii [this message]
2023-02-23 11:32 ` João Távora
2023-02-23 12:04 ` Augusto Stoffel
2023-02-23 12:24 ` João Távora
2023-02-23 11:46 ` Augusto Stoffel
2023-02-23 12:54 ` Eli Zaretskii
2023-02-23 13:31 ` Augusto Stoffel
2023-02-23 15:04 ` Eli Zaretskii
2023-02-23 18:52 ` Augusto Stoffel
2023-02-23 19:20 ` Eli Zaretskii
2023-02-23 19:28 ` João Távora
2023-02-23 19:52 ` Augusto Stoffel
2023-02-24 6:43 ` Eli Zaretskii
2023-02-24 7:18 ` Augusto Stoffel
2023-02-24 8:38 ` Eli Zaretskii
2023-02-24 9:15 ` Augusto Stoffel
2023-02-24 10:20 ` João Távora
2023-02-24 11:01 ` Augusto Stoffel
2023-02-24 11:18 ` João Távora
2023-02-24 11:47 ` Augusto Stoffel
2023-02-24 12:05 ` João Távora
2023-02-24 12:14 ` Augusto Stoffel
2023-02-24 11:38 ` Eli Zaretskii
2023-02-24 11:55 ` João Távora
2023-02-24 11:27 ` Eli Zaretskii
2023-02-24 11:43 ` João Távora
2023-02-24 11:57 ` Eli Zaretskii
2023-02-24 12:09 ` João Távora
2023-02-24 12:18 ` Eli Zaretskii
2023-02-24 12:31 ` Augusto Stoffel
2023-02-24 12:01 ` Augusto Stoffel
2023-02-24 12:16 ` Eli Zaretskii
2023-02-24 12:35 ` Augusto Stoffel
2023-02-24 12:55 ` João Távora
2023-02-24 13:34 ` Eli Zaretskii
2023-02-24 13:45 ` João Távora
2023-02-24 13:51 ` Eli Zaretskii
2023-02-24 14:45 ` Augusto Stoffel
2023-02-24 15:19 ` Eli Zaretskii
2023-02-24 15:52 ` Augusto Stoffel
2023-02-24 16:01 ` Eli Zaretskii
2023-02-24 16:39 ` Augusto Stoffel
2023-02-24 17:07 ` Eli Zaretskii
2023-02-24 18:08 ` Augusto Stoffel
2023-02-24 18:55 ` João Távora
2023-02-25 10:58 ` Eli Zaretskii
2023-03-05 10:26 ` Augusto Stoffel
2023-02-25 10:57 ` Eli Zaretskii
2023-02-25 11:29 ` Augusto Stoffel
2023-02-25 13:47 ` Eli Zaretskii
2023-02-25 14:14 ` Augusto Stoffel
2023-02-25 16:26 ` Eli Zaretskii
2023-02-25 18:10 ` Augusto Stoffel
2023-02-25 22:15 ` João Távora
2023-02-25 22:13 ` João Távora
2023-02-25 22:34 ` Augusto Stoffel
2023-02-25 23:16 ` João Távora
2023-02-25 23:57 ` Augusto Stoffel
2023-02-26 6:03 ` Eli Zaretskii
2023-02-26 10:33 ` João Távora
2023-02-26 13:13 ` João Távora
2023-02-26 13:16 ` Eli Zaretskii
2023-02-26 13:25 ` Eli Zaretskii
2023-02-26 14:17 ` João Távora
2023-02-26 14:50 ` Eli Zaretskii
2023-02-26 15:15 ` João Távora
2023-02-26 15:37 ` Eli Zaretskii
2023-02-27 11:15 ` João Távora
2023-02-26 5:31 ` Eli Zaretskii
2023-02-26 10:38 ` João Távora
2023-02-24 14:54 ` Augusto Stoffel
2023-02-24 15:23 ` Eli Zaretskii
2023-02-24 15:56 ` Augusto Stoffel
2023-02-24 17:02 ` Eli Zaretskii
2023-02-24 16:34 ` João Távora
2023-02-24 17:06 ` Eli Zaretskii
2023-02-23 11:37 ` João Távora
2023-02-23 17:01 ` Felician Nemeth
2023-02-23 17:11 ` João Távora
2023-02-23 18:42 ` Augusto Stoffel
2023-02-27 10:11 ` Felician Nemeth
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=83cz60r7hu.fsf@gnu.org \
--to=eliz@gnu.org \
--cc=61726@debbugs.gnu.org \
--cc=arstoffel@gmail.com \
--cc=joaotavora@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).