From: Augusto Stoffel <arstoffel@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 61726@debbugs.gnu.org, joaotavora@gmail.com
Subject: bug#61726: [PATCH] Eglot: Support positionEncoding capability
Date: Fri, 24 Feb 2023 10:15:48 +0100 [thread overview]
Message-ID: <87356vbf0b.fsf@gmail.com> (raw)
In-Reply-To: <83r0ufo3uc.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 24 Feb 2023 10:38:35 +0200")
On Fri, 24 Feb 2023 at 10:38, Eli Zaretskii wrote:
>> From: Augusto Stoffel <arstoffel@gmail.com>
>> Cc: joaotavora@gmail.com, 61726@debbugs.gnu.org
>> Date: Fri, 24 Feb 2023 08:18:30 +0100
>>
>> On Fri, 24 Feb 2023 at 08:43, Eli Zaretskii wrote:
>>
>> > It does? then please humor me by walking me through the code and the
>> > patch to show how that would work after applying the patch.
>>
>> + :general
>> + (list
>> + :positionEncodings ["utf-32" "utf-8" "utf-16"])
>> :experimental eglot--{})))
>
> Is "UTF-32" an LSP thing and terminology? Because I'd prefer a
> different name if we can. At least for our internal nomenclature,
> let's use "codepoint" or "character" instead.
Yes, this is how the LSP spec refers to the 3 offset counting methods.
>> -(defun eglot-current-column () (- (point) (line-beginning-position)))
>> +(defun eglot-current-column ()
>> + "Calculate current column, counting Unicode codepoints."
>> + (- (point) (line-beginning-position)))
>
> Can we please take this opportunity to get rid of the confusing
> "column" terminology? As became evident from this discussion, we are
> not talking columns here, we are talking offsets in characters from
> BOL. So something like "pos" or "linepos" or "line-offset" should be
> better.
>
> João, are you okay with such a sweeping change in all of eglot.el?
I like linepos, if João is fine with not making the absolute minimal
amount of changes to the code.
>> +(defun eglot--current-column-utf-8 ()
>> + "Calculate current column, counting bytes."
>> + (- (position-bytes (point)) (position-bytes (line-beginning-position))))
>
> As discussed, position-bytes is incorrect. You should instead do
> something like
>
> (length (encode-coding-string
> (buffer-substring-no-properties (point)
> (line-beginning-position))
> 'utf-8-unix t))
But it is incorrect only if the buffer contains characters outside of
the Unicode range, right? If that happens, we already lost, because a
few steps later we will serialize the buffer text as JSON to send it to
the server:
(progn
(insert ?x (max-char) ?y)
(json-serialize (buffer-substring-no-properties (pos-bol)
(pos-eol))))
⇒ Debugger entered--Lisp error: (wrong-type-argument utf-8-string-p " (json-serialize (buffer-substring-no-properties (...")
> Also, for 100% reliable results, we should bind
> inhibit-field-text-motion to t when calling line-beginning-position.
We should rather be using pos-bol, no? But how do we keep compatibility
with older Emacsen?
>> +(defun eglot--move-to-column-utf-8 (column)
>> + "Move to COLUMN, regarded as a byte offset."
>> + (goto-char (min (byte-to-position
>> + (+ (position-bytes (line-beginning-position)) column))
>> + (line-end-position))))
>
> Likewise here.
>
>> @@ -1515,14 +1536,20 @@ eglot--lsp-position-to-point
>> (forward-line (min most-positive-fixnum
>> (plist-get pos-plist :line)))
>> (unless (eobp) ;; if line was excessive leave point at eob
>> - (let ((tab-width 1)
>> + (let ((movefn (or eglot-move-to-column-function
>> + (pcase (plist-get (eglot--capabilities (eglot-current-server))
>> + :positionEncoding)
>> + ("utf-32" #'eglot-move-to-column)
>> + ("utf-8" #'eglot--move-to-column-utf-8)
>> + (_ #'eglot-move-to-lsp-abiding-column))))
>> + (tab-width 1)
> ^^^^^^^^^^^
> This last part shouldn't be necessary: we should move by characters,
> not by columns. Why is it necessary?
Maybe João can clarify, but I'm pretty sure this is there to support the
UTF-16 way of counting offsets, so this ideally should move to
eglot-move-to-lsp-abiding-column.
next prev parent reply other threads:[~2023-02-24 9:15 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-23 8:05 bug#61726: [PATCH] Eglot: Support positionEncoding capability Augusto Stoffel
2023-02-23 10:39 ` Eli Zaretskii
2023-02-23 11:32 ` João Távora
2023-02-23 12:04 ` Augusto Stoffel
2023-02-23 12:24 ` João Távora
2023-02-23 11:46 ` Augusto Stoffel
2023-02-23 12:54 ` Eli Zaretskii
2023-02-23 13:31 ` Augusto Stoffel
2023-02-23 15:04 ` Eli Zaretskii
2023-02-23 18:52 ` Augusto Stoffel
2023-02-23 19:20 ` Eli Zaretskii
2023-02-23 19:28 ` João Távora
2023-02-23 19:52 ` Augusto Stoffel
2023-02-24 6:43 ` Eli Zaretskii
2023-02-24 7:18 ` Augusto Stoffel
2023-02-24 8:38 ` Eli Zaretskii
2023-02-24 9:15 ` Augusto Stoffel [this message]
2023-02-24 10:20 ` João Távora
2023-02-24 11:01 ` Augusto Stoffel
2023-02-24 11:18 ` João Távora
2023-02-24 11:47 ` Augusto Stoffel
2023-02-24 12:05 ` João Távora
2023-02-24 12:14 ` Augusto Stoffel
2023-02-24 11:38 ` Eli Zaretskii
2023-02-24 11:55 ` João Távora
2023-02-24 11:27 ` Eli Zaretskii
2023-02-24 11:43 ` João Távora
2023-02-24 11:57 ` Eli Zaretskii
2023-02-24 12:09 ` João Távora
2023-02-24 12:18 ` Eli Zaretskii
2023-02-24 12:31 ` Augusto Stoffel
2023-02-24 12:01 ` Augusto Stoffel
2023-02-24 12:16 ` Eli Zaretskii
2023-02-24 12:35 ` Augusto Stoffel
2023-02-24 12:55 ` João Távora
2023-02-24 13:34 ` Eli Zaretskii
2023-02-24 13:45 ` João Távora
2023-02-24 13:51 ` Eli Zaretskii
2023-02-24 14:45 ` Augusto Stoffel
2023-02-24 15:19 ` Eli Zaretskii
2023-02-24 15:52 ` Augusto Stoffel
2023-02-24 16:01 ` Eli Zaretskii
2023-02-24 16:39 ` Augusto Stoffel
2023-02-24 17:07 ` Eli Zaretskii
2023-02-24 18:08 ` Augusto Stoffel
2023-02-24 18:55 ` João Távora
2023-02-25 10:58 ` Eli Zaretskii
2023-03-05 10:26 ` Augusto Stoffel
2023-02-25 10:57 ` Eli Zaretskii
2023-02-25 11:29 ` Augusto Stoffel
2023-02-25 13:47 ` Eli Zaretskii
2023-02-25 14:14 ` Augusto Stoffel
2023-02-25 16:26 ` Eli Zaretskii
2023-02-25 18:10 ` Augusto Stoffel
2023-02-25 22:15 ` João Távora
2023-02-25 22:13 ` João Távora
2023-02-25 22:34 ` Augusto Stoffel
2023-02-25 23:16 ` João Távora
2023-02-25 23:57 ` Augusto Stoffel
2023-02-26 6:03 ` Eli Zaretskii
2023-02-26 10:33 ` João Távora
2023-02-26 13:13 ` João Távora
2023-02-26 13:16 ` Eli Zaretskii
2023-02-26 13:25 ` Eli Zaretskii
2023-02-26 14:17 ` João Távora
2023-02-26 14:50 ` Eli Zaretskii
2023-02-26 15:15 ` João Távora
2023-02-26 15:37 ` Eli Zaretskii
2023-02-27 11:15 ` João Távora
2023-02-26 5:31 ` Eli Zaretskii
2023-02-26 10:38 ` João Távora
2023-02-24 14:54 ` Augusto Stoffel
2023-02-24 15:23 ` Eli Zaretskii
2023-02-24 15:56 ` Augusto Stoffel
2023-02-24 17:02 ` Eli Zaretskii
2023-02-24 16:34 ` João Távora
2023-02-24 17:06 ` Eli Zaretskii
2023-02-23 11:37 ` João Távora
2023-02-23 17:01 ` Felician Nemeth
2023-02-23 17:11 ` João Távora
2023-02-23 18:42 ` Augusto Stoffel
2023-02-27 10:11 ` Felician Nemeth
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87356vbf0b.fsf@gmail.com \
--to=arstoffel@gmail.com \
--cc=61726@debbugs.gnu.org \
--cc=eliz@gnu.org \
--cc=joaotavora@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).