* bug#66825: last-coding-system-used in basic-save-buffer
@ 2023-10-29 18:40 Juri Linkov
2023-10-29 19:27 ` Eli Zaretskii
0 siblings, 1 reply; 7+ messages in thread
From: Juri Linkov @ 2023-10-29 18:40 UTC (permalink / raw)
To: 66825
[Creating a separate bug report from bug#66317 since
sometimes the bug occurs even when project-mode-line is nil]
>> Ok, here is 100% reproducible minimal test case:
>> 0. emacs-30 -Q
>> 1. eval:
>> (progn
>> (require 'project)
>> (setq project-mode-line t)
>> (setq set-message-functions '(set-multi-message)))
>> 2. in a temporary directory: M-! git init RET
>> 3. C-x C-f .dir-locals.el RET
>> 4. insert: ((fundamental-mode . ((mode . flyspell))))
>> 5. C-x C-s
>> At this point even buffer-file-coding-system of .dir-locals.el
>> is changed to 't' (raw-text-unix). The same happens when saving
>> any file in that project.
>> The problem doesn't occur when flyspell-mode is enabled from
>> file-local variables, only from dir-locals.
>
> I can repro. But it's as weird a bug as they come.
>
> I guess it's a combination of using flyspell-mode and editing
> .dir-locals.el? Or have you seen other buffers' b-f-c-s changed this way
> too?
>
> If not, it might have something to do with flyspell-mode's use of
> sit-for?.. But I'm only saying that because it's the only feature of this
> mode that I'm regularly reminded of.
>
> I tried using a variable watcher:
>
> (add-variable-watcher
> 'buffer-file-coding-system
> (lambda (_sym value op where)
> (message "%s %s %s" value op where)
> (if (eq 'raw-text-unix value) (backtrace))
> ))
>
> but it just prints
>
> prefer-utf-8-unix set *temp*-925453 [2 times]
> raw-text-unix set .dir-locals.el
> backtrace()
> (if (eq 'raw-text-unix value) (backtrace))
> (closure (t) (_sym value op where) (message "%s %s %s" value op where)
> (if (eq 'raw-text-unix value) (backtrace)))(buffer-file-coding-system
> raw-text-unix set #<buffer .dir-locals.el>)
> basic-save-buffer(t)
> save-buffer(1)
> funcall-interactively(save-buffer 1)
> call-interactively(save-buffer nil nil)
> command-execute(save-buffer)
>
> OTOH, the bug is very reliable to reproduce: add the aforementioned line to
> dir-locals and save -> the coding system changes to raw-text. Delete the
> line and save -> and it's prefer-utf-8 again.
Indeed, the problem is in basic-save-buffer on the following line:
(setq buffer-file-coding-system last-coding-system-used)
It's hard to guess why this code relies on the value that
can be changed by other functions during saving the buffer.
For example,
(progn
(setq last-coding-system-used 'prefer-utf-8-unix)
(project-name (project-current))
(message "%S" last-coding-system-used))
prints "raw-text-unix" because it enables 'flyspell-mode'
that calls:
(defun ispell-buffer-local-parsing ()
(ispell-send-string "!\n")
where 'process-send-string' changes 'last-coding-system-used'
to "raw-text-unix" in:
send_process (Lisp_Object proc, const char *buf, ptrdiff_t len,
Lisp_Object object)
{
...
Vlast_coding_system_used = CODING_ID_NAME (coding->id);
A possible workaround would be to protect the value of
last-coding-system-used in 'project-mode-line-format':
(defun project-mode-line-format ()
"Compose the project mode-line."
(when-let ((project (project-current)))
(let (last-coding-system-used)
(concat
" "
(propertize
(project-name project)
'face project-mode-line-face
'mouse-face 'mode-line-highlight
'help-echo "mouse-1: Project menu"
'local-map project-mode-line-map)))))
However, I noticed that occasionally this bug occurs even
when this function is not used. So the proper fix needed
in 'basic-save-buffer', but I don't know if it's intended
that some function should change 'last-coding-system-used'
during saving the buffer.
^ permalink raw reply [flat|nested] 7+ messages in thread
* bug#66825: last-coding-system-used in basic-save-buffer
2023-10-29 18:40 bug#66825: last-coding-system-used in basic-save-buffer Juri Linkov
@ 2023-10-29 19:27 ` Eli Zaretskii
2023-10-30 7:56 ` Juri Linkov
0 siblings, 1 reply; 7+ messages in thread
From: Eli Zaretskii @ 2023-10-29 19:27 UTC (permalink / raw)
To: Juri Linkov; +Cc: 66825
> From: Juri Linkov <juri@linkov.net>
> Date: Sun, 29 Oct 2023 20:40:32 +0200
>
> Indeed, the problem is in basic-save-buffer on the following line:
>
> (setq buffer-file-coding-system last-coding-system-used)
>
> It's hard to guess why this code relies on the value that
> can be changed by other functions during saving the buffer.
Because this is the protocol: functions that determine the encoding of
buffer text dynamically set this variable, so it could later be used
to reflect the detection in buffer-file-coding-system.
> For example,
>
> (progn
> (setq last-coding-system-used 'prefer-utf-8-unix)
> (project-name (project-current))
> (message "%S" last-coding-system-used))
>
> prints "raw-text-unix" because it enables 'flyspell-mode'
> that calls:
>
> (defun ispell-buffer-local-parsing ()
> (ispell-send-string "!\n")
>
> where 'process-send-string' changes 'last-coding-system-used'
> to "raw-text-unix" in:
>
> send_process (Lisp_Object proc, const char *buf, ptrdiff_t len,
> Lisp_Object object)
> {
> ...
> Vlast_coding_system_used = CODING_ID_NAME (coding->id);
>
> A possible workaround would be to protect the value of
> last-coding-system-used in 'project-mode-line-format':
That's not a workaround, that's what we should do in such cases.
Alternatively, the "important" setting of last-coding-system-used
should be done later, after the inner functions already returned.
> However, I noticed that occasionally this bug occurs even
> when this function is not used. So the proper fix needed
> in 'basic-save-buffer', but I don't know if it's intended
> that some function should change 'last-coding-system-used'
> during saving the buffer.
Yes, it's intended: saving the buffer can change
buffer-file-coding-system if it detects characters which cannot be
encoded by the original buffer-file-coding-system.
^ permalink raw reply [flat|nested] 7+ messages in thread
* bug#66825: last-coding-system-used in basic-save-buffer
2023-10-29 19:27 ` Eli Zaretskii
@ 2023-10-30 7:56 ` Juri Linkov
2023-10-30 12:15 ` Eli Zaretskii
0 siblings, 1 reply; 7+ messages in thread
From: Juri Linkov @ 2023-10-30 7:56 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 66825
>> Indeed, the problem is in basic-save-buffer on the following line:
>>
>> (setq buffer-file-coding-system last-coding-system-used)
>>
>> It's hard to guess why this code relies on the value that
>> can be changed by other functions during saving the buffer.
>
> Because this is the protocol: functions that determine the encoding of
> buffer text dynamically set this variable, so it could later be used
> to reflect the detection in buffer-file-coding-system.
Still it seems there is some flaw in the design when some
functions change 'last-coding-system-used' intentionally,
but some can change it unintentionally that causes this bug.
>> A possible workaround would be to protect the value of
>> last-coding-system-used in 'project-mode-line-format':
>
> That's not a workaround, that's what we should do in such cases.
> Alternatively, the "important" setting of last-coding-system-used
> should be done later, after the inner functions already returned.
I don't understand this alternative. The mode line updating
that uses 'project-mode-line-format' that unintentionally
changes 'last-coding-system-used' is called from this line
in 'basic-save-buffer-2':
(write-region nil nil
buffer-file-name nil t buffer-file-truename)
because this call in 'write_region' updates the mode line:
message_with_string ((NUMBERP (append)
? "Updated %s"
: ! NILP (append)
? "Added to %s"
: "Wrote %s"),
visit_file, 1);
>> However, I noticed that occasionally this bug occurs even
>> when this function is not used. So the proper fix needed
>> in 'basic-save-buffer', but I don't know if it's intended
>> that some function should change 'last-coding-system-used'
>> during saving the buffer.
>
> Yes, it's intended: saving the buffer can change
> buffer-file-coding-system if it detects characters which cannot be
> encoded by the original buffer-file-coding-system.
I see this detection happens in the same 'write_region':
/* Decide the coding-system to encode the data with.
We used to make this choice before calling build_annotations, but that
leads to problems when a write-annotate-function takes care of
unsavable chars (as was the case with X-Symbol). */
Vlast_coding_system_used
= choose_write_coding_system (start, end, filename,
append, visit, lockname, &coding);
This means the value 'last-coding-system-used' should be preserved
afterwards. Therefore I don't see how this could be fixed in
'basic-save-buffer' and 'write-region'.
Ok, then let's fix this in 'project-mode-line-format':
diff --git a/lisp/progmodes/project.el b/lisp/progmodes/project.el
index bb44cfefa54..da89036160a 100644
--- a/lisp/progmodes/project.el
+++ b/lisp/progmodes/project.el
@@ -2074,14 +2121,17 @@ project-mode-line-format
(defun project-mode-line-format ()
"Compose the project mode-line."
(when-let ((project (project-current)))
- (concat
- " "
- (propertize
- (project-name project)
- 'face project-mode-line-face
- 'mouse-face 'mode-line-highlight
- 'help-echo "mouse-1: Project menu"
- 'local-map project-mode-line-map))))
+ ;; Preserve the original value of 'last-coding-system-used'
+ ;; that can break 'basic-save-buffer' (bug#66825)
+ (let ((last-coding-system-used nil))
+ (concat
+ " "
+ (propertize
+ (project-name project)
+ 'face project-mode-line-face
+ 'mouse-face 'mode-line-highlight
+ 'help-echo "mouse-1: Project menu"
+ 'local-map project-mode-line-map)))))
(provide 'project)
;;; project.el ends here
^ permalink raw reply related [flat|nested] 7+ messages in thread
* bug#66825: last-coding-system-used in basic-save-buffer
2023-10-30 7:56 ` Juri Linkov
@ 2023-10-30 12:15 ` Eli Zaretskii
2023-10-30 17:20 ` Juri Linkov
0 siblings, 1 reply; 7+ messages in thread
From: Eli Zaretskii @ 2023-10-30 12:15 UTC (permalink / raw)
To: Juri Linkov; +Cc: 66825
> From: Juri Linkov <juri@linkov.net>
> Cc: 66825@debbugs.gnu.org
> Date: Mon, 30 Oct 2023 09:56:27 +0200
>
> I don't understand this alternative. The mode line updating
> that uses 'project-mode-line-format' that unintentionally
> changes 'last-coding-system-used' is called from this line
> in 'basic-save-buffer-2':
>
> (write-region nil nil
> buffer-file-name nil t buffer-file-truename)
>
> because this call in 'write_region' updates the mode line:
>
> message_with_string ((NUMBERP (append)
> ? "Updated %s"
> : ! NILP (append)
> ? "Added to %s"
> : "Wrote %s"),
> visit_file, 1);
How does message_with_string update the mode line?
And why does last-coding-system-used get set to raw-text-unix in this
scenario anyway?
> + ;; Preserve the original value of 'last-coding-system-used'
> + ;; that can break 'basic-save-buffer' (bug#66825)
> + (let ((last-coding-system-used nil))
> + (concat
> + " "
> + (propertize
> + (project-name project)
> + 'face project-mode-line-face
> + 'mouse-face 'mode-line-highlight
> + 'help-echo "mouse-1: Project menu"
> + 'local-map project-mode-line-map)))))
I'm confused how this avoids the problem, probably because I don't
understand the answers to the two questions above.
^ permalink raw reply [flat|nested] 7+ messages in thread
* bug#66825: last-coding-system-used in basic-save-buffer
2023-10-30 12:15 ` Eli Zaretskii
@ 2023-10-30 17:20 ` Juri Linkov
2023-10-30 17:45 ` Eli Zaretskii
0 siblings, 1 reply; 7+ messages in thread
From: Juri Linkov @ 2023-10-30 17:20 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 66825
>> I don't understand this alternative. The mode line updating
>> that uses 'project-mode-line-format' that unintentionally
>> changes 'last-coding-system-used' is called from this line
>> in 'basic-save-buffer-2':
>>
>> (write-region nil nil
>> buffer-file-name nil t buffer-file-truename)
>>
>> because this call in 'write_region' updates the mode line:
>>
>> message_with_string ((NUMBERP (append)
>> ? "Updated %s"
>> : ! NILP (append)
>> ? "Added to %s"
>> : "Wrote %s"),
>> visit_file, 1);
>
> How does message_with_string update the mode line?
I'm not know, some deeper function needs to update the mode line
when the multi-line message resizes the echo area.
> And why does last-coding-system-used get set to raw-text-unix in this
> scenario anyway?
Because send_process needs to set it to raw-text-unix for ispell:
send_process (Lisp_Object proc, const char *buf, ptrdiff_t len, Lisp_Object object)
{
Vlast_coding_system_used = CODING_ID_NAME (coding->id);
Here is a backtrace:
send_process
process-send-string
ispell-send-string
ispell-buffer-local-parsing
ispell-accept-buffer-local-defs
flyspell-accept-buffer-local-defs
flyspell--mode-on
flyspell-mode
hack-one-local-variable(mode flyspell)
hack-local-variables-apply
hack-dir-local-variables-non-file-buffer
project--value-in-dir
project-name
project-mode-line-format
eval((project-mode-line-format))
write-region
basic-save-buffer-2
basic-save-buffer-1
basic-save-buffer
save-buffer
^ permalink raw reply [flat|nested] 7+ messages in thread
* bug#66825: last-coding-system-used in basic-save-buffer
2023-10-30 17:20 ` Juri Linkov
@ 2023-10-30 17:45 ` Eli Zaretskii
2023-10-31 7:23 ` Juri Linkov
0 siblings, 1 reply; 7+ messages in thread
From: Eli Zaretskii @ 2023-10-30 17:45 UTC (permalink / raw)
To: Juri Linkov; +Cc: 66825
> From: Juri Linkov <juri@linkov.net>
> Cc: 66825@debbugs.gnu.org
> Date: Mon, 30 Oct 2023 19:20:19 +0200
>
> >> message_with_string ((NUMBERP (append)
> >> ? "Updated %s"
> >> : ! NILP (append)
> >> ? "Added to %s"
> >> : "Wrote %s"),
> >> visit_file, 1);
> >
> > How does message_with_string update the mode line?
>
> I'm not know, some deeper function needs to update the mode line
> when the multi-line message resizes the echo area.
I think it's because message_with_string eventually calls redisplay,
and redisplay updates the mode line as part of its job.
> > And why does last-coding-system-used get set to raw-text-unix in this
> > scenario anyway?
>
> Because send_process needs to set it to raw-text-unix for ispell:
>
> send_process (Lisp_Object proc, const char *buf, ptrdiff_t len, Lisp_Object object)
> {
> Vlast_coding_system_used = CODING_ID_NAME (coding->id);
I think this happens because the string sent to the speller is a
plain-ASCII string, and those are almost always unibyte strings.
So I think the local binding of last-coding-system-used around the
call to project-mode-line is TRT, it just needs a better comment to
explain why it's needed.
I think this should also teach us a lesson: calling arbitrary complex
code from mode-line's :eval forms is in general risky business and
should be avoided as much as possible.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-10-31 7:23 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-29 18:40 bug#66825: last-coding-system-used in basic-save-buffer Juri Linkov
2023-10-29 19:27 ` Eli Zaretskii
2023-10-30 7:56 ` Juri Linkov
2023-10-30 12:15 ` Eli Zaretskii
2023-10-30 17:20 ` Juri Linkov
2023-10-30 17:45 ` Eli Zaretskii
2023-10-31 7:23 ` Juri Linkov
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).