unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#48043: UTF-8 magic comment is unwelcome with recent Ruby versions
@ 2021-04-26 18:28 Peter Oliver
  2021-04-26 21:04 ` Dmitry Gutov
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Oliver @ 2021-04-26 18:28 UTC (permalink / raw)
  To: 48043

[-- Attachment #1: Type: text/plain, Size: 758 bytes --]

When saving a ruby-mode buffer, if the content is not plain ASCII, then the default behaviour is to add a specially-formatted comment that indicates the encoding to the Ruby interpreter.  E.g.,

# coding: utf-8

However, since Ruby 2.0 released in 2013, the default encoding for Ruby has been UTF-8.  Consequently, users of other editors tend not to include this comment when using UTF-8.  When you edit such a file with Emacs, you end up with a messy diff.

Two patches are attached to address this:

- The first patch adds a new choice to ruby-insert-encoding-magic-comment, unless-utf8, which causes the magic comment not to be inserted if the encoding is UTF-8.

- The second patch, perhaps more controversially, makes this the default.

-- 
Peter Oliver

[-- Attachment #2: Type: text/plain, Size: 5736 bytes --]

From c753f7216b3acedb57117e7e86ae139d9d9a9b98 Mon Sep 17 00:00:00 2001
From: Peter Oliver <git@mavit.org.uk>
Date: Mon, 26 Apr 2021 17:17:28 +0100
Subject: [PATCH 1/2] New choice for ruby-insert-encoding-magic-comment,
 unless-utf8

With this setting, when saving, a comment describing the file encoding will
not be added if it is UTF-8.

UTF-8 is the default encoding for Ruby 2.0 and newer.
---
 etc/NEWS                               |  4 ++
 lisp/progmodes/ruby-mode.el            | 15 ++++++-
 test/lisp/progmodes/ruby-mode-tests.el | 54 ++++++++++++++++++++++++++
 3 files changed, 71 insertions(+), 2 deletions(-)

diff --git a/etc/NEWS b/etc/NEWS
index 9bf232ac02..8c1325cb09 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -501,6 +501,10 @@ documented.
 SMIE is now always enabled and 'ruby-use-smie' only controls whether
 indentation is done using SMIE or with the old ad-hoc code.
 
+*** 'ruby-insert-encoding-magic-comment' has a new choice, 'unless-utf8.
+With this setting, when saving, a comment describing the file encoding will
+not be added if it is UTF-8.
+
 ** Icomplete
 
 +++
diff --git a/lisp/progmodes/ruby-mode.el b/lisp/progmodes/ruby-mode.el
index 84ac8fdb28..822e1e1c14 100644
--- a/lisp/progmodes/ruby-mode.el
+++ b/lisp/progmodes/ruby-mode.el
@@ -304,9 +304,17 @@ ruby-insert-encoding-magic-comment
 The encoding will be auto-detected.  The format of the encoding comment
 is customizable via `ruby-encoding-magic-comment-style'.
 
+When set to `unless-utf8', a comment will always be added unless
+the encoding is ASCII or UTF-8.
+
 When set to `always-utf8' an utf-8 comment will always be added,
 even if it's not required."
-  :type 'boolean :group 'ruby)
+  :type '(choice
+          (const :tag "On" t)
+          (const :tag "On, always UTF-8" always-utf8)
+          (const :tag "On unless UTF-8" unless-utf8)
+          (const :tag "Off" nil))
+  :group 'ruby)
 
 (defcustom ruby-encoding-magic-comment-style 'ruby
   "The style of the magic encoding comment to use."
@@ -789,7 +797,10 @@ ruby-mode-set-encoding
       (when (ruby--encoding-comment-required-p)
         (goto-char (point-min))
         (let ((coding-system (ruby--detect-encoding)))
-          (when coding-system
+          (when (and coding-system
+                     (if (eq ruby-insert-encoding-magic-comment 'unless-utf8)
+                         (not (string= coding-system "utf-8"))
+                       t))
             (if (looking-at "^#!") (beginning-of-line 2))
             (cond ((looking-at "\\s *#.*\\(en\\)?coding\\s *:\\s *\\([-a-z0-9_]*\\)")
                    ;; update existing encoding comment if necessary
diff --git a/test/lisp/progmodes/ruby-mode-tests.el b/test/lisp/progmodes/ruby-mode-tests.el
index 42a011c8bc..ead8a99eb4 100644
--- a/test/lisp/progmodes/ruby-mode-tests.el
+++ b/test/lisp/progmodes/ruby-mode-tests.el
@@ -32,6 +32,12 @@ ruby-with-temp-buffer
      (ruby-mode)
      ,@body))
 
+(defmacro ruby-with-temp-file (contents &rest body)
+  `(ruby-with-temp-buffer ,contents
+     (set-visited-file-name "ruby-mode-tests")
+     ,@body
+     (delete-file buffer-file-name)))
+
 (defun ruby-should-indent (content column)
   "Assert indentation COLUMN on the last line of CONTENT."
   (ruby-with-temp-buffer content
@@ -844,6 +850,54 @@ ruby--insert-coding-comment-custom-style
       (ruby--insert-coding-comment "utf-8")
       (should (string= "# encoding: utf-8\n\n" (buffer-string))))))
 
+(ert-deftest ruby--set-encoding-when-ascii ()
+  (ruby-with-temp-file "ascii"
+    (let ((ruby-encoding-magic-comment-style 'ruby)
+          (ruby-insert-encoding-magic-comment t))
+      (setq save-buffer-coding-system 'us-ascii)
+      (ruby-mode-set-encoding)
+      (should (string= "ascii" (buffer-string))))))
+
+(ert-deftest ruby--set-encoding-always-utf8-when-ascii ()
+  (ruby-with-temp-file "ascii"
+    (let ((ruby-encoding-magic-comment-style 'ruby)
+          (ruby-insert-encoding-magic-comment 'always-utf8))
+      (setq save-buffer-coding-system 'us-ascii)
+      (ruby-mode-set-encoding)
+      (should (string= "# coding: utf-8\nascii" (buffer-string))))))
+
+(ert-deftest ruby--set-encoding-when-utf8 ()
+  (ruby-with-temp-file "💎"
+    (let ((ruby-encoding-magic-comment-style 'ruby)
+          (ruby-insert-encoding-magic-comment t))
+      (setq save-buffer-coding-system 'utf-8)
+      (ruby-mode-set-encoding)
+      (should (string= "# coding: utf-8\n💎" (buffer-string))))))
+
+(ert-deftest ruby--set-encoding-off ()
+  (ruby-with-temp-file "💎"
+    (let ((ruby-encoding-magic-comment-style 'ruby)
+          (ruby-insert-encoding-magic-comment nil))
+      (setq save-buffer-coding-system 'utf-8)
+      (ruby-mode-set-encoding)
+      (should (string= "💎" (buffer-string))))))
+
+(ert-deftest ruby--set-encoding-unless-utf8-when-utf8 ()
+  (ruby-with-temp-file "💎"
+    (let ((ruby-encoding-magic-comment-style 'ruby)
+          (ruby-insert-encoding-magic-comment 'unless-utf8))
+      (setq save-buffer-coding-system 'utf-8)
+      (ruby-mode-set-encoding)
+      (should (string= "💎" (buffer-string))))))
+
+(ert-deftest ruby--set-encoding-unless-utf8-when-latin-15 ()
+  (ruby-with-temp-file "Ⓡ"
+    (let ((ruby-encoding-magic-comment-style 'ruby)
+          (ruby-insert-encoding-magic-comment 'unless-utf8))
+      (setq save-buffer-coding-system 'iso-8859-15)
+      (ruby-mode-set-encoding)
+      (should (string= "# coding: iso-8859-15\nⓇ" (buffer-string))))))
+
 (ert-deftest ruby--indent/converted-from-manual-test ()
   :tags '(:expensive-test)
   ;; Converted from manual test.
-- 
2.26.3


[-- Attachment #3: Type: text/plain, Size: 2156 bytes --]

From 0c8032dae23d07d5086bd3fe8bb7cd33ac654bd6 Mon Sep 17 00:00:00 2001
From: Peter Oliver <git@mavit.org.uk>
Date: Mon, 26 Apr 2021 17:24:59 +0100
Subject: [PATCH 2/2] Default ruby-insert-encoding-magic-comment to
 'unless-utf8
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Most users don’t require a magic comment if their files are encoded
using UTF-8, since that has been the default since Ruby 2.0.
---
 etc/NEWS                    | 4 ++++
 lisp/progmodes/ruby-mode.el | 5 +++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/etc/NEWS b/etc/NEWS
index 8c1325cb09..961a3a76b1 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -505,6 +505,10 @@ indentation is done using SMIE or with the old ad-hoc code.
 With this setting, when saving, a comment describing the file encoding will
 not be added if it is UTF-8.
 
+*** 'ruby-insert-encoding-magic-comment' defaults to 'unless-utf8.
+Most users don’t require a magic comment if their files are encoded
+using UTF-8, since that has been the default since Ruby 2.0.
+
 ** Icomplete
 
 +++
diff --git a/lisp/progmodes/ruby-mode.el b/lisp/progmodes/ruby-mode.el
index 822e1e1c14..3b04d3d83e 100644
--- a/lisp/progmodes/ruby-mode.el
+++ b/lisp/progmodes/ruby-mode.el
@@ -299,7 +299,7 @@ ruby-encoding-map
 explicitly declared in magic comment."
   :type '(repeat (cons (symbol :tag "From") (symbol :tag "To"))))
 
-(defcustom ruby-insert-encoding-magic-comment t
+(defcustom ruby-insert-encoding-magic-comment 'unless-utf8
   "Insert a magic Ruby encoding comment upon save if this is non-nil.
 The encoding will be auto-detected.  The format of the encoding comment
 is customizable via `ruby-encoding-magic-comment-style'.
@@ -314,7 +314,8 @@ ruby-insert-encoding-magic-comment
           (const :tag "On, always UTF-8" always-utf8)
           (const :tag "On unless UTF-8" unless-utf8)
           (const :tag "Off" nil))
-  :group 'ruby)
+  :group 'ruby
+  :version "28.1")
 
 (defcustom ruby-encoding-magic-comment-style 'ruby
   "The style of the magic encoding comment to use."
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* bug#48043: UTF-8 magic comment is unwelcome with recent Ruby versions
  2021-04-26 18:28 bug#48043: UTF-8 magic comment is unwelcome with recent Ruby versions Peter Oliver
@ 2021-04-26 21:04 ` Dmitry Gutov
  2021-04-27 15:29   ` Peter Oliver
  0 siblings, 1 reply; 6+ messages in thread
From: Dmitry Gutov @ 2021-04-26 21:04 UTC (permalink / raw)
  To: Peter Oliver, 48043

Hi!

On 26.04.2021 21:28, Peter Oliver wrote:
> When saving a ruby-mode buffer, if the content is not plain ASCII, then 
> the default behaviour is to add a specially-formatted comment that 
> indicates the encoding to the Ruby interpreter.  E.g.,
> 
> # coding: utf-8
> 
> However, since Ruby 2.0 released in 2013, the default encoding for Ruby 
> has been UTF-8.  Consequently, users of other editors tend not to 
> include this comment when using UTF-8.  When you edit such a file with 
> Emacs, you end up with a messy diff.
> 
> Two patches are attached to address this:
> 
> - The first patch adds a new choice to 
> ruby-insert-encoding-magic-comment, unless-utf8, which causes the magic 
> comment not to be inserted if the encoding is UTF-8.
> 
> - The second patch, perhaps more controversially, makes this the default.

Both changes make sense to me.

However, I've looked at the existing code and found a prior change which 
intended for this to be more customizable already, yet had a minor bug.

Please try out the following patch:

diff --git a/lisp/progmodes/ruby-mode.el b/lisp/progmodes/ruby-mode.el
index 84ac8fdb28..35772827ce 100644
--- a/lisp/progmodes/ruby-mode.el
+++ b/lisp/progmodes/ruby-mode.el
@@ -291,6 +291,7 @@ ruby-deep-indent-paren-style

  (defcustom ruby-encoding-map
    '((us-ascii       . nil)       ;; Do not put coding: us-ascii
+    (utf-8          . nil)       ;; Default since Ruby 2.0
      (shift-jis      . cp932)     ;; Emacs charset name of Shift_JIS
      (shift_jis      . cp932)     ;; MIME charset name of Shift_JIS
      (japanese-cp932 . cp932))    ;; Emacs charset name of CP932
@@ -760,7 +761,7 @@ ruby--insert-coding-comment

  (defun ruby--detect-encoding ()
    (if (eq ruby-insert-encoding-magic-comment 'always-utf8)
-      "utf-8"
+      'utf-8
      (let ((coding-system
             (or save-buffer-coding-system
                 buffer-file-coding-system)))
@@ -769,12 +770,11 @@ ruby--detect-encoding
                  (or (coding-system-get coding-system 'mime-charset)
                      (coding-system-change-eol-conversion coding-system 
nil))))
        (if coding-system
-          (symbol-name
-           (if ruby-use-encoding-map
-               (let ((elt (assq coding-system ruby-encoding-map)))
-                 (if elt (cdr elt) coding-system))
-             coding-system))
-        "ascii-8bit"))))
+          (if ruby-use-encoding-map
+              (let ((elt (assq coding-system ruby-encoding-map)))
+                (if elt (cdr elt) coding-system))
+            coding-system)
+        'ascii-8bit))))

  (defun ruby--encoding-comment-required-p ()
    (or (eq ruby-insert-encoding-magic-comment 'always-utf8)
@@ -796,7 +796,7 @@ ruby-mode-set-encoding
                     (unless (string= (match-string 2) coding-system)
                       (goto-char (match-beginning 2))
                       (delete-region (point) (match-end 2))
-                     (insert coding-system)))
+                     (insert (symbol-name coding-system))))
                    ((looking-at "\\s *#.*coding\\s *[:=]"))
                    (t (when ruby-insert-encoding-magic-comment
                         (ruby--insert-coding-comment coding-system))))







^ permalink raw reply related	[flat|nested] 6+ messages in thread

* bug#48043: UTF-8 magic comment is unwelcome with recent Ruby versions
  2021-04-26 21:04 ` Dmitry Gutov
@ 2021-04-27 15:29   ` Peter Oliver
  2021-04-28  2:23     ` Dmitry Gutov
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Oliver @ 2021-04-27 15:29 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 48043

[-- Attachment #1: Type: text/plain, Size: 810 bytes --]

On Tue, 27 Apr 2021, Dmitry Gutov wrote:

> On 26.04.2021 21:28, Peter Oliver wrote:
>
>>  Two patches are attached to address this:
>>
>>  - The first patch adds a new choice to ruby-insert-encoding-magic-comment,
>>  unless-utf8, which causes the magic comment not to be inserted if the
>>  encoding is UTF-8.
>>
>>  - The second patch, perhaps more controversially, makes this the default.
>
> Both changes make sense to me.
>
> However, I've looked at the existing code and found a prior change which 
> intended for this to be more customizable already, yet had a minor bug.
>
> Please try out the following patch:

That works for me, and I think is more straightforward than my approach.  Thanks.

Attached is an additional patch which adapts the tests added in my patch for your patch.

-- 
Peter Oliver

[-- Attachment #2: Type: text/plain, Size: 2500 bytes --]

From 675c08cee899444f33113b806d6709b569c44790 Mon Sep 17 00:00:00 2001
From: Peter Oliver <git@mavit.org.uk>
Date: Tue, 27 Apr 2021 16:24:58 +0100
Subject: [PATCH] Test ruby-mode-set-encoding with a few different encodings
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Follows on from Dmitry Gutov’s patch in
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=48043#8.
---
 test/lisp/progmodes/ruby-mode-tests.el | 30 ++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/test/lisp/progmodes/ruby-mode-tests.el b/test/lisp/progmodes/ruby-mode-tests.el
index 42a011c8bc..fec7d86a95 100644
--- a/test/lisp/progmodes/ruby-mode-tests.el
+++ b/test/lisp/progmodes/ruby-mode-tests.el
@@ -32,6 +32,12 @@ ruby-with-temp-buffer
      (ruby-mode)
      ,@body))
 
+(defmacro ruby-with-temp-file (contents &rest body)
+  `(ruby-with-temp-buffer ,contents
+     (set-visited-file-name "ruby-mode-tests")
+     ,@body
+     (delete-file buffer-file-name)))
+
 (defun ruby-should-indent (content column)
   "Assert indentation COLUMN on the last line of CONTENT."
   (ruby-with-temp-buffer content
@@ -844,6 +850,30 @@ ruby--insert-coding-comment-custom-style
       (ruby--insert-coding-comment "utf-8")
       (should (string= "# encoding: utf-8\n\n" (buffer-string))))))
 
+(ert-deftest ruby--set-encoding-when-ascii ()
+  (ruby-with-temp-file "ascii"
+    (let ((ruby-encoding-magic-comment-style 'ruby)
+          (ruby-insert-encoding-magic-comment t))
+      (setq save-buffer-coding-system 'us-ascii)
+      (ruby-mode-set-encoding)
+      (should (string= "ascii" (buffer-string))))))
+
+(ert-deftest ruby--set-encoding-when-utf8 ()
+  (ruby-with-temp-file "💎"
+    (let ((ruby-encoding-magic-comment-style 'ruby)
+          (ruby-insert-encoding-magic-comment t))
+      (setq save-buffer-coding-system 'utf-8)
+      (ruby-mode-set-encoding)
+      (should (string= "💎" (buffer-string))))))
+
+(ert-deftest ruby--set-encoding-when-latin-15 ()
+  (ruby-with-temp-file "Ⓡ"
+    (let ((ruby-encoding-magic-comment-style 'ruby)
+          (ruby-insert-encoding-magic-comment t))
+      (setq save-buffer-coding-system 'iso-8859-15)
+      (ruby-mode-set-encoding)
+      (should (string= "# coding: iso-8859-15\nⓇ" (buffer-string))))))
+
 (ert-deftest ruby--indent/converted-from-manual-test ()
   :tags '(:expensive-test)
   ;; Converted from manual test.
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* bug#48043: UTF-8 magic comment is unwelcome with recent Ruby versions
  2021-04-27 15:29   ` Peter Oliver
@ 2021-04-28  2:23     ` Dmitry Gutov
  2021-04-28 11:59       ` Peter Oliver
  0 siblings, 1 reply; 6+ messages in thread
From: Dmitry Gutov @ 2021-04-28  2:23 UTC (permalink / raw)
  To: Peter Oliver; +Cc: 48043-done

Version: 28.1

On 27.04.2021 18:29, Peter Oliver wrote:

> That works for me, and I think is more straightforward than my 
> approach.  Thanks.
> 
> Attached is an additional patch which adapts the tests added in my patch 
> for your patch.

Thanks! I've pushed the change and the tests to master.

Please note that since (AFAICT) you don't have FSF copyright assignment 
on file this exhausts the allowed limit for code contributions to Emacs.

Would you like us to send you the assignment form, so that the next 
patch could be accepted without reservation?





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#48043: UTF-8 magic comment is unwelcome with recent Ruby versions
  2021-04-28  2:23     ` Dmitry Gutov
@ 2021-04-28 11:59       ` Peter Oliver
  2021-04-28 12:28         ` Eli Zaretskii
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Oliver @ 2021-04-28 11:59 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 48043-done

On Wed, 28 Apr 2021, Dmitry Gutov wrote:

> Please note that since (AFAICT) you don't have FSF copyright assignment on 
> file this exhausts the allowed limit for code contributions to Emacs.
>
> Would you like us to send you the assignment form, so that the next patch 
> could be accepted without reservation?

Yes please.

-- 
Peter Oliver





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#48043: UTF-8 magic comment is unwelcome with recent Ruby versions
  2021-04-28 11:59       ` Peter Oliver
@ 2021-04-28 12:28         ` Eli Zaretskii
  0 siblings, 0 replies; 6+ messages in thread
From: Eli Zaretskii @ 2021-04-28 12:28 UTC (permalink / raw)
  To: Peter Oliver; +Cc: 48043-done, dgutov

> Date: Wed, 28 Apr 2021 12:59:32 +0100 (BST)
> From: Peter Oliver <p.d.oliver@mavit.org.uk>
> Cc: 48043-done@debbugs.gnu.org
> 
> > Would you like us to send you the assignment form, so that the next patch 
> > could be accepted without reservation?
> 
> Yes please.

Thanks, form sent off-list.





^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-04-28 12:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-26 18:28 bug#48043: UTF-8 magic comment is unwelcome with recent Ruby versions Peter Oliver
2021-04-26 21:04 ` Dmitry Gutov
2021-04-27 15:29   ` Peter Oliver
2021-04-28  2:23     ` Dmitry Gutov
2021-04-28 11:59       ` Peter Oliver
2021-04-28 12:28         ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).