unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Coding systems vietnamese-vscii and vietnamese-tcvn
@ 2023-07-21 11:26 Ulrich Mueller
  2023-07-21 12:34 ` Eli Zaretskii
  0 siblings, 1 reply; 7+ messages in thread
From: Ulrich Mueller @ 2023-07-21 11:26 UTC (permalink / raw)
  To: emacs-devel

language/vietnamese.el defines coding systems vietnamese-vscii
(with charset vscii) and vietnamese-tcvn (with charset tcvn-5712).
However, mule-conf.el defines the two charsets as aliases [1]:

   (define-charset-alias 'tcvn-5712 'vscii)

Indeed, a file containing bytes 0x00 to 0xff decodes to the same buffer
contents, regardless if its coding is specified as vscii or as tcvn.

Wikipedia also seems to say that these are only different names for
the same encoding: https://en.wikipedia.org/wiki/VSCII

Should coding system vietnamese-tcvn be an alias for vietnamese-vscii
(or the other way around)?


[1] I tried to trace the history of this definition, but I gave up at
merge(?) commit 8f924df7df01, neither of whose parents have the line.
Apparently conversion from CVS isn't perfect.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Coding systems vietnamese-vscii and vietnamese-tcvn
  2023-07-21 11:26 Coding systems vietnamese-vscii and vietnamese-tcvn Ulrich Mueller
@ 2023-07-21 12:34 ` Eli Zaretskii
  2023-07-21 12:52   ` Ulrich Mueller
  0 siblings, 1 reply; 7+ messages in thread
From: Eli Zaretskii @ 2023-07-21 12:34 UTC (permalink / raw)
  To: Ulrich Mueller; +Cc: emacs-devel

> From: Ulrich Mueller <ulm@gentoo.org>
> Date: Fri, 21 Jul 2023 13:26:02 +0200
> 
> language/vietnamese.el defines coding systems vietnamese-vscii
> (with charset vscii) and vietnamese-tcvn (with charset tcvn-5712).
> However, mule-conf.el defines the two charsets as aliases [1]:
> 
>    (define-charset-alias 'tcvn-5712 'vscii)
> 
> Indeed, a file containing bytes 0x00 to 0xff decodes to the same buffer
> contents, regardless if its coding is specified as vscii or as tcvn.
> 
> Wikipedia also seems to say that these are only different names for
> the same encoding: https://en.wikipedia.org/wiki/VSCII
> 
> Should coding system vietnamese-tcvn be an alias for vietnamese-vscii
> (or the other way around)?

Why is the fact that we have two separate coding systems a problem in
this case?



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Coding systems vietnamese-vscii and vietnamese-tcvn
  2023-07-21 12:34 ` Eli Zaretskii
@ 2023-07-21 12:52   ` Ulrich Mueller
  2023-07-21 13:06     ` Eli Zaretskii
  0 siblings, 1 reply; 7+ messages in thread
From: Ulrich Mueller @ 2023-07-21 12:52 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

>>>>> On Fri, 21 Jul 2023, Eli Zaretskii wrote:

> Why is the fact that we have two separate coding systems a problem in
> this case?

Presumably it works fine as-is, but is there any benefit of having such
duplication?



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Coding systems vietnamese-vscii and vietnamese-tcvn
  2023-07-21 12:52   ` Ulrich Mueller
@ 2023-07-21 13:06     ` Eli Zaretskii
  2023-07-28 18:03       ` Ulrich Mueller
  0 siblings, 1 reply; 7+ messages in thread
From: Eli Zaretskii @ 2023-07-21 13:06 UTC (permalink / raw)
  To: Ulrich Mueller; +Cc: emacs-devel

> From: Ulrich Mueller <ulm@gentoo.org>
> Cc: emacs-devel@gnu.org
> Date: Fri, 21 Jul 2023 14:52:55 +0200
> 
> >>>>> On Fri, 21 Jul 2023, Eli Zaretskii wrote:
> 
> > Why is the fact that we have two separate coding systems a problem in
> > this case?
> 
> Presumably it works fine as-is, but is there any benefit of having such
> duplication?

I cannot see any clear benefits either way.

Maybe we need to ask Vietnamese users.  The comments in
lisp/language/vietnamese.el say that vscii is deprecated.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Coding systems vietnamese-vscii and vietnamese-tcvn
  2023-07-21 13:06     ` Eli Zaretskii
@ 2023-07-28 18:03       ` Ulrich Mueller
  2023-07-28 18:53         ` Eli Zaretskii
  0 siblings, 1 reply; 7+ messages in thread
From: Ulrich Mueller @ 2023-07-28 18:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

>>>>> On Fri, 21 Jul 2023, Eli Zaretskii wrote:

>> From: Ulrich Mueller <ulm@gentoo.org>
>> Cc: emacs-devel@gnu.org
>> Date: Fri, 21 Jul 2023 14:52:55 +0200
>> 
>> >>>>> On Fri, 21 Jul 2023, Eli Zaretskii wrote:
>> 
>> > Why is the fact that we have two separate coding systems a problem in
>> > this case?
>> 
>> Presumably it works fine as-is, but is there any benefit of having such
>> duplication?

> I cannot see any clear benefits either way.

> Maybe we need to ask Vietnamese users.  The comments in
> lisp/language/vietnamese.el say that vscii is deprecated.

I have asked a Vietnamese speaker. Paraphrasing their answer: VSCII is
the encoding described in standard TCVN 5712:1993. The terms can be used
interchangeably; there is no reason to have separate character tables.

So the statement in the Vietnamese language info that "VSCII is
deprecated in favor of TCVN-5712" doesn't appear to be correct; the two
terms are synonyms for the same encoding.

I also looked up the original standard. TCVN 5712:1993 defines two
encodings which it names VN1 (aka VSCII-1) and VN2 (aka VSCII-2).

VSCII-2 (VN2) defines these code points:
- 0x00 to 0x7f are identical to ASCII,
- 0x80 to 0x9f are the C1 controls,
- 0xa0 to 0xff contain 96 non-ASCII characters.

VSCII-1 (VN1) is different from VSCII-2 in that it also replaces code
points 0x01-0x02, 0x04-0x06, 0x11-0x17, and 0x80-0x9f by 44 additional
non-ASCII characters, for a total of 140 non-ASCII characters.

There is also an updated standard TCVN 5712:1999 which mentions only one
encoding identical to VSCII-1, i.e. VSCII-2 is no longer part of this
later version of the standard.

So I suggest to define only coding system vietnamese-vscii, and make
vietnamese-tcvn an alias of it. If this a acceptable, I can prepare a
patch for lisp/language/vietnamese.el.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Coding systems vietnamese-vscii and vietnamese-tcvn
  2023-07-28 18:03       ` Ulrich Mueller
@ 2023-07-28 18:53         ` Eli Zaretskii
  2023-07-29  5:19           ` Ulrich Müller
  0 siblings, 1 reply; 7+ messages in thread
From: Eli Zaretskii @ 2023-07-28 18:53 UTC (permalink / raw)
  To: Ulrich Mueller; +Cc: emacs-devel

> From: Ulrich Mueller <ulm@gentoo.org>
> Cc: emacs-devel@gnu.org
> Date: Fri, 28 Jul 2023 20:03:10 +0200
> 
> So I suggest to define only coding system vietnamese-vscii, and make
> vietnamese-tcvn an alias of it. If this a acceptable, I can prepare a
> patch for lisp/language/vietnamese.el.

Please do, and thanks.  But please include in the patch a NEWS entry,
and please make the change backward-compatible, so that files written
in either of the 2 encodings we supported before and stating the
encoding in their file-local variables will still be correctly decoded
after the change.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Coding systems vietnamese-vscii and vietnamese-tcvn
  2023-07-28 18:53         ` Eli Zaretskii
@ 2023-07-29  5:19           ` Ulrich Müller
  0 siblings, 0 replies; 7+ messages in thread
From: Ulrich Müller @ 2023-07-29  5:19 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

>>>>> On Fri, 28 Jul 2023, Eli Zaretskii wrote:

>> From: Ulrich Mueller <ulm@gentoo.org>
>> Cc: emacs-devel@gnu.org
>> Date: Fri, 28 Jul 2023 20:03:10 +0200
>> 
>> So I suggest to define only coding system vietnamese-vscii, and make
>> vietnamese-tcvn an alias of it. If this a acceptable, I can prepare a
>> patch for lisp/language/vietnamese.el.

> Please do, and thanks.  But please include in the patch a NEWS entry,
> and please make the change backward-compatible, so that files written
> in either of the 2 encodings we supported before and stating the
> encoding in their file-local variables will still be correctly decoded
> after the change.

From 393d39dc4961309cdc7f6e71ceaeab7a4942d868 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ulrich=20M=C3=BCller?= <ulm@gentoo.org>
Date: Fri, 28 Jul 2023 23:04:43 +0200
Subject: [PATCH] Drop duplicate vietnamese-tcvn coding system

* lisp/language/vietnamese.el (vietnamese-vscii): Update docstring.
(vietnamese-tcvn, tcvn, tcvn-5712): Make them aliases of
vietnamese-vscii.
("Vietnamese"): Drop vietnamese-tcvn from coding-system values.
Update docstring.

* etc/NEWS: Announce this change.
---
 etc/NEWS                    |  9 +++++++++
 lisp/language/vietnamese.el | 32 +++++++++++++-------------------
 2 files changed, 22 insertions(+), 19 deletions(-)

diff --git a/etc/NEWS b/etc/NEWS
index 39b4a35930a..7b521f3e6fe 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -665,6 +665,15 @@ previous behavior of showing 'U' in the mode line for 'koi8-u':
 
     (coding-system-put 'koi8-u :mnemonic ?U)
 
+---
+** 'vietnamese-tcvn' is now a coding system alias for 'vietnamese-vscii'.
+VSCII-1 and TCVN-5712 are different names for the same character
+encoding.  Therefore, the duplicate coding system definition has been
+dropped in favor of an alias.
+
+The mode-line mnemonic for 'vietnamese-vscii' and its aliases is the
+lowercase letter 'v'.
+
 +++
 ** Infinities and NaNs no longer act as symbols on non-IEEE platforms.
 On old platforms like the VAX that do not support IEEE floating-point,
diff --git a/lisp/language/vietnamese.el b/lisp/language/vietnamese.el
index bd0b3c5ae3e..1589173b207 100644
--- a/lisp/language/vietnamese.el
+++ b/lisp/language/vietnamese.el
@@ -28,8 +28,8 @@
 
 ;;; Commentary:
 
-;; For Vietnamese, the character sets VISCII, VSCII and TCVN-5712 are
-;; supported.
+;; For Vietnamese, the character sets VISCII, VSCII-1 (TCVN-5712),
+;; VIQR and windows-1258 are supported.
 
 ;;; Code:
 
@@ -44,13 +44,16 @@ 'vietnamese-viscii
 (define-coding-system-alias 'viscii 'vietnamese-viscii)
 
 (define-coding-system 'vietnamese-vscii
-  "8-bit encoding for Vietnamese VSCII-1."
+  "8-bit encoding for Vietnamese VSCII-1 (TCVN-5712)."
   :coding-type 'charset
   :mnemonic ?v
   :charset-list '(vscii)
   :suitable-for-file-name t)
 
 (define-coding-system-alias 'vscii 'vietnamese-vscii)
+(define-coding-system-alias 'vietnamese-tcvn 'vietnamese-vscii)
+(define-coding-system-alias 'tcvn 'vietnamese-vscii)
+(define-coding-system-alias 'tcvn-5712 'vietnamese-vscii)
 
 ;; (make-coding-system
 ;;  'vietnamese-vps 4 ?p
@@ -74,7 +77,7 @@ 'viqr
 (set-language-info-alist
  "Vietnamese" '((charset viscii)
 		(coding-system vietnamese-viscii vietnamese-vscii
-			       vietnamese-tcvn vietnamese-viqr windows-1258)
+			       vietnamese-viqr windows-1258)
 		(nonascii-translation . viscii)
 		(coding-priority vietnamese-viscii)
 		(input-method . "vietnamese-viqr")
@@ -83,12 +86,12 @@ 'viqr
 		(sample-text . "Vietnamese (Tiếng Việt)	Chào bạn")
 		(documentation . "\
 For Vietnamese, Emacs uses special charsets internally.
-They can be decoded from and encoded to VISCII, VSCII, TCVN-5712, VIQR
-and windows-1258.  VSCII is deprecated in favor of TCVN-5712.  The
-Current setting gives higher priority to the coding system VISCII than
-TCVN-5712.  If you prefer TCVN-5712, please do: (prefer-coding-system
-'vietnamese-tcvn).  There are two Vietnamese input methods: VIQR and
-Telex, VIQR is the default setting.")))
+They can be decoded from and encoded to VISCII, VSCII-1 (TCVN-5712),
+VIQR and windows-1258.  The current setting gives higher priority
+to the coding system VISCII than VSCII-1.  If you prefer VSCII-1,
+please do: (prefer-coding-system 'vietnamese-vscii).  There are
+two Vietnamese input methods: VIQR and Telex, VIQR is the default
+setting.")))
 
 (define-coding-system 'windows-1258
   "windows-1258 encoding for Vietnamese (MIME: WINDOWS-1258)"
@@ -98,15 +101,6 @@ 'windows-1258
   :mime-charset 'windows-1258)
 (define-coding-system-alias 'cp1258 'windows-1258)
 
-(define-coding-system 'vietnamese-tcvn
-  "8-bit encoding for Vietnamese TCVN-5712"
-  :coding-type 'charset
-  :mnemonic ?t
-  :charset-list '(tcvn-5712)
-  :suitable-for-file-name t)
-(define-coding-system-alias 'tcvn 'vietnamese-tcvn)
-(define-coding-system-alias 'tcvn-5712 'vietnamese-tcvn)
-
 (provide 'vietnamese)
 
 ;;; vietnamese.el ends here
-- 
2.41.0




^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-07-29  5:19 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-21 11:26 Coding systems vietnamese-vscii and vietnamese-tcvn Ulrich Mueller
2023-07-21 12:34 ` Eli Zaretskii
2023-07-21 12:52   ` Ulrich Mueller
2023-07-21 13:06     ` Eli Zaretskii
2023-07-28 18:03       ` Ulrich Mueller
2023-07-28 18:53         ` Eli Zaretskii
2023-07-29  5:19           ` Ulrich Müller

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).