* Coding systems vietnamese-vscii and vietnamese-tcvn @ 2023-07-21 11:26 Ulrich Mueller 2023-07-21 12:34 ` Eli Zaretskii 0 siblings, 1 reply; 7+ messages in thread From: Ulrich Mueller @ 2023-07-21 11:26 UTC (permalink / raw) To: emacs-devel language/vietnamese.el defines coding systems vietnamese-vscii (with charset vscii) and vietnamese-tcvn (with charset tcvn-5712). However, mule-conf.el defines the two charsets as aliases [1]: (define-charset-alias 'tcvn-5712 'vscii) Indeed, a file containing bytes 0x00 to 0xff decodes to the same buffer contents, regardless if its coding is specified as vscii or as tcvn. Wikipedia also seems to say that these are only different names for the same encoding: https://en.wikipedia.org/wiki/VSCII Should coding system vietnamese-tcvn be an alias for vietnamese-vscii (or the other way around)? [1] I tried to trace the history of this definition, but I gave up at merge(?) commit 8f924df7df01, neither of whose parents have the line. Apparently conversion from CVS isn't perfect. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Coding systems vietnamese-vscii and vietnamese-tcvn 2023-07-21 11:26 Coding systems vietnamese-vscii and vietnamese-tcvn Ulrich Mueller @ 2023-07-21 12:34 ` Eli Zaretskii 2023-07-21 12:52 ` Ulrich Mueller 0 siblings, 1 reply; 7+ messages in thread From: Eli Zaretskii @ 2023-07-21 12:34 UTC (permalink / raw) To: Ulrich Mueller; +Cc: emacs-devel > From: Ulrich Mueller <ulm@gentoo.org> > Date: Fri, 21 Jul 2023 13:26:02 +0200 > > language/vietnamese.el defines coding systems vietnamese-vscii > (with charset vscii) and vietnamese-tcvn (with charset tcvn-5712). > However, mule-conf.el defines the two charsets as aliases [1]: > > (define-charset-alias 'tcvn-5712 'vscii) > > Indeed, a file containing bytes 0x00 to 0xff decodes to the same buffer > contents, regardless if its coding is specified as vscii or as tcvn. > > Wikipedia also seems to say that these are only different names for > the same encoding: https://en.wikipedia.org/wiki/VSCII > > Should coding system vietnamese-tcvn be an alias for vietnamese-vscii > (or the other way around)? Why is the fact that we have two separate coding systems a problem in this case? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Coding systems vietnamese-vscii and vietnamese-tcvn 2023-07-21 12:34 ` Eli Zaretskii @ 2023-07-21 12:52 ` Ulrich Mueller 2023-07-21 13:06 ` Eli Zaretskii 0 siblings, 1 reply; 7+ messages in thread From: Ulrich Mueller @ 2023-07-21 12:52 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel >>>>> On Fri, 21 Jul 2023, Eli Zaretskii wrote: > Why is the fact that we have two separate coding systems a problem in > this case? Presumably it works fine as-is, but is there any benefit of having such duplication? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Coding systems vietnamese-vscii and vietnamese-tcvn 2023-07-21 12:52 ` Ulrich Mueller @ 2023-07-21 13:06 ` Eli Zaretskii 2023-07-28 18:03 ` Ulrich Mueller 0 siblings, 1 reply; 7+ messages in thread From: Eli Zaretskii @ 2023-07-21 13:06 UTC (permalink / raw) To: Ulrich Mueller; +Cc: emacs-devel > From: Ulrich Mueller <ulm@gentoo.org> > Cc: emacs-devel@gnu.org > Date: Fri, 21 Jul 2023 14:52:55 +0200 > > >>>>> On Fri, 21 Jul 2023, Eli Zaretskii wrote: > > > Why is the fact that we have two separate coding systems a problem in > > this case? > > Presumably it works fine as-is, but is there any benefit of having such > duplication? I cannot see any clear benefits either way. Maybe we need to ask Vietnamese users. The comments in lisp/language/vietnamese.el say that vscii is deprecated. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Coding systems vietnamese-vscii and vietnamese-tcvn 2023-07-21 13:06 ` Eli Zaretskii @ 2023-07-28 18:03 ` Ulrich Mueller 2023-07-28 18:53 ` Eli Zaretskii 0 siblings, 1 reply; 7+ messages in thread From: Ulrich Mueller @ 2023-07-28 18:03 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel >>>>> On Fri, 21 Jul 2023, Eli Zaretskii wrote: >> From: Ulrich Mueller <ulm@gentoo.org> >> Cc: emacs-devel@gnu.org >> Date: Fri, 21 Jul 2023 14:52:55 +0200 >> >> >>>>> On Fri, 21 Jul 2023, Eli Zaretskii wrote: >> >> > Why is the fact that we have two separate coding systems a problem in >> > this case? >> >> Presumably it works fine as-is, but is there any benefit of having such >> duplication? > I cannot see any clear benefits either way. > Maybe we need to ask Vietnamese users. The comments in > lisp/language/vietnamese.el say that vscii is deprecated. I have asked a Vietnamese speaker. Paraphrasing their answer: VSCII is the encoding described in standard TCVN 5712:1993. The terms can be used interchangeably; there is no reason to have separate character tables. So the statement in the Vietnamese language info that "VSCII is deprecated in favor of TCVN-5712" doesn't appear to be correct; the two terms are synonyms for the same encoding. I also looked up the original standard. TCVN 5712:1993 defines two encodings which it names VN1 (aka VSCII-1) and VN2 (aka VSCII-2). VSCII-2 (VN2) defines these code points: - 0x00 to 0x7f are identical to ASCII, - 0x80 to 0x9f are the C1 controls, - 0xa0 to 0xff contain 96 non-ASCII characters. VSCII-1 (VN1) is different from VSCII-2 in that it also replaces code points 0x01-0x02, 0x04-0x06, 0x11-0x17, and 0x80-0x9f by 44 additional non-ASCII characters, for a total of 140 non-ASCII characters. There is also an updated standard TCVN 5712:1999 which mentions only one encoding identical to VSCII-1, i.e. VSCII-2 is no longer part of this later version of the standard. So I suggest to define only coding system vietnamese-vscii, and make vietnamese-tcvn an alias of it. If this a acceptable, I can prepare a patch for lisp/language/vietnamese.el. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Coding systems vietnamese-vscii and vietnamese-tcvn 2023-07-28 18:03 ` Ulrich Mueller @ 2023-07-28 18:53 ` Eli Zaretskii 2023-07-29 5:19 ` Ulrich Müller 0 siblings, 1 reply; 7+ messages in thread From: Eli Zaretskii @ 2023-07-28 18:53 UTC (permalink / raw) To: Ulrich Mueller; +Cc: emacs-devel > From: Ulrich Mueller <ulm@gentoo.org> > Cc: emacs-devel@gnu.org > Date: Fri, 28 Jul 2023 20:03:10 +0200 > > So I suggest to define only coding system vietnamese-vscii, and make > vietnamese-tcvn an alias of it. If this a acceptable, I can prepare a > patch for lisp/language/vietnamese.el. Please do, and thanks. But please include in the patch a NEWS entry, and please make the change backward-compatible, so that files written in either of the 2 encodings we supported before and stating the encoding in their file-local variables will still be correctly decoded after the change. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Coding systems vietnamese-vscii and vietnamese-tcvn 2023-07-28 18:53 ` Eli Zaretskii @ 2023-07-29 5:19 ` Ulrich Müller 0 siblings, 0 replies; 7+ messages in thread From: Ulrich Müller @ 2023-07-29 5:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel >>>>> On Fri, 28 Jul 2023, Eli Zaretskii wrote: >> From: Ulrich Mueller <ulm@gentoo.org> >> Cc: emacs-devel@gnu.org >> Date: Fri, 28 Jul 2023 20:03:10 +0200 >> >> So I suggest to define only coding system vietnamese-vscii, and make >> vietnamese-tcvn an alias of it. If this a acceptable, I can prepare a >> patch for lisp/language/vietnamese.el. > Please do, and thanks. But please include in the patch a NEWS entry, > and please make the change backward-compatible, so that files written > in either of the 2 encodings we supported before and stating the > encoding in their file-local variables will still be correctly decoded > after the change. From 393d39dc4961309cdc7f6e71ceaeab7a4942d868 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ulrich=20M=C3=BCller?= <ulm@gentoo.org> Date: Fri, 28 Jul 2023 23:04:43 +0200 Subject: [PATCH] Drop duplicate vietnamese-tcvn coding system * lisp/language/vietnamese.el (vietnamese-vscii): Update docstring. (vietnamese-tcvn, tcvn, tcvn-5712): Make them aliases of vietnamese-vscii. ("Vietnamese"): Drop vietnamese-tcvn from coding-system values. Update docstring. * etc/NEWS: Announce this change. --- etc/NEWS | 9 +++++++++ lisp/language/vietnamese.el | 32 +++++++++++++------------------- 2 files changed, 22 insertions(+), 19 deletions(-) diff --git a/etc/NEWS b/etc/NEWS index 39b4a35930a..7b521f3e6fe 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -665,6 +665,15 @@ previous behavior of showing 'U' in the mode line for 'koi8-u': (coding-system-put 'koi8-u :mnemonic ?U) +--- +** 'vietnamese-tcvn' is now a coding system alias for 'vietnamese-vscii'. +VSCII-1 and TCVN-5712 are different names for the same character +encoding. Therefore, the duplicate coding system definition has been +dropped in favor of an alias. + +The mode-line mnemonic for 'vietnamese-vscii' and its aliases is the +lowercase letter 'v'. + +++ ** Infinities and NaNs no longer act as symbols on non-IEEE platforms. On old platforms like the VAX that do not support IEEE floating-point, diff --git a/lisp/language/vietnamese.el b/lisp/language/vietnamese.el index bd0b3c5ae3e..1589173b207 100644 --- a/lisp/language/vietnamese.el +++ b/lisp/language/vietnamese.el @@ -28,8 +28,8 @@ ;;; Commentary: -;; For Vietnamese, the character sets VISCII, VSCII and TCVN-5712 are -;; supported. +;; For Vietnamese, the character sets VISCII, VSCII-1 (TCVN-5712), +;; VIQR and windows-1258 are supported. ;;; Code: @@ -44,13 +44,16 @@ 'vietnamese-viscii (define-coding-system-alias 'viscii 'vietnamese-viscii) (define-coding-system 'vietnamese-vscii - "8-bit encoding for Vietnamese VSCII-1." + "8-bit encoding for Vietnamese VSCII-1 (TCVN-5712)." :coding-type 'charset :mnemonic ?v :charset-list '(vscii) :suitable-for-file-name t) (define-coding-system-alias 'vscii 'vietnamese-vscii) +(define-coding-system-alias 'vietnamese-tcvn 'vietnamese-vscii) +(define-coding-system-alias 'tcvn 'vietnamese-vscii) +(define-coding-system-alias 'tcvn-5712 'vietnamese-vscii) ;; (make-coding-system ;; 'vietnamese-vps 4 ?p @@ -74,7 +77,7 @@ 'viqr (set-language-info-alist "Vietnamese" '((charset viscii) (coding-system vietnamese-viscii vietnamese-vscii - vietnamese-tcvn vietnamese-viqr windows-1258) + vietnamese-viqr windows-1258) (nonascii-translation . viscii) (coding-priority vietnamese-viscii) (input-method . "vietnamese-viqr") @@ -83,12 +86,12 @@ 'viqr (sample-text . "Vietnamese (Tiếng Việt) Chào bạn") (documentation . "\ For Vietnamese, Emacs uses special charsets internally. -They can be decoded from and encoded to VISCII, VSCII, TCVN-5712, VIQR -and windows-1258. VSCII is deprecated in favor of TCVN-5712. The -Current setting gives higher priority to the coding system VISCII than -TCVN-5712. If you prefer TCVN-5712, please do: (prefer-coding-system -'vietnamese-tcvn). There are two Vietnamese input methods: VIQR and -Telex, VIQR is the default setting."))) +They can be decoded from and encoded to VISCII, VSCII-1 (TCVN-5712), +VIQR and windows-1258. The current setting gives higher priority +to the coding system VISCII than VSCII-1. If you prefer VSCII-1, +please do: (prefer-coding-system 'vietnamese-vscii). There are +two Vietnamese input methods: VIQR and Telex, VIQR is the default +setting."))) (define-coding-system 'windows-1258 "windows-1258 encoding for Vietnamese (MIME: WINDOWS-1258)" @@ -98,15 +101,6 @@ 'windows-1258 :mime-charset 'windows-1258) (define-coding-system-alias 'cp1258 'windows-1258) -(define-coding-system 'vietnamese-tcvn - "8-bit encoding for Vietnamese TCVN-5712" - :coding-type 'charset - :mnemonic ?t - :charset-list '(tcvn-5712) - :suitable-for-file-name t) -(define-coding-system-alias 'tcvn 'vietnamese-tcvn) -(define-coding-system-alias 'tcvn-5712 'vietnamese-tcvn) - (provide 'vietnamese) ;;; vietnamese.el ends here -- 2.41.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-07-29 5:19 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-07-21 11:26 Coding systems vietnamese-vscii and vietnamese-tcvn Ulrich Mueller 2023-07-21 12:34 ` Eli Zaretskii 2023-07-21 12:52 ` Ulrich Mueller 2023-07-21 13:06 ` Eli Zaretskii 2023-07-28 18:03 ` Ulrich Mueller 2023-07-28 18:53 ` Eli Zaretskii 2023-07-29 5:19 ` Ulrich Müller
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).