* Unicode and text editors
@ 2024-12-08 13:16 Heime via Users list for the GNU Emacs text editor
2024-12-08 13:29 ` Jean Louis
` (2 more replies)
0 siblings, 3 replies; 16+ messages in thread
From: Heime via Users list for the GNU Emacs text editor @ 2024-12-08 13:16 UTC (permalink / raw)
To: Heime via Users list for the GNU Emacs text editor
I am using unicode characters in emacs. What happens when people load the
file in a different text editor? Will the characters be illegible?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Unicode and text editors
2024-12-08 13:16 Unicode and text editors Heime via Users list for the GNU Emacs text editor
@ 2024-12-08 13:29 ` Jean Louis
2024-12-08 13:31 ` Basile Starynkevitch
2024-12-08 16:06 ` Stefan Monnier via Users list for the GNU Emacs text editor
2 siblings, 0 replies; 16+ messages in thread
From: Jean Louis @ 2024-12-08 13:29 UTC (permalink / raw)
To: Heime; +Cc: Heime via Users list for the GNU Emacs text editor
* Heime via Users list for the GNU Emacs text editor <help-gnu-emacs@gnu.org> [2024-12-08 16:18]:
>
>
> I am using unicode characters in emacs. What happens when people load the
> file in a different text editor? Will the characters be illegible?
So far those editors I have inspected they accepted Unicode.
Some editors in terminal, like Zile, Emacs clone, did not accept, I just wish it could.
All graphical editors I know so far accept Unicode. Some not, but are older already, rarely used.
--
Jean Louis
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Unicode and text editors
2024-12-08 13:16 Unicode and text editors Heime via Users list for the GNU Emacs text editor
2024-12-08 13:29 ` Jean Louis
@ 2024-12-08 13:31 ` Basile Starynkevitch
2024-12-08 14:02 ` Heime via Users list for the GNU Emacs text editor
2024-12-08 16:06 ` Stefan Monnier via Users list for the GNU Emacs text editor
2 siblings, 1 reply; 16+ messages in thread
From: Basile Starynkevitch @ 2024-12-08 13:31 UTC (permalink / raw)
To: Heime, Heime via Users list for the GNU Emacs text editor
On Sun, 2024-12-08 at 13:16 +0000, Heime via Users list for the GNU
Emacs text editor wrote:
>
>
> I am using unicode characters in emacs. What happens when people
> load the
> file in a different text editor? Will the characters be illegible?
>
I guess you mean Unicode characters with UTF-8 encoding. I will refer
to the people mentioned in your question as colleagues (but they could
be friends or customers or students or authorities or managers). Your
computer means the computer you are using (probably under Linux) for
GNU emacs. Their computer or the other computer is the one used by the
collague.
I see several possible issues.
The other computer don't have the required font to display some
character (like a cyrillic letter, or § ....)
The other computer (or your colleague) don't know that the file is UTF-
8 encoded.
The other computer don't have any editor.
the other computer has an editor which does not understand UTF-8
encoding.
The other computer has an editor requiring UTF-16 encoding.
The file has been corrupted during transmission.
Regards
NB my open source project is
https://github.com/RefPerSys/RefPerSys (GPLv3+ inference engine)
--
Basile STARYNKEVITCH <basile@starynkevitch.net>
8 rue de la Faïencerie
92340 Bourg-la-Reine, France
http://starynkevitch.net/Basile & https://github.com/bstarynk
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Unicode and text editors
2024-12-08 13:31 ` Basile Starynkevitch
@ 2024-12-08 14:02 ` Heime via Users list for the GNU Emacs text editor
2024-12-08 15:15 ` W. Greenhouse via Users list for the GNU Emacs text editor
0 siblings, 1 reply; 16+ messages in thread
From: Heime via Users list for the GNU Emacs text editor @ 2024-12-08 14:02 UTC (permalink / raw)
To: Basile Starynkevitch; +Cc: Heime via Users list for the GNU Emacs text editor
Sent with Proton Mail secure email.
On Monday, December 9th, 2024 at 1:31 AM, Basile Starynkevitch <basile@starynkevitch.net> wrote:
> On Sun, 2024-12-08 at 13:16 +0000, Heime via Users list for the GNU
> Emacs text editor wrote:
>
> > I am using unicode characters in emacs. What happens when people
> > load the
> > file in a different text editor? Will the characters be illegible?
>
>
>
> I guess you mean Unicode characters with UTF-8 encoding.
Correct
> I will refer
> to the people mentioned in your question as colleagues (but they could
> be friends or customers or students or authorities or managers). Your
> computer means the computer you are using (probably under Linux) for
> GNU emacs. Their computer or the other computer is the one used by the
> collague.
>
> I see several possible issues.
>
> The other computer don't have the required font to display some
> character (like a cyrillic letter, or § ....)
>
> The other computer (or your colleague) don't know that the file is UTF-
> 8 encoded.
>
> The other computer don't have any editor.
>
> the other computer has an editor which does not understand UTF-8
> encoding.
>
> The other computer has an editor requiring UTF-16 encoding.
Is this becoming the norm? What about emacs?
Does emacs encourage use of unicode characters (UTF-8) in code comments
and documentation?
> The file has been corrupted during transmission.
>
> Regards
>
> NB my open source project is
> https://github.com/RefPerSys/RefPerSys (GPLv3+ inference engine)
>
> --
> Basile STARYNKEVITCH basile@starynkevitch.net
>
> 8 rue de la Faïencerie
> 92340 Bourg-la-Reine, France
> http://starynkevitch.net/Basile & https://github.com/bstarynk
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Unicode and text editors
2024-12-08 14:02 ` Heime via Users list for the GNU Emacs text editor
@ 2024-12-08 15:15 ` W. Greenhouse via Users list for the GNU Emacs text editor
2024-12-08 16:32 ` Eli Zaretskii
0 siblings, 1 reply; 16+ messages in thread
From: W. Greenhouse via Users list for the GNU Emacs text editor @ 2024-12-08 15:15 UTC (permalink / raw)
To: help-gnu-emacs
Heime via Users list for the GNU Emacs text editor
<help-gnu-emacs@gnu.org> writes:
>
> Is this becoming the norm? What about emacs?
>
> Does emacs encourage use of unicode characters (UTF-8) in code comments
> and documentation?
>
UTF-8 represents Emacs' internal text encoding and also its default
preference for new files. When visiting existing files it attempts to
detect the encoding, and maintain it while editing.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Unicode and text editors
2024-12-08 13:16 Unicode and text editors Heime via Users list for the GNU Emacs text editor
2024-12-08 13:29 ` Jean Louis
2024-12-08 13:31 ` Basile Starynkevitch
@ 2024-12-08 16:06 ` Stefan Monnier via Users list for the GNU Emacs text editor
2 siblings, 0 replies; 16+ messages in thread
From: Stefan Monnier via Users list for the GNU Emacs text editor @ 2024-12-08 16:06 UTC (permalink / raw)
To: help-gnu-emacs
> I am using unicode characters in Emacs. What happens when people load
> the file in a different text editor? Will the characters
> be illegible?
Nowadays, the use of non-ASCII characters via the UTF-8 encoding is
sufficiently widespread that if another editor displays them in an
illegible way, it's usually considered as a limitation of that
other editor (tho it may also depend on availability of fonts, which
are usually not specific to an editor but to the overall system).
Stefan
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Unicode and text editors
2024-12-08 15:15 ` W. Greenhouse via Users list for the GNU Emacs text editor
@ 2024-12-08 16:32 ` Eli Zaretskii
2024-12-08 17:54 ` Heime via Users list for the GNU Emacs text editor
0 siblings, 1 reply; 16+ messages in thread
From: Eli Zaretskii @ 2024-12-08 16:32 UTC (permalink / raw)
To: help-gnu-emacs
> Date: Sun, 08 Dec 2024 15:15:31 +0000
> From: "W. Greenhouse" via Users list for the GNU Emacs text editor <help-gnu-emacs@gnu.org>
>
> Heime via Users list for the GNU Emacs text editor
> <help-gnu-emacs@gnu.org> writes:
>
> >
> > Is this becoming the norm? What about emacs?
> >
> > Does emacs encourage use of unicode characters (UTF-8) in code comments
> > and documentation?
> >
>
> UTF-8 represents Emacs' internal text encoding
No, the internal encoding is different (it's a superset of UTF-8).
> and also its default preference for new files.
That is only true if the user's locale has UTF-8 as its codeset.
> When visiting existing files it attempts to
> detect the encoding, and maintain it while editing.
That is correct.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Unicode and text editors
2024-12-08 16:32 ` Eli Zaretskii
@ 2024-12-08 17:54 ` Heime via Users list for the GNU Emacs text editor
2024-12-08 18:44 ` Eli Zaretskii
0 siblings, 1 reply; 16+ messages in thread
From: Heime via Users list for the GNU Emacs text editor @ 2024-12-08 17:54 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: help-gnu-emacs
On Monday, December 9th, 2024 at 4:32 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> > Date: Sun, 08 Dec 2024 15:15:31 +0000
> > From: "W. Greenhouse" via Users list for the GNU Emacs text editor help-gnu-emacs@gnu.org
> >
> > Heime via Users list for the GNU Emacs text editor
> > help-gnu-emacs@gnu.org writes:
> >
> > > Is this becoming the norm? What about emacs?
> > >
> > > Does emacs encourage use of unicode characters (UTF-8) in code comments
> > > and documentation?
> >
> > UTF-8 represents Emacs' internal text encoding
>
>
> No, the internal encoding is different (it's a superset of UTF-8).
>
> > and also its default preference for new files.
>
>
> That is only true if the user's locale has UTF-8 as its codeset.
>
> > When visiting existing files it attempts to
> > detect the encoding, and maintain it while editing.
>
> That is correct.
Does this mean that we should not worry when we use UTF-8? Would
UTF-* be supported, and detected. With the appropriate symbols showing
as they should.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Unicode and text editors
2024-12-08 17:54 ` Heime via Users list for the GNU Emacs text editor
@ 2024-12-08 18:44 ` Eli Zaretskii
2024-12-08 18:55 ` Heime via Users list for the GNU Emacs text editor
0 siblings, 1 reply; 16+ messages in thread
From: Eli Zaretskii @ 2024-12-08 18:44 UTC (permalink / raw)
To: help-gnu-emacs
> Date: Sun, 08 Dec 2024 17:54:43 +0000
> From: Heime <heimeborgia@protonmail.com>
> Cc: help-gnu-emacs@gnu.org
>
> > > When visiting existing files it attempts to
> > > detect the encoding, and maintain it while editing.
> >
> > That is correct.
>
> Does this mean that we should not worry when we use UTF-8? Would
> UTF-* be supported, and detected. With the appropriate symbols showing
> as they should.
I was talking about Emacs. Your original question was about other
editors. I'm not familiar with other modern editors enough to tell
whether you should or should not worry. In particular, I don't know
how they recognize UTF-8, and am not sure that Emacs coding cookies
are supported by them.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Unicode and text editors
2024-12-08 18:44 ` Eli Zaretskii
@ 2024-12-08 18:55 ` Heime via Users list for the GNU Emacs text editor
2024-12-08 19:07 ` Eli Zaretskii
0 siblings, 1 reply; 16+ messages in thread
From: Heime via Users list for the GNU Emacs text editor @ 2024-12-08 18:55 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: help-gnu-emacs
Sent with Proton Mail secure email.
On Monday, December 9th, 2024 at 6:44 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> > Date: Sun, 08 Dec 2024 17:54:43 +0000
> > From: Heime heimeborgia@protonmail.com
> > Cc: help-gnu-emacs@gnu.org
> >
> > > > When visiting existing files it attempts to
> > > > detect the encoding, and maintain it while editing.
> > >
> > > That is correct.
> >
> > Does this mean that we should not worry when we use UTF-8? Would
> > UTF-* be supported, and detected. With the appropriate symbols showing
> > as they should.
>
>
> I was talking about Emacs. Your original question was about other
> editors. I'm not familiar with other modern editors enough to tell
> whether you should or should not worry. In particular, I don't know
> how they recognize UTF-8, and am not sure that Emacs coding cookies
> are supported by them.
Is it reasonable to expect that UTF-* characters are recognised and supported?
What is your school of thought regarding emacs?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Unicode and text editors
2024-12-08 18:55 ` Heime via Users list for the GNU Emacs text editor
@ 2024-12-08 19:07 ` Eli Zaretskii
2024-12-08 19:39 ` Heime via Users list for the GNU Emacs text editor
0 siblings, 1 reply; 16+ messages in thread
From: Eli Zaretskii @ 2024-12-08 19:07 UTC (permalink / raw)
To: help-gnu-emacs
> Date: Sun, 08 Dec 2024 18:55:52 +0000
> From: Heime <heimeborgia@protonmail.com>
> Cc: help-gnu-emacs@gnu.org
>
> > I was talking about Emacs. Your original question was about other
> > editors. I'm not familiar with other modern editors enough to tell
> > whether you should or should not worry. In particular, I don't know
> > how they recognize UTF-8, and am not sure that Emacs coding cookies
> > are supported by them.
>
> Is it reasonable to expect that UTF-* characters are recognised and supported?
Reasonable? yes. But the problem is not trivial: Emacs itself not
always recognizes UTF-8 reliably. So there could be problems.
> What is your school of thought regarding emacs?
I don't understand the question, sorry.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Unicode and text editors
2024-12-08 19:07 ` Eli Zaretskii
@ 2024-12-08 19:39 ` Heime via Users list for the GNU Emacs text editor
2024-12-08 20:43 ` Eli Zaretskii
2024-12-09 3:35 ` Stefan Monnier via Users list for the GNU Emacs text editor
0 siblings, 2 replies; 16+ messages in thread
From: Heime via Users list for the GNU Emacs text editor @ 2024-12-08 19:39 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: help-gnu-emacs
On Monday, December 9th, 2024 at 7:07 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> > Date: Sun, 08 Dec 2024 18:55:52 +0000
> > From: Heime heimeborgia@protonmail.com
> > Cc: help-gnu-emacs@gnu.org
> >
> > > I was talking about Emacs. Your original question was about other
> > > editors. I'm not familiar with other modern editors enough to tell
> > > whether you should or should not worry. In particular, I don't know
> > > how they recognize UTF-8, and am not sure that Emacs coding cookies
> > > are supported by them.
> >
> > Is it reasonable to expect that UTF-* characters are recognised and supported?
>
>
> Reasonable? yes. But the problem is not trivial: Emacs itself not
> always recognizes UTF-8 reliably. So there could be problems.
>
> > What is your school of thought regarding emacs?
>
>
> I don't understand the question, sorry.
Should we comfortable use UTF Characters in source code when using Emacs?
When there are problems, do they usually get looked into and fixed if possible?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Unicode and text editors
2024-12-08 19:39 ` Heime via Users list for the GNU Emacs text editor
@ 2024-12-08 20:43 ` Eli Zaretskii
2024-12-08 20:48 ` Heime via Users list for the GNU Emacs text editor
2024-12-09 3:35 ` Stefan Monnier via Users list for the GNU Emacs text editor
1 sibling, 1 reply; 16+ messages in thread
From: Eli Zaretskii @ 2024-12-08 20:43 UTC (permalink / raw)
To: help-gnu-emacs
> Date: Sun, 08 Dec 2024 19:39:55 +0000
> From: Heime <heimeborgia@protonmail.com>
> Cc: help-gnu-emacs@gnu.org
>
>
> On Monday, December 9th, 2024 at 7:07 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>
> > > Date: Sun, 08 Dec 2024 18:55:52 +0000
> > > From: Heime heimeborgia@protonmail.com
> > > Cc: help-gnu-emacs@gnu.org
> > >
> > > > I was talking about Emacs. Your original question was about other
> > > > editors. I'm not familiar with other modern editors enough to tell
> > > > whether you should or should not worry. In particular, I don't know
> > > > how they recognize UTF-8, and am not sure that Emacs coding cookies
> > > > are supported by them.
> > >
> > > Is it reasonable to expect that UTF-* characters are recognised and supported?
> >
> >
> > Reasonable? yes. But the problem is not trivial: Emacs itself not
> > always recognizes UTF-8 reliably. So there could be problems.
> >
> > > What is your school of thought regarding emacs?
> >
> >
> > I don't understand the question, sorry.
>
> Should we comfortable use UTF Characters in source code when using Emacs?
> When there are problems, do they usually get looked into and fixed if possible?
If you are talking about Emacs, then my recommendation is to always
use a coding: cookie. Then it's 100% reliable when Emacs reads the
file.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Unicode and text editors
2024-12-08 20:43 ` Eli Zaretskii
@ 2024-12-08 20:48 ` Heime via Users list for the GNU Emacs text editor
2024-12-09 3:24 ` Eli Zaretskii
0 siblings, 1 reply; 16+ messages in thread
From: Heime via Users list for the GNU Emacs text editor @ 2024-12-08 20:48 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: help-gnu-emacs
Sent with Proton Mail secure email.
On Monday, December 9th, 2024 at 8:43 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> > Date: Sun, 08 Dec 2024 19:39:55 +0000
> > From: Heime heimeborgia@protonmail.com
> > Cc: help-gnu-emacs@gnu.org
> >
> > On Monday, December 9th, 2024 at 7:07 AM, Eli Zaretskii eliz@gnu.org wrote:
> >
> > > > Date: Sun, 08 Dec 2024 18:55:52 +0000
> > > > From: Heime heimeborgia@protonmail.com
> > > > Cc: help-gnu-emacs@gnu.org
> > > >
> > > > > I was talking about Emacs. Your original question was about other
> > > > > editors. I'm not familiar with other modern editors enough to tell
> > > > > whether you should or should not worry. In particular, I don't know
> > > > > how they recognize UTF-8, and am not sure that Emacs coding cookies
> > > > > are supported by them.
> > > >
> > > > Is it reasonable to expect that UTF-* characters are recognised and supported?
> > >
> > > Reasonable? yes. But the problem is not trivial: Emacs itself not
> > > always recognizes UTF-8 reliably. So there could be problems.
> > >
> > > > What is your school of thought regarding emacs?
> > >
> > > I don't understand the question, sorry.
> >
> > Should we comfortable use UTF Characters in source code when using Emacs?
> > When there are problems, do they usually get looked into and fixed if possible?
>
>
> If you are talking about Emacs, then my recommendation is to always
> use a coding: cookie. Then it's 100% reliable when Emacs reads the
> file.
What would one have to do exactly?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Unicode and text editors
2024-12-08 20:48 ` Heime via Users list for the GNU Emacs text editor
@ 2024-12-09 3:24 ` Eli Zaretskii
0 siblings, 0 replies; 16+ messages in thread
From: Eli Zaretskii @ 2024-12-09 3:24 UTC (permalink / raw)
To: help-gnu-emacs
> Date: Sun, 08 Dec 2024 20:48:08 +0000
> From: Heime <heimeborgia@protonmail.com>
> Cc: help-gnu-emacs@gnu.org
>
> > If you are talking about Emacs, then my recommendation is to always
> > use a coding: cookie. Then it's 100% reliable when Emacs reads the
> > file.
>
> What would one have to do exactly?
Add a "coding: utf-8" cookie in the first line of the file.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Unicode and text editors
2024-12-08 19:39 ` Heime via Users list for the GNU Emacs text editor
2024-12-08 20:43 ` Eli Zaretskii
@ 2024-12-09 3:35 ` Stefan Monnier via Users list for the GNU Emacs text editor
1 sibling, 0 replies; 16+ messages in thread
From: Stefan Monnier via Users list for the GNU Emacs text editor @ 2024-12-09 3:35 UTC (permalink / raw)
To: help-gnu-emacs
> Should we comfortable use UTF Characters in source code when
> using Emacs?
In source code, the main issue is not whether another editor will
display it properly but whether the other tools that use the file will
handle it properly.
Usually this will depend on the programming language, where the
definition of the language usually clarifies which kinds of encodings
and/or charsets are allowed (and where, since the rule can be different
in different parts, such as inside comments or strings).
E.g. Emacs Lisp uses utf-8 (since Emacs-24.4) and supports the use of
basically any Unicode characters in source code, such as inside
identifiers.
[ BTW, "UTF characters" is kind of meaningless. You'll want to learn to
distinguish OT1H characters, and OTOH the encodings that can be used
to represent them as sequences of bytes. ]
Stefan
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2024-12-09 3:35 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-08 13:16 Unicode and text editors Heime via Users list for the GNU Emacs text editor
2024-12-08 13:29 ` Jean Louis
2024-12-08 13:31 ` Basile Starynkevitch
2024-12-08 14:02 ` Heime via Users list for the GNU Emacs text editor
2024-12-08 15:15 ` W. Greenhouse via Users list for the GNU Emacs text editor
2024-12-08 16:32 ` Eli Zaretskii
2024-12-08 17:54 ` Heime via Users list for the GNU Emacs text editor
2024-12-08 18:44 ` Eli Zaretskii
2024-12-08 18:55 ` Heime via Users list for the GNU Emacs text editor
2024-12-08 19:07 ` Eli Zaretskii
2024-12-08 19:39 ` Heime via Users list for the GNU Emacs text editor
2024-12-08 20:43 ` Eli Zaretskii
2024-12-08 20:48 ` Heime via Users list for the GNU Emacs text editor
2024-12-09 3:24 ` Eli Zaretskii
2024-12-09 3:35 ` Stefan Monnier via Users list for the GNU Emacs text editor
2024-12-08 16:06 ` Stefan Monnier via Users list for the GNU Emacs text editor
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).