* smtpmail and ~/.authinfo
@ 2011-08-20 10:26 Eli Zaretskii
2011-08-21 4:39 ` Lars Magne Ingebrigtsen
0 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2011-08-20 10:26 UTC (permalink / raw)
To: Lars Magne Ingebrigtsen; +Cc: emacs-devel
I switched today to using ~/.authinfo with smtpmail in Emacs 24 for
the first time, and immediately hit a snag: sending mail failed with
an error message from the SMTP server claiming that my login
credentials were incorrect.
It turned out that ~/.authinfo _must_ have Unix EOLs, or else sending
mail with smtpmail not work. This happens because auth-source-search
is called from smtpmail inside a form that let-binds
coding-system-for-read to `binary'. That binding is there for reasons
that have nothing to do with auth-source-search, and a cursory search
finds no similar bindings in other users of auth-source-search.
It should be easy to fix this, but I need to know what can be in Netrc
files to do this correctly. Can these files include non-ASCII
characters, or do all fields in these files have to be strict 7-bit
ASCII? If non-ASCII characters are allowed, then are there any
limitations on the charsets that can be used in Netrc files, or can
they be anything at all in any valid encoding? Also, is there any
need to do something special with non-ASCII characters (if they are
allowed) when communicating with the SMTP server, like encode them in
some particular way?
Given the answers to these questions, fixing the above problem should
be as simple as adding a few lines to smtpmail-via-smtp.
TIA
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-08-20 10:26 smtpmail and ~/.authinfo Eli Zaretskii
@ 2011-08-21 4:39 ` Lars Magne Ingebrigtsen
2011-08-21 6:12 ` Eli Zaretskii
0 siblings, 1 reply; 45+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-08-21 4:39 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
Eli Zaretskii <eliz@gnu.org> writes:
> It turned out that ~/.authinfo _must_ have Unix EOLs, or else sending
> mail with smtpmail not work. This happens because auth-source-search
> is called from smtpmail inside a form that let-binds
> coding-system-for-read to `binary'. That binding is there for reasons
> that have nothing to do with auth-source-search, and a cursory search
> finds no similar bindings in other users of auth-source-search.
Yes, that sounds like an accident. Perhaps that let binding should be
narrowed dramatically? It's bad practise to bind variables like that
over non-relevant function calls.
> It should be easy to fix this, but I need to know what can be in Netrc
> files to do this correctly. Can these files include non-ASCII
> characters, or do all fields in these files have to be strict 7-bit
> ASCII?
There can basically be anything in the files, I think, and the encoding
is local. But it's unusual to put non-ASCII into the file for most
protocols, since so many protocols developed their auth schemes before
anybody had considered the problem of coding systems.
> Also, is there any need to do something special with non-ASCII
> characters (if they are allowed) when communicating with the SMTP
> server, like encode them in some particular way?
It... varies. :-) SMTP allows using several AUTH methods, and I'm
actually not sure whether any of them actually specify what charset to
use. DIGEST-MD5 does, I think? But smtpmail.el doesn't support it,
anyway.
I think AUTH PLAIN, for instance, is basically essentially a binary
thing, where you're allowed to use any blob of bytes as user name and
password. Except NULs.
This is just from memory, so if somebody knows better, please correct
me...
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog http://lars.ingebrigtsen.no/
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-08-21 4:39 ` Lars Magne Ingebrigtsen
@ 2011-08-21 6:12 ` Eli Zaretskii
2011-08-21 19:25 ` Lars Magne Ingebrigtsen
0 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2011-08-21 6:12 UTC (permalink / raw)
To: Lars Magne Ingebrigtsen; +Cc: emacs-devel
> From: Lars Magne Ingebrigtsen <larsi@gnus.org>
> Cc: emacs-devel@gnu.org
> Date: Sun, 21 Aug 2011 06:39:12 +0200
>
> Eli Zaretskii <eliz@gnu.org> writes:
>
> > It turned out that ~/.authinfo _must_ have Unix EOLs, or else sending
> > mail with smtpmail not work. This happens because auth-source-search
> > is called from smtpmail inside a form that let-binds
> > coding-system-for-read to `binary'. That binding is there for reasons
> > that have nothing to do with auth-source-search, and a cursory search
> > finds no similar bindings in other users of auth-source-search.
>
> Yes, that sounds like an accident. Perhaps that let binding should be
> narrowed dramatically?
You should know: you put it there ;-)
The log message for revision 104742, where these bindings were
introduced, doesn't say much. Can you tell why did you need them (for
Windows, no less)?
> > It should be easy to fix this, but I need to know what can be in Netrc
> > files to do this correctly. Can these files include non-ASCII
> > characters, or do all fields in these files have to be strict 7-bit
> > ASCII?
>
> There can basically be anything in the files, I think, and the encoding
> is local. But it's unusual to put non-ASCII into the file for most
> protocols, since so many protocols developed their auth schemes before
> anybody had considered the problem of coding systems.
>
> > Also, is there any need to do something special with non-ASCII
> > characters (if they are allowed) when communicating with the SMTP
> > server, like encode them in some particular way?
>
> It... varies. :-) SMTP allows using several AUTH methods, and I'm
> actually not sure whether any of them actually specify what charset to
> use. DIGEST-MD5 does, I think? But smtpmail.el doesn't support it,
> anyway.
>
> I think AUTH PLAIN, for instance, is basically essentially a binary
> thing, where you're allowed to use any blob of bytes as user name and
> password. Except NULs.
This tells me that TRT is to bind coding-system-for-read to raw-text
for auth-source-search to do its thing. But I'm still uncertain what
should be the binding in the rest of smtpmail-via-smtp.
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-08-21 6:12 ` Eli Zaretskii
@ 2011-08-21 19:25 ` Lars Magne Ingebrigtsen
2011-08-21 19:59 ` Eli Zaretskii
0 siblings, 1 reply; 45+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-08-21 19:25 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
Eli Zaretskii <eliz@gnu.org> writes:
> You should know: you put it there ;-)
>
> The log message for revision 104742, where these bindings were
> introduced, doesn't say much. Can you tell why did you need them (for
> Windows, no less)?
I don't see any Windows special-casing there?
Anyway, they're for the `open-network-stream' call. I've now wrapped
them closer around that call, which should probably fix the problem.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog http://lars.ingebrigtsen.no/
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-08-21 19:25 ` Lars Magne Ingebrigtsen
@ 2011-08-21 19:59 ` Eli Zaretskii
2011-08-21 20:17 ` Lars Magne Ingebrigtsen
2011-09-25 12:33 ` Ted Zlatanov
0 siblings, 2 replies; 45+ messages in thread
From: Eli Zaretskii @ 2011-08-21 19:59 UTC (permalink / raw)
To: Lars Magne Ingebrigtsen; +Cc: emacs-devel
> From: Lars Magne Ingebrigtsen <larsi@gnus.org>
> Cc: emacs-devel@gnu.org
> Date: Sun, 21 Aug 2011 21:25:55 +0200
>
> Eli Zaretskii <eliz@gnu.org> writes:
>
> > You should know: you put it there ;-)
> >
> > The log message for revision 104742, where these bindings were
> > introduced, doesn't say much. Can you tell why did you need them (for
> > Windows, no less)?
>
> I don't see any Windows special-casing there?
I was quoting your ChangeLog entry:
2011-06-27 Lars Magne Ingebrigtsen <larsi@gnus.org>
* mail/smtpmail.el (smtpmail-via-smtp): Bind coding-system-for-*
to binary to possibly avoid line encoding issues on Windows (among
other things).
Btw, to _avoid_ line encoding issues on Windows, one should NOT bind
coding-system-for-read to `binary', because that binding brings the
CR-LF EOLs right into Emacs buffers.
> Anyway, they're for the `open-network-stream' call. I've now wrapped
> them closer around that call, which should probably fix the problem.
Only partially. Since you say netrc files can have non-ASCII
characters, we should bind coding-system-for-read to raw-text when
calling auth-source-search. I will take care of that.
Thanks.
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-08-21 19:59 ` Eli Zaretskii
@ 2011-08-21 20:17 ` Lars Magne Ingebrigtsen
2011-08-22 5:35 ` Eli Zaretskii
2011-09-25 12:33 ` Ted Zlatanov
1 sibling, 1 reply; 45+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-08-21 20:17 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
Eli Zaretskii <eliz@gnu.org> writes:
> I was quoting your ChangeLog entry:
>
> 2011-06-27 Lars Magne Ingebrigtsen <larsi@gnus.org>
>
> * mail/smtpmail.el (smtpmail-via-smtp): Bind coding-system-for-*
> to binary to possibly avoid line encoding issues on Windows (among
> other things).
>
> Btw, to _avoid_ line encoding issues on Windows, one should NOT bind
> coding-system-for-read to `binary', because that binding brings the
> CR-LF EOLs right into Emacs buffers.
Yes, and that's what we want, since SMTP uses CRLF as the line ending.
And now I remember what the problem was -- under Windows it would
infloop looking for CRLF, and never getting it, since Emacs helpfully
auto-translated CRLF to newline under Windows.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog http://lars.ingebrigtsen.no/
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-08-21 20:17 ` Lars Magne Ingebrigtsen
@ 2011-08-22 5:35 ` Eli Zaretskii
2011-09-10 19:01 ` Lars Magne Ingebrigtsen
0 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2011-08-22 5:35 UTC (permalink / raw)
To: Lars Magne Ingebrigtsen; +Cc: emacs-devel
> From: Lars Magne Ingebrigtsen <larsi@gnus.org>
> Cc: emacs-devel@gnu.org
> Date: Sun, 21 Aug 2011 22:17:44 +0200
>
> Eli Zaretskii <eliz@gnu.org> writes:
>
> > I was quoting your ChangeLog entry:
> >
> > 2011-06-27 Lars Magne Ingebrigtsen <larsi@gnus.org>
> >
> > * mail/smtpmail.el (smtpmail-via-smtp): Bind coding-system-for-*
> > to binary to possibly avoid line encoding issues on Windows (among
> > other things).
> >
> > Btw, to _avoid_ line encoding issues on Windows, one should NOT bind
> > coding-system-for-read to `binary', because that binding brings the
> > CR-LF EOLs right into Emacs buffers.
>
> Yes, and that's what we want, since SMTP uses CRLF as the line ending.
> And now I remember what the problem was -- under Windows it would
> infloop looking for CRLF, and never getting it, since Emacs helpfully
> auto-translated CRLF to newline under Windows.
??? The same auto-translation happens on Unix as well. Are you saying
this problem happened only on Windows?
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-08-22 5:35 ` Eli Zaretskii
@ 2011-09-10 19:01 ` Lars Magne Ingebrigtsen
0 siblings, 0 replies; 45+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-09-10 19:01 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
Eli Zaretskii <eliz@gnu.org> writes:
>> Yes, and that's what we want, since SMTP uses CRLF as the line ending.
>> And now I remember what the problem was -- under Windows it would
>> infloop looking for CRLF, and never getting it, since Emacs helpfully
>> auto-translated CRLF to newline under Windows.
>
> ??? The same auto-translation happens on Unix as well. Are you saying
> this problem happened only on Windows?
Yup.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog http://lars.ingebrigtsen.no/
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-08-21 19:59 ` Eli Zaretskii
2011-08-21 20:17 ` Lars Magne Ingebrigtsen
@ 2011-09-25 12:33 ` Ted Zlatanov
2011-09-25 12:48 ` Eli Zaretskii
1 sibling, 1 reply; 45+ messages in thread
From: Ted Zlatanov @ 2011-09-25 12:33 UTC (permalink / raw)
To: emacs-devel
On Sun, 21 Aug 2011 22:59:03 +0300 Eli Zaretskii <eliz@gnu.org> wrote:
EZ> Since [Lars says] netrc files can have non-ASCII characters, we
EZ> should bind coding-system-for-read to raw-text when calling
EZ> auth-source-search. I will take care of that.
Thank you. This was set to binary by historical accident as you
guessed.
The Emacs authinfo/netrc format, incidentally, is evolved from the
original because we use the Lisp reader to consume tokens. So for
instance we can handle quoted strings, which do not work in other
consumers of netrc-style files, notably libcurl and thus curl and Git.
Thus the Emacs format is backwards compatible but older netrc consumers
can't necessarily read our tokens, so I think it's OK that we go further
and explicitly allow Unicode characters through UTF-8. Would it make
sense, then, to explicitly use utf-8 or auto-guess for the encoding
instead of raw-text? There is no standard that says it should be UTF-8
but that would be the cleanest compatibility path to allow older
consumers still using ASCII to read our netrc files.
I don't know the Emacs reading/writing coding systems well so any
suggestions or ideas you have are most welcome.
Thanks
Ted
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-25 12:33 ` Ted Zlatanov
@ 2011-09-25 12:48 ` Eli Zaretskii
2011-09-25 13:21 ` Ted Zlatanov
0 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2011-09-25 12:48 UTC (permalink / raw)
To: emacs-devel
> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Sun, 25 Sep 2011 07:33:20 -0500
> Reply-To: emacs-devel@gnu.org
>
> Thus the Emacs format is backwards compatible but older netrc consumers
> can't necessarily read our tokens, so I think it's OK that we go further
> and explicitly allow Unicode characters through UTF-8. Would it make
> sense, then, to explicitly use utf-8 or auto-guess for the encoding
> instead of raw-text?
Only if either (a) we encode the responses we send to the SMTP server
during handshake, or (b) SMTP servers support UTF-8 encoding in the
strings they expect to receive.
Lars said "encoding is local", which suggest that neither of the above
is true. raw-text leaves the byte stream unchanged, and only converts
the EOL, so a netrc file encoded in some locale-specific way has a
better chance with SMTP servers from the same locale.
IOW, to answer your question, someone who knows more than I do about
communications with SMTP servers should tell us how, if at all,
non-ASCII characters are supposed to be handled when communicating
with the server.
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-25 12:48 ` Eli Zaretskii
@ 2011-09-25 13:21 ` Ted Zlatanov
2011-09-25 17:08 ` Eli Zaretskii
2011-09-26 18:04 ` Lars Magne Ingebrigtsen
0 siblings, 2 replies; 45+ messages in thread
From: Ted Zlatanov @ 2011-09-25 13:21 UTC (permalink / raw)
To: emacs-devel
On Sun, 25 Sep 2011 08:48:07 -0400 Eli Zaretskii <eliz@gnu.org> wrote:
>> From: Ted Zlatanov <tzz@lifelogs.com>
>> Date: Sun, 25 Sep 2011 07:33:20 -0500
>> Reply-To: emacs-devel@gnu.org
>>
>> Thus the Emacs format is backwards compatible but older netrc consumers
>> can't necessarily read our tokens, so I think it's OK that we go further
>> and explicitly allow Unicode characters through UTF-8. Would it make
>> sense, then, to explicitly use utf-8 or auto-guess for the encoding
>> instead of raw-text?
EZ> Only if either (a) we encode the responses we send to the SMTP server
EZ> during handshake, or (b) SMTP servers support UTF-8 encoding in the
EZ> strings they expect to receive.
EZ> Lars said "encoding is local", which suggest that neither of the above
EZ> is true. raw-text leaves the byte stream unchanged, and only converts
EZ> the EOL, so a netrc file encoded in some locale-specific way has a
EZ> better chance with SMTP servers from the same locale.
EZ> IOW, to answer your question, someone who knows more than I do about
EZ> communications with SMTP servers should tell us how, if at all,
EZ> non-ASCII characters are supposed to be handled when communicating
EZ> with the server.
I don't think the SMTP interaction should not be the critical factor
here. The SMTP library should deal with invalid (for SMTP) characters
on its side; many other libraries and protocols use `auth-source-search'
that can handle non-ASCII characters. In other words, let's not limit
the capabilities of `auth-source-search' just because one of the users
can't handle non-ASCII.
I think authinfo/netrc files should be portable and support Unicode in a
way that enables other (older or new!) software to use them too. IMHO
enforcing UTF-8 encoding is the best way to achieve that.
Ted
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-25 13:21 ` Ted Zlatanov
@ 2011-09-25 17:08 ` Eli Zaretskii
2011-09-26 14:41 ` Ted Zlatanov
2011-09-26 18:04 ` Lars Magne Ingebrigtsen
1 sibling, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2011-09-25 17:08 UTC (permalink / raw)
To: emacs-devel
> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Sun, 25 Sep 2011 08:21:36 -0500
>
> I don't think the SMTP interaction should not be the critical factor
> here. The SMTP library should deal with invalid (for SMTP) characters
> on its side; many other libraries and protocols use `auth-source-search'
> that can handle non-ASCII characters. In other words, let's not limit
> the capabilities of `auth-source-search' just because one of the users
> can't handle non-ASCII.
>
> I think authinfo/netrc files should be portable and support Unicode in a
> way that enables other (older or new!) software to use them too. IMHO
> enforcing UTF-8 encoding is the best way to achieve that.
Fine with me, but then Someone™ should simultaneously modify smtpmail
(and perhaps also other users of authinfo) to DTRT when communicating
with the SMTP server, whatever "TRT" may mean in this case. Do one,
but not the other, and we will have a bug waiting to happen on our
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-25 17:08 ` Eli Zaretskii
@ 2011-09-26 14:41 ` Ted Zlatanov
2011-09-26 16:18 ` Eli Zaretskii
0 siblings, 1 reply; 45+ messages in thread
From: Ted Zlatanov @ 2011-09-26 14:41 UTC (permalink / raw)
To: emacs-devel
On Sun, 25 Sep 2011 20:08:30 +0300 Eli Zaretskii <eliz@gnu.org> wrote:
>> From: Ted Zlatanov <tzz@lifelogs.com>
>> Date: Sun, 25 Sep 2011 08:21:36 -0500
>>
>> I don't think the SMTP interaction should not be the critical factor
>> here. The SMTP library should deal with invalid (for SMTP) characters
>> on its side; many other libraries and protocols use `auth-source-search'
>> that can handle non-ASCII characters. In other words, let's not limit
>> the capabilities of `auth-source-search' just because one of the users
>> can't handle non-ASCII.
>>
>> I think authinfo/netrc files should be portable and support Unicode in a
>> way that enables other (older or new!) software to use them too. IMHO
>> enforcing UTF-8 encoding is the best way to achieve that.
EZ> Fine with me, but then Someone™ should simultaneously modify smtpmail
EZ> (and perhaps also other users of authinfo) to DTRT when communicating
EZ> with the SMTP server, whatever "TRT" may mean in this case. Do one,
EZ> but not the other, and we will have a bug waiting to happen on our
I have a pretty good handle on the `auth-source-search' users in the
Emacs space. More importantly, this makes no difference on the API
user's side. With raw-text they also get potentially unsafe characters,
right? We're just going to enforce UTF-8 as the non-ASCII encoding and
in the case of ASCII data UTF-8 is the same as unencoded.
Could you help me (or point me to the right examples) to:
- always create/write a file in UTF-8 on every platform
- opportunistically open the file in binary, raw-text, UTF-8, etc. on
every platform
I'll use your suggestions in auth-source.el's authinfo/netrc backend.
Thanks
Ted
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-26 14:41 ` Ted Zlatanov
@ 2011-09-26 16:18 ` Eli Zaretskii
2011-09-26 16:53 ` Ted Zlatanov
2011-09-26 17:00 ` Stefan Monnier
0 siblings, 2 replies; 45+ messages in thread
From: Eli Zaretskii @ 2011-09-26 16:18 UTC (permalink / raw)
To: emacs-devel
> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Mon, 26 Sep 2011 09:41:08 -0500
>
> With raw-text they also get potentially unsafe characters, right?
They get what they put in the file. If we assume that what's there is
acceptable by their SMTP server, it's "safe".
> Could you help me (or point me to the right examples) to:
>
> - always create/write a file in UTF-8 on every platform
You mean, force Emacs to encode .authinfo in UTF-8 when creating it?
I guess that's the job for file-coding-system-alist.
> - opportunistically open the file in binary, raw-text, UTF-8, etc. on
> every platform
Sorry, I don't understand what you'd like to do. Please elaborate,
and I will gladly try to help.
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-26 16:18 ` Eli Zaretskii
@ 2011-09-26 16:53 ` Ted Zlatanov
2011-09-26 17:15 ` Eli Zaretskii
2011-09-26 17:00 ` Stefan Monnier
1 sibling, 1 reply; 45+ messages in thread
From: Ted Zlatanov @ 2011-09-26 16:53 UTC (permalink / raw)
To: emacs-devel
On Mon, 26 Sep 2011 19:18:39 +0300 Eli Zaretskii <eliz@gnu.org> wrote:
>> From: Ted Zlatanov <tzz@lifelogs.com>
>> Date: Mon, 26 Sep 2011 09:41:08 -0500
>>
>> With raw-text they also get potentially unsafe characters, right?
EZ> They get what they put in the file. If we assume that what's there is
EZ> acceptable by their SMTP server, it's "safe".
Exactly. So the UTF-8 encoding won't change anything, it will only make
it easier for the netrc/authinfo file to be shared :)
>> Could you help me (or point me to the right examples) to:
>>
>> - always create/write a file in UTF-8 on every platform
EZ> You mean, force Emacs to encode .authinfo in UTF-8 when creating it?
EZ> I guess that's the job for file-coding-system-alist.
So I would just override that when writing the netrc/authinfo file. I
can't imagine any value in letting the user override the UTF-8 encoding,
can you?
>> - opportunistically open the file in binary, raw-text, UTF-8, etc. on
>> every platform
EZ> Sorry, I don't understand what you'd like to do. Please elaborate,
EZ> and I will gladly try to help.
There must be netrc/authinfo files written in binary encoding because
that was the default. I'd like to open them, but also open UTF-8
encoded netrc/authinfo files, and also accept raw-text or any other
reasonably guessed encoding. For UTF-8 there are heuristics but Emacs
has them built-in, right? So I don't have to write special code to
guess?
The alternative is to try as utf-8, then try binary, then give up. But
that's less friendly to the user I think.
Ted
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-26 16:18 ` Eli Zaretskii
2011-09-26 16:53 ` Ted Zlatanov
@ 2011-09-26 17:00 ` Stefan Monnier
2011-09-26 17:28 ` Ted Zlatanov
1 sibling, 1 reply; 45+ messages in thread
From: Stefan Monnier @ 2011-09-26 17:00 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
> You mean, force Emacs to encode .authinfo in UTF-8 when creating it?
> I guess that's the job for file-coding-system-alist.
Or adding a -*- coding: utf-8 -*- cookie to the file.
Stefan
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-26 16:53 ` Ted Zlatanov
@ 2011-09-26 17:15 ` Eli Zaretskii
2011-09-26 17:23 ` Eli Zaretskii
2011-09-26 17:31 ` Ted Zlatanov
0 siblings, 2 replies; 45+ messages in thread
From: Eli Zaretskii @ 2011-09-26 17:15 UTC (permalink / raw)
To: emacs-devel
> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Mon, 26 Sep 2011 11:53:18 -0500
>
> >> Could you help me (or point me to the right examples) to:
> >>
> >> - always create/write a file in UTF-8 on every platform
>
> EZ> You mean, force Emacs to encode .authinfo in UTF-8 when creating it?
> EZ> I guess that's the job for file-coding-system-alist.
>
> So I would just override that when writing the netrc/authinfo file. I
> can't imagine any value in letting the user override the UTF-8 encoding,
> can you?
No, I cannot.
> >> - opportunistically open the file in binary, raw-text, UTF-8, etc. on
> >> every platform
>
> EZ> Sorry, I don't understand what you'd like to do. Please elaborate,
> EZ> and I will gladly try to help.
>
> There must be netrc/authinfo files written in binary encoding because
> that was the default. I'd like to open them, but also open UTF-8
> encoded netrc/authinfo files, and also accept raw-text or any other
> reasonably guessed encoding. For UTF-8 there are heuristics but Emacs
> has them built-in, right? So I don't have to write special code to
> guess?
>
> The alternative is to try as utf-8, then try binary, then give up. But
> that's less friendly to the user I think.
Just let Emacs do its usual guesswork.
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-26 17:15 ` Eli Zaretskii
@ 2011-09-26 17:23 ` Eli Zaretskii
2011-09-26 17:31 ` Ted Zlatanov
1 sibling, 0 replies; 45+ messages in thread
From: Eli Zaretskii @ 2011-09-26 17:23 UTC (permalink / raw)
To: emacs-devel
> Date: Mon, 26 Sep 2011 20:15:52 +0300
> From: Eli Zaretskii <eliz@gnu.org>
>
> > From: Ted Zlatanov <tzz@lifelogs.com>
> > Date: Mon, 26 Sep 2011 11:53:18 -0500
> >
> > >> Could you help me (or point me to the right examples) to:
> > >>
> > >> - always create/write a file in UTF-8 on every platform
> >
> > EZ> You mean, force Emacs to encode .authinfo in UTF-8 when creating it?
> > EZ> I guess that's the job for file-coding-system-alist.
> >
> > So I would just override that when writing the netrc/authinfo file.
It may be worthwhile to add this permanently to the alist we maintain
in Emacs.
> > I can't imagine any value in letting the user override the UTF-8
> > encoding, can you?
>
> No, I cannot.
That said, users can always override if they want with "C-x RET c".
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-26 17:00 ` Stefan Monnier
@ 2011-09-26 17:28 ` Ted Zlatanov
2011-09-26 21:27 ` Stefan Monnier
0 siblings, 1 reply; 45+ messages in thread
From: Ted Zlatanov @ 2011-09-26 17:28 UTC (permalink / raw)
To: emacs-devel
On Mon, 26 Sep 2011 13:00:47 -0400 Stefan Monnier <monnier@IRO.UMontreal.CA> wrote:
>> You mean, force Emacs to encode .authinfo in UTF-8 when creating it?
>> I guess that's the job for file-coding-system-alist.
SM> Or adding a -*- coding: utf-8 -*- cookie to the file.
That's not standard, so a netrc/authinfo file created by someone else
would not have it and we're back to guessing. Better to guess on read,
enforce UTF-8 on write IMO.
Ted
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-26 17:15 ` Eli Zaretskii
2011-09-26 17:23 ` Eli Zaretskii
@ 2011-09-26 17:31 ` Ted Zlatanov
1 sibling, 0 replies; 45+ messages in thread
From: Ted Zlatanov @ 2011-09-26 17:31 UTC (permalink / raw)
To: emacs-devel
On Mon, 26 Sep 2011 20:15:52 +0300 Eli Zaretskii <eliz@gnu.org> wrote:
>> From: Ted Zlatanov <tzz@lifelogs.com>
>> Date: Mon, 26 Sep 2011 11:53:18 -0500
>>
>> >> Could you help me (or point me to the right examples) to:
>> >>
>> >> - always create/write a file in UTF-8 on every platform
>>
EZ> You mean, force Emacs to encode .authinfo in UTF-8 when creating it?
EZ> I guess that's the job for file-coding-system-alist.
>>
>> So I would just override that when writing the netrc/authinfo file. I
>> can't imagine any value in letting the user override the UTF-8 encoding,
>> can you?
EZ> No, I cannot.
OK, I can add that. Stefan, would you consider this a bug fix (since
the previous writes were broken)? If not I'll hold off until the
pretest is done.
>> >> - opportunistically open the file in binary, raw-text, UTF-8, etc. on
>> >> every platform
...
EZ> Just let Emacs do its usual guesswork.
Great, thanks for the advice.
Ted
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-25 13:21 ` Ted Zlatanov
2011-09-25 17:08 ` Eli Zaretskii
@ 2011-09-26 18:04 ` Lars Magne Ingebrigtsen
2011-09-26 19:22 ` Ted Zlatanov
1 sibling, 1 reply; 45+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-09-26 18:04 UTC (permalink / raw)
To: emacs-devel
Ted Zlatanov <tzz@lifelogs.com> writes:
> I think authinfo/netrc files should be portable and support Unicode in a
> way that enables other (older or new!) software to use them too. IMHO
> enforcing UTF-8 encoding is the best way to achieve that.
That's not realistic, I think.
Look, these protocols (SMTP, NNTP, pop3, etc) are really old. Most of
them were created in a "just send ASCII" world, which then morphed into
a "just send 8bit, just make sure you don't send any null bytes" world,
which then again sort of morphed into a world that's somewhat cognisant
of charsets server-side.
But for NNTP basic auth, for instance, it's perfectly valid to use the
five-byte sequence representing "héllo" in iso-8859-15 as the password,
if that's what the user has set up, and it's what the NNTP server has
stored. (And the same goes for pop3 and SMTP. (For IMAP the situation
is different -- there they've actually defined the charset to use, and
it's a tweak on utf7.))
So there isn't any wiggle room here. The user has to be able to store a
random sequence of bytes into the .authinfo file to be able to contact
their servers -- if they have been careless enough to create a non-ASCII
user name or password.
Because using non-ASCII credentials is so fraught with problems, almost
nobody does it, which is why we don't get many (or any, really) bug
reports about this.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog http://lars.ingebrigtsen.no/
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-26 18:04 ` Lars Magne Ingebrigtsen
@ 2011-09-26 19:22 ` Ted Zlatanov
2011-09-26 19:30 ` Lars Magne Ingebrigtsen
` (2 more replies)
0 siblings, 3 replies; 45+ messages in thread
From: Ted Zlatanov @ 2011-09-26 19:22 UTC (permalink / raw)
To: emacs-devel
On Mon, 26 Sep 2011 20:04:42 +0200 Lars Magne Ingebrigtsen <larsi@gnus.org> wrote:
LMI> Ted Zlatanov <tzz@lifelogs.com> writes:
>> I think authinfo/netrc files should be portable and support Unicode in a
>> way that enables other (older or new!) software to use them too. IMHO
>> enforcing UTF-8 encoding is the best way to achieve that.
LMI> That's not realistic, I think.
...
LMI> So there isn't any wiggle room here. The user has to be able to store a
LMI> random sequence of bytes into the .authinfo file to be able to contact
LMI> their servers -- if they have been careless enough to create a non-ASCII
LMI> user name or password.
I agree 100%. I'm saying we should save the netrc/authinfo file in the
UTF-8 coding system instead of raw-text so Unicode characters in there
are usable by other programs too. Forget the `auth-source-search'
callers, they won't know or care. There will be no difference to their
usage or the data they get.
I believe random bytes can be encoded just fine by UTF-8. If they are
read by a program that doesn't know UTF-8 that's a problem, but IMO we
can live with it and it's entirely theoretical.
Ted
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-26 19:22 ` Ted Zlatanov
@ 2011-09-26 19:30 ` Lars Magne Ingebrigtsen
2011-09-26 19:48 ` Ted Zlatanov
2011-09-26 19:34 ` Eli Zaretskii
2011-09-27 13:54 ` Jason Rumney
2 siblings, 1 reply; 45+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-09-26 19:30 UTC (permalink / raw)
To: emacs-devel
Ted Zlatanov <tzz@lifelogs.com> writes:
> I agree 100%. I'm saying we should save the netrc/authinfo file in the
> UTF-8 coding system instead of raw-text so Unicode characters in there
> are usable by other programs too.
No, if the sequence "héllo" is a five-byte sequence, it should be saved
as such. Otherwise it's not usable to other programs.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog http://lars.ingebrigtsen.no/
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-26 19:22 ` Ted Zlatanov
2011-09-26 19:30 ` Lars Magne Ingebrigtsen
@ 2011-09-26 19:34 ` Eli Zaretskii
2011-09-26 19:40 ` Ted Zlatanov
2011-09-27 13:54 ` Jason Rumney
2 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2011-09-26 19:34 UTC (permalink / raw)
To: emacs-devel
> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Mon, 26 Sep 2011 14:22:36 -0500
>
> I believe random bytes can be encoded just fine by UTF-8.
No, they cannot. A given sequence of "random bytes" can be a valid
UTF-8 encoding of some character with a sufficiently large code point.
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-26 19:34 ` Eli Zaretskii
@ 2011-09-26 19:40 ` Ted Zlatanov
2011-09-27 2:51 ` Eli Zaretskii
0 siblings, 1 reply; 45+ messages in thread
From: Ted Zlatanov @ 2011-09-26 19:40 UTC (permalink / raw)
To: emacs-devel
On Mon, 26 Sep 2011 22:34:21 +0300 Eli Zaretskii <eliz@gnu.org> wrote:
>> From: Ted Zlatanov <tzz@lifelogs.com>
>> Date: Mon, 26 Sep 2011 14:22:36 -0500
>>
>> I believe random bytes can be encoded just fine by UTF-8.
EZ> No, they cannot. A given sequence of "random bytes" can be a valid
EZ> UTF-8 encoding of some character with a sufficiently large code point.
But that's not what I said :) They can be *encoded* to a UTF-8 sequence
than can be eventually decoded back from UTF-8.
Ted
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-26 19:30 ` Lars Magne Ingebrigtsen
@ 2011-09-26 19:48 ` Ted Zlatanov
2011-09-26 21:31 ` Stefan Monnier
0 siblings, 1 reply; 45+ messages in thread
From: Ted Zlatanov @ 2011-09-26 19:48 UTC (permalink / raw)
To: emacs-devel
On Mon, 26 Sep 2011 21:30:16 +0200 Lars Magne Ingebrigtsen <larsi@gnus.org> wrote:
LMI> Ted Zlatanov <tzz@lifelogs.com> writes:
>> I agree 100%. I'm saying we should save the netrc/authinfo file in the
>> UTF-8 coding system instead of raw-text so Unicode characters in there
>> are usable by other programs too.
LMI> No, if the sequence "héllo" is a five-byte sequence, it should be saved
LMI> as such. Otherwise it's not usable to other programs.
That's exactly my point. Right now we save as raw-text, which is not
usable to other programs in the long term. In UTF-8 it would be saved
"as such" because IIRC all codepoints under 255 don't need to be encoded
(and your string goes up to 233).
Ted
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-26 17:28 ` Ted Zlatanov
@ 2011-09-26 21:27 ` Stefan Monnier
0 siblings, 0 replies; 45+ messages in thread
From: Stefan Monnier @ 2011-09-26 21:27 UTC (permalink / raw)
To: emacs-devel
>>> You mean, force Emacs to encode .authinfo in UTF-8 when creating it?
>>> I guess that's the job for file-coding-system-alist.
SM> Or adding a -*- coding: utf-8 -*- cookie to the file.
> That's not standard, so a netrc/authinfo file created by someone else
> would not have it and we're back to guessing. Better to guess on read,
> enforce UTF-8 on write IMO.
For the "read, modify, write", if the guess is wrong, the write will not
magically be fixed by using utf-8.
What do the files that we generated until now contain? utf-8?
something else?
Stefan
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-26 19:48 ` Ted Zlatanov
@ 2011-09-26 21:31 ` Stefan Monnier
2011-09-26 21:43 ` Lars Magne Ingebrigtsen
` (2 more replies)
0 siblings, 3 replies; 45+ messages in thread
From: Stefan Monnier @ 2011-09-26 21:31 UTC (permalink / raw)
To: emacs-devel
> That's exactly my point. Right now we save as raw-text, which is not
> usable to other programs in the long term. In UTF-8 it would be saved
> "as such" because IIRC all codepoints under 255 don't need to be encoded
> (and your string goes up to 233).
No, chars from the latin-1 set have identical *Unicode* code points
(i.e. between 128 and 255), but their encoding into utf-8 occupies
2 bytes.
As for saving random bytes, you can't either, at least not in a way that
is supported by all utf-8 implementations.
I think raw-text is more likely to work, based on what Lars says.
Stefan
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-26 21:31 ` Stefan Monnier
@ 2011-09-26 21:43 ` Lars Magne Ingebrigtsen
2011-09-26 21:54 ` Ted Zlatanov
2011-09-27 4:07 ` Stephen J. Turnbull
2011-09-26 21:55 ` Ted Zlatanov
2011-09-27 2:57 ` Eli Zaretskii
2 siblings, 2 replies; 45+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-09-26 21:43 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel
Stefan Monnier <monnier@IRO.UMontreal.CA> writes:
> I think raw-text is more likely to work, based on what Lars says.
On the other hand, if auth-source prompts for a password, and you type
in something non-ASCII, the result will probably be something utf8-ey, I
think? Which may or may not work on the server, but I don't really see
what to do about it. Except asking the user "You've typed in something
non-ASCII. What bit pattern are you imagining Emacs will actually send
to the server?" :-)
Or what charset to use. Probably slightly less confusing, but probably
not a whole lot more.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog http://lars.ingebrigtsen.no/
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-26 21:43 ` Lars Magne Ingebrigtsen
@ 2011-09-26 21:54 ` Ted Zlatanov
2011-09-27 4:07 ` Stephen J. Turnbull
1 sibling, 0 replies; 45+ messages in thread
From: Ted Zlatanov @ 2011-09-26 21:54 UTC (permalink / raw)
To: emacs-devel
On Mon, 26 Sep 2011 23:43:08 +0200 Lars Magne Ingebrigtsen <larsi@gnus.org> wrote:
LMI> Stefan Monnier <monnier@IRO.UMontreal.CA> writes:
>> I think raw-text is more likely to work, based on what Lars says.
LMI> On the other hand, if auth-source prompts for a password, and you type
LMI> in something non-ASCII, the result will probably be something utf8-ey, I
LMI> think? Which may or may not work on the server, but I don't really see
LMI> what to do about it. Except asking the user "You've typed in something
LMI> non-ASCII. What bit pattern are you imagining Emacs will actually send
LMI> to the server?" :-)
LMI> Or what charset to use. Probably slightly less confusing, but probably
LMI> not a whole lot more.
OK, let's start from the beginning. We should support Unicode
characters for secrets, yes? I think each API user should limit that
further, but there's no reason for auth-source to block some data
arbitrarily. So if I can't encode the secrets with UTF-8, what should I
use that gives me good compatibility with other GNU and other libraries
and programs like libcurl for instance?
Ted
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-26 21:31 ` Stefan Monnier
2011-09-26 21:43 ` Lars Magne Ingebrigtsen
@ 2011-09-26 21:55 ` Ted Zlatanov
2011-09-27 2:57 ` Eli Zaretskii
2 siblings, 0 replies; 45+ messages in thread
From: Ted Zlatanov @ 2011-09-26 21:55 UTC (permalink / raw)
To: emacs-devel
On Mon, 26 Sep 2011 17:31:52 -0400 Stefan Monnier <monnier@IRO.UMontreal.CA> wrote:
>> That's exactly my point. Right now we save as raw-text, which is not
>> usable to other programs in the long term. In UTF-8 it would be saved
>> "as such" because IIRC all codepoints under 255 don't need to be encoded
>> (and your string goes up to 233).
SM> No, chars from the latin-1 set have identical *Unicode* code points
SM> (i.e. between 128 and 255), but their encoding into utf-8 occupies
SM> 2 bytes.
SM> As for saving random bytes, you can't either, at least not in a way that
SM> is supported by all utf-8 implementations.
SM> I think raw-text is more likely to work, based on what Lars says.
Thanks for explaining, my recollection of the high-bit extended ASCII
encoding was wrong. I hope we find a way that works for everyone.
Ted
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-26 19:40 ` Ted Zlatanov
@ 2011-09-27 2:51 ` Eli Zaretskii
0 siblings, 0 replies; 45+ messages in thread
From: Eli Zaretskii @ 2011-09-27 2:51 UTC (permalink / raw)
To: emacs-devel
> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Mon, 26 Sep 2011 14:40:07 -0500
>
> On Mon, 26 Sep 2011 22:34:21 +0300 Eli Zaretskii <eliz@gnu.org> wrote:
>
> >> From: Ted Zlatanov <tzz@lifelogs.com>
> >> Date: Mon, 26 Sep 2011 14:22:36 -0500
> >>
> >> I believe random bytes can be encoded just fine by UTF-8.
>
> EZ> No, they cannot. A given sequence of "random bytes" can be a valid
> EZ> UTF-8 encoding of some character with a sufficiently large code point.
>
> But that's not what I said :) They can be *encoded* to a UTF-8 sequence
> than can be eventually decoded back from UTF-8.
No. Some byte sequences are invalid UTF-8, and are decoded into a
single special character. IOW, what you suggest is in general lossy
conversion.
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-26 21:31 ` Stefan Monnier
2011-09-26 21:43 ` Lars Magne Ingebrigtsen
2011-09-26 21:55 ` Ted Zlatanov
@ 2011-09-27 2:57 ` Eli Zaretskii
2011-09-27 10:38 ` Ted Zlatanov
2 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2011-09-27 2:57 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel
> From: Stefan Monnier <monnier@IRO.UMontreal.CA>
> Date: Mon, 26 Sep 2011 17:31:52 -0400
>
> I think raw-text is more likely to work, based on what Lars says.
That was also my conclusion.
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-26 21:43 ` Lars Magne Ingebrigtsen
2011-09-26 21:54 ` Ted Zlatanov
@ 2011-09-27 4:07 ` Stephen J. Turnbull
2011-09-27 6:11 ` Lars Magne Ingebrigtsen
2011-09-27 10:29 ` Ted Zlatanov
1 sibling, 2 replies; 45+ messages in thread
From: Stephen J. Turnbull @ 2011-09-27 4:07 UTC (permalink / raw)
To: Lars Magne Ingebrigtsen; +Cc: Stefan Monnier, emacs-devel
Lars Magne Ingebrigtsen writes:
> On the other hand, if auth-source prompts for a password, and you type
> in something non-ASCII, the result will probably be something utf8-ey, I
> think?
No. 1.3 billion Chinese are very likely to use GB2312, not to mention
130 million Japanese who use Shift JIS. These are not UTF-8-ey in
several ways, and Shift JIS even abuses octets in the ASCII range for
use in multibyte characters.
If you have *no* password and the user asks to store one, yes, use
UTF-8, and warn the user that Emacs has chosen to use the standard
Unicode encoding "UTF-8", but other applications (especially on
Windows) may choose something else. In which case the user will be
unable to log in from those applications.
If you already have a password, it should be read verbatim (binary, or
raw-text should do given the line-oriented nature of these
configuration files) and treated as a binary blob.
> Or what charset to use. Probably slightly less confusing, but probably
> not a whole lot more.
Mule should have a language-to-list-of-charset alist around somewhere.
Use that to generate a menu of suggestions. Ask Ken'ichi about how to
access it.
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-27 4:07 ` Stephen J. Turnbull
@ 2011-09-27 6:11 ` Lars Magne Ingebrigtsen
2011-09-27 10:29 ` Ted Zlatanov
1 sibling, 0 replies; 45+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-09-27 6:11 UTC (permalink / raw)
To: Stephen J. Turnbull; +Cc: Stefan Monnier, emacs-devel
"Stephen J. Turnbull" <stephen@xemacs.org> writes:
> > On the other hand, if auth-source prompts for a password, and you type
> > in something non-ASCII, the result will probably be something utf8-ey, I
> > think?
>
> No. 1.3 billion Chinese are very likely to use GB2312, not to mention
> 130 million Japanese who use Shift JIS. These are not UTF-8-ey in
> several ways, and Shift JIS even abuses octets in the ASCII range for
> use in multibyte characters.
I meant: If you type something into auth-source today that is non-ASCII,
what you'll get in the .authinfo file is probably utf-8.
Which, as you point out, may not be what the user wants.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog http://lars.ingebrigtsen.no/
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-27 4:07 ` Stephen J. Turnbull
2011-09-27 6:11 ` Lars Magne Ingebrigtsen
@ 2011-09-27 10:29 ` Ted Zlatanov
2011-09-27 12:33 ` Stephen J. Turnbull
1 sibling, 1 reply; 45+ messages in thread
From: Ted Zlatanov @ 2011-09-27 10:29 UTC (permalink / raw)
To: emacs-devel
On Tue, 27 Sep 2011 13:07:42 +0900 "Stephen J. Turnbull" <stephen@xemacs.org> wrote:
SJT> Lars Magne Ingebrigtsen writes:
>> On the other hand, if auth-source prompts for a password, and you type
>> in something non-ASCII, the result will probably be something utf8-ey, I
>> think?
SJT> No. 1.3 billion Chinese are very likely to use GB2312, not to mention
SJT> 130 million Japanese who use Shift JIS. These are not UTF-8-ey in
SJT> several ways, and Shift JIS even abuses octets in the ASCII range for
SJT> use in multibyte characters.
UTF-8 is an encoding; you're talking about charsets. Can you explain
more precisely what you mean by "not UTF-8-ey in several ways"?
SJT> If you have *no* password and the user asks to store one, yes, use
SJT> UTF-8, and warn the user that Emacs has chosen to use the standard
SJT> Unicode encoding "UTF-8", but other applications (especially on
SJT> Windows) may choose something else. In which case the user will be
SJT> unable to log in from those applications.
Would it be enough to let the user override that coding system choice
through a defcustom? For all the use cases I have seen, UTF-8 is
enough, so I'd rather use it by default.
SJT> If you already have a password, it should be read verbatim (binary, or
SJT> raw-text should do given the line-oriented nature of these
SJT> configuration files) and treated as a binary blob.
That's not helpful when you need to encode it for IMAP, for instance.
You have to know the actual characters that make up the binary blob.
Ted
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-27 2:57 ` Eli Zaretskii
@ 2011-09-27 10:38 ` Ted Zlatanov
2011-09-27 11:31 ` Eli Zaretskii
2011-09-27 14:02 ` Jason Rumney
0 siblings, 2 replies; 45+ messages in thread
From: Ted Zlatanov @ 2011-09-27 10:38 UTC (permalink / raw)
To: emacs-devel
On Tue, 27 Sep 2011 05:57:28 +0300 Eli Zaretskii <eliz@gnu.org> wrote:
>> From: Stefan Monnier <monnier@IRO.UMontreal.CA>
>> Date: Mon, 26 Sep 2011 17:31:52 -0400
>>
>> I think raw-text is more likely to work, based on what Lars says.
EZ> That was also my conclusion.
I think we should make an effort to make the netrc/authinfo file
shareable with other programs, or else what's the point of using such a
file? We may as well `print' straight to a file. raw-text encoding is,
to me, saying "we give up."
I thought today, on most popular platforms, UTF-8 was the safest choice
if you want to share data that covers UCS. I think the non-UCS data can
be covered by letting the user override the encoding in a defcustom.
The other objection to UTF-8 was that some binary sequences can't be
encoded by it. Remember, we're talking about passwords and other
legible tokens, not binary files. The likelihood of such a sequence in
a token is too small to matter IMO. So I still think raw-text is the
worse choice even though it's easier to make it.
Ted
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-27 10:38 ` Ted Zlatanov
@ 2011-09-27 11:31 ` Eli Zaretskii
2011-09-27 12:55 ` Stefan Monnier
2011-09-27 14:02 ` Jason Rumney
1 sibling, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2011-09-27 11:31 UTC (permalink / raw)
To: emacs-devel
> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Tue, 27 Sep 2011 05:38:28 -0500
> Reply-To: emacs-devel@gnu.org
>
> On Tue, 27 Sep 2011 05:57:28 +0300 Eli Zaretskii <eliz@gnu.org> wrote:
>
> >> From: Stefan Monnier <monnier@IRO.UMontreal.CA>
> >> Date: Mon, 26 Sep 2011 17:31:52 -0400
> >>
> >> I think raw-text is more likely to work, based on what Lars says.
>
> EZ> That was also my conclusion.
>
> I think we should make an effort to make the netrc/authinfo file
> shareable with other programs
I agree. But to do that, it sounds like we are lacking some knowledge
about the intended use of these files, especially when they are used
in conjunction with external services. If someone can prepare an
exhaustive list of such uses, or at least those we want to support,
and tell what encodings can be used with each of them, we can take it
from there the way you want it. But if such details are not known at
the moment, we may actually break some legitimate uses, which would be
a pity.
> raw-text encoding is, to me, saying "we give up."
Give up knowing exactly how the stuff is encoded, yes. There's
nothing wrong with that; after all, we do that when we edit binary
files, don't we?
> I thought today, on most popular platforms, UTF-8 was the safest choice
> if you want to share data that covers UCS.
UCS and UTF-8 are not the same thing. Windows uses UCS (well,
actually UTF-16) internally, but UTF-8 is seldom seen there, e.g. you
will never see a file name encoded in UTF-8 on a Windows filesystem,
except as an accident.
Stephen gave you examples with CJK locales, where UTF-8 might not be
as popular as you'd like it, even on Posix systems.
And even in Europe there are a few locales which prefer single-byte
encoding of some kind, AFAIK.
So I think you are being overly optimistic in asserting that UTF-8 is
"the safest choice".
> The other objection to UTF-8 was that some binary sequences can't be
> encoded by it. Remember, we're talking about passwords and other
> legible tokens, not binary files. The likelihood of such a sequence in
> a token is too small to matter IMO. So I still think raw-text is the
> worse choice even though it's easier to make it.
You read "binary" incorrectly. For the purposes of this discussion,
"binary" == "arbitrary byte values". Not every 8-bit byte is valid as
part of a UTF-8 sequence. If the authinfo file includes such bytes,
it cannot be encoded in UTF-8, except if we use the Emacs extensions,
which will be only useful for Emacs. Such bytes can easily come from
some single-byte encoding, for example. To DTRT with such bytes, we
_must_ know its precise encoding; then we could _recode_ it in UTF-8,
and encode back when we send the string to external services.
Once again, blindly assuming that UTF-8 is "safe" is not good enough,
IMO. We need more details, if someone can provide them.
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-27 10:29 ` Ted Zlatanov
@ 2011-09-27 12:33 ` Stephen J. Turnbull
2011-09-27 20:15 ` Ted Zlatanov
0 siblings, 1 reply; 45+ messages in thread
From: Stephen J. Turnbull @ 2011-09-27 12:33 UTC (permalink / raw)
To: emacs-devel
Ted Zlatanov writes:
> UTF-8 is an encoding; you're talking about charsets.
No, I'm talking about encodings. I'm not entirely sure about GB 2312,
but I believe it has a defined preferred encoding (the one registered
as the MIME charset GB2312 -- MIME charsets are all encodings, they
specify what *bytes* will appear in the stream, not just an abstract
character to abstract integer mapping). Shift JIS is most definitely
an encoding for the JIS character set (although which JIS character
set is poorly defined).
> Can you explain more precisely what you mean by "not UTF-8-ey in
> several ways"?
In the case of Shift JIS, I already did: octets in the ASCII range are
used in multibyte characters. That *never* happens in valid UTF-8.
The distinctions for GB2312 are more nebulous. But Lars meant
something different, so it's not relevent.
> Would it be enough to let the user override that coding system choice
> through a defcustom?
No. That requires a huge amount of user sophistication, and is too
global; different applications might very well use different coding
systems for non-ASCII characters.
> For all the use cases I have seen, UTF-8 is enough, so I'd rather
> use it by default.
Isn't that what I said?
> SJT> If you already have a password, it should be read verbatim (binary, or
> SJT> raw-text should do given the line-oriented nature of these
> SJT> configuration files) and treated as a binary blob.
>
> That's not helpful when you need to encode it for IMAP, for instance.
> You have to know the actual characters that make up the binary blob.
Since when? I haven't paid much attention to IMAP since RFC 3501 was
an internet-draft, but in that document there are a few commands that
accept a CHARSET parameter. LOGIN and AUTHENTICATE aren't among them.
So you're just passing along binary blobs, which in the case of LOGIN
will often look like somebody's birthday or a child's name, but that's
just an unfortunate accident.
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-27 11:31 ` Eli Zaretskii
@ 2011-09-27 12:55 ` Stefan Monnier
0 siblings, 0 replies; 45+ messages in thread
From: Stefan Monnier @ 2011-09-27 12:55 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
Here's my take on it:
.authinfo contains various things and is used in different ways, and
there isn't a single answer that covers all cases:
- each kind of field (hostname, username, password) may require
a different encoding/decoding.
- when reading a password from the file, it should be read using
raw-text (i.e. as a "unibyte string").
In other words, the password should not be decoded into chars but left
as a sequence of bytes that will be sent as-is to whoever needs it.
- when a password is typed by the user it'll be a sequence of chars, so
we'll have to convert it into a sequence of bytes. The best coding
system to use for that purpose is probably going to be
locale-coding-system. That sequence of bytes is then send to whoever
needs it and saved as-is (using raw-text) into the .authinfo file.
- i.e. authinfo should be read as a unibyte file.
- i.e. when reading other fields than passwords, we'll have to
explicitly decode them using the coding system we want to use for
those fields.
- similarly, we'll have to encode those other fields manually when
writing them into .authinfo.
Of course, another option is to just read&write authinfo without
thinking about it, so Emacs will usually pick locale-coding-system for
it and it'll work just fine in 99.9% of the cases.
Stefan
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-26 19:22 ` Ted Zlatanov
2011-09-26 19:30 ` Lars Magne Ingebrigtsen
2011-09-26 19:34 ` Eli Zaretskii
@ 2011-09-27 13:54 ` Jason Rumney
2 siblings, 0 replies; 45+ messages in thread
From: Jason Rumney @ 2011-09-27 13:54 UTC (permalink / raw)
To: emacs-devel
Ted Zlatanov <tzz@lifelogs.com> writes:
> I believe random bytes can be encoded just fine by UTF-8.
Not without Emacs extensions.
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-27 10:38 ` Ted Zlatanov
2011-09-27 11:31 ` Eli Zaretskii
@ 2011-09-27 14:02 ` Jason Rumney
1 sibling, 0 replies; 45+ messages in thread
From: Jason Rumney @ 2011-09-27 14:02 UTC (permalink / raw)
To: emacs-devel
Ted Zlatanov <tzz@lifelogs.com> writes:
> The other objection to UTF-8 was that some binary sequences can't be
> encoded by it. Remember, we're talking about passwords and other
> legible tokens, not binary files. The likelihood of such a sequence in
> a token is too small to matter IMO.
Where the binary sequence is non ASCII characters in an encoding other
than UTF-8, the likelyhood that the sequence is not valid UTF-8 is very
high.
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-27 12:33 ` Stephen J. Turnbull
@ 2011-09-27 20:15 ` Ted Zlatanov
2011-09-28 1:41 ` Stephen J. Turnbull
0 siblings, 1 reply; 45+ messages in thread
From: Ted Zlatanov @ 2011-09-27 20:15 UTC (permalink / raw)
To: emacs-devel
On Tue, 27 Sep 2011 21:33:37 +0900 "Stephen J. Turnbull" <stephen@xemacs.org> wrote:
SJT> Ted Zlatanov writes:
>> UTF-8 is an encoding; you're talking about charsets.
SJT> No, I'm talking about encodings. I'm not entirely sure about GB 2312,
SJT> but I believe it has a defined preferred encoding (the one registered
SJT> as the MIME charset GB2312 -- MIME charsets are all encodings, they
SJT> specify what *bytes* will appear in the stream, not just an abstract
SJT> character to abstract integer mapping). Shift JIS is most definitely
SJT> an encoding for the JIS character set (although which JIS character
SJT> set is poorly defined).
Thanks for correcting my misunderstanding.
SJT> If you already have a password, it should be read verbatim (binary, or
SJT> raw-text should do given the line-oriented nature of these
SJT> configuration files) and treated as a binary blob.
>>
>> That's not helpful when you need to encode it for IMAP, for instance.
>> You have to know the actual characters that make up the binary blob.
SJT> Since when? I haven't paid much attention to IMAP since RFC 3501 was
SJT> an internet-draft, but in that document there are a few commands that
SJT> accept a CHARSET parameter. LOGIN and AUTHENTICATE aren't among them.
SJT> So you're just passing along binary blobs, which in the case of LOGIN
SJT> will often look like somebody's birthday or a child's name, but that's
SJT> just an unfortunate accident.
Ditto. I thought the CHARSET was used for passwords.
On Tue, 27 Sep 2011 07:31:23 -0400 Eli Zaretskii <eliz@gnu.org> wrote:
>> From: Ted Zlatanov <tzz@lifelogs.com>
>> Date: Tue, 27 Sep 2011 05:38:28 -0500
>> Reply-To: emacs-devel@gnu.org
>>
>> On Tue, 27 Sep 2011 05:57:28 +0300 Eli Zaretskii <eliz@gnu.org> wrote:
>>
>> >> From: Stefan Monnier <monnier@IRO.UMontreal.CA>
>> >> Date: Mon, 26 Sep 2011 17:31:52 -0400
>> >>
>> >> I think raw-text is more likely to work, based on what Lars says.
>>
EZ> That was also my conclusion.
>>
>> I think we should make an effort to make the netrc/authinfo file
>> shareable with other programs
EZ> I agree. But to do that, it sounds like we are lacking some knowledge
EZ> about the intended use of these files, especially when they are used
EZ> in conjunction with external services. If someone can prepare an
EZ> exhaustive list of such uses, or at least those we want to support,
EZ> and tell what encodings can be used with each of them, we can take it
EZ> from there the way you want it. But if such details are not known at
EZ> the moment, we may actually break some legitimate uses, which would be
EZ> a pity.
I know for sure only ASCII (up to 0xff) is supported by libcurl and
older FTP clients. I thought UTF-8 would be a good compatibility path
but apparently I'm wrong.
EZ> So I think you are being overly optimistic in asserting that UTF-8 is
EZ> "the safest choice".
OK.
EZ> You read "binary" incorrectly. For the purposes of this discussion,
EZ> "binary" == "arbitrary byte values". Not every 8-bit byte is valid as
EZ> part of a UTF-8 sequence. If the authinfo file includes such bytes,
EZ> it cannot be encoded in UTF-8, except if we use the Emacs extensions,
EZ> which will be only useful for Emacs. Such bytes can easily come from
EZ> some single-byte encoding, for example. To DTRT with such bytes, we
EZ> _must_ know its precise encoding; then we could _recode_ it in UTF-8,
EZ> and encode back when we send the string to external services.
Got it.
On Tue, 27 Sep 2011 08:55:45 -0400 Stefan Monnier <monnier@iro.umontreal.ca> wrote:
SM> Here's my take on it:
SM> .authinfo contains various things and is used in different ways, and
SM> there isn't a single answer that covers all cases:
SM> - each kind of field (hostname, username, password) may require
SM> a different encoding/decoding.
SM> - when reading a password from the file, it should be read using
SM> raw-text (i.e. as a "unibyte string").
SM> In other words, the password should not be decoded into chars but left
SM> as a sequence of bytes that will be sent as-is to whoever needs it.
SM> - when a password is typed by the user it'll be a sequence of chars, so
SM> we'll have to convert it into a sequence of bytes. The best coding
SM> system to use for that purpose is probably going to be
SM> locale-coding-system. That sequence of bytes is then send to whoever
SM> needs it and saved as-is (using raw-text) into the .authinfo file.
SM> - i.e. authinfo should be read as a unibyte file.
SM> - i.e. when reading other fields than passwords, we'll have to
SM> explicitly decode them using the coding system we want to use for
SM> those fields.
SM> - similarly, we'll have to encode those other fields manually when
SM> writing them into .authinfo.
SM> Of course, another option is to just read&write authinfo without
SM> thinking about it, so Emacs will usually pick locale-coding-system for
SM> it and it'll work just fine in 99.9% of the cases.
It sounds like the latter option is the least work and most reliable.
Users should be able to override the coding system as with any other
file, and we'll just keep the status quo. I appreciate all the details
and corrections; I thought UTF-8 was better and more widely useful than
it really is.
Thanks
Ted
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-27 20:15 ` Ted Zlatanov
@ 2011-09-28 1:41 ` Stephen J. Turnbull
2011-09-28 8:38 ` Eli Zaretskii
0 siblings, 1 reply; 45+ messages in thread
From: Stephen J. Turnbull @ 2011-09-28 1:41 UTC (permalink / raw)
To: emacs-devel
Ted Zlatanov writes:
> I appreciate all the details and corrections; I thought UTF-8 was
> better and more widely useful than it really is.
Please hang on to that impression. UTF-8 really is the best thing
since sliced bread (but also like sliced bread you still need to drink
milk and eat fruit to get all essential vitamins). Although in many
localizations, Windows defaults to something other than UTF-8 (AFAIK)
for most text operations (including file system access etc), most
Windows text applications do fine with UTF-8. What UTF-8 is not
(yet), is backward compatible with legacy systems -- a lot of people
have not yet converted from 60s- and 70s-era encodings to Unicode,
even where that is almost trivial even for non-techies.
IOW, your general impression is correct: UTF-8 is now an appropriate
(ie, "usable") *system* default even on Windows (not that text files
are in great vogue on Windows, except for program sources). Please
don't hesitate to advocate it in that role. However, by default in a
portable *application* that needs to deal with both variation *among*
platforms and local customization within any given platform, Emacs
needs to ask the system what its default is (or in some cases we can
be a little more fine-grained, but POSIX localization isn't very
useful in that direction).
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: smtpmail and ~/.authinfo
2011-09-28 1:41 ` Stephen J. Turnbull
@ 2011-09-28 8:38 ` Eli Zaretskii
0 siblings, 0 replies; 45+ messages in thread
From: Eli Zaretskii @ 2011-09-28 8:38 UTC (permalink / raw)
To: Stephen J. Turnbull; +Cc: emacs-devel
> From: "Stephen J. Turnbull" <stephen@xemacs.org>
> Date: Wed, 28 Sep 2011 10:41:09 +0900
>
> Ted Zlatanov writes:
>
> > I appreciate all the details and corrections; I thought UTF-8 was
> > better and more widely useful than it really is.
>
> Please hang on to that impression. UTF-8 really is the best thing
> since sliced bread
FWIW, I agree.
> Although in many localizations, Windows defaults to something other
> than UTF-8 (AFAIK) for most text operations (including file system
> access etc)
To set the record straight: AFAIK there's not a single locale where
Windows uses UTF-8 as the default encoding. Internal operations all
use UTF-16, and file names are encoded by the NTFS filesystem in
UTF-16 (FAT32 uses the locale-specific encoding, and thus can support
only the characters in that encoding). Clipboard works in UTF-16.
Etc. etc.
> most Windows text applications do fine with UTF-8.
True. Even Notepad can. There's a single exception, though: the
shell (a.k.a. console) window. I cannot get "emacs -nw" on Windows
use UTF-8 as its terminal encoding, nor have other Windows programs
display UTF-8 in the console window. There's a UTF-8 codepage
allegedly supported by Windows, but if I set the console window to use
that codepage, I get gibberish or a crashed application. Maybe I'm
doing something wrong.
^ permalink raw reply [flat|nested] 45+ messages in thread
end of thread, other threads:[~2011-09-28 8:38 UTC | newest]
Thread overview: 45+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-20 10:26 smtpmail and ~/.authinfo Eli Zaretskii
2011-08-21 4:39 ` Lars Magne Ingebrigtsen
2011-08-21 6:12 ` Eli Zaretskii
2011-08-21 19:25 ` Lars Magne Ingebrigtsen
2011-08-21 19:59 ` Eli Zaretskii
2011-08-21 20:17 ` Lars Magne Ingebrigtsen
2011-08-22 5:35 ` Eli Zaretskii
2011-09-10 19:01 ` Lars Magne Ingebrigtsen
2011-09-25 12:33 ` Ted Zlatanov
2011-09-25 12:48 ` Eli Zaretskii
2011-09-25 13:21 ` Ted Zlatanov
2011-09-25 17:08 ` Eli Zaretskii
2011-09-26 14:41 ` Ted Zlatanov
2011-09-26 16:18 ` Eli Zaretskii
2011-09-26 16:53 ` Ted Zlatanov
2011-09-26 17:15 ` Eli Zaretskii
2011-09-26 17:23 ` Eli Zaretskii
2011-09-26 17:31 ` Ted Zlatanov
2011-09-26 17:00 ` Stefan Monnier
2011-09-26 17:28 ` Ted Zlatanov
2011-09-26 21:27 ` Stefan Monnier
2011-09-26 18:04 ` Lars Magne Ingebrigtsen
2011-09-26 19:22 ` Ted Zlatanov
2011-09-26 19:30 ` Lars Magne Ingebrigtsen
2011-09-26 19:48 ` Ted Zlatanov
2011-09-26 21:31 ` Stefan Monnier
2011-09-26 21:43 ` Lars Magne Ingebrigtsen
2011-09-26 21:54 ` Ted Zlatanov
2011-09-27 4:07 ` Stephen J. Turnbull
2011-09-27 6:11 ` Lars Magne Ingebrigtsen
2011-09-27 10:29 ` Ted Zlatanov
2011-09-27 12:33 ` Stephen J. Turnbull
2011-09-27 20:15 ` Ted Zlatanov
2011-09-28 1:41 ` Stephen J. Turnbull
2011-09-28 8:38 ` Eli Zaretskii
2011-09-26 21:55 ` Ted Zlatanov
2011-09-27 2:57 ` Eli Zaretskii
2011-09-27 10:38 ` Ted Zlatanov
2011-09-27 11:31 ` Eli Zaretskii
2011-09-27 12:55 ` Stefan Monnier
2011-09-27 14:02 ` Jason Rumney
2011-09-26 19:34 ` Eli Zaretskii
2011-09-26 19:40 ` Ted Zlatanov
2011-09-27 2:51 ` Eli Zaretskii
2011-09-27 13:54 ` Jason Rumney
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).