How to ship native modules?

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* How to ship native modules?
@ 2017-02-20 10:00 Elias Mårtenson
  2017-02-20 15:27 ` Eli Zaretskii
                   ` (2 more replies)
  0 siblings, 3 replies; 128+ messages in thread
From: Elias Mårtenson @ 2017-02-20 10:00 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1332 bytes --]

As I am working on making the Gnus-GSS support ready for general
consumption, I need to know how this thing should be distributed.

Right now there are two separate issues, with the second dependent on the
first:

The first issue is that I have no idea how to actually distribute the
native GSSAPI module. It's just a single, reasonably small, C file that
needs to be compiled with the libgssapi_krb5.so library. If this is bundled
in Emacs proper, making it a part of the build system isn't too
complicated, although I don't believe this has been done before.

If it is to be shipped as part of ELPA, then a different question emerges:
Should I simply add code to the Elisp files that checks if it can find the
native library and if not, compile it on the fly? It's doable, but I'd hard
to reinvent several wheels to do so. Essentially I'd have to rebuild part
of autoconf in Elisp.

The second issue is how to ship my changes to Gnus. My changes are limited
to a changes to nnimap.el, but this new feature will now try to require the
‘gss’ feature when gssapi authentication is enabled. If the gssapi library
is shipped as part of ELPA, then we end up with the interesting situation
where the packaged version of Gnus depends on a package in ELPA.

How should I continue with this?

Regards,
Elias

[-- Attachment #2: Type: text/html, Size: 1504 bytes --]

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: How to ship native modules?
  2017-02-20 10:00 How to ship native modules? Elias Mårtenson
@ 2017-02-20 15:27 ` Eli Zaretskii
  2017-02-20 16:01   ` Elias Mårtenson
  2017-03-02 14:59   ` request to reconsider libnettle/libhogweed (was: How to ship native modules?) Ted Zlatanov
  2017-02-20 15:33 ` How to ship native modules? Aurélien Aptel
  2017-02-21  4:50 ` Andreas Politz
  2 siblings, 2 replies; 128+ messages in thread
From: Eli Zaretskii @ 2017-02-20 15:27 UTC (permalink / raw)
  To: Elias Mårtenson; +Cc: emacs-devel

> From: Elias Mårtenson <lokedhs@gmail.com>
> Date: Mon, 20 Feb 2017 18:00:24 +0800
> 
> The first issue is that I have no idea how to actually distribute the native GSSAPI module. It's just a single,
> reasonably small, C file that needs to be compiled with the libgssapi_krb5.so library. If this is bundled in
> Emacs proper, making it a part of the build system isn't too complicated, although I don't believe this has been
> done before.

IMO, it makes little sense to distribute loadable modules with Emacs
itself: if they are part of the standard distribution, they should
just be "normal" source files in src/, and don't have to go through
all the trouble of using the emacs-modules machinery.

> If it is to be shipped as part of ELPA, then a different question emerges: Should I simply add code to the Elisp
> files that checks if it can find the native library and if not, compile it on the fly? It's doable, but I'd hard to
> reinvent several wheels to do so. Essentially I'd have to rebuild part of autoconf in Elisp.

I'm not sure I see the reason for "rebuilding part of autoconf".  Your
Lisp code should just (load "foo"), and leave it to the user and
package.el to put the compiled module where Emacs will find it (along
load-path).  Am I missing something?

> The second issue is how to ship my changes to Gnus. My changes are limited to a changes to nnimap.el, but
> this new feature will now try to require the ‘gss’ feature when gssapi authentication is enabled. If the gssapi
> library is shipped as part of ELPA, then we end up with the interesting situation where the packaged version
> of Gnus depends on a package in ELPA.

If the package is on ELPA, IMO it should provide a separate .el file
with a feature that Gnus will 'require', given some defcustom.

Thanks.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: How to ship native modules?
  2017-02-20 10:00 How to ship native modules? Elias Mårtenson
  2017-02-20 15:27 ` Eli Zaretskii
@ 2017-02-20 15:33 ` Aurélien Aptel
  2017-02-21  4:50 ` Andreas Politz
  2 siblings, 0 replies; 128+ messages in thread
From: Aurélien Aptel @ 2017-02-20 15:33 UTC (permalink / raw)
  To: Elias Mårtenson; +Cc: emacs-devel

On Mon, Feb 20, 2017 at 11:00 AM, Elias Mårtenson <lokedhs@gmail.com> wrote:
> How should I continue with this?

As you probably already know, there are no official ways to ship
native modules at the moment (see previous discussions here [1] and
here [2]).

1: http://lists.gnu.org/archive/html/emacs-devel/2017-02/msg00245.html
2: http://lists.gnu.org/archive/html/emacs-devel/2017-02/msg00572.html



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: How to ship native modules?
  2017-02-20 15:27 ` Eli Zaretskii
@ 2017-02-20 16:01   ` Elias Mårtenson
  2017-02-20 16:30     ` Eli Zaretskii
  2017-03-02 14:59   ` request to reconsider libnettle/libhogweed (was: How to ship native modules?) Ted Zlatanov
  1 sibling, 1 reply; 128+ messages in thread
From: Elias Mårtenson @ 2017-02-20 16:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 2668 bytes --]

On 20 February 2017 at 23:27, Eli Zaretskii <eliz@gnu.org> wrote:

> > From: Elias Mårtenson <lokedhs@gmail.com>
> > Date: Mon, 20 Feb 2017 18:00:24 +0800
> >
> > The first issue is that I have no idea how to actually distribute the
> native GSSAPI module. It's just a single,
> > reasonably small, C file that needs to be compiled with the
> libgssapi_krb5.so library. If this is bundled in
> > Emacs proper, making it a part of the build system isn't too
> complicated, although I don't believe this has been
> > done before.
>
> IMO, it makes little sense to distribute loadable modules with Emacs
> itself: if they are part of the standard distribution, they should
> just be "normal" source files in src/, and don't have to go through
> all the trouble of using the emacs-modules machinery.
>

Are you suggesting that I port the GSS module to use the native Emacs
internal API's? If I do that, would you accept that into Emacs proper?

Personally, I'd definitely prefer to not have to rely on a loadable module
for this kind of functionality. I guess the only drawback is that most
distributions won't ship Emacs with GSS support, since that required the
gssapi libraries to be available.

What is the best solution here?

> If it is to be shipped as part of ELPA, then a different question
> emerges: Should I simply add code to the Elisp
> > files that checks if it can find the native library and if not, compile
> it on the fly? It's doable, but I'd hard to
> > reinvent several wheels to do so. Essentially I'd have to rebuild part
> of autoconf in Elisp.
>
> I'm not sure I see the reason for "rebuilding part of autoconf".  Your
> Lisp code should just (load "foo"), and leave it to the user and
> package.el to put the compiled module where Emacs will find it (along
> load-path).  Am I missing something?
>

I'd expect the user to run M-x package-installl gss, and then have working
GSSAPI support in Gnus. Currently, ELPA doesn't have any facilities to
support compilation of native modules as part of ELPA package installation.

Are you suggesting that the user manually download the C part of the "gss"
package and compile it, and then run "emacs -L path/to/custom/libs" when
they start Emacs? That would work, but that doesn't seem to be the most
ideal user interface.

> If the package is on ELPA, IMO it should provide a separate .el file
> with a feature that Gnus will 'require', given some defcustom.
>

Fair enough. That can be done with a relatively minor patch to nnimap.el.
However, it still leaves the issue of how to deal with the native module
part of this.

Regards,
Elias

[-- Attachment #2: Type: text/html, Size: 3601 bytes --]

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: How to ship native modules?
  2017-02-20 16:01   ` Elias Mårtenson
@ 2017-02-20 16:30     ` Eli Zaretskii
  2017-02-21  2:48       ` Elias Mårtenson
  2017-02-21 14:44       ` Stefan Monnier
  0 siblings, 2 replies; 128+ messages in thread
From: Eli Zaretskii @ 2017-02-20 16:30 UTC (permalink / raw)
  To: Elias Mårtenson; +Cc: emacs-devel

> From: Elias Mårtenson <lokedhs@gmail.com>
> Date: Tue, 21 Feb 2017 00:01:26 +0800
> Cc: emacs-devel <emacs-devel@gnu.org>
> 
>  IMO, it makes little sense to distribute loadable modules with Emacs
>  itself: if they are part of the standard distribution, they should
>  just be "normal" source files in src/, and don't have to go through
>  all the trouble of using the emacs-modules machinery.
> 
> Are you suggesting that I port the GSS module to use the native Emacs internal API's?

If the module is to come with Emacs out of the box, yes.

> If I do that, would you accept that into Emacs proper?

I don't know, I'd have to look at the sources first.

> Personally, I'd definitely prefer to not have to rely on a loadable module for this kind of functionality. I guess the
> only drawback is that most distributions won't ship Emacs with GSS support, since that required the gssapi
> libraries to be available.

Is it different from any other optional library in any significant
way?  E.g., GPM or the image libraries?  They all require to have the
library installed for Emacs to be able to be built with its support.

>  I'm not sure I see the reason for "rebuilding part of autoconf". Your
>  Lisp code should just (load "foo"), and leave it to the user and
>  package.el to put the compiled module where Emacs will find it (along
>  load-path). Am I missing something?
> 
> I'd expect the user to run M-x package-installl gss, and then have working GSSAPI support in Gnus. Currently,
> ELPA doesn't have any facilities to support compilation of native modules as part of ELPA package
> installation.

You can provide a small Makefile that the user will have to invoke in
order to compile the C source.  Same as in modules/mod-test/ in the
Emacs sources.

> Are you suggesting that the user manually download the C part of the "gss" package and compile it, and then
> run "emacs -L path/to/custom/libs" when they start Emacs? That would work, but that doesn't seem to be the
> most ideal user interface.

Compile: yes (except that I see no reason for downloading manually, it
can all come to the end-user machine when the package is installed).
But I see no reason to run Emacs specially: if the Makefile has an
'install' target that puts the compiled module in site-lisp, Emacs
will find it when some Lisp attempts to load it.

>  If the package is on ELPA, IMO it should provide a separate .el file
>  with a feature that Gnus will 'require', given some defcustom.
> 
> Fair enough. That can be done with a relatively minor patch to nnimap.el. However, it still leaves the issue of
> how to deal with the native module part of this.

The module should be installed in site-lisp, IMO.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: How to ship native modules?
  2017-02-20 16:30     ` Eli Zaretskii
@ 2017-02-21  2:48       ` Elias Mårtenson
  2017-02-21  3:41         ` Eli Zaretskii
  2017-02-21 14:44       ` Stefan Monnier
  1 sibling, 1 reply; 128+ messages in thread
From: Elias Mårtenson @ 2017-02-21  2:48 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 3525 bytes --]

On 21 February 2017 at 00:30, Eli Zaretskii <eliz@gnu.org> wrote:

> > From: Elias Mårtenson <lokedhs@gmail.com>
> > Date: Tue, 21 Feb 2017 00:01:26 +0800
> > Cc: emacs-devel <emacs-devel@gnu.org>
> >
> >  IMO, it makes little sense to distribute loadable modules with Emacs
> >  itself: if they are part of the standard distribution, they should
> >  just be "normal" source files in src/, and don't have to go through
> >  all the trouble of using the emacs-modules machinery.
> >
> > Are you suggesting that I port the GSS module to use the native Emacs
> internal API's?
>
> If the module is to come with Emacs out of the box, yes.
>

Since the main beneficiary of this feature is Gnus, it does make sense.

> > If I do that, would you accept that into Emacs proper?
>
> I don't know, I'd have to look at the sources first.
>

It's not that much work to implement the C part of this. I'll have to get
acquainted with the internal Emacs API's and then do the implementation.
Hopefully it should be a non-controversial thing to include.

> > Personally, I'd definitely prefer to not have to rely on a loadable
> module for this kind of functionality. I guess the
> > only drawback is that most distributions won't ship Emacs with GSS
> support, since that required the gssapi
> > libraries to be available.
>
> Is it different from any other optional library in any significant
> way?  E.g., GPM or the image libraries?  They all require to have the
> library installed for Emacs to be able to be built with its support.
>

Technically, it's the same thing. The difference is that most people don't
have the krb5 packages installed, and enabling the GSS feature would
require packagers to add another dependency to Emacs, giving them two
alternatives: 1) Not include GSS in their prebuilt version of Emacs, or 2)
add another dependency to Emacs, resulting in another package being
installed with it.

Now, krb5 is not a big library, so option 2 above is IMHO the right way to
go, but I worry that packagers won't see it that way and we'll end up with
a feature that users would have to rebuild their Emacs to use.

However, there is an upside. Most people who need this feature are in
corporate environments, usually on Windows (Kerberos is the standard
authentication mechanism in Active Directory) and the runtime libraries are
a standard part of Windows, so at on that platform there is no issue.

> > I'd expect the user to run M-x package-installl gss, and then have
> working GSSAPI support in Gnus. Currently,
> > ELPA doesn't have any facilities to support compilation of native
> modules as part of ELPA package
> > installation.
>
> You can provide a small Makefile that the user will have to invoke in
> order to compile the C source.  Same as in modules/mod-test/ in the
> Emacs sources.
>

I could, and I'd have to. But it would be very nice if this could be part
of the M‍-‍x package‍-‍install process.

If it was, then going the ELPA route is definitely the best way.

> Compile: yes (except that I see no reason for downloading manually, it
> can all come to the end-user machine when the package is installed).
> But I see no reason to run Emacs specially: if the Makefile has an
> 'install' target that puts the compiled module in site-lisp, Emacs
> will find it when some Lisp attempts to load it.
>

Installing in site‍-‍lisp requires root. Can I put it somewhere in
$HOME/.emacs.d?

Regards,
Elias

[-- Attachment #2: Type: text/html, Size: 4965 bytes --]

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: How to ship native modules?
  2017-02-21  2:48       ` Elias Mårtenson
@ 2017-02-21  3:41         ` Eli Zaretskii
  2017-02-21  4:13           ` Elias Mårtenson
  0 siblings, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-02-21  3:41 UTC (permalink / raw)
  To: Elias Mårtenson; +Cc: emacs-devel

> From: Elias Mårtenson <lokedhs@gmail.com>
> Date: Tue, 21 Feb 2017 10:48:03 +0800
> Cc: emacs-devel <emacs-devel@gnu.org>
> 
>  Is it different from any other optional library in any significant
>  way? E.g., GPM or the image libraries? They all require to have the
>  library installed for Emacs to be able to be built with its support.
> 
> Technically, it's the same thing. The difference is that most people don't have the krb5 packages installed, and
> enabling the GSS feature would require packagers to add another dependency to Emacs, giving them two
> alternatives: 1) Not include GSS in their prebuilt version of Emacs, or 2) add another dependency to Emacs,
> resulting in another package being installed with it.
> 
> Now, krb5 is not a big library, so option 2 above is IMHO the right way to go, but I worry that packagers won't
> see it that way and we'll end up with a feature that users would have to rebuild their Emacs to use.
> 
> However, there is an upside. Most people who need this feature are in corporate environments, usually on
> Windows (Kerberos is the standard authentication mechanism in Active Directory) and the runtime libraries
> are a standard part of Windows, so at on that platform there is no issue.

Really?  I was under the impression that one would need to install the
Kerberos libraries from http://web.mit.edu/kerberos/dist/, and then
have the GSSAPI library built for Windows.  These all don't seem to be
available out of the box on MS-Windows.  Am I missing something?

>  You can provide a small Makefile that the user will have to invoke in
>  order to compile the C source. Same as in modules/mod-test/ in the
>  Emacs sources.
> 
> I could, and I'd have to. But it would be very nice if this could be part of the M‍-‍x package‍-‍install process.

I think that making package.el be able to compile programs in
languages like C is an ambitious goal that is best left to end users
at this stage.  For simple enough modules, the Makefile can be made
small and portable, so IMO it isn't worth the complexity to do more
than that, at least not yet.

> Installing in site‍-‍lisp requires root. Can I put it somewhere in $HOME/.emacs.d?

Yes, like we do with Lisp packages.  A module can live anywhere a Lisp
package can live, in order for Emacs to find it.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: How to ship native modules?
  2017-02-21  3:41         ` Eli Zaretskii
@ 2017-02-21  4:13           ` Elias Mårtenson
  2017-02-21 16:48             ` Eli Zaretskii
  0 siblings, 1 reply; 128+ messages in thread
From: Elias Mårtenson @ 2017-02-21  4:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1734 bytes --]

On 21 February 2017 at 11:41, Eli Zaretskii <eliz@gnu.org> wrote:

> > However, there is an upside. Most people who need this feature are in
> corporate environments, usually on
> > Windows (Kerberos is the standard authentication mechanism in Active
> Directory) and the runtime libraries
> > are a standard part of Windows, so at on that platform there is no issue.
>
> Really?  I was under the impression that one would need to install the
> Kerberos libraries from http://web.mit.edu/kerberos/dist/, and then
> have the GSSAPI library built for Windows.  These all don't seem to be
> available out of the box on MS-Windows.  Am I missing something?
>

Yes, I was a bit hasty. If you want to use the standard C API, then you
need those libraries. However, you could use the Windows API, SSPI instead.
It implements the same functionality using a different API (because of
course they do).

Note that I don't have any intention of implementing SSPI support myself,
but implementing an SSPI backend would not be terribly difficult.

> I think that making package.el be able to compile programs in
> languages like C is an ambitious goal that is best left to end users
> at this stage.  For simple enough modules, the Makefile can be made
> small and portable, so IMO it isn't worth the complexity to do more
> than that, at least not yet.
>

Fair enough.

That leave me with only one question: With the main user of this
functionality being Gnus, should GSS support be implemented as part of
Emacs proper, or shipped using ELPA?

The pros and cons of each approach should be well understood now, and I'm
willing to implement it either way. So we're up to a policy decision on the
part of the Emacs maintainers now.

Regards,
Elias

[-- Attachment #2: Type: text/html, Size: 2476 bytes --]

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: How to ship native modules?
  2017-02-20 10:00 How to ship native modules? Elias Mårtenson
  2017-02-20 15:27 ` Eli Zaretskii
  2017-02-20 15:33 ` How to ship native modules? Aurélien Aptel
@ 2017-02-21  4:50 ` Andreas Politz
  2017-02-21  5:12   ` Elias Mårtenson
  2 siblings, 1 reply; 128+ messages in thread
From: Andreas Politz @ 2017-02-21  4:50 UTC (permalink / raw)
  To: Elias Mårtenson; +Cc: emacs-devel

Elias Mårtenson <lokedhs@gmail.com> writes:

> If it is to be shipped as part of ELPA, then a different question
> emerges: Should I simply add code to the Elisp files that checks if it
> can find the native library and if not, compile it on the fly? It's
> doable, but I'd hard to reinvent several wheels to do so. Essentially
> I'd have to rebuild part of autoconf in Elisp.

Hi, I am the author of pdf-tools and I have a very similar, if not the
same, problem.

I think your analogy to autoconf is not the right one.  What's needed is
more like an abstraction over the various systems including package
manager and possibly naming schemes.  For example on Arch you need to
install the package krb5 with pacman, while on Debian its probably
libkrb5-dev and libkrb5 using aptitude and on Windows you'd download
some installer or whatever.

There are various other details to consider. For example make is called
gmake on BSD, or you need to setup an environment variable
(e.g. PKG_CONFIG_PATH on macos).

This could be implemented in a neat, small package, your package depends
on and calls into as part of the installation or first-time-running
process.

Adding a new system-configuration should be fairly easy.

I've been pondering to write such a thing,

what do you think ?

-ap

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: How to ship native modules?
  2017-02-21  4:50 ` Andreas Politz
@ 2017-02-21  5:12   ` Elias Mårtenson
  2017-02-21  5:23     ` Andreas Politz
  0 siblings, 1 reply; 128+ messages in thread
From: Elias Mårtenson @ 2017-02-21  5:12 UTC (permalink / raw)
  To: Andreas Politz; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1884 bytes --]

On 21 February 2017 at 12:50, Andreas Politz <politza@hochschule-trier.de>
wrote:

> Elias Mårtenson <lokedhs@gmail.com> writes:
>
> > If it is to be shipped as part of ELPA, then a different question
> > emerges: Should I simply add code to the Elisp files that checks if it
> > can find the native library and if not, compile it on the fly? It's
> > doable, but I'd hard to reinvent several wheels to do so. Essentially
> > I'd have to rebuild part of autoconf in Elisp.
>
> Hi, I am the author of pdf-tools and I have a very similar, if not the
> same, problem.
>

I use pdf-tools, and I have definitely seen the same issue there.


> I think your analogy to autoconf is not the right one.  What's needed is
> more like an abstraction over the various systems including package
> manager and possibly naming schemes.  For example on Arch you need to
> install the package krb5 with pacman, while on Debian its probably
> libkrb5-dev and libkrb5 using aptitude and on Windows you'd download
> some installer or whatever.
>

Indeed. The MIT and Heimdal implementations also have some minor
differences that has to be taken into account, making the Makefile more
complex.


> There are various other details to consider. For example make is called
> gmake on BSD, or you need to setup an environment variable
> (e.g. PKG_CONFIG_PATH on macos).
>

I don't use OSX much anymore, so I wasn't aware of that.


> This could be implemented in a neat, small package, your package depends
> on and calls into as part of the installation or first-time-running
> process.
>
> Adding a new system-configuration should be fairly easy.
>
> I've been pondering to write such a thing,
>
> what do you think ?
>

I think that sounds like a great idea. If you build a prototype I'd be
happy to provide whatever assistance I can.

Regards,
Elias

[-- Attachment #2: Type: text/html, Size: 2828 bytes --]

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: How to ship native modules?
  2017-02-21  5:12   ` Elias Mårtenson
@ 2017-02-21  5:23     ` Andreas Politz
  0 siblings, 0 replies; 128+ messages in thread
From: Andreas Politz @ 2017-02-21  5:23 UTC (permalink / raw)
  To: Elias Mårtenson; +Cc: emacs-devel

Elias Mårtenson <lokedhs@gmail.com> writes:

> I think that sounds like a great idea. If you build a prototype I'd be happy to provide whatever assistance I can.

Great, I'll get back here, if I have something working.

-ap



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: How to ship native modules?
  2017-02-20 16:30     ` Eli Zaretskii
  2017-02-21  2:48       ` Elias Mårtenson
@ 2017-02-21 14:44       ` Stefan Monnier
       [not found]         ` <CADtN0WLjNcFRLCsJNZX+XfqOcq+veTaoGkwHQCV9bjvuQoEORA@mail.gmail.com>
  2017-02-21 16:59         ` Eli Zaretskii
  1 sibling, 2 replies; 128+ messages in thread
From: Stefan Monnier @ 2017-02-21 14:44 UTC (permalink / raw)
  To: emacs-devel

> You can provide a small Makefile that the user will have to invoke in
> order to compile the C source.

The user doesn't even have to invoke it: package-install can invoke it
(indirectly) by having something akin to

    (eval-when-compile (shell-command "make"))

in one of the Elisp files.


        Stefan




^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: How to ship native modules?
       [not found]         ` <CADtN0WLjNcFRLCsJNZX+XfqOcq+veTaoGkwHQCV9bjvuQoEORA@mail.gmail.com>
@ 2017-02-21 15:48           ` Elias Mårtenson
  2017-02-21 17:14             ` Stefan Monnier
  0 siblings, 1 reply; 128+ messages in thread
From: Elias Mårtenson @ 2017-02-21 15:48 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 533 bytes --]

On 21 Feb 2017 11:24 PM, "Stefan Monnier" <monnier@iro.umontreal.ca> wrote:

> You can provide a small Makefile that the user will have to invoke in
> order to compile the C source.

The user doesn't even have to invoke it: package-install can invoke it
(indirectly) by having something akin to

    (eval-when-compile (shell-command "make"))

in one of the Elisp files.


I thought ELPA packages could only contain el files? (granted, even if that
is the case, I could always encode the C code in elisp code, but that's a
bit ugly)

[-- Attachment #2: Type: text/html, Size: 1215 bytes --]

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: How to ship native modules?
  2017-02-21  4:13           ` Elias Mårtenson
@ 2017-02-21 16:48             ` Eli Zaretskii
  2017-02-21 20:06               ` John Wiegley
  0 siblings, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-02-21 16:48 UTC (permalink / raw)
  To: Elias Mårtenson, John Wiegley; +Cc: emacs-devel

> From: Elias Mårtenson <lokedhs@gmail.com>
> Date: Tue, 21 Feb 2017 12:13:44 +0800
> Cc: emacs-devel <emacs-devel@gnu.org>
> 
> That leave me with only one question: With the main user of this functionality being Gnus, should GSS
> support be implemented as part of Emacs proper, or shipped using ELPA?
> 
> The pros and cons of each approach should be well understood now, and I'm willing to implement it either
> way. So we're up to a policy decision on the part of the Emacs maintainers now.

I looked at the C source.  It's quite small, and if reimplemented as
part of Emacs, will be even smaller (no need to reinvent XCAR, intern,
etc.).  So I see no problems with adding this to the Emacs sources,
probably as a separate C file.

John?



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: How to ship native modules?
  2017-02-21 14:44       ` Stefan Monnier
       [not found]         ` <CADtN0WLjNcFRLCsJNZX+XfqOcq+veTaoGkwHQCV9bjvuQoEORA@mail.gmail.com>
@ 2017-02-21 16:59         ` Eli Zaretskii
  1 sibling, 0 replies; 128+ messages in thread
From: Eli Zaretskii @ 2017-02-21 16:59 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Tue, 21 Feb 2017 09:44:39 -0500
> 
> > You can provide a small Makefile that the user will have to invoke in
> > order to compile the C source.
> 
> The user doesn't even have to invoke it: package-install can invoke it
> (indirectly) by having something akin to
> 
>     (eval-when-compile (shell-command "make"))
> 
> in one of the Elisp files.

Yes, of course.  Provided that the 'make' doesn't need any
command-line options that only human can deduce (depends on how clever
and reliable the Makefile is).



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: How to ship native modules?
  2017-02-21 15:48           ` Elias Mårtenson
@ 2017-02-21 17:14             ` Stefan Monnier
  0 siblings, 0 replies; 128+ messages in thread
From: Stefan Monnier @ 2017-02-21 17:14 UTC (permalink / raw)
  To: emacs-devel

> I thought ELPA packages could only contain el files? (granted, even if that
> is the case, I could always encode the C code in elisp code, but that's a
> bit ugly)

No, they can contain any file you like (they're just tarballs).
Some commonly used non-Elisp files are image files and .info files, but
C files will work just as well (tho package.el knows nothing about them).


        Stefan




^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: How to ship native modules?
  2017-02-21 16:48             ` Eli Zaretskii
@ 2017-02-21 20:06               ` John Wiegley
  0 siblings, 0 replies; 128+ messages in thread
From: John Wiegley @ 2017-02-21 20:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Elias Mårtenson, emacs-devel

>>>>> Eli Zaretskii <eliz@gnu.org> writes:

> I looked at the C source. It's quite small, and if reimplemented as part of
> Emacs, will be even smaller (no need to reinvent XCAR, intern, etc.). So I
> see no problems with adding this to the Emacs sources, probably as a
> separate C file.

> John?

Sure, I'm fine with that being in core.

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2



^ permalink raw reply	[flat|nested] 128+ messages in thread

* request to reconsider libnettle/libhogweed (was: How to ship native modules?)
  2017-02-20 15:27 ` Eli Zaretskii
  2017-02-20 16:01   ` Elias Mårtenson
@ 2017-03-02 14:59   ` Ted Zlatanov
  2017-03-02 15:19     ` request to reconsider libnettle/libhogweed Stefan Monnier
                       ` (2 more replies)
  1 sibling, 3 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-03-02 14:59 UTC (permalink / raw)
  To: emacs-devel

On Tue, 21 Feb 2017 12:06:40 -0800 John Wiegley <jwiegley@gmail.com> wrote: 

JW> Sure, I'm fine with [GSSAPI] being in core.

Hi, John, Eli, and everyone else. It's been a few years since I've
brought up the libnettle/libhogweed topic and we've had lots of change
in the C core and in our community.

Since it's apparently OK to add the GSS/Kerberos linkage to the core, I
want to ask again that the libnettle/libhogweed crypto functions be
allowed in the core, for the following reasons:

* they are more *generally* useful than GSS and will enable future
  development in many directions. (I'm not saying GSS is not useful,
  just that it's a more specific tool that serves a narrow purpose.)

* we already link to GnuTLS, which means those C functions are available
  already. So this is not going to require much extra C code, just some
  work to expose the C functions. My original patch, now many years old,
  can probably be adapted or rewritten pretty easily. Most of it was
  tests. Keeping up to date is similarly easy as long as users keep up
  to date with GnuTLS. Note also that on the Lisp side, no low-level
  crypto code will be written or maintained, we're only exposing
  libraries that are maintained and reviewed by an existing large
  community.

* recall that Stefan, the maintainer at the time, asked me to push for a
  module approach with the understanding that the crypto module would be
  distributed to let users and packages access those functions. We now
  have a module system, but there has been no support or interest in
  implementing a module *distribution* system that would allow users to
  access those cryptographic functions with a stock distribution of
  Emacs. Instead all the comments have been that it's a hard problem and
  the Emacs developers would rather not deal with it. Where the comments
  have suggested a solution, it has been "let them run make" which
  rhymes with "let them eat cake."

* personally, I would rather make the core more functional out of the
  box, so my proposal is to go back to the original C patch instead of
  inventing a module distribution system. (A module distribution system
  may yet make sense, and I have nothing against it, but I don't want to
  wait more years for it or invent it.)

Thank you for your consideration.
Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: request to reconsider libnettle/libhogweed
  2017-03-02 14:59   ` request to reconsider libnettle/libhogweed (was: How to ship native modules?) Ted Zlatanov
@ 2017-03-02 15:19     ` Stefan Monnier
  2017-03-02 15:55     ` request to reconsider libnettle/libhogweed (was: How to ship native modules?) Eli Zaretskii
  2017-03-02 17:58     ` request to reconsider libnettle/libhogweed Paul Eggert
  2 siblings, 0 replies; 128+ messages in thread
From: Stefan Monnier @ 2017-03-02 15:19 UTC (permalink / raw)
  To: emacs-devel

>   Emacs. Instead all the comments have been that it's a hard problem and
>   the Emacs developers would rather not deal with it.  Where the comments
>   have suggested a solution, it has been "let them run make" which
>   rhymes with "let them eat cake."

Hmm... I'm sorry if that's how I'm coming across.  But that's really not
my intention.  Instead, what I mean is:
- I don't personally know exactly what is needed (I have some vague
  idea, but details matter here).
- We probably want to have a solution *now* and that basically means
  using something like the eval-when-compile+make hack.

But indeed, it looks like the new maintainers, being OK with adding
GSSAPI in the core, will be OK with adding random other libs to the
core, so the pressure to improve the dynload module infrastructure will
be low.


        Stefan




^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: request to reconsider libnettle/libhogweed (was: How to ship native modules?)
  2017-03-02 14:59   ` request to reconsider libnettle/libhogweed (was: How to ship native modules?) Ted Zlatanov
  2017-03-02 15:19     ` request to reconsider libnettle/libhogweed Stefan Monnier
@ 2017-03-02 15:55     ` Eli Zaretskii
  2017-03-15 21:19       ` libnettle/libhogweed WIP (was: request to reconsider libnettle/libhogweed) Ted Zlatanov
  2017-03-02 17:58     ` request to reconsider libnettle/libhogweed Paul Eggert
  2 siblings, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-03-02 15:55 UTC (permalink / raw)
  To: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Thu, 02 Mar 2017 09:59:59 -0500
> 
> * personally, I would rather make the core more functional out of the
>   box, so my proposal is to go back to the original C patch instead of
>   inventing a module distribution system. (A module distribution system
>   may yet make sense, and I have nothing against it, but I don't want to
>   wait more years for it or invent it.)

Please show the patch, perhaps after updating it to match the current
master, and let's discuss then.

Thanks.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: request to reconsider libnettle/libhogweed
  2017-03-02 14:59   ` request to reconsider libnettle/libhogweed (was: How to ship native modules?) Ted Zlatanov
  2017-03-02 15:19     ` request to reconsider libnettle/libhogweed Stefan Monnier
  2017-03-02 15:55     ` request to reconsider libnettle/libhogweed (was: How to ship native modules?) Eli Zaretskii
@ 2017-03-02 17:58     ` Paul Eggert
  2017-03-02 18:33       ` Ted Zlatanov
  2 siblings, 1 reply; 128+ messages in thread
From: Paul Eggert @ 2017-03-02 17:58 UTC (permalink / raw)
  To: emacs-devel

On 03/02/2017 06:59 AM, Ted Zlatanov wrote:
> * we already link to GnuTLS, which means those C functions are available
>    already.

If Emacs's module system were always present and worked well, this could 
be supported via a module that consisted entirely of glue, i.e., a 
module that merely exposes C functions already present in Emacs. Since 
the module system isn't guaranteed, though, it may be better to create 
new built-in Elisp functions for this, at least for now.

Is there some way that we can say "these functions are built-in now, but 
may be moved to a module later"? If not, perhaps there should be.

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: request to reconsider libnettle/libhogweed
  2017-03-02 17:58     ` request to reconsider libnettle/libhogweed Paul Eggert
@ 2017-03-02 18:33       ` Ted Zlatanov
  0 siblings, 0 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-03-02 18:33 UTC (permalink / raw)
  To: emacs-devel

On Thu, 2 Mar 2017 09:58:57 -0800 Paul Eggert <eggert@cs.ucla.edu> wrote: 

PE> On 03/02/2017 06:59 AM, Ted Zlatanov wrote:
>> * we already link to GnuTLS, which means those C functions are available
>> already.

PE> If Emacs's module system were always present and worked well, this could be
PE> supported via a module that consisted entirely of glue, i.e., a module that
PE> merely exposes C functions already present in Emacs. Since the module system
PE> isn't guaranteed, though, it may be better to create new built-in Elisp
PE> functions for this, at least for now.

There won't be anything in these functions that precludes moving them.

PE> Is there some way that we can say "these functions are built-in now, but may be
PE> moved to a module later"? If not, perhaps there should be.

I'll be happy to annotate them as needed, but I don't know who will care
about such annotations? Perhaps it's better to mark the functions as
dependent on a particular Emacs feature, sort of like
`gnutls-available-p' but as a per-function tag.

On Thu, 02 Mar 2017 17:55:07 +0200 Eli Zaretskii <eliz@gnu.org> wrote: 

EZ> Please show the patch, perhaps after updating it to match the current
EZ> master, and let's discuss then.

That's the technical side, and I'll gladly do it if John agrees it's
acceptable in principle. I'd rather not spend hours on it otherwise.

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* libnettle/libhogweed WIP (was: request to reconsider libnettle/libhogweed)
  2017-03-02 15:55     ` request to reconsider libnettle/libhogweed (was: How to ship native modules?) Eli Zaretskii
@ 2017-03-15 21:19       ` Ted Zlatanov
  2017-03-16 15:28         ` Eli Zaretskii
  0 siblings, 1 reply; 128+ messages in thread
From: Ted Zlatanov @ 2017-03-15 21:19 UTC (permalink / raw)
  To: emacs-devel

On Thu, 02 Mar 2017 17:55:07 +0200 Eli Zaretskii <eliz@gnu.org> wrote: 

>> From: Ted Zlatanov <tzz@lifelogs.com>
>> Date: Thu, 02 Mar 2017 09:59:59 -0500
>> 
>> * personally, I would rather make the core more functional out of the
>> box, so my proposal is to go back to the original C patch instead of
>> inventing a module distribution system. (A module distribution system
>> may yet make sense, and I have nothing against it, but I don't want to
>> wait more years for it or invent it.)

EZ> Please show the patch, perhaps after updating it to match the current
EZ> master, and let's discuss then.

I did some work, at least to shut up the warnings, but it's far from
ready. I pushed it into the branch scratch/tzz/nettle for now. If you
could take a look, that would be very helpful.

Issues:

* the ERT tests need to be moved

* maybe there's a better way than using the NETTLE_STRING_EQUAL_UNIBYTE
  macro to compare Emacs and C strings?

* several times I've forced casts to shut up warnings, and generally
  could use some help converting between C bytes and the ELisp strings

* data is not wiped after use

* the APIs are primitive, matching the C API pretty closely. I would
  like to leave it to the ELisp packages to create abstractions on top
  of nettle.el, but maybe others have suggestions?

* it compiles but doesn't link: "error adding symbols: DSO missing from
  command line" which I hope is something trivial

Thanks
Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP (was: request to reconsider libnettle/libhogweed)
  2017-03-15 21:19       ` libnettle/libhogweed WIP (was: request to reconsider libnettle/libhogweed) Ted Zlatanov
@ 2017-03-16 15:28         ` Eli Zaretskii
  2017-03-17 22:46           ` libnettle/libhogweed WIP Ted Zlatanov
                             ` (2 more replies)
  0 siblings, 3 replies; 128+ messages in thread
From: Eli Zaretskii @ 2017-03-16 15:28 UTC (permalink / raw)
  To: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Wed, 15 Mar 2017 17:19:32 -0400
> 
> I did some work, at least to shut up the warnings, but it's far from
> ready. I pushed it into the branch scratch/tzz/nettle for now. If you
> could take a look, that would be very helpful.

I took a look, comments below.

> * maybe there's a better way than using the NETTLE_STRING_EQUAL_UNIBYTE
>   macro to compare Emacs and C strings?

I think this entire issue should be rethought, see the comments below.

> * several times I've forced casts to shut up warnings, and generally
>   could use some help converting between C bytes and the ELisp strings

If you agree with my comments below, most, if not all, of this issue
should go away.

> * data is not wiped after use

I'm not an expert, but AFAIU this is a serious flaw in
security-related software.  Should this be optional (as it probably
has non-negligible run-time costs)?

> * it compiles but doesn't link: "error adding symbols: DSO missing from
>   command line" which I hope is something trivial

Did the -lhogweed -lnettle switches appear on the link command line?

Here are some comments to the patch:

First, some general observations:

 . the patch is missing additions to the manual and NEWS
 . support for dynamic loading on MS-Windows should be added
 . did you consider exposing this functionality through corresponding
   GnuTLS functions?

Specific comments follow:

> +  # Windows loads libnettle dynamically
> +  if test "${opsys}" = "mingw32"; then
> +    LIBNETTLE_LIBS=
> +  else
> +    CFLAGS="$CFLAGS $LIBNETTLE_CFLAGS"
> +    LIBS="$LIBNETTLE_LIBS $LIBS"
> +  fi

CFLAGS should be set for the Windows build as well.  Only LIBS should
not be added.

> diff --git a/lisp/nettle.el b/lisp/nettle.el
> new file mode 100644
> index 0000000000..6f49c3f031
> --- /dev/null
> +++ b/lisp/nettle.el
> @@ -0,0 +1,335 @@
> +;;; nettle.el --- Interface to libnettle/libhogweed  -*- lexical-binding: t; -*-

Actually, this does not seem to be an interface at all.  What you have
here are defcustom that seems not to be used anywhere, a bunch of
diagnostic functions useful for debugging, and tests that should be
moved elsewhere anyway.  Do we really need this file?

> +;; Copyright (C) 2013  Teodor Zlatanov

This should cite the FSF as the copyright holder, I think.

> @@ -1369,6 +1373,10 @@ Using an Emacs configured with --with-x-toolkit=lucid does not have this problem
>    globals_of_kqueue ();
>  #endif

> +#ifdef HAVE_NETTLE
> +      syms_of_nettle ();
> +#endif

syms_of_nettle should be called only if !initialized.

> +#ifdef HAVE_NETTLE
> +
> +#include "nettle.h"
> +
> +DEFUN ("nettle-available-p", Fnettle_available_p, Snettle_available_p, 0, 0, 0,
> +       doc: /* Return t if Nettle is available in this instance of Emacs.
> +
> +The Nettle integration binds Emacs with libnettle and libhogweed.
> +See http://www.lysator.liu.se/~nisse/nettle for more on Nettle.  */)
> +     (void)
> +{
> +  return Qt;
> +}

This function should be outside of the "#ifdef HAVE_NETTLE"
conditional, otherwise it will not be available in an Emacs built
without the library, and Lisp programs will need to use fboundp
instead, which all but defeats the purpose of the function.

> +DEFUN ("nettle-hash", Fnettle_hash, Snettle_hash, 2, 2, 0,
> +       doc: /* Hash INPUT string with HASH-METHOD into a unibyte string.

Here and elsewhere, the doc string should explicitly tell that INPUT
must be a unibyte string.

A design question: would it make sense to support vectors as INPUT,
here and in the rest of the functions?

Another design question: should be support buffer regions, instead of
strings, as input to these functions?  The way your code is written,
Lisp programs must always cons a string from buffer text, before they
invoke these functions.  This could be gratuitous cost in some use
cases.

> +  (Lisp_Object input, Lisp_Object hash_method)
> +{
> +  Lisp_Object ret = Qnil;
> +
> +  CHECK_STRING (input);
> +  CHECK_STRING (hash_method);
> +
> +  for (int i = 0; NULL != nettle_hashes[i]; i++)
> +    {
> +      if (NETTLE_STRING_EQUAL_UNIBYTE (hash_method, nettle_hashes[i]->name))

I don't think it's a good idea to use strings for methods and other
such fixed parameters in your patch.  We usually use symbols for that.
The advantage of symbols is that you can test equality with EQ,
instead of the costly NETTLE_STRING_EQUAL_UNIBYTE, or any equivalent
code which compares strings.

> +          const struct nettle_hash *alg = nettle_hashes[i];
> +          unsigned int length = alg->digest_size;
> +          void *ctx = xzalloc (alg->context_size);
> +          uint8_t *digest;
> +          ctx = xzalloc (alg->context_size);
> +          alg->init (ctx);
> +          alg->update (ctx, SCHARS (input), SDATA (input));
> +
> +          digest = xzalloc (length);
> +          alg->digest (ctx, length, digest);
> +
> +          ret = make_unibyte_string ((const char*) digest, length);
> +
> +          xfree (digest);
> +          xfree (ctx);

Here and elsewhere, the size of the result is known in advance, so I
would avoid allocating a scratch buffer and then copying its data
(inside make_unibyte_string) into a newly-allocated string.  Instead,
use make_uninit_string, and then pass its string data pointer to the
algorithm that produces the digest.

> +          uint8_t *digest;
             ^^^^^^^
Why not 'unsigned char'?

> +DEFUN ("nettle-pbkdf2", Fnettle_pbkdf2, Snettle_pbkdf2, 5, 5, 0,
> +       doc: /* Make PBKDF2 of HASH-LENGTH from KEY with HASH-METHOD using ITERATIONS and SALT.
> +
> +The PBKDF2 data is stored in a unibyte string according to RFC 2898.
> +The list of hash-methods can be obtained with `nettle-hashes`.
> +The `sha1' and `sha256' hashing methods are most common and only supported now.  */)

This doc string doesn't document the ITERATIONS and SALT arguments.

> +DEFUN ("nettle-rsa-verify", Fnettle_rsa_verify, Snettle_rsa_verify, 4, 4, 0,
> +       doc: /* Verify the RSA SIGNATURE of DATA with PUBLIC-KEY and HASH-METHOD.
> +
> +The list of hash-methods can be obtained with `nettle-hashes`.
> +Only the `md5', `sha1', `sha256', and `sha512' hashing methods are supported.  */)

This doc string doesn't describe some of its argument, and doesn't
tell their data types.

> +DEFUN ("nettle-crypt", Fnettle_crypt, Snettle_crypt, 6, 6, 0,
> +       doc: /* Encrypt or decrypt INPUT in CRYPT-MODE with KEY, CIPHER, CIPHER-MODE, and IV.
> +
> +The INPUT will be zero-padded to be a multiple of the cipher's block size.
> +
> +The KEY will be zero-padded to the cipher's key size and will be
> +trimmed if it exceeds that key size.
> +
> +The list of ciphers can be obtained with `nettle-ciphers`.
> +The list of cipher modes can be obtained with `nettle-cipher-modes`.
> +The `aes256' cipher method is probably best for general use.
> +The `twofish256' cipher method may be better if you want to avoid NIST ciphers.  */)

This doesn't document the argument IV, and doesn't tell the data types
of the arguments.

> +  mode = CAR_SAFE (Fmember (cipher_mode, Fnettle_cipher_modes ()));

Wouldn't it be less expensive to access the data on the C level,
without consing a list?

> +          /* Increment input_length to the next multiple of block_size.  */
> +          if (0 != alg->block_size)
> +            {
> +              while (0 != (input_length % alg->block_size))
> +                {
> +                  input_length++;
> +                }
> +            }

I think there are more efficient ways of doing this, which don't need
an explicit loop.

> +          dest = xzalloc (input_length);
> +          memcpy (dest, SDATA (input), SCHARS (input));
> +          // message("Dest buffer: '%s' and size is %d", dest, input_length);

I don't understand why you copy the data here, instead of passing the
algorithm a pointer to the original string's data.

> +          key_hold = xzalloc (alg->key_size);
> +          memcpy (key_hold, SDATA (key), min (alg->key_size, SCHARS (key)));

Likewise here.  In addition, since you are using 'min', shouldn't we
signal an error if KEY is too long?

> +          if (0 == NETTLE_STRING_EQUAL_UNIBYTE (mode, "CBC"))
> +            {
> +              if (decrypt)
> +                {
> +                  cbc_decrypt (ctx, alg->decrypt,
> +                               alg->block_size, iv_hold,
> +                               input_length, dest, dest);
> +                }
> +              else
> +                {
> +                  cbc_encrypt (ctx, alg->encrypt,
> +                               alg->block_size, iv_hold,
> +                               input_length, dest, dest);
> +                }
> +            }
> +          else if (ctr_mode)
> +            {
> +              ctr_crypt (ctx, alg->encrypt,
> +                         alg->block_size, iv_hold,
> +                         input_length, dest, dest);
> +            }
> +          else
> +            {
> +              error ("Unexpected error: Nettle cipher mode %s was not found", SDATA (mode));
> +            }
> +
> +          // message("-> produced (%d) '%s'", input_length, dest);
> +          ret = make_unibyte_string ((const char*) dest, input_length);

Once again, since the length is known in advance, producing output
into an allocated buffer and then creating a Lisp string by copying
that buffer is wasteful.  It is better to produce output directly into
a string's data.

Thanks again for working on this.

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-03-16 15:28         ` Eli Zaretskii
@ 2017-03-17 22:46           ` Ted Zlatanov
  2017-03-18  8:12             ` Eli Zaretskii
  2017-03-20 18:45           ` Ted Zlatanov
  2017-04-11 20:05           ` Ted Zlatanov
  2 siblings, 1 reply; 128+ messages in thread
From: Ted Zlatanov @ 2017-03-17 22:46 UTC (permalink / raw)
  To: emacs-devel

On Thu, 16 Mar 2017 17:28:20 +0200 Eli Zaretskii <eliz@gnu.org> wrote: 

>> From: Ted Zlatanov <tzz@lifelogs.com>
>> Date: Wed, 15 Mar 2017 17:19:32 -0400

>> * data is not wiped after use

EZ> I'm not an expert, but AFAIU this is a serious flaw in
EZ> security-related software.  Should this be optional (as it probably
EZ> has non-negligible run-time costs)?

I don't think the costs are huge, so this is definitely TODO before the
code is ready.

>> * it compiles but doesn't link: "error adding symbols: DSO missing from
>> command line" which I hope is something trivial

EZ> Did the -lhogweed -lnettle switches appear on the link command line?

Here's the full monster (I just did `./autogen.sh all && configure
--with-nettle && make' on a Ubuntu system):

gcc -Demacs  -I. -I. -I../lib -I../lib   -pthread -isystem /usr/include/gtk-3.0 -isystem /usr/include/at-spi2-atk/2.0 -isystem /usr/include/at-spi-2.0 -isystem /usr/include/dbus-1.0 -isystem /usr/lib/x86_64-linux-gnu/dbus-1.0/include -isystem /usr/include/gtk-3.0 -isystem /usr/include/gio-unix-2.0/ -isystem /usr/include/mirclient -isystem /usr/include/mircommon -isystem /usr/include/mircookie -isystem /usr/include/cairo -isystem /usr/include/pango-1.0 -isystem /usr/include/harfbuzz -isystem /usr/include/pango-1.0 -isystem /usr/include/atk-1.0 -isystem /usr/include/cairo -isystem /usr/include/pixman-1 -isystem /usr/include/freetype2 -isystem /usr/include/libpng16 -isystem /usr/include/gdk-pixbuf-2.0 -isystem /usr/include/libpng16 -isystem /usr/include/glib-2.0 -isystem /usr/lib/x86_64-linux-gnu/glib-2.0/include -isystem /usr/include/freetype2  -isystem /usr/include/alsa -pthread -isystem /usr/include/librsvg-2.0 -isystem /usr/include/gdk-pixbuf-2.0 -isystem /usr/include/libpng16 -isystem /usr/include/cairo -isystem /usr/include/glib-2.0 -isystem /usr/lib/x86_64-linux-gnu/glib-2.0/include -isystem /usr/include/pixman-1 -isystem /usr/include/freetype2 -isystem /usr/include/libpng16 -fopenmp -DMAGICKCORE_HDRI_ENABLE=0 -DMAGICKCORE_QUANTUM_DEPTH=16 -fopenmp -DMAGICKCORE_HDRI_ENABLE=0 -DMAGICKCORE_QUANTUM_DEPTH=16 -isystem /usr/include/x86_64-linux-gnu/ImageMagick-6 -isystem /usr/include/ImageMagick-6 -isystem /usr/include/x86_64-linux-gnu/ImageMagick-6 -isystem /usr/include/ImageMagick-6 -isystem /usr/include/libpng16 -isystem /usr/include/libxml2 -isystem /usr/include/dbus-1.0 -isystem /usr/lib/x86_64-linux-gnu/dbus-1.0/include      -pthread -isystem /usr/include/glib-2.0 -isystem /usr/lib/x86_64-linux-gnu/glib-2.0/include -pthread -isystem /usr/include/gconf/2 -isystem /usr/include/dbus-1.0 -isystem /usr/lib/x86_64-linux-gnu/dbus-1.0/include -isystem /usr/include/glib-2.0 -isystem /usr/lib/x86_64-linux-gnu/glib-2.0/include -isystem /usr/include/glib-2.0 -isystem /usr/lib/x86_64-linux-gnu/glib-2.0/include -isystem /usr/include/freetype2 -isystem /usr/include/freetype2 -isystem /usr/include/freetype2  -MMD -MF deps/.d -MP   -lnettle -isystem /usr/include/p11-kit-1    -fno-common -W -Wabi -Waddress -Waggressive-loop-optimizations -Wall -Wattributes -Wbool-compare -Wbuiltin-macro-redefined -Wcast-align -Wchar-subscripts -Wchkp -Wclobbered -Wcomment -Wcomments -Wcoverage-mismatch -Wcpp -Wdate-time -Wdeprecated -Wdeprecated-declarations -Wdesignated-init -Wdisabled-optimization -Wdiscarded-array-qualifiers -Wdiscarded-qualifiers -Wdiv-by-zero -Wdouble-promotion -Wduplicated-cond -Wempty-body -Wendif-labels -Wenum-compare -Wextra -Wformat-contains-nul -Wformat-extra-args -Wformat-security -Wformat-signedness -Wformat-y2k -Wformat-zero-length -Wframe-address -Wfree-nonheap-object -Whsa -Wignored-attributes -Wignored-qualifiers -Wimplicit -Wimplicit-function-declaration -Wimplicit-int -Wincompatible-pointer-types -Winit-self -Wint-conversion -Wint-to-pointer-cast -Winvalid-memory-model -Winvalid-pch -Wjump-misses-init -Wlogical-not-parentheses -Wlogical-op -Wmain -Wmaybe-uninitialized -Wmemset-transposed-args -Wmisleading-indentation -Wmissing-braces -Wmissing-declarations -Wmissing-include-dirs -Wmissing-parameter-type -Wmissing-prototypes -Wmultichar -Wnarrowing -Wnested-externs -Wnonnull -Wnonnull-compare -Wnull-dereference -Wodr -Wold-style-declaration -Wold-style-definition -Wopenmp-simd -Woverflow -Woverride-init -Wpacked -Wpacked-bitfield-compat -Wparentheses -Wpointer-arith -Wpointer-sign -Wpointer-to-int-cast -Wpragmas -Wreturn-local-addr -Wreturn-type -Wscalar-storage-order -Wsequence-point -Wshift-count-negative -Wshift-count-overflow -Wshift-negative-value -Wsizeof-array-argument -Wsizeof-pointer-memaccess -Wstrict-aliasing -Wstrict-prototypes -Wsuggest-attribute=format -Wsuggest-attribute=noreturn -Wsuggest-final-methods -Wsuggest-final-types -Wswitch-bool -Wtautological-compare -Wtrampolines -Wtrigraphs -Wuninitialized -Wunknown-pragmas -Wunused -Wunused-but-set-parameter -Wunused-but-set-variable -Wunused-function -Wunused-label -Wunused-local-typedefs -Wunused-macros -Wunused-result -Wunused-value -Wunused-variable -Wvarargs -Wvariadic-macros -Wvector-operation-performance -Wvolatile-register-var -Wwrite-strings -Warray-bounds=2 -Wnormalized=nfc -Wshift-overflow=2 -Wredundant-decls -Wno-missing-field-initializers -Wno-sign-compare -Wno-type-limits -Wno-unused-parameter -Wno-format-nonliteral -g3 -O2  -Wl,-znocombreloc  -no-pie  \
  -o temacs   dispnew.o frame.o scroll.o xdisp.o menu.o xmenu.o window.o charset.o coding.o category.o ccl.o character.o chartab.o bidi.o cm.o term.o terminal.o xfaces.o xterm.o xfns.o xselect.o xrdb.o xsmfns.o xsettings.o gtkutil.o emacsgtkfixed.o dbusbind.o emacs.o keyboard.o macros.o keymap.o sysdep.o buffer.o filelock.o insdel.o marker.o minibuf.o fileio.o dired.o cmds.o casetab.o casefiddle.o indent.o search.o regex.o undo.o alloc.o data.o doc.o editfns.o callint.o eval.o floatfns.o fns.o font.o print.o lread.o  syntax.o unexelf.o bytecode.o process.o gnutls.o nettle.o callproc.o region-cache.o sound.o atimer.o doprnt.o intervals.o textprop.o composite.o xml.o inotify.o  profiler.o decompress.o thread.o systhread.o sheap.o     xfont.o ftfont.o xftfont.o ftxfont.o  fontset.o fringe.o image.o xgselect.o  terminfo.o lastfile.o gmalloc.o     ../lib/libegnu.a       -ltiff -ljpeg -lpng16 -lgif -lXpm -lgtk-3 -lgdk-3 -lpangocairo-1.0 -lpango-1.0 -latk-1.0 -lcairo-gobject -lcairo -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0 -lSM -lICE -lX11 -lX11-xcb -lxcb -lXrender -lXft -lasound -lrsvg-2 -lm -lgio-2.0 -lgdk_pixbuf-2.0 -lgobject-2.0 -lglib-2.0 -lcairo -lMagickWand-6.Q16 -lMagickCore-6.Q16 -lacl     -lrt -ldbus-1  -lXrandr -lXinerama -lXfixes -lXext -lxml2 -lgpm   -ltinfo  -lgio-2.0 -lgobject-2.0 -lglib-2.0 -lgconf-2 -lglib-2.0 -lgobject-2.0 -lglib-2.0 -lselinux -lfreetype -lfontconfig -lfreetype -lfreetype -lotf -lfreetype  -lhogweed -lgnutls -lpthread -lanl  -lm -lz 

And the output:

  CCLD     temacs
/usr/bin/ld: nettle.o: undefined reference to symbol 'nettle_hmac_set_key@@NETTLE_6'
/usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/libnettle.so: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
Makefile:612: recipe for target 'temacs' failed

I'm still not sure why it's failing. Is it the order of the libraries?
What does the error mean? I'm sure I had it working before.

EZ>  . the patch is missing additions to the manual and NEWS

Of course, those are TODOs.

EZ>  . support for dynamic loading on MS-Windows should be added

Can we agree that's a followup task? I don't think it should be a
blocker. I have neither the expertise nor a machine running this OS so
it needs a developer.

EZ>  . did you consider exposing this functionality through corresponding
EZ>    GnuTLS functions?

I'm not sure what you mean.

EZ> Specific comments follow:

>> +  # Windows loads libnettle dynamically
>> +  if test "${opsys}" = "mingw32"; then
>> +    LIBNETTLE_LIBS=
>> +  else
>> +    CFLAGS="$CFLAGS $LIBNETTLE_CFLAGS"
>> +    LIBS="$LIBNETTLE_LIBS $LIBS"
>> +  fi

EZ> CFLAGS should be set for the Windows build as well.  Only LIBS should
EZ> not be added.

OK.

>> diff --git a/lisp/nettle.el b/lisp/nettle.el
>> new file mode 100644
>> index 0000000000..6f49c3f031
>> --- /dev/null
>> +++ b/lisp/nettle.el
>> @@ -0,0 +1,335 @@
>> +;;; nettle.el --- Interface to libnettle/libhogweed  -*- lexical-binding: t; -*-

EZ> Actually, this does not seem to be an interface at all.  What you have
EZ> here are defcustom that seems not to be used anywhere, a bunch of
EZ> diagnostic functions useful for debugging, and tests that should be
EZ> moved elsewhere anyway.  Do we really need this file?

This will be the test code, I think. The defcustoms and functions will
mostly be useful for testing and debugging. I think when I did the
original patch the tests directory was not as evolved :)

Once the code compiles, I'll move this to the tests.

>> +;; Copyright (C) 2013  Teodor Zlatanov

EZ> This should cite the FSF as the copyright holder, I think.

Fixed.

>> +#ifdef HAVE_NETTLE
>> +      syms_of_nettle ();
>> +#endif

EZ> syms_of_nettle should be called only if !initialized.

Fixed.

>> +DEFUN ("nettle-available-p", Fnettle_available_p, Snettle_available_p, 0, 0, 0,

EZ> This function should be outside of the "#ifdef HAVE_NETTLE"
EZ> conditional, otherwise it will not be available in an Emacs built
EZ> without the library, and Lisp programs will need to use fboundp
EZ> instead, which all but defeats the purpose of the function.

Yup, fixed.

>> +DEFUN ("nettle-hash", Fnettle_hash, Snettle_hash, 2, 2, 0,
>> +       doc: /* Hash INPUT string with HASH-METHOD into a unibyte string.

EZ> Here and elsewhere, the doc string should explicitly tell that INPUT
EZ> must be a unibyte string.

Yeah, it's a pain.

EZ> A design question: would it make sense to support vectors as INPUT,
EZ> here and in the rest of the functions?

I think so--I was heading that way. Since I see this as a low-level
stateless interface, it makes sense to require consumers to use bindat,
and it would remove the ambiguity. Any objections?

EZ> Another design question: should be support buffer regions, instead of
EZ> strings, as input to these functions?  The way your code is written,
EZ> Lisp programs must always cons a string from buffer text, before they
EZ> invoke these functions.  This could be gratuitous cost in some use
EZ> cases.

Yes, but I think a low-level interface shouldn't know about buffers.
This has security implications as well--it's a little bit harder to see
the data that was passed to the function, but I feel that's not enough
to offset the complexity of translating from buffer data to C.

>> +  (Lisp_Object input, Lisp_Object hash_method)
>> +{
>> +  Lisp_Object ret = Qnil;
>> +
>> +  CHECK_STRING (input);
>> +  CHECK_STRING (hash_method);
>> +
>> +  for (int i = 0; NULL != nettle_hashes[i]; i++)
>> +    {
>> +      if (NETTLE_STRING_EQUAL_UNIBYTE (hash_method, nettle_hashes[i]->name))

EZ> I don't think it's a good idea to use strings for methods and other
EZ> such fixed parameters in your patch.  We usually use symbols for that.
EZ> The advantage of symbols is that you can test equality with EQ,
EZ> instead of the costly NETTLE_STRING_EQUAL_UNIBYTE, or any equivalent
EZ> code which compares strings.

Right. But I can't pre-define the symbols because I don't know what
hashes and ciphers will be dynamically available. So is it OK to check
if the symbol name equals a C string? Is there a fast safe way to do
that (considering that symbol names have a pretty rich charset)?

EZ> Here and elsewhere, the size of the result is known in advance, so I
EZ> would avoid allocating a scratch buffer and then copying its data
EZ> (inside make_unibyte_string) into a newly-allocated string.  Instead,
EZ> use make_uninit_string, and then pass its string data pointer to the
EZ> algorithm that produces the digest.

Got it, will be a TODO once compilation works.

>> +          uint8_t *digest;
EZ>              ^^^^^^^
EZ> Why not 'unsigned char'?

I was matching nettle-types.h which uses uint8_t. I've switched to
unsigned char.

EZ> This doc string doesn't document the ITERATIONS and SALT arguments.
(x several times)

Definitely TODO all these docs.

>> +  mode = CAR_SAFE (Fmember (cipher_mode, Fnettle_cipher_modes ()));

EZ> Wouldn't it be less expensive to access the data on the C level,
EZ> without consing a list?
...
EZ> I think there are more efficient ways of doing this, which don't need
EZ> an explicit loop.
...
EZ> I don't understand why you copy the data here, instead of passing the
EZ> algorithm a pointer to the original string's data.
...
EZ> Likewise here.  In addition, since you are using 'min', shouldn't we
EZ> signal an error if KEY is too long?

I'll put these on the TODO list.

EZ> Once again, since the length is known in advance, producing output
EZ> into an allocated buffer and then creating a Lisp string by copying
EZ> that buffer is wasteful.  It is better to produce output directly into
EZ> a string's data.

I don't know enough about the Lisp string internals to do this safely.

I've pushed your latest suggestions (the "fixed" ones above) to the branch.

I'll proceed with the TODOs (for now, simply dropped in nettle.c) as
time allows but I need the compilation to work.

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-03-17 22:46           ` libnettle/libhogweed WIP Ted Zlatanov
@ 2017-03-18  8:12             ` Eli Zaretskii
  0 siblings, 0 replies; 128+ messages in thread
From: Eli Zaretskii @ 2017-03-18  8:12 UTC (permalink / raw)
  To: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Fri, 17 Mar 2017 18:46:29 -0400
> 
> >> * it compiles but doesn't link: "error adding symbols: DSO missing from
> >> command line" which I hope is something trivial
> 
> EZ> Did the -lhogweed -lnettle switches appear on the link command line?
> 
> Here's the full monster (I just did `./autogen.sh all && configure
> --with-nettle && make' on a Ubuntu system):

Looks like -lhogweed is not followed by -lnettle, perhaps that's the
problem.

>   CCLD     temacs
> /usr/bin/ld: nettle.o: undefined reference to symbol 'nettle_hmac_set_key@@NETTLE_6'
> /usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/libnettle.so: error adding symbols: DSO missing from command line
> collect2: error: ld returned 1 exit status
> Makefile:612: recipe for target 'temacs' failed
> 
> I'm still not sure why it's failing. Is it the order of the libraries?

Probably.

> What does the error mean?

It means the linker didn't find that function in the libraries it
scanned after encountering the first reference to the function.

> EZ>  . support for dynamic loading on MS-Windows should be added
> 
> Can we agree that's a followup task?

Yes.  Once you get the branch to a good shape, someone should add that
to the branch, before it is merged.

> EZ>  . did you consider exposing this functionality through corresponding
> EZ>    GnuTLS functions?
> 
> I'm not sure what you mean.

AFAICT, GnuTLS exposes at least some of the functions you want to use,
so we could instead implement these APIs by calling GnuTLS.

> EZ> Another design question: should be support buffer regions, instead of
> EZ> strings, as input to these functions?  The way your code is written,
> EZ> Lisp programs must always cons a string from buffer text, before they
> EZ> invoke these functions.  This could be gratuitous cost in some use
> EZ> cases.
> 
> Yes, but I think a low-level interface shouldn't know about buffers.

Why not?

> Right. But I can't pre-define the symbols because I don't know what
> hashes and ciphers will be dynamically available.

You could intern all the symbols at startup time, or the first time
any of these functions is called.

> So is it OK to check if the symbol name equals a C string?

You have SYMBOL_NAME for that.  But I think interning the symbols at
startup is better.

> Is there a fast safe way to do
> that (considering that symbol names have a pretty rich charset)?

Not sure what you mean: since the internal representation is a
superset of UTF-8, strcmp should work just fine, no?

> EZ> Once again, since the length is known in advance, producing output
> EZ> into an allocated buffer and then creating a Lisp string by copying
> EZ> that buffer is wasteful.  It is better to produce output directly into
> EZ> a string's data.
> 
> I don't know enough about the Lisp string internals to do this safely.

That's the same comment about make_uninit_string above.

Thanks.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-03-16 15:28         ` Eli Zaretskii
  2017-03-17 22:46           ` libnettle/libhogweed WIP Ted Zlatanov
@ 2017-03-20 18:45           ` Ted Zlatanov
  2017-04-11 20:05           ` Ted Zlatanov
  2 siblings, 0 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-03-20 18:45 UTC (permalink / raw)
  To: emacs-devel

On Thu, 16 Mar 2017 17:28:20 +0200 Eli Zaretskii <eliz@gnu.org> wrote: 

EZ> did you consider exposing this functionality through corresponding
EZ> GnuTLS functions?

Thanks for that suggestion. I looked into those crypto functions. They
look good so far. I'll look at rebasing my older test code and whatever
I can from gnutls.git/lib/crypto-selftests.c on top of these functions
and killing the libnettle/libhogweed dependency.

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-03-16 15:28         ` Eli Zaretskii
  2017-03-17 22:46           ` libnettle/libhogweed WIP Ted Zlatanov
  2017-03-20 18:45           ` Ted Zlatanov
@ 2017-04-11 20:05           ` Ted Zlatanov
  2017-04-14 20:48             ` Ted Zlatanov
  2 siblings, 1 reply; 128+ messages in thread
From: Ted Zlatanov @ 2017-04-11 20:05 UTC (permalink / raw)
  To: emacs-devel

I updated the branch scratch/tzz/nettle with comprehensive
tests. It now mirrors the low-level API in the GnuTLS crypto.h
functions, so there's no libnettle/libhogweed dependency. The API was
pretty different so I had to rewrite almost everything.

The tests show usage and exercise many edge cases (e.g. AEAD with a nil
auth string).

There are many things still missing but I have some notes following up
to Eli's previous comments and my own, and think it's good enough for
another review round. Just keep the following in mind.

* the AEAD ciphers "CHACHA20-POLY1305" "AES-128-CCM-8" "AES-256-CCM-8"
  are not working yet.

* For AEAD, I pin to GnuTLS 3.4.0 instead of AC_CHECK_FUNCS_ONCE because
  I couldn't get that autoconf macro to work!

* the ERT tests look at the environment variable `GNUTLS_TEST_VERBOSE=1'
  to trigger verbose behavior. I'm not sure if there's a better way, and
  would like verbosity control and maybe even per-test-tag output
  settings (to make a specific type of test more verbose). It feels like
  something ERT should provide.

* other TODO: add the PK algorithms

* bookkeeping TODO list before merge: doc strings, additions to the
  manual and NEWS

* should I cache `gnutls-macs' and `gnutls-ciphers'? I'm not sure. It
  seems unnecessary, these are very fast and produce small data structures.

* should I distinguish between an AEAD decryption failure (e.g. bad
  auth) and a general error? Right now both return nil, but I could have
  the decryption failure return 'fail.

* TODO from Eli: avoid allocating a scratch buffer and then copying its
  data (inside make_unibyte_string) into a newly-allocated string.
  Instead, use make_uninit_string.

* I believe all data is wiped at the C level by the GnuTLS API, but I
  don't make special efforts at the Lisp level to wipe inputs, keys,
  IVs, or auth strings. If you think it's worthwhile, let me know what's
  a good way to do it (or point me to an example in the C code).

Thanks
Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-11 20:05           ` Ted Zlatanov
@ 2017-04-14 20:48             ` Ted Zlatanov
  2017-04-15  9:32               ` Eli Zaretskii
  2017-04-16  7:47               ` Toon Claes
  0 siblings, 2 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-04-14 20:48 UTC (permalink / raw)
  To: emacs-devel

On Tue, 11 Apr 2017 16:05:16 -0400 Ted Zlatanov <tzz@lifelogs.com> wrote: 

TZ> I updated the branch scratch/tzz/nettle with comprehensive
TZ> tests. It now mirrors the low-level API in the GnuTLS crypto.h
TZ> functions, so there's no libnettle/libhogweed dependency. The API was
TZ> pretty different so I had to rewrite almost everything.

I've put this work up on
https://gitlab.com/emacs-ci/emacs/merge_requests/2 for those who want to
use Gitlab to comment.

Updates from my previous post:

TZ> * the AEAD ciphers "CHACHA20-POLY1305" "AES-128-CCM-8" "AES-256-CCM-8"
TZ>   are not working yet.

This is fixed, all ciphers (AEAD and non-AEAD) work now.

TZ> * TODO from Eli: avoid allocating a scratch buffer and then copying its
TZ>   data (inside make_unibyte_string) into a newly-allocated string.
TZ>   Instead, use make_uninit_string.

I've done this as much as possible. For AEAD output, I'm not sure how to
set the length of an already-allocated string; I didn't want to modify
`s.size' directly. I didn't see a function or macro to do it. This is
needed for gnutls_symmetric_aead().

I also wanted to mention two possible avenues to explore:

1) how about a special C data structure that has to be called to get its
payload data, but can't be inspected or printed otherwise, and after N
reads is inaccessible? Would that be resistant to Lisp-level attempts to
grab the data?

2) Could there be a built-in C way to let C core functions take strings,
but callers can invoke them with '(buffer-string) to tell *the core
function* to call that form. In other words, I want the eval to be done
at the C level, so that looking at the call tree won't reveal the actual
string that was passed to the function. I think that would simplify my
code and other C code too. I can probably fake it with eval()? WDYT?

If that's not possible, I'll pull the first part of fns.c:secure_hash()
out into a common function for the gnutls.c crypto functions as well, so
they can take buffer contents etc. easily.

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-14 20:48             ` Ted Zlatanov
@ 2017-04-15  9:32               ` Eli Zaretskii
  2017-04-15 14:27                 ` Ted Zlatanov
  2017-04-16  7:47               ` Toon Claes
  1 sibling, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-15  9:32 UTC (permalink / raw)
  To: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Fri, 14 Apr 2017 16:48:39 -0400
> 
> TZ> * TODO from Eli: avoid allocating a scratch buffer and then copying its
> TZ>   data (inside make_unibyte_string) into a newly-allocated string.
> TZ>   Instead, use make_uninit_string.
> 
> I've done this as much as possible. For AEAD output, I'm not sure how to
> set the length of an already-allocated string; I didn't want to modify
> `s.size' directly. I didn't see a function or macro to do it. This is
> needed for gnutls_symmetric_aead().

I'm not sure I understand what you say here.  In particular, I see no
s.size in gnutls_symmetric_aead.  What did I miss?

I do see some issues in gnutls_symmetric_aead with how you create Lisp
strings.  You first correctly call make_uninit_string, which gives you
a unibyte string with no contents.  Then you populate that string's
data by calling gnutls_aead_cipher_encrypt resp. decrypt functions.
But then you call make_unibyte_string with the resulting data, which
is redundant: you already have the unibyte string with the correct
data in the 'storage' variable.  So you should just return 'storage',
like you do in, e.g., gnutls_symmetric.

I see your methods are still strings, whereas I suggested making them
symbols.  Any reasons why you didn't?

A minor nit: in doc strings, please always leave 2 spaces between
sentences, not 1.

In gnutls-ciphers, is it known in advance that all cipher names are
pure-ASCII?  If not, I'd advise against using make_unibyte_string
there, as Lisp data structures returned to Lisp programs should not
include unibyte non-ASCII strings unless strictly necessary.  Likewise
in gnutls-macs and in gnutls-digests.  (Of course, if methods become
symbols, this will become a moot point.)

> 1) how about a special C data structure that has to be called to get its
> payload data, but can't be inspected or printed otherwise, and after N
> reads is inaccessible? Would that be resistant to Lisp-level attempts to
> grab the data?

Only data structures defined via DEFVAR are accessible to Lisp, so
keeping the data in C and providing accessors for Lisp programs will
achieve the result, I think.  The accessor could wipe the data after N
accesses.

> 2) Could there be a built-in C way to let C core functions take strings,
> but callers can invoke them with '(buffer-string) to tell *the core
> function* to call that form. In other words, I want the eval to be done
> at the C level, so that looking at the call tree won't reveal the actual
> string that was passed to the function. I think that would simplify my
> code and other C code too. I can probably fake it with eval()? WDYT?

Why not simply pass nil as the input, with the meaning that it stands
for the current buffer's text?  Or, better yet, pass START and END to
allow a substring of current buffer's text.  We do that in a lot of
places (for different reasons, of course).

IOW, I see no reason to involve the Lisp interpreter for this job.  Am
I missing something?

Thanks.

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-15  9:32               ` Eli Zaretskii
@ 2017-04-15 14:27                 ` Ted Zlatanov
  2017-04-15 14:55                   ` Eli Zaretskii
  0 siblings, 1 reply; 128+ messages in thread
From: Ted Zlatanov @ 2017-04-15 14:27 UTC (permalink / raw)
  To: emacs-devel

On Sat, 15 Apr 2017 12:32:32 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

>> From: Ted Zlatanov <tzz@lifelogs.com>
>> Date: Fri, 14 Apr 2017 16:48:39 -0400
>> 
TZ> * TODO from Eli: avoid allocating a scratch buffer and then copying its
TZ> data (inside make_unibyte_string) into a newly-allocated string.
TZ> Instead, use make_uninit_string.
>> 
>> I've done this as much as possible. For AEAD output, I'm not sure how to
>> set the length of an already-allocated string; I didn't want to modify
>> `s.size' directly. I didn't see a function or macro to do it. This is
>> needed for gnutls_symmetric_aead().

EZ> I'm not sure I understand what you say here.  In particular, I see no
EZ> s.size in gnutls_symmetric_aead.  What did I miss?

EZ> I do see some issues in gnutls_symmetric_aead with how you create Lisp
EZ> strings.  You first correctly call make_uninit_string, which gives you
EZ> a unibyte string with no contents.  Then you populate that string's
EZ> data by calling gnutls_aead_cipher_encrypt resp. decrypt functions.
EZ> But then you call make_unibyte_string with the resulting data, which
EZ> is redundant: you already have the unibyte string with the correct
EZ> data in the 'storage' variable.  So you should just return 'storage',
EZ> like you do in, e.g., gnutls_symmetric.

These two comments are related: for example, the decryption with
CAMELLIA-256-GCM produces less bytes of output that the input. But I
don't want to try to anticipate that byte count--it complicates the code
needlessly. So instead I want to cut the Lisp string `storage' to
`storage_length' bytes after gnutls_aead_cipher_{encrypt,decrypt}()
modifies `storage_length'. I can't find a macro or function to do it, so
I used make_unibyte_string() for now and am asking how to do it better.

EZ> I see your methods are still strings, whereas I suggested making them
EZ> symbols.  Any reasons why you didn't?

Forgot :) Done now.

EZ> A minor nit: in doc strings, please always leave 2 spaces between
EZ> sentences, not 1.

That was the auto reformat. Fixed, thanks.

EZ> Only data structures defined via DEFVAR are accessible to Lisp, so
EZ> keeping the data in C and providing accessors for Lisp programs will
EZ> achieve the result, I think.  The accessor could wipe the data after N
EZ> accesses.

OK, I'll work on that later.

>> 2) Could there be a built-in C way to let C core functions take strings,
>> but callers can invoke them with '(buffer-string) to tell *the core
>> function* to call that form. In other words, I want the eval to be done
>> at the C level, so that looking at the call tree won't reveal the actual
>> string that was passed to the function. I think that would simplify my
>> code and other C code too. I can probably fake it with eval()? WDYT?

EZ> Why not simply pass nil as the input, with the meaning that it stands
EZ> for the current buffer's text?  Or, better yet, pass START and END to
EZ> allow a substring of current buffer's text.  We do that in a lot of
EZ> places (for different reasons, of course).

EZ> IOW, I see no reason to involve the Lisp interpreter for this job.  Am
EZ> I missing something?

We're assuming that there's only three ways to pass data to the
function, and they can all be expressed in the parameters instead of
code: buffer-string, buffer-substring, and direct string. I think there
may be other use cases, but maybe I'm wrong? Streaming data, event
handlers, and coding system adjustments maybe?

This is not standardized for core C functions AFAIK; I don't want to
rewrite the first 150 lines of secure_hash() and extend them when I need
to support more ways to pass data. So I think this should be provided by
the core somehow, in a way that can be reused. I thought of the
following possibilities:

a) Could the core C functions use the `interactive' spec? That may be the
best way, but I don't know of any core C functions that use it.

b) Another way is to write a special core C function to interpret these
special parameters and give back text. I could start writing this
function with the first 150 lines of secure_hash() and then try to
standardize the parameters.

c) my earlier idea, to eval a form in the core C function, but that's
slow and awkward. It *could* be a little better for performance, if the
C function doesn't call the form when it's not needed. And it's very
flexible.

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-15 14:27                 ` Ted Zlatanov
@ 2017-04-15 14:55                   ` Eli Zaretskii
  2017-04-16  2:39                     ` Ted Zlatanov
  2017-04-16  3:37                     ` libnettle/libhogweed WIP Stefan Monnier
  0 siblings, 2 replies; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-15 14:55 UTC (permalink / raw)
  To: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Sat, 15 Apr 2017 10:27:33 -0400
> 
> On Sat, 15 Apr 2017 12:32:32 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 
> 
> >> From: Ted Zlatanov <tzz@lifelogs.com>
> >> Date: Fri, 14 Apr 2017 16:48:39 -0400
> >> 
> TZ> * TODO from Eli: avoid allocating a scratch buffer and then copying its
> TZ> data (inside make_unibyte_string) into a newly-allocated string.
> TZ> Instead, use make_uninit_string.
> >> 
> >> I've done this as much as possible. For AEAD output, I'm not sure how to
> >> set the length of an already-allocated string; I didn't want to modify
> >> `s.size' directly. I didn't see a function or macro to do it. This is
> >> needed for gnutls_symmetric_aead().
> 
> EZ> I'm not sure I understand what you say here.  In particular, I see no
> EZ> s.size in gnutls_symmetric_aead.  What did I miss?
> 
> EZ> I do see some issues in gnutls_symmetric_aead with how you create Lisp
> EZ> strings.  You first correctly call make_uninit_string, which gives you
> EZ> a unibyte string with no contents.  Then you populate that string's
> EZ> data by calling gnutls_aead_cipher_encrypt resp. decrypt functions.
> EZ> But then you call make_unibyte_string with the resulting data, which
> EZ> is redundant: you already have the unibyte string with the correct
> EZ> data in the 'storage' variable.  So you should just return 'storage',
> EZ> like you do in, e.g., gnutls_symmetric.
> 
> These two comments are related: for example, the decryption with
> CAMELLIA-256-GCM produces less bytes of output that the input. But I
> don't want to try to anticipate that byte count--it complicates the code
> needlessly. So instead I want to cut the Lisp string `storage' to
> `storage_length' bytes after gnutls_aead_cipher_{encrypt,decrypt}()
> modifies `storage_length'. I can't find a macro or function to do it, so
> I used make_unibyte_string() for now and am asking how to do it better.

I think STRING_SET_CHARS is what you want here.

> >> 2) Could there be a built-in C way to let C core functions take strings,
> >> but callers can invoke them with '(buffer-string) to tell *the core
> >> function* to call that form. In other words, I want the eval to be done
> >> at the C level, so that looking at the call tree won't reveal the actual
> >> string that was passed to the function. I think that would simplify my
> >> code and other C code too. I can probably fake it with eval()? WDYT?
> 
> EZ> Why not simply pass nil as the input, with the meaning that it stands
> EZ> for the current buffer's text?  Or, better yet, pass START and END to
> EZ> allow a substring of current buffer's text.  We do that in a lot of
> EZ> places (for different reasons, of course).
> 
> EZ> IOW, I see no reason to involve the Lisp interpreter for this job.  Am
> EZ> I missing something?
> 
> We're assuming that there's only three ways to pass data to the
> function, and they can all be expressed in the parameters instead of
> code: buffer-string, buffer-substring, and direct string. I think there
> may be other use cases, but maybe I'm wrong? Streaming data, event
> handlers, and coding system adjustments maybe?

What other types of text except strings and buffer text are there?  I
think there are no others, and all the other features are necessarily
built on top of those two.

> This is not standardized for core C functions AFAIK; I don't want to
> rewrite the first 150 lines of secure_hash() and extend them when I need
> to support more ways to pass data.

I don't think I see the problem; can you give an example?

Buffer text is just a C string (with the slight complication of the
gap), so I don't understand why you'd need to rewrite code that much.

> a) Could the core C functions use the `interactive' spec? That may be the
> best way, but I don't know of any core C functions that use it.
> 
> b) Another way is to write a special core C function to interpret these
> special parameters and give back text. I could start writing this
> function with the first 150 lines of secure_hash() and then try to
> standardize the parameters.
> 
> c) my earlier idea, to eval a form in the core C function, but that's
> slow and awkward. It *could* be a little better for performance, if the
> C function doesn't call the form when it's not needed. And it's very
> flexible.

These seem to be complications for which I see no justifications for
now.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-15 14:55                   ` Eli Zaretskii
@ 2017-04-16  2:39                     ` Ted Zlatanov
  2017-04-16  6:25                       ` Eli Zaretskii
                                         ` (2 more replies)
  2017-04-16  3:37                     ` libnettle/libhogweed WIP Stefan Monnier
  1 sibling, 3 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-04-16  2:39 UTC (permalink / raw)
  To: emacs-devel

On Sat, 15 Apr 2017 17:55:00 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

>> From: Ted Zlatanov <tzz@lifelogs.com>
>> Date: Sat, 15 Apr 2017 10:27:33 -0400
>> 
>> These two comments are related: for example, the decryption with
>> CAMELLIA-256-GCM produces less bytes of output that the input. But I
>> don't want to try to anticipate that byte count--it complicates the code
>> needlessly. So instead I want to cut the Lisp string `storage' to
>> `storage_length' bytes after gnutls_aead_cipher_{encrypt,decrypt}()
>> modifies `storage_length'. I can't find a macro or function to do it, so
>> I used make_unibyte_string() for now and am asking how to do it better.

EZ> I think STRING_SET_CHARS is what you want here.

Thank you for letting me know. That macro has a terrible name and should
probably be called STRING_SET_{SIZE,LENGTH} or similar. But yeah, it
seems right.

When I replace gnutls.c:1828

    return make_unibyte_string (SSDATA (storage), storage_length);

with

    STRING_SET_CHARS (storage, storage_length);
    return storage;

I get a segfault running the tests:

#+begin_src text
Thread 1 "emacs" received signal SIGSEGV, Segmentation fault.
compact_small_strings () at alloc.c:2227
2227		      nbytes = s ? STRING_BYTES (s) : SDATA_NBYTES (from);
[gdb]> bt
#0  compact_small_strings () at alloc.c:2227
#1  sweep_strings () at alloc.c:2159
#2  0x00000000005542d4 in gc_sweep () at alloc.c:7088
#3  garbage_collect_1 (end=<optimized out>) at alloc.c:5965
#4  Fgarbage_collect () at alloc.c:6096
#5  0x000000000056b1ef in maybe_gc () at lisp.h:4667
#6  eval_sub (form=0x14ef193) at eval.c:2128
...
[gdb]> p s
$1 = (struct Lisp_String *) 0x2020202020202020
[gdb]> p *s
Cannot access memory at address 0x2020202020202020

#+end_src

It happens during the tests with 100% certainty but I'm not sure what
test or string specifically caused it.

EZ> Buffer text is just a C string (with the slight complication of the
EZ> gap), so I don't understand why you'd need to rewrite code that much.

Like I said, the first 150 lines of secure_hash() demonstrate the
complexity. I'll just pull them out into a common function and use that,
and assume you're right that only buffer extraction and direct strings
are needed, and that the other ideas I had were not so good.

Thanks for your help
Ted




^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-15 14:55                   ` Eli Zaretskii
  2017-04-16  2:39                     ` Ted Zlatanov
@ 2017-04-16  3:37                     ` Stefan Monnier
  2017-04-16  6:19                       ` Eli Zaretskii
  1 sibling, 1 reply; 128+ messages in thread
From: Stefan Monnier @ 2017-04-16  3:37 UTC (permalink / raw)
  To: emacs-devel

> I think STRING_SET_CHARS is what you want here.

Really?  This sets ->size without changing ->size_bytes, so it doesn't
look right.  I think we don't have the function that Ted wants.
Basically, we'd need to provide a `resize_string_data` function, which
will have to massage the sdata blocks appropriately, paying attention to
LARGE_STRING_BYTES and things like that.
I think when shrinking the string, we can avoid copying in most cases,
but IIUC there are some cases where we can't avoid it (when the string
is smaller than LARGE_STRING_BYTES and we shrink it by too few bytes to
be able to place a new (free) sdata block in the freed space.
This said, it could get a bit ugly around the LARGE_STRING_BYTES
boundary because I can't see an easy way to efficiently figure out if
a string data was allocated into a large_sblock(s) or into a small block
(kept in oldest_sblock instead) so if the old size was larger than
LARGE_STRING_BYTES but the new size is smaller, we may have to add
a dummy "free sdata" as well (which again is only possible if the
shrinkage is sufficient to make room for such a dummy sdata)).

        Stefan

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-16  3:37                     ` libnettle/libhogweed WIP Stefan Monnier
@ 2017-04-16  6:19                       ` Eli Zaretskii
  2017-04-16 13:20                         ` Stefan Monnier
  0 siblings, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-16  6:19 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Sat, 15 Apr 2017 23:37:57 -0400
> 
> > I think STRING_SET_CHARS is what you want here.
> 
> Really?  This sets ->size without changing ->size_bytes, so it doesn't
> look right.

This is a unibyte string, so size_byte is -1 anyway.  We do use this
macro in a few places in our sources; are you saying they are all
wrong?



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-16  2:39                     ` Ted Zlatanov
@ 2017-04-16  6:25                       ` Eli Zaretskii
  2017-04-16  6:51                       ` Eli Zaretskii
  2017-04-17 16:00                       ` rename STRING_SET_CHARS to STRING_SET_SIZE (was: libnettle/libhogweed WIP) Ted Zlatanov
  2 siblings, 0 replies; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-16  6:25 UTC (permalink / raw)
  To: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Sat, 15 Apr 2017 22:39:57 -0400
> 
> EZ> Buffer text is just a C string (with the slight complication of the
> EZ> gap), so I don't understand why you'd need to rewrite code that much.
> 
> Like I said, the first 150 lines of secure_hash() demonstrate the
> complexity.

Most of that is encoding the text, which is not relevant to your
functions.  What's left is about 10 lines for a string and 30 for a
buffer substring, not too much IMO.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-16  2:39                     ` Ted Zlatanov
  2017-04-16  6:25                       ` Eli Zaretskii
@ 2017-04-16  6:51                       ` Eli Zaretskii
  2017-04-17 16:23                         ` Ted Zlatanov
  2017-04-17 16:00                       ` rename STRING_SET_CHARS to STRING_SET_SIZE (was: libnettle/libhogweed WIP) Ted Zlatanov
  2 siblings, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-16  6:51 UTC (permalink / raw)
  To: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Sat, 15 Apr 2017 22:39:57 -0400
> 
> When I replace gnutls.c:1828
> 
>     return make_unibyte_string (SSDATA (storage), storage_length);
> 
> with
> 
>     STRING_SET_CHARS (storage, storage_length);
>     return storage;
> 
> I get a segfault running the tests:

An alternative is to use SAFE_ALLOCA to allocate a C string while you
manipulate it, and then convert it to a unibyte Lisp string with
make_unibyte_string, just before the function returns.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-14 20:48             ` Ted Zlatanov
  2017-04-15  9:32               ` Eli Zaretskii
@ 2017-04-16  7:47               ` Toon Claes
  1 sibling, 0 replies; 128+ messages in thread
From: Toon Claes @ 2017-04-16  7:47 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 526 bytes --]

> On 14 Apr 2017, at 22:48, Ted Zlatanov <tzz@lifelogs.com> wrote: 
> 
> I've put this work up on
> https://gitlab.com/emacs-ci/emacs/merge_requests/2 for those who want to
> use Gitlab to comment.

FYI: you can visit https://gitlab.com/emacs-ci/emacs/merge_requests/2.patch to see the plain text diff for all the changes made in Ted's branch. So if Ted pushes more changes, it's not needed to send the full patch over mail. 
This link does not require running any JavaScript and there is no need to log in. 

-- Toon

[-- Attachment #2: Type: text/html, Size: 1010 bytes --]

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-16  6:19                       ` Eli Zaretskii
@ 2017-04-16 13:20                         ` Stefan Monnier
  0 siblings, 0 replies; 128+ messages in thread
From: Stefan Monnier @ 2017-04-16 13:20 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

>> > I think STRING_SET_CHARS is what you want here.
>> Really?  This sets ->size without changing ->size_bytes, so it doesn't
>> look right.
> This is a unibyte string, so size_byte is -1 anyway.  We do use this
> macro in a few places in our sources; are you saying they are all
> wrong?

No, the three places where we use it are correct: they do not change the
byte-size, only the char-size.


        Stefan



^ permalink raw reply	[flat|nested] 128+ messages in thread

* rename STRING_SET_CHARS to STRING_SET_SIZE (was: libnettle/libhogweed WIP)
  2017-04-16  2:39                     ` Ted Zlatanov
  2017-04-16  6:25                       ` Eli Zaretskii
  2017-04-16  6:51                       ` Eli Zaretskii
@ 2017-04-17 16:00                       ` Ted Zlatanov
  2017-04-17 16:24                         ` rename STRING_SET_CHARS to STRING_SET_SIZE Eli Zaretskii
  2017-04-17 16:29                         ` Stefan Monnier
  2 siblings, 2 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-04-17 16:00 UTC (permalink / raw)
  To: emacs-devel

On Sat, 15 Apr 2017 22:39:57 -0400 Ted Zlatanov <tzz@lifelogs.com> wrote: 

TZ> On Sat, 15 Apr 2017 17:55:00 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

EZ> I think STRING_SET_CHARS is what you want here.

TZ> Thank you for letting me know. That macro has a terrible name and should
TZ> probably be called STRING_SET_{SIZE,LENGTH} or similar. But yeah, it
TZ> seems right.

Eli, anyone else, is it OK to do that renaming of STRING_SET_CHARS to
STRING_SET_SIZE? It's a small thing but annoys me :)

Ted




^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-16  6:51                       ` Eli Zaretskii
@ 2017-04-17 16:23                         ` Ted Zlatanov
  2017-04-17 16:34                           ` Eli Zaretskii
  0 siblings, 1 reply; 128+ messages in thread
From: Ted Zlatanov @ 2017-04-17 16:23 UTC (permalink / raw)
  To: emacs-devel

On Sun, 16 Apr 2017 09:25:13 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

>> From: Ted Zlatanov <tzz@lifelogs.com>
>> Date: Sat, 15 Apr 2017 22:39:57 -0400
>> 
>> Like I said, the first 150 lines of secure_hash() demonstrate the
>> complexity.

EZ> Most of that is encoding the text, which is not relevant to your
EZ> functions.  What's left is about 10 lines for a string and 30 for a
EZ> buffer substring, not too much IMO.

Why do you think the new functions don't need it?

Either way I'll pull the code out into a shared function, but right now
my patch assumes the input is always unibyte and in a Lisp string, and
the return is always binary. secure_hash() does much better on both
input and output, respecting coding systems and multibyte strings, and
can produce binary or hex-encoded output. So I think all of that
extra code is useful.

On Sat, 15 Apr 2017 23:37:57 -0400 Stefan Monnier <monnier@iro.umontreal.ca> wrote: 

SM> I think we don't have the function that Ted wants. Basically, we'd
SM> need to provide a `resize_string_data` function

That seems pretty complicated. I'll leave the patch as is, doing an
extra copy, and add a TODO referencing this potential function.

Somewhat related--is there a sure way to wipe Lisp strings in C? I've done

    memset(SSDATA (storage), 0, storage_length);

but maybe that's not ideal. Does the core allow C functions to say "GC
this Lisp object right away and make sure it's wiped" or some subset of
that?

Thanks
Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: rename STRING_SET_CHARS to STRING_SET_SIZE
  2017-04-17 16:00                       ` rename STRING_SET_CHARS to STRING_SET_SIZE (was: libnettle/libhogweed WIP) Ted Zlatanov
@ 2017-04-17 16:24                         ` Eli Zaretskii
  2017-04-17 16:29                         ` Stefan Monnier
  1 sibling, 0 replies; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-17 16:24 UTC (permalink / raw)
  To: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Mon, 17 Apr 2017 12:00:56 -0400
> 
> Eli, anyone else, is it OK to do that renaming of STRING_SET_CHARS to
> STRING_SET_SIZE?

It is called STRING_SET_CHARS because it sets the same member that
SCHARS returns.  And CHARS is actually correct, because the 'size'
member counts string characters (as opposed to bytes).  So I don't
think there's a problem with the name here, you just need ti get used
to thinking about it as 'chars', not 'size'.

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: rename STRING_SET_CHARS to STRING_SET_SIZE
  2017-04-17 16:00                       ` rename STRING_SET_CHARS to STRING_SET_SIZE (was: libnettle/libhogweed WIP) Ted Zlatanov
  2017-04-17 16:24                         ` rename STRING_SET_CHARS to STRING_SET_SIZE Eli Zaretskii
@ 2017-04-17 16:29                         ` Stefan Monnier
  2017-04-17 16:34                           ` Ted Zlatanov
  1 sibling, 1 reply; 128+ messages in thread
From: Stefan Monnier @ 2017-04-17 16:29 UTC (permalink / raw)
  To: emacs-devel

> Eli, anyone else, is it OK to do that renaming of STRING_SET_CHARS to
> STRING_SET_SIZE? It's a small thing but annoys me :)

That macro is only meant to change the number of chars in a string
without changing the number of bytes, so the current name doesn't sound
bad to me.


        Stefan




^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-17 16:23                         ` Ted Zlatanov
@ 2017-04-17 16:34                           ` Eli Zaretskii
  2017-04-17 16:55                             ` Ted Zlatanov
  0 siblings, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-17 16:34 UTC (permalink / raw)
  To: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Mon, 17 Apr 2017 12:23:29 -0400
> 
> On Sun, 16 Apr 2017 09:25:13 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 
> 
> >> Like I said, the first 150 lines of secure_hash() demonstrate the
> >> complexity.
> 
> EZ> Most of that is encoding the text, which is not relevant to your
> EZ> functions.  What's left is about 10 lines for a string and 30 for a
> EZ> buffer substring, not too much IMO.
> 
> Why do you think the new functions don't need it?

Because you already require unibyte text as input.  Unibyte text
doesn't need encoding, it's either ASCII or was already encoded before
calling these functions.

> Either way I'll pull the code out into a shared function, but right now
> my patch assumes the input is always unibyte and in a Lisp string, and
> the return is always binary. secure_hash() does much better on both
> input and output, respecting coding systems and multibyte strings, and
> can produce binary or hex-encoded output. So I think all of that
> extra code is useful.

If you want to allow multibyte input that needs to be encoded as part
of these functions, then yes.

> SM> I think we don't have the function that Ted wants. Basically, we'd
> SM> need to provide a `resize_string_data` function
> 
> That seems pretty complicated. I'll leave the patch as is, doing an
> extra copy, and add a TODO referencing this potential function.

Why not use my suggestion, producing a Lisp string out of C string
just before returning?

> Somewhat related--is there a sure way to wipe Lisp strings in C? I've done
> 
>     memset(SSDATA (storage), 0, storage_length);

I think you want clear-string, a.k.a. Fclear_string (although it does
almost exactly what you did).

> Does the core allow C functions to say "GC this Lisp object right
> away and make sure it's wiped" or some subset of that?

No.  And you are aware that GC doesn't wipe memory, only reshuffles it
and marks it free, yes?  So clearing the contents is required anyway,
and after that, why do you care when it will be GC'ed?



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: rename STRING_SET_CHARS to STRING_SET_SIZE
  2017-04-17 16:29                         ` Stefan Monnier
@ 2017-04-17 16:34                           ` Ted Zlatanov
  0 siblings, 0 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-04-17 16:34 UTC (permalink / raw)
  To: emacs-devel

On Mon, 17 Apr 2017 12:29:11 -0400 Stefan Monnier <monnier@iro.umontreal.ca> wrote: 

>> Eli, anyone else, is it OK to do that renaming of STRING_SET_CHARS to
>> STRING_SET_SIZE? It's a small thing but annoys me :)

SM> That macro is only meant to change the number of chars in a string
SM> without changing the number of bytes, so the current name doesn't sound
SM> bad to me.

On Mon, 17 Apr 2017 19:24:28 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

EZ> It is called STRING_SET_CHARS because it sets the same member that
EZ> SCHARS returns.  And CHARS is actually correct, because the 'size'
EZ> member counts string characters (as opposed to bytes).  So I don't
EZ> think there's a problem with the name here, you just need ti get used
EZ> to thinking about it as 'chars', not 'size'.

All right, leaving this alone. Thanks for explaining.

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-17 16:34                           ` Eli Zaretskii
@ 2017-04-17 16:55                             ` Ted Zlatanov
  2017-04-17 17:11                               ` Eli Zaretskii
                                                 ` (2 more replies)
  0 siblings, 3 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-04-17 16:55 UTC (permalink / raw)
  To: emacs-devel

On Mon, 17 Apr 2017 19:34:10 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

EZ> Because you already require unibyte text as input.  Unibyte text
EZ> doesn't need encoding, it's either ASCII or was already encoded before
EZ> calling these functions.
...
EZ> If you want to allow multibyte input that needs to be encoded as part
EZ> of these functions, then yes.

I see the confusion. Yes, I want to allow multibyte input. Users
shouldn't have to jump through hoops to use these functions.

SM> I think we don't have the function that Ted wants. Basically, we'd
SM> need to provide a `resize_string_data` function
>> 
>> That seems pretty complicated. I'll leave the patch as is, doing an
>> extra copy, and add a TODO referencing this potential function.

EZ> Why not use my suggestion, producing a Lisp string out of C string
EZ> just before returning?

I don't see the difference between allocating a C string +
make_unibyte_string(), and doing

  Lisp_Object storage = make_uninit_string (storage_length);
  ...
  return make_unibyte_string (SSDATA (storage), storage_length);

because either way the data has to be copied, and the latter needs less
care with freeing the memory. It's really not a big deal to switch to
your suggestion, I just don't know why it matters?

>> Somewhat related--is there a sure way to wipe Lisp strings in C? I've done
>> 
>> memset(SSDATA (storage), 0, storage_length);

EZ> I think you want clear-string, a.k.a. Fclear_string (although it does
EZ> almost exactly what you did).

Perfect, thanks.

>> Does the core allow C functions to say "GC this Lisp object right
>> away and make sure it's wiped" or some subset of that?

EZ> No.  And you are aware that GC doesn't wipe memory, only reshuffles it
EZ> and marks it free, yes?  So clearing the contents is required anyway,
EZ> and after that, why do you care when it will be GC'ed?

Understood, I'll use clear-string and worry less :) Thank you again.

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-17 16:55                             ` Ted Zlatanov
@ 2017-04-17 17:11                               ` Eli Zaretskii
  2017-04-17 17:34                                 ` Ted Zlatanov
  2017-04-17 20:50                               ` Ted Zlatanov
  2017-04-19 12:22                               ` Stefan Monnier
  2 siblings, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-17 17:11 UTC (permalink / raw)
  To: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Mon, 17 Apr 2017 12:55:32 -0400
> 
> EZ> Why not use my suggestion, producing a Lisp string out of C string
> EZ> just before returning?
> 
> I don't see the difference between allocating a C string +
> make_unibyte_string(), and doing
> 
>   Lisp_Object storage = make_uninit_string (storage_length);
>   ...
>   return make_unibyte_string (SSDATA (storage), storage_length);

It's a minor issue, admittedly.  make_unibyte_string allocates a Lisp
string, which does more than just allocate the string data (e.g., Lisp
memory allocation is non-reentrant, so it calls block_input).  It also
requires allocation whose alignment is potentially more stringent than
simple C memory allocation, certainly so if SAFE_ALLOCA decides to use
alloca, not malloc, so it slightly increases memory pressure.

IOW, it's a kind of aesthetic issue: code that allocates a Lisp string
just so it could use its data as storage, and then actually makes a
different string to return to the caller, looks less elegant to me.

> because either way the data has to be copied, and the latter needs less
> care with freeing the memory.

Using SAFE_ALLOCA is not complicated, either.

> It's really not a big deal to switch to your suggestion, I just
> don't know why it matters?

I hope the above explains why.

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-17 17:11                               ` Eli Zaretskii
@ 2017-04-17 17:34                                 ` Ted Zlatanov
  2017-04-17 17:46                                   ` Ted Zlatanov
  2017-04-17 18:11                                   ` Eli Zaretskii
  0 siblings, 2 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-04-17 17:34 UTC (permalink / raw)
  To: emacs-devel

On Mon, 17 Apr 2017 20:11:32 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

EZ> Using SAFE_ALLOCA is not complicated, either.

OK, thanks for the advice. Here's the output when I used it (also pushed
to the branch). Let me know if I'm doing something wrong, or should go
ahead and silence the warnings:

    USE_SAFE_ALLOCA;
    unsigned char *storage = SAFE_ALLOCA (storage_length);

gnutls.c: In function ‘gnutls_symmetric_aead’:
gnutls.c:1839:45: warning: pointer targets in passing argument 1 of ‘make_unibyte_string’ differ in signedness [-Wpointer-sign]
   Lisp_Object output = make_unibyte_string (storage, storage_length);
                                             ^~~~~~~
In file included from gnutls.c:23:0:
lisp.h:3583:20: note: expected ‘const char *’ but argument is of type ‘unsigned char *’
 extern Lisp_Object make_unibyte_string (const char *, ptrdiff_t);
                    ^~~~~~~~~~~~~~~~~~~
In file included from gnutls.c:23:0:
lisp.h:4425:47: warning: variable ‘sa_must_free’ set but not used [-Wunused-but-set-variable]
   ptrdiff_t sa_count = SPECPDL_INDEX (); bool sa_must_free = false
                                               ^
gnutls.c:1769:3: note: in expansion of macro ‘USE_SAFE_ALLOCA’
   USE_SAFE_ALLOCA;
   ^~~~~~~~~~~~~~~
lisp.h:4425:13: warning: unused variable ‘sa_count’ [-Wunused-variable]
   ptrdiff_t sa_count = SPECPDL_INDEX (); bool sa_must_free = false
             ^
gnutls.c:1769:3: note: in expansion of macro ‘USE_SAFE_ALLOCA’
   USE_SAFE_ALLOCA;
   ^~~~~~~~~~~~~~~




^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-17 17:34                                 ` Ted Zlatanov
@ 2017-04-17 17:46                                   ` Ted Zlatanov
  2017-04-17 18:11                                   ` Eli Zaretskii
  1 sibling, 0 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-04-17 17:46 UTC (permalink / raw)
  To: emacs-devel

On Mon, 17 Apr 2017 13:34:00 -0400 Ted Zlatanov <tzz@lifelogs.com> wrote: 

TZ> On Mon, 17 Apr 2017 20:11:32 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 
EZ> Using SAFE_ALLOCA is not complicated, either.

(...warning from where I forgot to do SAFE_FREE...)

Bah, never mind, sorry, I thought I had done SAFE_FREE() but somehow
missed it. Fixed, it's using SAFE_ALLOCA() with no warnings now.

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-17 17:34                                 ` Ted Zlatanov
  2017-04-17 17:46                                   ` Ted Zlatanov
@ 2017-04-17 18:11                                   ` Eli Zaretskii
  1 sibling, 0 replies; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-17 18:11 UTC (permalink / raw)
  To: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Mon, 17 Apr 2017 13:34:00 -0400
> 
>     USE_SAFE_ALLOCA;
>     unsigned char *storage = SAFE_ALLOCA (storage_length);
> 
> gnutls.c: In function ‘gnutls_symmetric_aead’:
> gnutls.c:1839:45: warning: pointer targets in passing argument 1 of ‘make_unibyte_string’ differ in signedness [-Wpointer-sign]
>    Lisp_Object output = make_unibyte_string (storage, storage_length);
>                                              ^~~~~~~
> In file included from gnutls.c:23:0:
> lisp.h:3583:20: note: expected ‘const char *’ but argument is of type ‘unsigned char *’
>  extern Lisp_Object make_unibyte_string (const char *, ptrdiff_t);
>                     ^~~~~~~~~~~~~~~~~~~

You need to declare/initialize 'storage' like this:

  char *storage = SAFE_ALLOCA (storage_length);

> In file included from gnutls.c:23:0:
> lisp.h:4425:47: warning: variable ‘sa_must_free’ set but not used [-Wunused-but-set-variable]
>    ptrdiff_t sa_count = SPECPDL_INDEX (); bool sa_must_free = false
>                                                ^
> gnutls.c:1769:3: note: in expansion of macro ‘USE_SAFE_ALLOCA’
>    USE_SAFE_ALLOCA;
>    ^~~~~~~~~~~~~~~
> lisp.h:4425:13: warning: unused variable ‘sa_count’ [-Wunused-variable]
>    ptrdiff_t sa_count = SPECPDL_INDEX (); bool sa_must_free = false
>              ^
> gnutls.c:1769:3: note: in expansion of macro ‘USE_SAFE_ALLOCA’
>    USE_SAFE_ALLOCA;
>    ^~~~~~~~~~~~~~~

Not sure why this happens: SAFE_FREE uses these variables.  Perhaps
because you didn't call SAFE_FREE before the first 'return'?  Or maybe
it's a side effect of the warning?



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-17 16:55                             ` Ted Zlatanov
  2017-04-17 17:11                               ` Eli Zaretskii
@ 2017-04-17 20:50                               ` Ted Zlatanov
  2017-04-17 21:19                                 ` Noam Postavsky
  2017-04-18 17:44                                 ` Ted Zlatanov
  2017-04-19 12:22                               ` Stefan Monnier
  2 siblings, 2 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-04-17 20:50 UTC (permalink / raw)
  To: emacs-devel

On Mon, 17 Apr 2017 12:55:32 -0400 Ted Zlatanov <tzz@lifelogs.com> wrote: 

TZ> I see the confusion. Yes, I want to allow multibyte input. Users
TZ> shouldn't have to jump through hoops to use these functions.

OK, the code has been changed as follows:

* since I was in fns.c, I added `secure-hash-algorithms' so we know what
  algorithms are supported by `secure-hash'. At least for tests it's
  nice to have this.

* factored out extract_data_from_object() from secure_hash() which
  incidentally defines a data format for data extraction
  (BUFFER-OR-STRING INPUT-START INPUT-END CODING-SYSTEM NOERROR) because
  I really didn't want my functions to take 20+ parameters.

* secure_hash() could also take that format but I left it alone for now.

* migrated secure_hash() and all the new gnutls.c crypto code to
  use extract_data_from_object()

* added tests to verify that buffer extraction (whole buffer) works

* IV, AUTH, and KEY parameters are still wiped if they were strings
  originally.

phew...

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-17 20:50                               ` Ted Zlatanov
@ 2017-04-17 21:19                                 ` Noam Postavsky
  2017-04-17 23:29                                   ` Ted Zlatanov
  2017-04-18 17:44                                 ` Ted Zlatanov
  1 sibling, 1 reply; 128+ messages in thread
From: Noam Postavsky @ 2017-04-17 21:19 UTC (permalink / raw)
  To: Emacs developers

On Mon, Apr 17, 2017 at 4:50 PM, Ted Zlatanov <tzz@lifelogs.com> wrote:
>
> * IV, AUTH, and KEY parameters are still wiped if they were strings
>   originally.

Is there any need to wipe IV and AUTH? Unless I've misunderstood,
those are not expected to be secret anyway.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-17 21:19                                 ` Noam Postavsky
@ 2017-04-17 23:29                                   ` Ted Zlatanov
  2017-04-19  2:08                                     ` Ted Zlatanov
  0 siblings, 1 reply; 128+ messages in thread
From: Ted Zlatanov @ 2017-04-17 23:29 UTC (permalink / raw)
  To: emacs-devel

On Mon, 17 Apr 2017 17:19:07 -0400 Noam Postavsky <npostavs@users.sourceforge.net> wrote: 

NP> On Mon, Apr 17, 2017 at 4:50 PM, Ted Zlatanov <tzz@lifelogs.com> wrote:
>> 
>> * IV, AUTH, and KEY parameters are still wiped if they were strings
>> originally.

NP> Is there any need to wipe IV and AUTH? Unless I've misunderstood,
NP> those are not expected to be secret anyway.

Yes, that was just mental inertia, thanks for catching it. I've removed
that. We don't want IVs to be reused for the same KEY, I'll work on that
as recommended in https://tools.ietf.org/html/rfc5116#section-3.2

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-17 20:50                               ` Ted Zlatanov
  2017-04-17 21:19                                 ` Noam Postavsky
@ 2017-04-18 17:44                                 ` Ted Zlatanov
  1 sibling, 0 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-04-18 17:44 UTC (permalink / raw)
  To: emacs-devel

On Mon, 17 Apr 2017 16:50:48 -0400 Ted Zlatanov <tzz@lifelogs.com> wrote: 

TZ> * factored out extract_data_from_object() from secure_hash() which
TZ>   incidentally defines a data format for data extraction
TZ>   (BUFFER-OR-STRING INPUT-START INPUT-END CODING-SYSTEM NOERROR) because
TZ>   I really didn't want my functions to take 20+ parameters.

I'd like to allow files here, but I'd also like to avoid reading them
into a buffer or a string just to use with extract_data_from_object().

There's no Lisp_Object AFAIK to represent a file. So for
BUFFER-OR-STRING what do I use to indicate a file?

"file:///the/path" ; a special string format: URL format

(file "/the/path") ; a nested list with a symbol

(insert-file-contents-literally "/the/path") ; a form, called in a temp buffer?

If there's any precedent for this, let me know please.

Thanks
Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-17 23:29                                   ` Ted Zlatanov
@ 2017-04-19  2:08                                     ` Ted Zlatanov
  2017-04-19  2:42                                       ` Noam Postavsky
                                                         ` (2 more replies)
  0 siblings, 3 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-04-19  2:08 UTC (permalink / raw)
  To: emacs-devel

Update on the patch, summarizing the last few days and restating all my
remaining questions. It's in the scratch/tzz/nettle branch of emacs.git
or at https://gitlab.com/emacs-ci/emacs/merge_requests/2

I appreciate all your help and review.

* all the ciphers, macs, and digests from the GnuTLS crypto API are
  available. I'll leave the PK crypto for later, this patch is huge
  already.

* docs are up to date with current code

* tests too

* We don't want IVs to be reused for the same KEY, I'll work on that as
  recommended in https://tools.ietf.org/html/rfc5116#section-3.2 with an
  internal IV counter that can't be overridden and increments every time
  it's utilized, which is not too bad.

* the KEY, if it's a string, is cleared by all the new functions that
  take it. I think it's best to only allow the key to come from a buffer
  or a file anyway, maybe even just a file. So this may not be needed
  later. In any case, I don't clear the INPUT or the rest of the
  parameters (thanks to Noam for noting that).

* factored out extract_data_from_object() from secure_hash() which
  incidentally defines a data format for data extraction
  (BUFFER-OR-STRING INPUT-START INPUT-END CODING-SYSTEM NOERROR). This
  supports coding systems etc. so it's a good reuse of the secure_hash
  code for the GnuTLS crypto API glue.

  I'd like to allow files here, but I'd also like to avoid reading them
  into a buffer or a string just to use with extract_data_from_object().

  There's no Lisp_Object AFAIK to represent a file. So for
  BUFFER-OR-STRING what do I use to indicate a file?

  "file:///the/path" ; a special string format: URL format?

  (file "/the/path") ; a nested list with a symbol?

  (insert-file-contents-literally "/the/path") ; a form, called in a temp buffer?

* added `secure-hash-algorithms' so we know what algorithms are
  supported by `secure-hash'. At least for tests it's nice to have this.

* I pin to GnuTLS 3.4.0 instead of AC_CHECK_FUNCS_ONCE because I
  couldn't get that autoconf macro to work! I would appreciate some help
  for how to use that macro for GnuTLS API functions. I think it needs
  to be told to include "gnutls/crypto.h" because the resulting C test
  doesn't.

* the ERT tests look at the environment variable `GNUTLS_TEST_VERBOSE=1'
  to trigger verbose behavior. I'm not sure if there's a better way, and
  would like verbosity control and maybe even per-test-tag output
  settings (to make a specific type of test more verbose). It feels like
  something ERT should provide.

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-19  2:08                                     ` Ted Zlatanov
@ 2017-04-19  2:42                                       ` Noam Postavsky
  2017-04-19 15:24                                       ` Davis Herring
  2017-04-19 15:45                                       ` Eli Zaretskii
  2 siblings, 0 replies; 128+ messages in thread
From: Noam Postavsky @ 2017-04-19  2:42 UTC (permalink / raw)
  To: Emacs developers

On Tue, Apr 18, 2017 at 10:08 PM, Ted Zlatanov <tzz@lifelogs.com> wrote:
>
> * I pin to GnuTLS 3.4.0 instead of AC_CHECK_FUNCS_ONCE because I
>   couldn't get that autoconf macro to work! I would appreciate some help
>   for how to use that macro for GnuTLS API functions. I think it needs
>   to be told to include "gnutls/crypto.h" because the resulting C test
>   doesn't.

Missing header would only be a warning. I guess what's actually
missing is the -lgnutls in $LIBS. Possibly you must use AC_CHECK_FUNCS
instead of AC_CHECK_FUNCS_ONCE since you need to manipulate $LIBS
around the call.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-17 16:55                             ` Ted Zlatanov
  2017-04-17 17:11                               ` Eli Zaretskii
  2017-04-17 20:50                               ` Ted Zlatanov
@ 2017-04-19 12:22                               ` Stefan Monnier
  2017-04-19 13:38                                 ` Ted Zlatanov
                                                   ` (2 more replies)
  2 siblings, 3 replies; 128+ messages in thread
From: Stefan Monnier @ 2017-04-19 12:22 UTC (permalink / raw)
  To: emacs-devel

> I see the confusion. Yes, I want to allow multibyte input. Users
> shouldn't have to jump through hoops to use these functions.

FWIW, I disagree.  You should check that either the string is unibyte
(AKA byte_size<0) or that it's ASCII-only (byte_size==size).

Forcing your users to (de|en)code explicitly before calling your
functions isn't making them jump through hoops: it's helping them have
the correct mental model of what "multibyte text" means.


        Stefan




^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-19 12:22                               ` Stefan Monnier
@ 2017-04-19 13:38                                 ` Ted Zlatanov
  2017-04-19 14:16                                 ` Lars Ingebrigtsen
  2017-04-19 14:41                                 ` Eli Zaretskii
  2 siblings, 0 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-04-19 13:38 UTC (permalink / raw)
  To: emacs-devel

On Wed, 19 Apr 2017 08:22:10 -0400 Stefan Monnier <monnier@iro.umontreal.ca> wrote: 

>> I see the confusion. Yes, I want to allow multibyte input. Users
>> shouldn't have to jump through hoops to use these functions.

SM> FWIW, I disagree.  You should check that either the string is unibyte
SM> (AKA byte_size<0) or that it's ASCII-only (byte_size==size).

SM> Forcing your users to (de|en)code explicitly before calling your
SM> functions isn't making them jump through hoops: it's helping them have
SM> the correct mental model of what "multibyte text" means.

I think inside the Emacs environment, users won't care too much about
the coding system or byte count, they'll just expect strings and buffers
to work. But you're right that for interoperability with other tools and
for file interactions, this is not trivial. And internally, I see it
can't be assumed that the decrypted text will have the same coding
system and unibyte/multibyte state as the original (this wasn't an issue
with `secure-hash' since it only does digests).

Since `secure-hash' supports multibyte in strings and buffers I think it
makes sense to be consistent, and that's why I factored out
extract_data_from_object(). So if we decide to make a change for how
multibyte text is treated, let's do that consistently, thinking about
interoperability and files, and consider the output as well. I think it
will require a payload structure. What do you think?

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-19 12:22                               ` Stefan Monnier
  2017-04-19 13:38                                 ` Ted Zlatanov
@ 2017-04-19 14:16                                 ` Lars Ingebrigtsen
  2017-04-19 14:48                                   ` Stefan Monnier
  2017-04-19 14:41                                 ` Eli Zaretskii
  2 siblings, 1 reply; 128+ messages in thread
From: Lars Ingebrigtsen @ 2017-04-19 14:16 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> Forcing your users to (de|en)code explicitly before calling your
> functions isn't making them jump through hoops: it's helping them have
> the correct mental model of what "multibyte text" means.

I agree.  While "just encrypt this string already" is a tempting thing
to be able to say, encryption doesn't really work on characters, but on
octet streams, so it turns into a head-scratcher when you get a weird
result.

But Emacs does have functions already that are rather... vague... about
what they do in this area and people seem to live with the ambiguity:

(base64-decode-string (base64-encode-string "Héllo"))
=> "H\351llo"

Which is...  correct?  But only if you know that the Emacs represents
non-ASCII strings as (approx.) UTF-8.

The less confusion in this area the better, and especially for
encryption functions where you really want to get the correct text back
again, I think it would be better, long-term, to not allow non-unibyte
text inputs.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-19 12:22                               ` Stefan Monnier
  2017-04-19 13:38                                 ` Ted Zlatanov
  2017-04-19 14:16                                 ` Lars Ingebrigtsen
@ 2017-04-19 14:41                                 ` Eli Zaretskii
  2017-04-19 14:54                                   ` Stefan Monnier
  2017-04-19 15:48                                   ` Ted Zlatanov
  2 siblings, 2 replies; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-19 14:41 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Wed, 19 Apr 2017 08:22:10 -0400
> 
> > I see the confusion. Yes, I want to allow multibyte input. Users
> > shouldn't have to jump through hoops to use these functions.
> 
> FWIW, I disagree.  You should check that either the string is unibyte
> (AKA byte_size<0) or that it's ASCII-only (byte_size==size).
> 
> Forcing your users to (de|en)code explicitly before calling your
> functions isn't making them jump through hoops: it's helping them have
> the correct mental model of what "multibyte text" means.

I tend to agree.

But maybe we should make a step back and discuss the various use cases
for passing non-ASCII text to these functions.  Why would a user want
that, and what would they expect to happen, when this is done from
within Emacs (as opposed to reading text from some file or accepting
it as a string).

I think it's important to discuss the expected results, because we
could avoid encoding the string, either inside or outside of the
functions, and just use its bytes instead, disregarding their
interpretation as characters.  The question is: would that yield what
users will want and expect?  The answer could be YES in some use cases
and NO in others.

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-19 14:16                                 ` Lars Ingebrigtsen
@ 2017-04-19 14:48                                   ` Stefan Monnier
  0 siblings, 0 replies; 128+ messages in thread
From: Stefan Monnier @ 2017-04-19 14:48 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: emacs-devel

> But Emacs does have functions already that are rather... vague... about
> what they do in this area and people seem to live with the ambiguity:
> (base64-decode-string (base64-encode-string "Héllo"))
> => "H\351llo"

Yes, we have such cases of permissiveness in many places.
I think they're all errors and would rather not make it worse.

> The less confusion in this area the better, and especially for
> encryption functions where you really want to get the correct text back
> again, I think it would be better, long-term, to not allow non-unibyte
> text inputs.

My thoughts exactly.


        Stefan



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-19 14:41                                 ` Eli Zaretskii
@ 2017-04-19 14:54                                   ` Stefan Monnier
  2017-04-19 15:31                                     ` Eli Zaretskii
  2017-04-19 15:48                                   ` Ted Zlatanov
  1 sibling, 1 reply; 128+ messages in thread
From: Stefan Monnier @ 2017-04-19 14:54 UTC (permalink / raw)
  To: emacs-devel

> I think it's important to discuss the expected results, because we
> could avoid encoding the string, either inside or outside of the
> functions, and just use its bytes instead, disregarding their
> interpretation as characters.  The question is: would that yield what
> users will want and expect?  The answer could be YES in some use cases
> and NO in others.

Indeed, in some cases we might want to work on multibyte text without
encoding it at all.  Maybe we could have an argument specifying that the
caller intends to operate on the internal encoding.

But what shouldn't be in those functions is encoding/decoding: if
encoding/decoding is needed, then it should be done before/after calling
the functions.


        Stefan




^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-19  2:08                                     ` Ted Zlatanov
  2017-04-19  2:42                                       ` Noam Postavsky
@ 2017-04-19 15:24                                       ` Davis Herring
  2017-04-19 15:45                                       ` Eli Zaretskii
  2 siblings, 0 replies; 128+ messages in thread
From: Davis Herring @ 2017-04-19 15:24 UTC (permalink / raw)
  To: emacs-devel

>   There's no Lisp_Object AFAIK to represent a file. So for
>   BUFFER-OR-STRING what do I use to indicate a file?
>
>   "file:///the/path" ; a special string format: URL format?

Certainly not this one -- such a string could of course _be_ the data to 
be treated.

>   (file "/the/path") ; a nested list with a symbol?

I like this one best, I think.

>   (insert-file-contents-literally "/the/path") ; a form, called in a temp buffer?

Certainly it's powerful, but it clashes with lexical binding, and it's 
not clear (to me) how it's superior to the caller simply creating the 
temporary buffer themselves.  (I seem to recall you were talking about 
protecting the data from Lisp inspection?  ...But evalling whatever Lisp 
inside the buffer that holds the data, however temporarily, would of 
course be quite a hole in such a scheme.)

Davis

-- 
This product is sold by volume, not by mass.  If it appears too dense or 
too sparse, it is because mass-energy conversion has occurred during 
shipping.

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-19 14:54                                   ` Stefan Monnier
@ 2017-04-19 15:31                                     ` Eli Zaretskii
  0 siblings, 0 replies; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-19 15:31 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Wed, 19 Apr 2017 10:54:55 -0400
> 
> Indeed, in some cases we might want to work on multibyte text without
> encoding it at all.  Maybe we could have an argument specifying that the
> caller intends to operate on the internal encoding.

Yes, could be.

> But what shouldn't be in those functions is encoding/decoding: if
> encoding/decoding is needed, then it should be done before/after calling
> the functions.

But if we pass buffer text, we could always encode using
buffer-file-coding-system, and IMO that would be the expected result
(provided that the user didn't want to use the internal
representation).  We do this in the likes of write-region, so why not
here?



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-19  2:08                                     ` Ted Zlatanov
  2017-04-19  2:42                                       ` Noam Postavsky
  2017-04-19 15:24                                       ` Davis Herring
@ 2017-04-19 15:45                                       ` Eli Zaretskii
  2017-04-20 17:24                                         ` Ted Zlatanov
  2 siblings, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-19 15:45 UTC (permalink / raw)
  To: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Tue, 18 Apr 2017 22:08:50 -0400
> 
> Update on the patch, summarizing the last few days and restating all my
> remaining questions. It's in the scratch/tzz/nettle branch of emacs.git
> or at https://gitlab.com/emacs-ci/emacs/merge_requests/2

Thanks for working on this, Ted.

> * docs are up to date with current code

The NEWS entry should mention the section in the manual which
describes these features.

Also, you consistently leave only one space between sentences, which
is not our convention.

I would suggest to extract the common description of the
(BUFFER-OR-STRING START END CODING-SYSTEM NOERROR) form, so you could
have it only once, instead of repeating it with each function.

>   I'd like to allow files here, but I'd also like to avoid reading them
>   into a buffer or a string just to use with extract_data_from_object().
> 
>   There's no Lisp_Object AFAIK to represent a file. So for
>   BUFFER-OR-STRING what do I use to indicate a file?
> 
>   "file:///the/path" ; a special string format: URL format?
> 
>   (file "/the/path") ; a nested list with a symbol?
> 
>   (insert-file-contents-literally "/the/path") ; a form, called in a temp buffer?

I think (file "FOO") is the best.  (I understand that the file will be
submitted to GnuTLS functions for processing, is that right?)



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-19 14:41                                 ` Eli Zaretskii
  2017-04-19 14:54                                   ` Stefan Monnier
@ 2017-04-19 15:48                                   ` Ted Zlatanov
  2017-04-19 16:49                                     ` Lars Ingebrigtsen
  1 sibling, 1 reply; 128+ messages in thread
From: Ted Zlatanov @ 2017-04-19 15:48 UTC (permalink / raw)
  To: emacs-devel

On Wed, 19 Apr 2017 10:54:55 -0400 Stefan Monnier <monnier@iro.umontreal.ca> wrote: 

On Wed, 19 Apr 2017 17:41:52 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

>> I think it's important to discuss the expected results, because we
>> could avoid encoding the string, either inside or outside of the
>> functions, and just use its bytes instead, disregarding their
>> interpretation as characters.  The question is: would that yield what
>> users will want and expect?  The answer could be YES in some use cases
>> and NO in others.

SM> Indeed, in some cases we might want to work on multibyte text without
SM> encoding it at all.  Maybe we could have an argument specifying that the
SM> caller intends to operate on the internal encoding.

SM> But what shouldn't be in those functions is encoding/decoding: if
SM> encoding/decoding is needed, then it should be done before/after calling
SM> the functions.

That's exactly what's happening, except it's done in C code, by
extract_data_from_object() before the function is called. So I can make
it a requirement. But I think there's two distinct use cases:
"Emacs to Emacs buffer/string" and "Emacs to non-Emacs file/string" and
maybe they should be considered separately.

Eli wrote:
>> But if we pass buffer text, we could always encode using
>> buffer-file-coding-system, and IMO that would be the expected result
>> (provided that the user didn't want to use the internal
>> representation).  We do this in the likes of write-region, so why not
>> here?

Right, I think user expectations are important here.

Lars wrote:
>> The less confusion in this area the better, and especially for
>> encryption functions where you really want to get the correct text back
>> again, I think it would be better, long-term, to not allow non-unibyte
>> text inputs.

Not allow them where? Just the new stuff I'm adding? Or also
`secure-hash' and `md5' etc? Or anything that deals with crypto (which
could also affect EPG)?

If it's just the code in my patch, let me know what I need to change in
the way it calls extract_data_from_object() and I'll adjust the code,
the tests, and the docs. IIUC Stefan wants the call to fail if the
string or buffer is mulibyte and Lars agreed?

But we can also look to define a payload structure that would express
the coding system, unibyte/multibyte, etc. better.

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-19 15:48                                   ` Ted Zlatanov
@ 2017-04-19 16:49                                     ` Lars Ingebrigtsen
  2017-04-19 17:24                                       ` Eli Zaretskii
  2017-04-19 19:49                                       ` Stefan Monnier
  0 siblings, 2 replies; 128+ messages in thread
From: Lars Ingebrigtsen @ 2017-04-19 16:49 UTC (permalink / raw)
  To: emacs-devel

Ted Zlatanov <tzz@lifelogs.com> writes:

> Not allow them where? Just the new stuff I'm adding? Or also
> `secure-hash' and `md5' etc? Or anything that deals with crypto (which
> could also affect EPG)?

I was thinking just the new stuff you're adding, because the ship has
sailed for the older functions.  (I'd expect the older functions to
become obsolete after a while...)

By just allowing non-multibyte text as input, you avoid stuff like

---
If CODING-SYSTEM is nil or omitted, the default depends on OBJECT.  If
OBJECT is a buffer, the default for CODING-SYSTEM is whatever coding
system would be chosen by default for writing this text into a file.

If OBJECT is a string, the most preferred coding system (see the
command ‘prefer-coding-system’) is used.

If NOERROR is non-nil, silently assume the ‘raw-text’ coding if the
guesswork fails.  Normally, an error is signaled in such case.
---

in the needlessly complicated call signature to `md5'.  I mean, we
already have these functions for handling characters separately, if the
user needs to encode text, and re-stuffing them into the signatures of
the encryption functions is just confusing, I think.

That is, I think

(md5 (encode-coding-string "Héllo" 'iso-8859-1))

is a better interface than

(md5 "Héllo" nil nil 'iso-8859-1)

> If it's just the code in my patch, let me know what I need to change in
> the way it calls extract_data_from_object() and I'll adjust the code,
> the tests, and the docs. IIUC Stefan wants the call to fail if the
> string or buffer is mulibyte and Lars agreed?

Yes, but as Eli and Stefan said, perhaps there's something to be said
for just allowing encrypting the internal representation of Emacs
objects, too.  (Which is what base64-encode-string/base64-encode-region
does.)

This is more flexible, but is this a flexibility that's useful?  I'm not
sure.  I think in the base64 case, it's led to confusion over the years.

Anyway, I may have forgotten to say this: Whee!  Encryption functions in
Emacs!  Yay!  Thanks for implementing this stuff, Ted.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-19 16:49                                     ` Lars Ingebrigtsen
@ 2017-04-19 17:24                                       ` Eli Zaretskii
  2017-04-19 19:53                                         ` Stefan Monnier
  2017-04-19 19:49                                       ` Stefan Monnier
  1 sibling, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-19 17:24 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: emacs-devel

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Date: Wed, 19 Apr 2017 18:49:56 +0200
> 
> By just allowing non-multibyte text as input, you avoid stuff like
> 
> ---
> If CODING-SYSTEM is nil or omitted, the default depends on OBJECT.  If
> OBJECT is a buffer, the default for CODING-SYSTEM is whatever coding
> system would be chosen by default for writing this text into a file.
> 
> If OBJECT is a string, the most preferred coding system (see the
> command ‘prefer-coding-system’) is used.
> 
> If NOERROR is non-nil, silently assume the ‘raw-text’ coding if the
> guesswork fails.  Normally, an error is signaled in such case.
> ---
> 
> in the needlessly complicated call signature to `md5'.

We could simply use choose_write_coding_system instead, we already use
it for write-region, which can write either buffer text or a string.
Then all the complexity that bothers you will go away, and be no more
visible than it is when you call write-region.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-19 16:49                                     ` Lars Ingebrigtsen
  2017-04-19 17:24                                       ` Eli Zaretskii
@ 2017-04-19 19:49                                       ` Stefan Monnier
  1 sibling, 0 replies; 128+ messages in thread
From: Stefan Monnier @ 2017-04-19 19:49 UTC (permalink / raw)
  To: emacs-devel

> That is, I think
> (md5 (encode-coding-string "Héllo" 'iso-8859-1))
> is a better interface than
> (md5 "Héllo" nil nil 'iso-8859-1)

That's right.

> Yes, but as Eli and Stefan said, perhaps there's something to be said
> for just allowing encrypting the internal representation of Emacs
> objects, too.  (Which is what base64-encode-string/base64-encode-region
> does.)
> This is more flexible, but is this a flexibility that's useful?

This is useful for purposes like "check that the hash of a buffer is not
affected by an operation".  Where encoding would be more costly than
computing the hash and would bring no benefit (unless you specifically
want to ignore changes to the buffer which are not visible in the
encoded output).

> I think in the base64 case, it's led to confusion over the years.

I agree that for base64-encode-*, this optimisation doesn't make much
sense (when would you want to do (base64-encode (encoding-coding-string
FOO 'emacs-internal)) ?) so it only leads to bugs.
I think it was not a conscious choice but just an oversight in
the implementation.

        Stefan

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-19 17:24                                       ` Eli Zaretskii
@ 2017-04-19 19:53                                         ` Stefan Monnier
  2017-04-20  2:30                                           ` Eli Zaretskii
  0 siblings, 1 reply; 128+ messages in thread
From: Stefan Monnier @ 2017-04-19 19:53 UTC (permalink / raw)
  To: emacs-devel

> We could simply use choose_write_coding_system instead, we already use
> it for write-region, which can write either buffer text or a string.
> Then all the complexity that bothers you will go away, and be no more
> visible than it is when you call write-region.

In my experience it's important for the coder to carefully choose which
coding-system should be used.  So if we want the functions to do
encoding internally, then I'd prefer we have a non-optional
coding-system argument.  "Automatically choose an appropriate
coding-system" encourages bugs.


        Stefan




^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-19 19:53                                         ` Stefan Monnier
@ 2017-04-20  2:30                                           ` Eli Zaretskii
  2017-04-20  3:36                                             ` Stefan Monnier
  0 siblings, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-20  2:30 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Wed, 19 Apr 2017 15:53:43 -0400
> 
> > We could simply use choose_write_coding_system instead, we already use
> > it for write-region, which can write either buffer text or a string.
> > Then all the complexity that bothers you will go away, and be no more
> > visible than it is when you call write-region.
> 
> In my experience it's important for the coder to carefully choose which
> coding-system should be used.  So if we want the functions to do
> encoding internally, then I'd prefer we have a non-optional
> coding-system argument.  "Automatically choose an appropriate
> coding-system" encourages bugs.

How is this different from write-region and its ilks?



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-20  2:30                                           ` Eli Zaretskii
@ 2017-04-20  3:36                                             ` Stefan Monnier
  2017-04-20 15:46                                               ` Eli Zaretskii
  0 siblings, 1 reply; 128+ messages in thread
From: Stefan Monnier @ 2017-04-20  3:36 UTC (permalink / raw)
  To: emacs-devel

>> In my experience it's important for the coder to carefully choose which
>> coding-system should be used.  So if we want the functions to do
>> encoding internally, then I'd prefer we have a non-optional
>> coding-system argument.  "Automatically choose an appropriate
>> coding-system" encourages bugs.
> How is this different from write-region and its ilks?

Off the top of my head:
- it's new, so we get to avoid past errors
- it's not a command
- it's not limited to interaction with files


        Stefan




^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-20  3:36                                             ` Stefan Monnier
@ 2017-04-20 15:46                                               ` Eli Zaretskii
  2017-04-20 15:59                                                 ` Lars Ingebrigtsen
  2017-04-20 17:14                                                 ` Stefan Monnier
  0 siblings, 2 replies; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-20 15:46 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Wed, 19 Apr 2017 23:36:11 -0400
> 
> >> In my experience it's important for the coder to carefully choose which
> >> coding-system should be used.  So if we want the functions to do
> >> encoding internally, then I'd prefer we have a non-optional
> >> coding-system argument.  "Automatically choose an appropriate
> >> coding-system" encourages bugs.
> > How is this different from write-region and its ilks?
> 
> Off the top of my head:
> - it's new, so we get to avoid past errors

Not sure what past errors you had in mind.  Any errors we made in the
encoding/decoding department were fixed by Emacs 23, and the stuff is
remarkably stable since then, with a single minor improvement in Emacs
24.4.  From my POV, this one of the greatest success stories in Emacs.
Paid for with sweat, blood, and tears, but success nonetheless.  Why
would we want to refrain from reusing it?

> - it's not a command

I don't see how this is relevant: write-region is mostly used
non-interactively.

> - it's not limited to interaction with files

write-region was just an example; we use basically the same rules with
process I/O.

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-20 15:46                                               ` Eli Zaretskii
@ 2017-04-20 15:59                                                 ` Lars Ingebrigtsen
  2017-04-20 16:24                                                   ` Eli Zaretskii
  2017-04-20 17:14                                                 ` Stefan Monnier
  1 sibling, 1 reply; 128+ messages in thread
From: Lars Ingebrigtsen @ 2017-04-20 15:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Stefan Monnier, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> > How is this different from write-region and its ilks?
>> 
>> Off the top of my head:
>> - it's new, so we get to avoid past errors
>
> Not sure what past errors you had in mind.  Any errors we made in the
> encoding/decoding department were fixed by Emacs 23, and the stuff is
> remarkably stable since then, with a single minor improvement in Emacs
> 24.4.  From my POV, this one of the greatest success stories in Emacs.
> Paid for with sweat, blood, and tears, but success nonetheless.  Why
> would we want to refrain from reusing it?

These are encryption primitives.  Stuffing code that guesses What I Mean
into these primitives doesn't seem like the most productive way to
proceed.  If somebody later wants to add a DWIM framework on top of
these primitives (to, say, automatically encrypt everything that Emacs
saves), that's a different thing.

But the primitives themselves should, in my opinion, be functional and
predictable: You give them explicit inputs and you always get the same
result back.

Encryption is also often used in conjunction with various protocols, and
(as opposed to saving files locally) the local user's preferences are
irrelevant to how the octets are supposed to be encoded on the wire.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-20 15:59                                                 ` Lars Ingebrigtsen
@ 2017-04-20 16:24                                                   ` Eli Zaretskii
  2017-04-20 17:25                                                     ` Stefan Monnier
  0 siblings, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-20 16:24 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: monnier, emacs-devel

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>,  emacs-devel@gnu.org
> Date: Thu, 20 Apr 2017 17:59:51 +0200
> 
> These are encryption primitives.  Stuffing code that guesses What I Mean
> into these primitives doesn't seem like the most productive way to
> proceed.  If somebody later wants to add a DWIM framework on top of
> these primitives (to, say, automatically encrypt everything that Emacs
> saves), that's a different thing.
> 
> But the primitives themselves should, in my opinion, be functional and
> predictable: You give them explicit inputs and you always get the same
> result back.

You can have predictability if you bind coding-system-for-write to
whatever you want.  Just like you do with file and process I/O.

> Encryption is also often used in conjunction with various protocols, and
> (as opposed to saving files locally) the local user's preferences are
> irrelevant to how the octets are supposed to be encoded on the wire.

How is this different from sending the stiff unencrypted via the same
protocols?

Once again, please tell why the considerations you raise do NOT apply
to write-region, insert-file-contents, send-string etc., because if
they do apply, we already have very reasonable solutions for them,
which are well-tested by time and many users.  In the I/O department,
experience shows that most users and programmers don't want/need this
level of control, so the heuristics that does TRT without burdening
users/programmers is a Good Thing.

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-20 15:46                                               ` Eli Zaretskii
  2017-04-20 15:59                                                 ` Lars Ingebrigtsen
@ 2017-04-20 17:14                                                 ` Stefan Monnier
  2017-04-20 19:29                                                   ` Eli Zaretskii
  1 sibling, 1 reply; 128+ messages in thread
From: Stefan Monnier @ 2017-04-20 17:14 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

>> - it's new, so we get to avoid past errors
> Not sure what past errors you had in mind.

Design errors with which we have to live.


        Stefan



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-19 15:45                                       ` Eli Zaretskii
@ 2017-04-20 17:24                                         ` Ted Zlatanov
  2017-04-20 19:38                                           ` Eli Zaretskii
  0 siblings, 1 reply; 128+ messages in thread
From: Ted Zlatanov @ 2017-04-20 17:24 UTC (permalink / raw)
  To: emacs-devel

On Wed, 19 Apr 2017 18:45:40 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

EZ> The NEWS entry should mention the section in the manual which
EZ> describes these features.

I think I did it right? That NEWS format is really confusing.

EZ> Also, you consistently leave only one space between sentences, which
EZ> is not our convention.

Hrm, yeah, and I have my paragraph reformat set to use two spaces.
Weird. I'll try to watch for it but maybe it can also be a dir-locals
setting in the docs directory? Sorry in any case.

EZ> I would suggest to extract the common description of the
EZ> (BUFFER-OR-STRING START END CODING-SYSTEM NOERROR) form, so you could
EZ> have it only once, instead of repeating it with each function.

I'm not sure how to make that look good in the texi docs. Is there an
existing convention for defining a data format that will be reused in
function definitions?

It would be handy if I could refer to it from the C function docs too. I
think it would be generally useful for the C core and Lisp functions, so
it's worth making it look nice.

Finally, I'd like a similar format for output as well, since writing the
output to a buffer or a file is a very common need.

EZ> I think (file "FOO") is the best.  (I understand that the file will be
EZ> submitted to GnuTLS functions for processing, is that right?)

Since you and David agreed, I've made that the third allowed format. It
errors out for now. Should I use an internal buffer (not in the visible
list) and read into it, and then use the buffer data? Or should I use
emacs_open() etc. and actually implement a raw file read operation,
which should be a bit more secure and less subject to coding systems?

NP> On Tue, Apr 18, 2017 at 10:08 PM, Ted Zlatanov <tzz@lifelogs.com> wrote:
>> 
>> * I pin to GnuTLS 3.4.0 instead of AC_CHECK_FUNCS_ONCE because I
>> couldn't get that autoconf macro to work! I would appreciate some help
>> for how to use that macro for GnuTLS API functions. I think it needs
>> to be told to include "gnutls/crypto.h" because the resulting C test
>> doesn't.

NP> Missing header would only be a warning. I guess what's actually
NP> missing is the -lgnutls in $LIBS. Possibly you must use AC_CHECK_FUNCS
NP> instead of AC_CHECK_FUNCS_ONCE since you need to manipulate $LIBS
NP> around the call.

I tried that and a dozen other macros and ideas, and couldn't get any of
itto work. The headers were not getting included. I hate autoconf and
gave up for now.

On Wed, 19 Apr 2017 18:49:56 +0200 Lars Ingebrigtsen <larsi@gnus.org> wrote: 

LI> Ted Zlatanov <tzz@lifelogs.com> writes:

>> Not allow them where? Just the new stuff I'm adding? Or also
>> `secure-hash' and `md5' etc? Or anything that deals with crypto (which
>> could also affect EPG)?

LI> I was thinking just the new stuff you're adding, because the ship has
LI> sailed for the older functions.  (I'd expect the older functions to
LI> become obsolete after a while...)

OK, understood. So I'd only allow (BUFFER-OR-STRING-OR-FILE START-BYTE
END-BYTE NOERROR), and I'd error out if the buffer or string have
multibyte data.

Since Eli and Stefan are still discussing this, I'll hold off making
the change, but it looks like the consensus is to only do unibyte.

LI> I think in the base64 case, it's led to confusion over the years.

I think you're right. The (file "filename") input/output format spec
should make the coding system issues less important, since users won't
have to read the file into a string or buffer in order to operate on it.

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-20 16:24                                                   ` Eli Zaretskii
@ 2017-04-20 17:25                                                     ` Stefan Monnier
  2017-04-20 19:40                                                       ` Lars Ingebrigtsen
  2017-04-20 19:58                                                       ` Eli Zaretskii
  0 siblings, 2 replies; 128+ messages in thread
From: Stefan Monnier @ 2017-04-20 17:25 UTC (permalink / raw)
  To: emacs-devel

> You can have predictability if you bind coding-system-for-write to
> whatever you want.

Why would the casual coder do that?  He'll just call the primitive, see
that it works for his use-case and not even realize that there's some
encoding/decoding, encouraging him along to way to overlook the big
difference between strings of chars and strings of bytes.

And by the time that carelessness bites the user, it'll be a lot more
difficult to track down its source and fix it, because the
misunderstanding will have carried over various parts of the design of
his code.  Lars understands this very well because Gnus's code is full
of such problems.

        Stefan

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-20 17:14                                                 ` Stefan Monnier
@ 2017-04-20 19:29                                                   ` Eli Zaretskii
  0 siblings, 0 replies; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-20 19:29 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@IRO.UMontreal.CA>
> Cc: emacs-devel@gnu.org
> Date: Thu, 20 Apr 2017 13:14:26 -0400
> 
> >> - it's new, so we get to avoid past errors
> > Not sure what past errors you had in mind.
> 
> Design errors with which we have to live.

If you are serious, please describe the details of those design
errors.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-20 17:24                                         ` Ted Zlatanov
@ 2017-04-20 19:38                                           ` Eli Zaretskii
  2017-04-20 20:24                                             ` Ted Zlatanov
  0 siblings, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-20 19:38 UTC (permalink / raw)
  To: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Thu, 20 Apr 2017 13:24:50 -0400
> 
> On Wed, 19 Apr 2017 18:45:40 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 
> 
> EZ> The NEWS entry should mention the section in the manual which
> EZ> describes these features.
> 
> I think I did it right?

Not sure what you mean.  I just did "git pull", and your branch's NEWS
doesn't mention the manual section where this is described.  Something
like "See the node 'FOO' in the ELisp manual for more details."  Am I
blind?

> EZ> I would suggest to extract the common description of the
> EZ> (BUFFER-OR-STRING START END CODING-SYSTEM NOERROR) form, so you could
> EZ> have it only once, instead of repeating it with each function.
> 
> I'm not sure how to make that look good in the texi docs. Is there an
> existing convention for defining a data format that will be reused in
> function definitions?

Something like this at the beginning of the node:

  All of the functions described here accept argument lists of the
  form @code{(BUFFER-OR-STRING START END CODING-SYSTEM NOERROR)},
  where BUFFER-OR-STRING ...

> EZ> I think (file "FOO") is the best.  (I understand that the file will be
> EZ> submitted to GnuTLS functions for processing, is that right?)
> 
> Since you and David agreed, I've made that the third allowed format. It
> errors out for now. Should I use an internal buffer (not in the visible
> list) and read into it, and then use the buffer data? Or should I use
> emacs_open() etc. and actually implement a raw file read operation,
> which should be a bit more secure and less subject to coding systems?

I'm confused.  I thought this form of an argument will cause the file
name be passed to GnuTLS functions, and GnuTLS will then open the file
and read its data.  If that's not true, and Emacs needs to read the
file anyway, then what's the purpose of having this form of the
argument?  How is it better than just reading the file into a
temporary buffer and submitting the buffer to the function?  I thought
you wanted to avoid having the file's contents in Emacs's memory for
security reasons.

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-20 17:25                                                     ` Stefan Monnier
@ 2017-04-20 19:40                                                       ` Lars Ingebrigtsen
  2017-04-20 20:31                                                         ` Eli Zaretskii
  2017-04-20 19:58                                                       ` Eli Zaretskii
  1 sibling, 1 reply; 128+ messages in thread
From: Lars Ingebrigtsen @ 2017-04-20 19:40 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> And by the time that carelessness bites the user, it'll be a lot more
> difficult to track down its source and fix it, because the
> misunderstanding will have carried over various parts of the design of
> his code.  Lars understands this very well because Gnus's code is full
> of such problems.

Indeed.  Even trivial things like forwarding files drifts irregularly
between states of "works now" and "destroys your files" as Gnus fights
and tries to work with the ever-changing Emacs DWIM machinery.

What's right for interactive use are often not the correct primitives
for implementing protocols.  But if you have predictable primitives, you
can make Emacs DWIM interactively.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-20 17:25                                                     ` Stefan Monnier
  2017-04-20 19:40                                                       ` Lars Ingebrigtsen
@ 2017-04-20 19:58                                                       ` Eli Zaretskii
  2017-04-20 20:36                                                         ` Eli Zaretskii
  1 sibling, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-20 19:58 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Thu, 20 Apr 2017 13:25:43 -0400
> 
> > You can have predictability if you bind coding-system-for-write to
> > whatever you want.
> 
> Why would the casual coder do that?

Casual coders won't.  And why should they?

> He'll just call the primitive, see
> that it works for his use-case and not even realize that there's some
> encoding/decoding, encouraging him along to way to overlook the big
> difference between strings of chars and strings of bytes.

I don't see any problem with that.  Perhaps you could try explaining
why you think there's a problem in this scenario.  E.g., compare:

  . a C program reads a file encoded in Latin-1 and passes the text it
    read to a function that encrypts it

and

  . an Emacs Lisp program reads a file encoded in Latin-1 into a
    buffer, then passes that buffer to a function that encrypts it

How are these two different, from the user's POV?  If they are not
different, then in the 2nd case encoding the buffer in Latin-1 before
passing the bytes to GnuTLS is TRT.

> Lars understands this very well because Gnus's code is full of such
> problems.

Gnus us full of such problems because some Gnus contributors tried to
fix problems with non-ASCII text without understanding what they are
doing and how this stuff should work, and also because Gnus messes too
much with unibyte text.  Let's "learn from past mistakes" and not do
this with these new functions.

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-20 19:38                                           ` Eli Zaretskii
@ 2017-04-20 20:24                                             ` Ted Zlatanov
  2017-04-20 20:42                                               ` Lars Ingebrigtsen
  2017-04-21  6:14                                               ` Eli Zaretskii
  0 siblings, 2 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-04-20 20:24 UTC (permalink / raw)
  To: emacs-devel

On Thu, 20 Apr 2017 22:38:05 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

>> On Wed, 19 Apr 2017 18:45:40 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

EZ> Not sure what you mean.  I just did "git pull", and your branch's NEWS
EZ> doesn't mention the manual section where this is described.  Something
EZ> like "See the node 'FOO' in the ELisp manual for more details."  Am I
EZ> blind?

Oh! I see now. I thought there was some special formatting in NEWS to
link to nodes. I've now done as you suggested.

Incidentally it would be nice if NEWS (Outline mode) supported clickable
links to the manuals like "(elisp) Formatting Strings" or
(info "(elisp) Formatting Strings") or something else. Right now it's a
bit awkward.

EZ> Something like this at the beginning of the node:

EZ>   All of the functions described here accept argument lists of the
EZ>   form @code{(BUFFER-OR-STRING START END CODING-SYSTEM NOERROR)},
EZ>   where BUFFER-OR-STRING ...

OK, but I want to link to that from the C function docs too, since they
also will reference that format. Would I link to the manual node? Is
that OK for C functions, which are kind of standalone as far as docs go?

EZ> I think (file "FOO") is the best.  (I understand that the file will be
EZ> submitted to GnuTLS functions for processing, is that right?)
>> 
>> Since you and David agreed, I've made that the third allowed format. It
>> errors out for now. Should I use an internal buffer (not in the visible
>> list) and read into it, and then use the buffer data? Or should I use
>> emacs_open() etc. and actually implement a raw file read operation,
>> which should be a bit more secure and less subject to coding systems?

EZ> I'm confused.  I thought this form of an argument will cause the file
EZ> name be passed to GnuTLS functions, and GnuTLS will then open the file
EZ> and read its data.  If that's not true, and Emacs needs to read the
EZ> file anyway, then what's the purpose of having this form of the
EZ> argument?  How is it better than just reading the file into a
EZ> temporary buffer and submitting the buffer to the function?  I thought
EZ> you wanted to avoid having the file's contents in Emacs's memory for
EZ> security reasons.

Sorry for the confusion; let me explain better.

extract_data_from_object() is the API; any function that uses it
(including secure_hash()) can benefit. I didn't want to write special
cases in gnutls.c for file vs. buffer vs. string handling.

So I said one approach is to use an internal (not visible to Lisp code)
buffer and read the file into that, so the extract_data_from_object()
function doesn't change much, and files reuse the buffer functionality.
One advantage of that approach is that it would respect coding systems
automatically, which seemed important earlier. But I should have been
clear that it was just one possible approach.

Then I listed another approach: to read the file directly with
extract_data_from_object(). Based on your comments, and based on how
we're sort of agreeing to limit the code to unibyte data, it's becoming
clear that this is the better way. I also think it's more secure and
less ambiguous. So I'll implement it.

Thus extract_data_from_object() would allocate and read N bytes into a C
memory buffer and pass that on, and the caller would have to free it.
I'll see how that works out in code, but shouldn't be too bad. It will
be a single place to review and fix, at least.

I think I mentioned already that I am also considering how to specify
the *output* of these functions better. It feels like it should parallel
the same BUFFER-OR-STRING-OR-FILE structure so the functions can write
to the desired place with minimal unintended visibility from Lisp.

Thanks!
Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-20 19:40                                                       ` Lars Ingebrigtsen
@ 2017-04-20 20:31                                                         ` Eli Zaretskii
  0 siblings, 0 replies; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-20 20:31 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: monnier, emacs-devel

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Date: Thu, 20 Apr 2017 21:40:14 +0200
> Cc: emacs-devel@gnu.org
> 
> What's right for interactive use are often not the correct primitives
> for implementing protocols.

write-region, insert-file-contents, send-string, etc. _are_
primitives.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-20 19:58                                                       ` Eli Zaretskii
@ 2017-04-20 20:36                                                         ` Eli Zaretskii
  0 siblings, 0 replies; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-20 20:36 UTC (permalink / raw)
  To: monnier; +Cc: emacs-devel

> Date: Thu, 20 Apr 2017 22:58:09 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: emacs-devel@gnu.org
> 
> Gnus us full of such problems because some Gnus contributors tried to
> fix problems with non-ASCII text without understanding what they are
> doing and how this stuff should work

Ouch, this sounds unfair, which is not what I meant.  So let me
elaborate a bit to explain myself.

Gnus bumped into issues with non-ASCII text when the core
functionality was still in diapers, and had quite a few bugs, both
design bugs and implementation bugs.  Gnus developers wanted to fix
those bugs, but instead of taking them up with core developers and
demanding that the problems be fixed in core, they worked around the
problems in Gnus application code.  The fact that Gnus was developed
with only a very lose coupling to Emacs didn't help.  Then those
workarounds were kept intact long after the core issues were fixed or
redesigned.  The result was what you describe.

IMO, none of this is relevant to the issue at hand, since we are
talking about core functionality to begin with.

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-20 20:24                                             ` Ted Zlatanov
@ 2017-04-20 20:42                                               ` Lars Ingebrigtsen
  2017-04-20 21:54                                                 ` Ted Zlatanov
  2017-04-21  6:14                                               ` Eli Zaretskii
  1 sibling, 1 reply; 128+ messages in thread
From: Lars Ingebrigtsen @ 2017-04-20 20:42 UTC (permalink / raw)
  To: emacs-devel

Ted Zlatanov <tzz@lifelogs.com> writes:

> Then I listed another approach: to read the file directly with
> extract_data_from_object(). Based on your comments, and based on how
> we're sort of agreeing to limit the code to unibyte data, it's becoming
> clear that this is the better way. I also think it's more secure and
> less ambiguous. So I'll implement it.
>
> Thus extract_data_from_object() would allocate and read N bytes into a C
> memory buffer and pass that on, and the caller would have to free it.
> I'll see how that works out in code, but shouldn't be too bad. It will
> be a single place to review and fix, at least.

While we're bikeshedding interfaces.  :-)

 -- Function: gnutls-symmetric-encrypt cipher key iv input &optional
          aead_auth
     The CIPHER can be the whole plist from ‘gnutls-ciphers’, or just
     the symbol key, or a string with the name of that symbol.

     The KEY can be a string or a buffer or a list in the format
     (BUFFER-OR-STRING START END CODING-SYSTEM NOERROR) where only the
     first element is required.  The KEY will be wiped after use if it’s
     a string.

     The IV and INPUT and the optional AEAD_AUTH can be a string or a
     buffer or a list in the format (BUFFER-OR-STRING START END
     CODING-SYSTEM NOERROR) where only the first element is required.

     In the list form of KEY, IV, INPUT, and the optional AEAD_AUTH you
     can specify a buffer or a string, an optional range to be
     extracted, and an optional coding system.  The last optional item,
     NOERROR, overrides the normal error when the text can’t be encoded
     using the specified or chosen coding system.  When NOERROR is
     non-‘nil’, this function silently uses ‘raw-text’ coding instead.
     This is similar to how ‘md5’ and ‘secure-hash’ operate but the
     arguments are packed in a list.

To me, as a naive application programmer that apparently doesn't
understand anything, this is all rather more complicated and more
flexible than optimal.  :-)

When encrypting something, you usually have a (short) passphrase from
somewhere (KEY), a (short) salt (the binary IV data) and the (long) text
(INPUT).  To me, this suggests that KEY, IV and INPUT doesn't really have
to allow all these various input types: We can limit KEY and IV to be
strings, and INPUT can be both a string and a buffer.  That's really the
use case for 99.72% of the cases, isn't it?

Then the natural call signature here would be

gnutls-symmetric-encrypt cipher key iv input &optional aead-auth start end

If input is a buffer, then the text in the buffer could perhaps be
replaced with the encrypted data, a la many other transformation
functions.

In any case, the `file' case you're discussing here doesn't really feel
that useful, but also makes things more complicated.  If the user wants
to encrypt a file, then it's more flexible to just have the caller
insert the file into a buffer and call the function as normal:

(with-temp-buffer
  (set-buffer-multibyte nil)
  (insert-file-contents "/tmp/rms.jpg")
  (gnutls-symmetric-encrypt 'AES-256-CBC key iv (current-buffer))
  (write-region ...))

Your call signatures are extremely flexible, and as such I think they
may be a bit overwhelming.  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-20 20:42                                               ` Lars Ingebrigtsen
@ 2017-04-20 21:54                                                 ` Ted Zlatanov
  2017-04-21  6:21                                                   ` Eli Zaretskii
  2017-04-21 18:45                                                   ` Lars Ingebrigtsen
  0 siblings, 2 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-04-20 21:54 UTC (permalink / raw)
  To: emacs-devel

On Thu, 20 Apr 2017 22:42:37 +0200 Lars Ingebrigtsen <larsi@gnus.org> wrote: 

LI> While we're bikeshedding interfaces.  :-)

Damn it.

LI> When encrypting something, you usually have a (short) passphrase from
LI> somewhere (KEY), a (short) salt (the binary IV data) and the (long) text
LI> (INPUT).  To me, this suggests that KEY, IV and INPUT doesn't really have
LI> to allow all these various input types: We can limit KEY and IV to be
LI> strings, and INPUT can be both a string and a buffer.  That's really the
LI> use case for 99.72% of the cases, isn't it?

The KEY is secret and ideally would come from a file and never be
seen at the Lisp level. But tests and other use cases may need it from a
buffer (more secure but still accessible to Lisp) or a string (visible
to all as a function parameter).

Getting the INPUT from a file enables large files (not in the first
version probably) and other interesting use cases.

The IV will probably be internal and part of the payload. It probably
won't be part of the function signature normally.

So I think the flexibility is important.

LI> In any case, the `file' case you're discussing here doesn't really feel
LI> that useful, but also makes things more complicated.  If the user wants
LI> to encrypt a file, then it's more flexible to just have the caller
LI> insert the file into a buffer and call the function as normal

Aboslutely. It would be nice if the Emacs C core had "readers" like Java
or Go because then this discussion would be really simple: "did you use
a reader" - "yes" - "good" :)

LI> Your call signatures are extremely flexible, and as such I think they
LI> may be a bit overwhelming.  :-)

Understood. I think there will have to be an interface on top. These are
building blocks.

However it works out, I am positive it would be a mistake to limit
this C code to just strings and buffers.

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-20 20:24                                             ` Ted Zlatanov
  2017-04-20 20:42                                               ` Lars Ingebrigtsen
@ 2017-04-21  6:14                                               ` Eli Zaretskii
  2017-05-15 21:55                                                 ` Ted Zlatanov
  1 sibling, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-21  6:14 UTC (permalink / raw)
  To: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Thu, 20 Apr 2017 16:24:33 -0400
> 
> Incidentally it would be nice if NEWS (Outline mode) supported clickable
> links to the manuals like "(elisp) Formatting Strings" or
> (info "(elisp) Formatting Strings") or something else. Right now it's a
> bit awkward.

Patches to add that are welcome, I think.

> EZ> Something like this at the beginning of the node:
> 
> EZ>   All of the functions described here accept argument lists of the
> EZ>   form @code{(BUFFER-OR-STRING START END CODING-SYSTEM NOERROR)},
> EZ>   where BUFFER-OR-STRING ...
> 
> OK, but I want to link to that from the C function docs too, since they
> also will reference that format. Would I link to the manual node? Is
> that OK for C functions, which are kind of standalone as far as docs go?

If you mean primitives implemented in C that are exposed to Lisp,
yes.  For internal C functions inaccessible from Lisp, just describe
that in a comment preceding the functions.

> I think I mentioned already that I am also considering how to specify
> the *output* of these functions better. It feels like it should parallel
> the same BUFFER-OR-STRING-OR-FILE structure so the functions can write
> to the desired place with minimal unintended visibility from Lisp.

I think both this and the other sub-thread arrived at a point where it
is important to present a list of use cases we envision and would like
to support, because these kinds of decisions are prone to errors if we
don't hold the use cases in our minds.

So could you please present such a list, describing the source of the
text to be encrypted/decrypted/hashed, the purpose of the operation
(i.e. some higher-level context), and the destination where the
encrypted/decrypted/hashed text will go?  The list doesn't have to be
exhaustive, but it should include the most important use cases.

Thanks.

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-20 21:54                                                 ` Ted Zlatanov
@ 2017-04-21  6:21                                                   ` Eli Zaretskii
  2017-04-21 18:45                                                   ` Lars Ingebrigtsen
  1 sibling, 0 replies; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-21  6:21 UTC (permalink / raw)
  To: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Thu, 20 Apr 2017 17:54:32 -0400
> 
> LI> When encrypting something, you usually have a (short) passphrase from
> LI> somewhere (KEY), a (short) salt (the binary IV data) and the (long) text
> LI> (INPUT).  To me, this suggests that KEY, IV and INPUT doesn't really have
> LI> to allow all these various input types: We can limit KEY and IV to be
> LI> strings, and INPUT can be both a string and a buffer.  That's really the
> LI> use case for 99.72% of the cases, isn't it?
> 
> The KEY is secret and ideally would come from a file and never be
> seen at the Lisp level. But tests and other use cases may need it from a
> buffer (more secure but still accessible to Lisp) or a string (visible
> to all as a function parameter).

For testing, we could always write the key to a file before using it.
What other use cases would need the key from other sources?

> Getting the INPUT from a file enables large files (not in the first
> version probably) and other interesting use cases.

What other cases?  Large files is only theoretically useful, since
generally Emacs cannot do useful things on files larger than
most-positive-fixnum, and on 64-bit machines that is far enough to not
care.

> The IV will probably be internal and part of the payload. It probably
> won't be part of the function signature normally.
> 
> So I think the flexibility is important.

I think we need to weigh flexibility against the complexity, and find
the optimal balance.  So making the interfaces too complicated for use
cases that will happen only very rarely, if at all, should be avoided.

So I agree with Lars: we should rethink what types of data we want to
support for each argument, and shouldn't complicate the APIs beyond
what's necessary, at least at this point.

Once again, I think a list of use cases with some details will be
helpful to make this discussion more focused and make sure we are all
on the same page.

Thanks.

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-20 21:54                                                 ` Ted Zlatanov
  2017-04-21  6:21                                                   ` Eli Zaretskii
@ 2017-04-21 18:45                                                   ` Lars Ingebrigtsen
  2017-04-21 19:15                                                     ` Eli Zaretskii
  1 sibling, 1 reply; 128+ messages in thread
From: Lars Ingebrigtsen @ 2017-04-21 18:45 UTC (permalink / raw)
  To: emacs-devel

Ted Zlatanov <tzz@lifelogs.com> writes:

> The KEY is secret and ideally would come from a file and never be
> seen at the Lisp level. But tests and other use cases may need it from a
> buffer (more secure but still accessible to Lisp) or a string (visible
> to all as a function parameter).

Hm...  Having a file that just has a passphrase in it sounds like an
unusual use case.  I think in Emacs these tokens would normally come
from auth-source in most applications.  At least that what I see when I
salivate at use cases.  :-)

> Getting the INPUT from a file enables large files (not in the first
> version probably) and other interesting use cases.

Emacs buffers are surprisingly efficient at handling large files:
They're basically just (sort of) contiguous areas of memory with some
structs describing their contents.  Here's how long it takes this
machine to put a 4GB .iso file into a buffer (and then kill Emacs):

[larsi@stories ~]$ time emacs -batch --eval "(with-temp-buffer (set-buffer-multibyte nil) (let ((coding-system-for-read 'binary)) (insert-file-contents \"~/Downloads/debian-8.6.0-amd64-DVD-1.iso\") (message \"%s\" (buffer-size))))"
3994091520

real    0m1.008s
user    0m0.012s
sys     0m0.988s

To compare, this is how long it takes this machine to just output it all
to /dev/null:

[larsi@stories ~]$ time cat ~/Downloads/debian-8.6.0-amd64-DVD-1.iso > /dev/null

real    0m0.294s
user    0m0.000s
sys     0m0.292s

So the Emacs primitives are definitely competitive in the "read a huge
file" stakes.  I think asking Emacs to encrypt a 4GB file will be a very
common use case, but it's doable without creating special handling.

If I understand the code correctly (and I may definitely not be doing
that; I've just skimmed it very, very briefly), you may be able to point
the encryption code at the Emacs buffer contents directly without
copying it anywhere beforehand, and then (since the results are usually
of very similar length) back to the same Emacs buffer afterwards.

4GB Emacs buffer -> encrypted to 4GB GnuTLS buffer -> 4GB Emacs buffer

instead of

4GB Emacs buffer -> copy to 4GB gnutls.c buffer -> encrypted to 4GB
GnuTLS buffer -> made into Emacs string or something

so you save at least one 4GB buffer by just taking the data directly
from the buffer and putting it back in the same place.  (So 8GB total
memory print instead of 12GB or even possibly 16GB in the current code.)

> LI> In any case, the `file' case you're discussing here doesn't really feel
> LI> that useful, but also makes things more complicated.  If the user wants
> LI> to encrypt a file, then it's more flexible to just have the caller
> LI> insert the file into a buffer and call the function as normal
>
> Aboslutely. It would be nice if the Emacs C core had "readers" like Java
> or Go because then this discussion would be really simple: "did you use
> a reader" - "yes" - "good" :)

I guess what I'm saying is that Emacs has readers, and we call those
"Emacs buffers".  :-)

The other problem with having a special file handler in the GnuTLS code
is that users will expect to be able to encrypt all files that they see
visible from Emacs, including the ones from Tramp, and application
writers will also have differing opinions on whether encrypting a .gz
file means encrypting the contents of the file or the file itself: That
is, Emacs has a very rich file handler jungle that it would be nice if
still works when you ask Emacs to encrypt something.

You'd have to handle

(file "~/foo)
(file "c:/foo/bar")
(file "Héllo") ; in iso-8859-1
(file "/ssh:host:/tmp/foo")

both as input and output specifiers if you never want the file contents
to his Elisp Land...

It all sounds a bit daunting.  To me, at least.  :-)

Instead we have most of the primitives we need for safe handling of
secrets in Emacs already; a few more should be added.  But I think this
pattern for handling secret files could be tweaked and macroised after
some code review:

(with-temp-buffer
  (set-buffer-multibyte nil)
  (let ((coding-system-for-read 'binary)
        (coding-system-for-write 'binary))
    (unwind-protect
      (progn
       (insert-file-contents "My DVD.iso")
       (gnutls-encrypt ... ... (current-buffer))
       (write-region ...))
     (clear-buffer (current-buffer))))) ;; New function that runs memset
                                        ;; over the buffer area

Or something.  We have to look at what buffers write-region creates and
stuff, but in the 'binary case, I don't think it creates copies of the
Emacs buffer anywhere.  Of course, if these files read and written are
via Tramp or a complex file handler, we can't guarantee that those don't
leave a buffer anywhere, but...

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-21 18:45                                                   ` Lars Ingebrigtsen
@ 2017-04-21 19:15                                                     ` Eli Zaretskii
  0 siblings, 0 replies; 128+ messages in thread
From: Eli Zaretskii @ 2017-04-21 19:15 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: emacs-devel

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Date: Fri, 21 Apr 2017 20:45:58 +0200
> 
> If I understand the code correctly (and I may definitely not be doing
> that; I've just skimmed it very, very briefly), you may be able to point
> the encryption code at the Emacs buffer contents directly without
> copying it anywhere beforehand, and then (since the results are usually
> of very similar length) back to the same Emacs buffer afterwards.
> 
> 4GB Emacs buffer -> encrypted to 4GB GnuTLS buffer -> 4GB Emacs buffer

But since (AFAIU) the size of the encrypted text is not always known
in advance, the encryption might be significantly slower than just
reading in the original file, because of the need to enlarge the gap
(and thus reallocate text) several times.  Similar things happen when
you visit a compressed file, which triggers its on-the-fly
decompression.

> (with-temp-buffer
>   (set-buffer-multibyte nil)  <<<<<<<<<<<<<<< this is not needed
>   (let ((coding-system-for-read 'binary)
>         (coding-system-for-write 'binary))
>     (unwind-protect
>       (progn
>        (insert-file-contents "My DVD.iso")
>        (gnutls-encrypt ... ... (current-buffer))
>        (write-region ...))
>      (clear-buffer (current-buffer))))) ;; New function that runs memset
>                                         ;; over the buffer area
>     
> Or something.  We have to look at what buffers write-region creates and
> stuff, but in the 'binary case, I don't think it creates copies of the
> Emacs buffer anywhere.

The data will always leave traces, because doing the above involves
reallocation of memory, so you are likely to leave traces in the page
file and in memory.  But I don't think you can avoid that, whatever
you do: as long as data needs to be read into memory to process it, it
will always leave traces.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-04-21  6:14                                               ` Eli Zaretskii
@ 2017-05-15 21:55                                                 ` Ted Zlatanov
  2017-05-16 22:19                                                   ` Ted Zlatanov
  2017-05-17 16:22                                                   ` Eli Zaretskii
  0 siblings, 2 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-05-15 21:55 UTC (permalink / raw)
  To: emacs-devel

On Fri, 21 Apr 2017 09:14:02 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

EZ> Something like this at the beginning of the node:
>> 
EZ> All of the functions described here accept argument lists of the
EZ> form @code{(BUFFER-OR-STRING START END CODING-SYSTEM NOERROR)},
EZ> where BUFFER-OR-STRING ...
>> 
>> OK, but I want to link to that from the C function docs too, since they
>> also will reference that format. Would I link to the manual node? Is
>> that OK for C functions, which are kind of standalone as far as docs go?

EZ> If you mean primitives implemented in C that are exposed to Lisp,
EZ> yes.  For internal C functions inaccessible from Lisp, just describe
EZ> that in a comment preceding the functions.

OK, I'll add those links instead of the full text.

EZ> I think both this and the other sub-thread arrived at a point where it
EZ> is important to present a list of use cases we envision and would like
EZ> to support, because these kinds of decisions are prone to errors if we
EZ> don't hold the use cases in our minds.

EZ> So could you please present such a list, describing the source of the
EZ> text to be encrypted/decrypted/hashed, the purpose of the operation
EZ> (i.e. some higher-level context), and the destination where the
EZ> encrypted/decrypted/hashed text will go?  The list doesn't have to be
EZ> exhaustive, but it should include the most important use cases.

Right now, I am mirroring the GnuTLS API. That API is in wide use. So I
think the use cases are a large subset of the general GnuTLS use cases;
we're enabling the same things.

On Fri, 21 Apr 2017 09:21:14 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

>> From: Ted Zlatanov <tzz@lifelogs.com>
>> Date: Thu, 20 Apr 2017 17:54:32 -0400
>> 
>> The KEY is secret and ideally would come from a file and never be
>> seen at the Lisp level. But tests and other use cases may need it from a
>> buffer (more secure but still accessible to Lisp) or a string (visible
>> to all as a function parameter).

EZ> For testing, we could always write the key to a file before using it.
EZ> What other use cases would need the key from other sources?

I think if the key is not on disk, we shouldn't force it to disk just to
fit a always-a-file usage model. OTOH if it's already on disk, we
shouldn't slurp it into memory to fit the always-in-memory usage model.
Both of those situations expose the key data by making copies.

>> Getting the INPUT from a file enables large files (not in the first
>> version probably) and other interesting use cases.

EZ> What other cases?  Large files is only theoretically useful, since
EZ> generally Emacs cannot do useful things on files larger than
EZ> most-positive-fixnum, and on 64-bit machines that is far enough to not
EZ> care.

I think anything over 1 MB is pretty big. There also pipes (pretty easy
to fit the file model) and network streams (probably a separate spec).

EZ> I think we need to weigh flexibility against the complexity, and find
EZ> the optimal balance.  So making the interfaces too complicated for use
EZ> cases that will happen only very rarely, if at all, should be avoided.

I agree in general, BUT these are not end-user APIs. They are for
application developers. They have to be flexible so we don't have to
bolt-on these things later. Users will get something much simpler or
they won't even see these interfaces (the application will hide them).

On Fri, 21 Apr 2017 20:45:58 +0200 Lars Ingebrigtsen <larsi@gnus.org> wrote: 

LI> Ted Zlatanov <tzz@lifelogs.com> writes:

LI> Hm...  Having a file that just has a passphrase in it sounds like an
LI> unusual use case.  I think in Emacs these tokens would normally come
LI> from auth-source in most applications.  At least that what I see when I
LI> salivate at use cases.  :-)

Private SSH keys are a good example; see
https://github.com/jschauma/jass for instance. But generally as I said
above, we shouldn't force a copy file->string or string->file if the
private data is already in one form.

LI> Emacs buffers are surprisingly efficient at handling large files:
LI> They're basically just (sort of) contiguous areas of memory with some
LI> structs describing their contents.

OK, but buffers are a copy of the file data. I'd rather not make a copy.

LI> If I understand the code correctly (and I may definitely not be doing
LI> that; I've just skimmed it very, very briefly), you may be able to point
LI> the encryption code at the Emacs buffer contents directly without
LI> copying it anywhere beforehand, and then (since the results are usually
LI> of very similar length) back to the same Emacs buffer afterwards.

LI> 4GB Emacs buffer -> encrypted to 4GB GnuTLS buffer -> 4GB Emacs buffer

LI> instead of

LI> 4GB Emacs buffer -> copy to 4GB gnutls.c buffer -> encrypted to 4GB
LI> GnuTLS buffer -> made into Emacs string or something

Yes, definitely possible. But it's more secure, I think, to read chunks
from the file and process them (possibly overwriting the data) in a
small loop, narrowing the scope and risk of the data exposure. The
GnuTLS APIs are designed for that usage. It's faster but less
interruptible as well. So it's not ideal for every situation, but I
would like to support it.

On Fri, 21 Apr 2017 22:15:16 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

EZ> The data will always leave traces, because doing the above involves
EZ> reallocation of memory, so you are likely to leave traces in the page
EZ> file and in memory.  But I don't think you can avoid that, whatever
EZ> you do: as long as data needs to be read into memory to process it, it
EZ> will always leave traces.

Would you agree the tight loop that overwrites the read block will leave
fewer traces and offer fewer exposure opportunities?

LI> The other problem with having a special file handler in the GnuTLS code
LI> is that users will expect to be able to encrypt all files that they see
LI> visible from Emacs, including the ones from Tramp, and application
LI> writers will also have differing opinions on whether encrypting a .gz
LI> file means encrypting the contents of the file or the file itself: That
LI> is, Emacs has a very rich file handler jungle that it would be nice if
LI> still works when you ask Emacs to encrypt something.

LI> You'd have to handle

LI> (file "~/foo)
LI> (file "c:/foo/bar")
LI> (file "Héllo") ; in iso-8859-1
LI> (file "/ssh:host:/tmp/foo")

Right, I understand. I am OK with restricting the API to local files
only but agree the name handling needs to be done carefully.

Lars, I think your read-into-buffer macro would work nicely in a wrapper
API (something like EPA's contexts). We can modify the macro if the
`(file "foo")' spec becomes available, and end users won't know the
difference.

So how about a compromise for now: I can leave the `(file "foo")'
capability out. I'll adjust the docs to remove mentions of it. Then,
after the main patch is done, I can propose a followup patch to
implement `(file "foo")' and we can decide if it's good or bad.

That will get the GnuTLS API integration working, and we can have a
separate discussion about `(file "foo")' later. Lars, Eli, would that be
acceptable?

Thanks!
Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-05-15 21:55                                                 ` Ted Zlatanov
@ 2017-05-16 22:19                                                   ` Ted Zlatanov
  2017-05-17 16:22                                                   ` Eli Zaretskii
  1 sibling, 0 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-05-16 22:19 UTC (permalink / raw)
  To: emacs-devel

On Mon, 15 May 2017 17:55:34 -0400 Ted Zlatanov <tzz@lifelogs.com> wrote: 

TZ> OK, I'll add those links instead of the full text.

This is done and I created a new subsection for the format spec; the
code is ready for one last round of review. My texi editing skills are
not great but I think it looks OK.

TZ> So how about a compromise for now: I can leave the `(file "foo")'
TZ> capability out. I'll adjust the docs to remove mentions of it. Then,
TZ> after the main patch is done, I can propose a followup patch to
TZ> implement `(file "foo")' and we can decide if it's good or bad.

This is also done; the only formats accepted are buffer, string, or the
list format that can specify the range and coding system and error
handling. So we can discuss direct file inputs later.

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-05-15 21:55                                                 ` Ted Zlatanov
  2017-05-16 22:19                                                   ` Ted Zlatanov
@ 2017-05-17 16:22                                                   ` Eli Zaretskii
  2017-05-17 20:05                                                     ` Ted Zlatanov
  1 sibling, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-05-17 16:22 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Mon, 15 May 2017 17:55:34 -0400
> 
> On Fri, 21 Apr 2017 09:14:02 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

It's hard to have a dialog with 4 weeks between responses ;-)

> EZ> I think both this and the other sub-thread arrived at a point where it
> EZ> is important to present a list of use cases we envision and would like
> EZ> to support, because these kinds of decisions are prone to errors if we
> EZ> don't hold the use cases in our minds.
> 
> EZ> So could you please present such a list, describing the source of the
> EZ> text to be encrypted/decrypted/hashed, the purpose of the operation
> EZ> (i.e. some higher-level context), and the destination where the
> EZ> encrypted/decrypted/hashed text will go?  The list doesn't have to be
> EZ> exhaustive, but it should include the most important use cases.
> 
> Right now, I am mirroring the GnuTLS API. That API is in wide use. So I
> think the use cases are a large subset of the general GnuTLS use cases;
> we're enabling the same things.

I don't think this could be the case.  We are talking about using
these APIs in Emacs Lisp programs, where objects being manipulated
aren't C strings, and where file I/O is relatively rare and mostly
implied.  The use cases I was asking about are use cases for Lisp
programs.

> 
> On Fri, 21 Apr 2017 09:21:14 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 
> 
> >> From: Ted Zlatanov <tzz@lifelogs.com>
> >> Date: Thu, 20 Apr 2017 17:54:32 -0400
> >> 
> >> The KEY is secret and ideally would come from a file and never be
> >> seen at the Lisp level. But tests and other use cases may need it from a
> >> buffer (more secure but still accessible to Lisp) or a string (visible
> >> to all as a function parameter).
> 
> EZ> For testing, we could always write the key to a file before using it.
> EZ> What other use cases would need the key from other sources?
> 
> I think if the key is not on disk, we shouldn't force it to disk just to
> fit a always-a-file usage model.

My point was that the test suite cannot be a use case that drives our
design, because we can tweak any test to fit into any usage pattern we
want.  Valid and interesting use cases should come from outside the
test suite, they should come from real-life uses of these features by
Lisp programs.

> >> Getting the INPUT from a file enables large files (not in the first
> >> version probably) and other interesting use cases.
> 
> EZ> What other cases?  Large files is only theoretically useful, since
> EZ> generally Emacs cannot do useful things on files larger than
> EZ> most-positive-fixnum, and on 64-bit machines that is far enough to not
> EZ> care.
> 
> I think anything over 1 MB is pretty big.

Not for Emacs.  1MB is actually pretty small, and shouldn't even
bother us.  A system that cannot spare 1MB is already in trouble.

> There also pipes (pretty easy to fit the file model)

How do you fit a pipe into the (file "FOO") model?

> and network streams (probably a separate spec).

Well, this part of the discussion was specifically about the
(file "FOO") specs, so additional specs are a separate discussion.

OTOH, all of these are readable into a buffer, so the buffer model
could easily cover them all, no?

> EZ> I think we need to weigh flexibility against the complexity, and find
> EZ> the optimal balance.  So making the interfaces too complicated for use
> EZ> cases that will happen only very rarely, if at all, should be avoided.
> 
> I agree in general, BUT these are not end-user APIs. They are for
> application developers. They have to be flexible so we don't have to
> bolt-on these things later.

They have to be flexible enough, but not more flexible.  We are
talking about where to draw the line.

> EZ> The data will always leave traces, because doing the above involves
> EZ> reallocation of memory, so you are likely to leave traces in the page
> EZ> file and in memory.  But I don't think you can avoid that, whatever
> EZ> you do: as long as data needs to be read into memory to process it, it
> EZ> will always leave traces.
> 
> Would you agree the tight loop that overwrites the read block will leave
> fewer traces and offer fewer exposure opportunities?

I don't know, I'm not a specialist on how these usage patterns affect
VM and swap.  Maybe you are right.

> So how about a compromise for now: I can leave the `(file "foo")'
> capability out. I'll adjust the docs to remove mentions of it. Then,
> after the main patch is done, I can propose a followup patch to
> implement `(file "foo")' and we can decide if it's good or bad.
> 
> That will get the GnuTLS API integration working, and we can have a
> separate discussion about `(file "foo")' later. Lars, Eli, would that be
> acceptable?

I think this should be fine, thanks.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-05-17 16:22                                                   ` Eli Zaretskii
@ 2017-05-17 20:05                                                     ` Ted Zlatanov
  2017-05-31 18:17                                                       ` Ted Zlatanov
  0 siblings, 1 reply; 128+ messages in thread
From: Ted Zlatanov @ 2017-05-17 20:05 UTC (permalink / raw)
  To: emacs-devel

On Wed, 17 May 2017 19:22:13 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

>> From: Ted Zlatanov <tzz@lifelogs.com>
>> Date: Mon, 15 May 2017 17:55:34 -0400
>> 
>> On Fri, 21 Apr 2017 09:14:02 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

EZ> It's hard to have a dialog with 4 weeks between responses ;-)

Sorry, I had finals, had to prioritize.

EZ> I don't think this could be the case.  We are talking about using
EZ> these APIs in Emacs Lisp programs, where objects being manipulated
EZ> aren't C strings, and where file I/O is relatively rare and mostly
EZ> implied.  The use cases I was asking about are use cases for Lisp
EZ> programs.

I expect most early use cases to be simply existing applications already
using EPA/EPG or nothing taking advantage of this functionality to upgrade
security opportunistically or forcefully, especially on platforms and in
situations where GnuPG is not available or suitable.

An indirect way is to look for applications in the Emacs source that use
EPA/EPG or auth-source: Tramp, RMAIL, Gnus (mail and cloud), Dired,
allout-mode, plstore, Org, ERC, EWW, url. Maybe that's helpful?

Below I'll list some use cases differentiated by usage modes but not
really at the application level; I hope that's a helpful complement to
the explanation above.

One use case is signing a data message (text or binary, string or
file) quickly.

Another: keep data on disk encrypted and only decrypt it exactly when
it's needed, then wipe it. The key or passphrase can be in a file as
well, or in a Lisp closure or other opaque storage. That would be ideal
for auth-source, for instance. It would enable the same file to work on
multiple OS platforms.

Another: you don't trust the network but you do trust the security of a
shared file. So you encrypt network data with a shared key in a file as
it's sent and decrypt it on the other side as it's received, with the
same file, without having to read the file into memory.

EZ> My point was that the test suite cannot be a use case that drives our
EZ> design, because we can tweak any test to fit into any usage pattern we
EZ> want.  Valid and interesting use cases should come from outside the
EZ> test suite, they should come from real-life uses of these features by
EZ> Lisp programs.

You're right. I hope I was helpful above.

>> There also pipes (pretty easy to fit the file model)

EZ> How do you fit a pipe into the (file "FOO") model?

I'm not very sure about non-Unix, but on Unix platforms they are pretty
similar. So I'd just let FOO be a pipe or a file and deal with the
differences at the C level.

EZ> OTOH, all of these are readable into a buffer, so the buffer model
EZ> could easily cover them all, no?

Buffers are a kind of Reader API as Lars mentioned, but they have a lot
of baggage and visibility to Emacs Lisp code. They're good enough to get
us going and we can extend the spec later.

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-05-17 20:05                                                     ` Ted Zlatanov
@ 2017-05-31 18:17                                                       ` Ted Zlatanov
  2017-06-03  7:23                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 128+ messages in thread
From: Ted Zlatanov @ 2017-05-31 18:17 UTC (permalink / raw)
  To: emacs-devel

On Wed, 17 May 2017 16:05:01 -0400 Ted Zlatanov <tzz@lifelogs.com> wrote: 

I've pushed the last commits I wanted to introduce to the
scratch/tzz/nettle branch: generate IVs using GNUTLS_RND_NONCE. The
input spec and the output of the functions is changed slightly. The docs
and tests are updated.

The output is now (OUTPUT ACTUAL-IV) which lets callers grab the IV that
was used. This detail will be hidden by wrapper libraries but maybe a
plist or alist would be better than a simple list? I'm not sure.

I had an alternate IV generator working following
https://tools.ietf.org/html/rfc5116#section-3.2 generating a fixed hash
with the key and appending a counter, but was not happy with the
performance using hashtables. Also hashtables don't seem to work well
with binary keys and the C mechanics got really annoying. The main
requirement is that an IV is never reused with the same key, which
I think GNUTLS_RND_NONCE satisfies pretty well. Comments welcome.

I'd love to merge this branch, if there are no objections or comments on
the two items above or otherwise. It's been sitting for a while.

Thanks
Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-05-31 18:17                                                       ` Ted Zlatanov
@ 2017-06-03  7:23                                                         ` Eli Zaretskii
  2017-06-03  9:00                                                           ` Andreas Schwab
  2017-06-27 22:58                                                           ` Ted Zlatanov
  0 siblings, 2 replies; 128+ messages in thread
From: Eli Zaretskii @ 2017-06-03  7:23 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Wed, 31 May 2017 14:17:54 -0400
> 
> I'd love to merge this branch, if there are no objections or comments on
> the two items above or otherwise. It's been sitting for a while.

I think it's okay to merge this, after fixing the following minor
issues:

There are a few TODOs in the documentation and in the code.  The one
in the docs should be either removed or moved to a comment so that it
doesn't appear in the manual.  Those in the code should be reviewed,
and in each case please decide whether the TODO will be handled any
time soon, or maybe should be simply deleted.

This fragment from the docs:

  +The functions described here accept argument lists of the
  +form @code{(BUFFER-OR-STRING START END CODING-SYSTEM NOERROR)},
  +where BUFFER-OR-STRING ...
  +
  +can be a string or a buffer or a list in the format
  +@var{(BUFFER-OR-STRING START END CODING-SYSTEM NOERROR)} where only
  +the first element is required.

should have @w{..} around the long form, to prevent it from being
broken between 2 lines.  I also don't understand the ellipsis which
breaks the sentence in two -- I think it should simply be deleted.  As
for using @var, you should do it for every element individually, and
the elements should not be up-cased.  Likewise in other uses of @var.

In several cases in the docs (manual and doc strings) and in error
messages you have only one space between sentences; please use two
everywhere.

This text:

  +@item (@code{iv-auto} @var{length})
  +This will generate an IV of the specified length using the GnuTLS
  +GNUTLS_RND_NONCE generator.

uses "IV" and "iv-auto" without explaining what that is.  Other parts
have the same problem.

In this fragment:

  +The @var{input} can be specified as a buffer or string or in other
  +ways (@xref{Format of GnuTLS Cryptography Inputs}).

please use @pxref, as @xref is not appropriate for parenthesized
reference.  There are several instances of this in the docs changes.

Please make sure that "nil" is always in @code in the docs.

  +  Lisp_Object object        = Fnth (make_number (0), spec);
  +  Lisp_Object start        = Fnth (make_number (1), spec);
  +  Lisp_Object end          = Fnth (make_number (2), spec);
  +  Lisp_Object coding_system = Fnth (make_number (3), spec);
  +  Lisp_Object noerror      = Fnth (make_number (4), spec);

Isn't it simpler to use XCAR and XCDR here?  You are extracting
all the elements of a list in strict order, so Fnth is an overkill,
and will be slower.

  +      if (! INTEGERP (start))
  +        {
  +          error ("Without a length, 'iv-auto can't be used. See manual.");
  +          object = Qnil;

I think the quoting of iv-auto here is incorrect.  Likewise in other
similar error messages

  +  void *(*hash_func) (const char *, size_t, void *);

The leftmost '*' is unnecessary, I think.

  +      Lisp_Object cp = listn (CONSTYPE_HEAP, 15,
  +                              // A symbol representing the cipher
  +                              intern (gnutls_cipher_get_name (gca)),
  +                              // The internally meaningful cipher ID

I think we still prefer the /* .. */ style of comments.

  +  if (ret < GNUTLS_E_SUCCESS)
  +    {
  +      const char* str = gnutls_strerror (ret);
  +      if (!str)
  +        str = "unknown";
  +      error ("GnuTLS AEAD cipher %s/%s initialization failed: %s",
  +             gnutls_cipher_get_name (gca), desc, str);
  +      return Qnil;
  +    }

You don't need a 'return' after calling 'error' (here and elsewhere),
as the latter doesn't return.

Some of the lines in doc strings you wrote are too long, please make
sure they are no longer than 76 characters, and in any case fit on a
single 80-column line.

Also, the first line of a doc string should be a complete sentence.
This one, for example, isn't:

  +DEFUN ("gnutls-digests", Fgnutls_digests, Sgnutls_digests, 0, 0, 0,
  +       doc: /* Return alist of GnuTLS digest-algorithm method
  +descriptions as plists.  Use the value of the alist (extract it with
  +`alist-get' for instance) with `gnutls-hash-digest'.  The alist key is
  +the digest-algorithm method name. */)

  +  if (STRINGP (hash_method))
  +    {
  +      hash_method = intern (SSDATA (hash_method));
  +    }

No need for braces when there's only 1 line in the block.

Thanks again for working on this.

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-06-03  7:23                                                         ` Eli Zaretskii
@ 2017-06-03  9:00                                                           ` Andreas Schwab
  2017-06-03 10:01                                                             ` Eli Zaretskii
  2017-06-27 22:58                                                           ` Ted Zlatanov
  1 sibling, 1 reply; 128+ messages in thread
From: Andreas Schwab @ 2017-06-03  9:00 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Ted Zlatanov, emacs-devel

On Jun 03 2017, Eli Zaretskii <eliz@gnu.org> wrote:

>   +  Lisp_Object object        = Fnth (make_number (0), spec);
>   +  Lisp_Object start        = Fnth (make_number (1), spec);
>   +  Lisp_Object end          = Fnth (make_number (2), spec);
>   +  Lisp_Object coding_system = Fnth (make_number (3), spec);
>   +  Lisp_Object noerror      = Fnth (make_number (4), spec);
>
> Isn't it simpler to use XCAR and XCDR here?

Wouldn't it be even simpler to just pass each argument separately?

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-06-03  9:00                                                           ` Andreas Schwab
@ 2017-06-03 10:01                                                             ` Eli Zaretskii
  2017-06-03 10:09                                                               ` Andreas Schwab
  0 siblings, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-06-03 10:01 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: tzz, emacs-devel

> From: Andreas Schwab <schwab@linux-m68k.org>
> Cc: Ted Zlatanov <tzz@lifelogs.com>,  emacs-devel@gnu.org
> Date: Sat, 03 Jun 2017 11:00:16 +0200
> 
> On Jun 03 2017, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> >   +  Lisp_Object object        = Fnth (make_number (0), spec);
> >   +  Lisp_Object start        = Fnth (make_number (1), spec);
> >   +  Lisp_Object end          = Fnth (make_number (2), spec);
> >   +  Lisp_Object coding_system = Fnth (make_number (3), spec);
> >   +  Lisp_Object noerror      = Fnth (make_number (4), spec);
> >
> > Isn't it simpler to use XCAR and XCDR here?
> 
> Wouldn't it be even simpler to just pass each argument separately?

Maybe it would, but given the amount of procrastination about the form
of the arguments to these function, I'm not sure we want to re-argue
all that again.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-06-03 10:01                                                             ` Eli Zaretskii
@ 2017-06-03 10:09                                                               ` Andreas Schwab
  2017-06-03 10:47                                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 128+ messages in thread
From: Andreas Schwab @ 2017-06-03 10:09 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tzz, emacs-devel

On Jun 03 2017, Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Andreas Schwab <schwab@linux-m68k.org>
>> Cc: Ted Zlatanov <tzz@lifelogs.com>,  emacs-devel@gnu.org
>> Date: Sat, 03 Jun 2017 11:00:16 +0200
>> 
>> On Jun 03 2017, Eli Zaretskii <eliz@gnu.org> wrote:
>> 
>> >   +  Lisp_Object object        = Fnth (make_number (0), spec);
>> >   +  Lisp_Object start        = Fnth (make_number (1), spec);
>> >   +  Lisp_Object end          = Fnth (make_number (2), spec);
>> >   +  Lisp_Object coding_system = Fnth (make_number (3), spec);
>> >   +  Lisp_Object noerror      = Fnth (make_number (4), spec);
>> >
>> > Isn't it simpler to use XCAR and XCDR here?
>> 
>> Wouldn't it be even simpler to just pass each argument separately?
>
> Maybe it would, but given the amount of procrastination about the form
> of the arguments to these function, I'm not sure we want to re-argue
> all that again.

I don't understand.  This is just an internal helper function.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-06-03 10:09                                                               ` Andreas Schwab
@ 2017-06-03 10:47                                                                 ` Eli Zaretskii
  0 siblings, 0 replies; 128+ messages in thread
From: Eli Zaretskii @ 2017-06-03 10:47 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: tzz, emacs-devel

> From: Andreas Schwab <schwab@linux-m68k.org>
> Cc: tzz@lifelogs.com,  emacs-devel@gnu.org
> Date: Sat, 03 Jun 2017 12:09:17 +0200
> 
> On Jun 03 2017, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> >> >   +  Lisp_Object object        = Fnth (make_number (0), spec);
> >> >   +  Lisp_Object start        = Fnth (make_number (1), spec);
> >> >   +  Lisp_Object end          = Fnth (make_number (2), spec);
> >> >   +  Lisp_Object coding_system = Fnth (make_number (3), spec);
> >> >   +  Lisp_Object noerror      = Fnth (make_number (4), spec);
> >> >
> >> > Isn't it simpler to use XCAR and XCDR here?
> >> 
> >> Wouldn't it be even simpler to just pass each argument separately?
> >
> > Maybe it would, but given the amount of procrastination about the form
> > of the arguments to these function, I'm not sure we want to re-argue
> > all that again.
> 
> I don't understand.  This is just an internal helper function.

It is used by all the other functions to get the input from the
arguments.  E.g., see gnutls-symmetric-encrypt: its arguments KEY, IV,
and INPUT are all of the form from which extract_data_from_object
extracts the individual values.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-06-03  7:23                                                         ` Eli Zaretskii
  2017-06-03  9:00                                                           ` Andreas Schwab
@ 2017-06-27 22:58                                                           ` Ted Zlatanov
  2017-06-28 16:54                                                             ` Eli Zaretskii
  1 sibling, 1 reply; 128+ messages in thread
From: Ted Zlatanov @ 2017-06-27 22:58 UTC (permalink / raw)
  To: emacs-devel

On Sat, 03 Jun 2017 10:23:39 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

EZ> There are a few TODOs in the documentation and in the code.  The one
EZ> in the docs should be either removed or moved to a comment so that it
EZ> doesn't appear in the manual.  Those in the code should be reviewed,
EZ> and in each case please decide whether the TODO will be handled any
EZ> time soon, or maybe should be simply deleted.

I hope we can resume work on this. I've set aside time for it. Sorry for
the repeated delays.

There is only one major TODO I can't resolve: I asked for help before,
and still can't make the autoconf code detect the GnuTLS functions
individually. So there are TODOs in the code and docs about it. I can't
merge without that--it would be a pretty bad situation for portability
and future compatibility to pin on just the GnuTLS version.

The other TODO below maybe can be answered here? Should I remove it or
leave it?

  // TODO: switch this to use a resize_string_data() function when
  // that's provided in the C core, to avoid the extra copy.

I think I've fixed the doc issues you noted. Thank you for the thorough
reading. I couldn't explain IV and other cryptographic terms in detail
in the Lisp reference manual so I pointed to the GnuTLS home page.

EZ> I think we still prefer the /* .. */ style of comments.

I changed all but the TODOs to that style.

EZ> You don't need a 'return' after calling 'error' (here and elsewhere),
EZ> as the latter doesn't return.

I got warnings for it in a few places so I added those returns
consistently. I also think it's cleaner to do it that way. Is this
important enough to change these instances?

EZ> Some of the lines in doc strings you wrote are too long, please make
EZ> sure they are no longer than 76 characters, and in any case fit on a
EZ> single 80-column line.

I tried. Several are really hard to condense further and I'm right
around 80.

EZ> No need for braces when there's only 1 line in the block.

I removed them, but it's not my favorite thing to omit the braces,
especially in the

if a
 b
else if c
 d
else
 e

situation. Do you think I should reintroduce them in those places?

On Sat, 03 Jun 2017 11:00:16 +0200 Andreas Schwab <schwab@linux-m68k.org> wrote: 

AS> Wouldn't it be even simpler to just pass each argument separately?

I want the format to be flexible to accomodate future expansion, and
passing arguments explicitly would defeat that purpose.

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-06-27 22:58                                                           ` Ted Zlatanov
@ 2017-06-28 16:54                                                             ` Eli Zaretskii
  2017-06-28 19:44                                                               ` Ted Zlatanov
  0 siblings, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-06-28 16:54 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Tue, 27 Jun 2017 18:58:30 -0400
> 
> There is only one major TODO I can't resolve: I asked for help before,
> and still can't make the autoconf code detect the GnuTLS functions
> individually.

Doesn't something like AC_CHECK_FUNCS do the job?  If not, why not?
(In general, the Autoconf manual should be able to answer these
questions.)

> The other TODO below maybe can be answered here? Should I remove it or
> leave it?
> 
>   // TODO: switch this to use a resize_string_data() function when
>   // that's provided in the C core, to avoid the extra copy.

Remove it.

> EZ> You don't need a 'return' after calling 'error' (here and elsewhere),
> EZ> as the latter doesn't return.
> 
> I got warnings for it in a few places so I added those returns
> consistently.

Which warnings and at what places?

If the function doesn't return, it should be marked _Noreturn, as we
do in other places.  Maybe that was your original problem.

> I removed them, but it's not my favorite thing to omit the braces,
> especially in the
> 
> if a
>  b
> else if c
>  d
> else
>  e
> 
> situation. Do you think I should reintroduce them in those places?

We don't use braces in such situations, unless there are comments that
make the blocks larger than 1 line.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-06-28 16:54                                                             ` Eli Zaretskii
@ 2017-06-28 19:44                                                               ` Ted Zlatanov
  2017-07-13 18:35                                                                 ` Ted Zlatanov
  0 siblings, 1 reply; 128+ messages in thread
From: Ted Zlatanov @ 2017-06-28 19:44 UTC (permalink / raw)
  To: emacs-devel

On Wed, 28 Jun 2017 19:54:38 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

>> From: Ted Zlatanov <tzz@lifelogs.com>
>> Date: Tue, 27 Jun 2017 18:58:30 -0400
>> 
>> There is only one major TODO I can't resolve: I asked for help before,
>> and still can't make the autoconf code detect the GnuTLS functions
>> individually.

EZ> Doesn't something like AC_CHECK_FUNCS do the job?  If not, why not?

I tried to use that macro and many others, but it wouldn't include the
right headers (gnutls/gnutls.h and gnutls/crypto.h). I eventually got it
working with AC_COMPILE_IFELSE+AC_LANG_PROGRAM which I think will test
definitions but not linkage. If anyone wants to provide a better
solution, they could do it after merge or push it on top of my branch?

EZ> You don't need a 'return' after calling 'error' (here and elsewhere),
EZ> as the latter doesn't return.

Those returns have now been removed. I'm not sure why I had warnings
before, I must have misinterpreted something. Thanks for pointing it
out, the code is that much shorter now.

The "void *(*hash_func) (const char *, size_t, void *)" piece in fns.c
gives a warning (incompatible assignment) if the first * is removed. I
believe that's correct and we should keep it the way I had it before.

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-06-28 19:44                                                               ` Ted Zlatanov
@ 2017-07-13 18:35                                                                 ` Ted Zlatanov
  2017-07-14 15:10                                                                   ` Ted Zlatanov
  0 siblings, 1 reply; 128+ messages in thread
From: Ted Zlatanov @ 2017-07-13 18:35 UTC (permalink / raw)
  To: emacs-devel

The code in the branch scratch/tzz/nettle is ready for merge and I
waited a few weeks since making the last round of changes. It has tests
you can review to see the API and usage.

Barring any objections, I'll rebase+squash, write the commit message,
and merge it tomorrow July 14.

Thanks
Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-13 18:35                                                                 ` Ted Zlatanov
@ 2017-07-14 15:10                                                                   ` Ted Zlatanov
  2017-07-14 19:04                                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 128+ messages in thread
From: Ted Zlatanov @ 2017-07-14 15:10 UTC (permalink / raw)
  To: emacs-devel

On Thu, 13 Jul 2017 14:35:15 -0400 Ted Zlatanov <tzz@lifelogs.com> wrote: 

TZ> The code in the branch scratch/tzz/nettle is ready for merge and I
TZ> waited a few weeks since making the last round of changes. It has tests
TZ> you can review to see the API and usage.

TZ> Barring any objections, I'll rebase+squash, write the commit message,
TZ> and merge it tomorrow July 14.

This is done.

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-14 15:10                                                                   ` Ted Zlatanov
@ 2017-07-14 19:04                                                                     ` Eli Zaretskii
  2017-07-14 19:43                                                                       ` Ted Zlatanov
  0 siblings, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-07-14 19:04 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Fri, 14 Jul 2017 11:10:09 -0400
> 
> On Thu, 13 Jul 2017 14:35:15 -0400 Ted Zlatanov <tzz@lifelogs.com> wrote: 
> 
> TZ> The code in the branch scratch/tzz/nettle is ready for merge and I
> TZ> waited a few weeks since making the last round of changes. It has tests
> TZ> you can review to see the API and usage.
> 
> TZ> Barring any objections, I'll rebase+squash, write the commit message,
> TZ> and merge it tomorrow July 14.
> 
> This is done.

Thank you for doing this, Ted.

I needed to add some Windows-specific code to get it to build on
Windows.  Also, the 7th test fails for me as below; any idea?

  Test test-gnutls-005-aead-ciphers backtrace:
    signal(ert-test-failed (((should (gnutls-tests-hexstring-equal input
    ert-fail(((should (gnutls-tests-hexstring-equal input reverse)) :for
    (if (unwind-protect (setq value-152 (apply fn-150 args-151)) (setq f
    (let (form-description-154) (if (unwind-protect (setq value-152 (app
    (let ((value-152 'ert-form-evaluation-aborted-153)) (let (form-descr
    (let ((fn-150 (function gnutls-tests-hexstring-equal)) (args-151 (li
    (let* ((iv (gnutls-tests-pad-or-trim iv (plist-get cplist :cipher-iv
    (while --dolist-tail-- (setq iv (car --dolist-tail--)) (gnutls-tests
    (let ((--dolist-tail-- (append ivs (list (list 'iv-auto ivsize)))) i
    (let* ((cplist (cdr (assq cipher (gnutls-ciphers)))) (key (gnutls-te
    (while --dolist-tail-- (setq key (car --dolist-tail--)) (let* ((cpli
    (let ((--dolist-tail-- keys) key) (while --dolist-tail-- (setq key (
    (while --dolist-tail-- (setq auth (car --dolist-tail--)) (let ((--do
    (let ((--dolist-tail-- auths) auth) (while --dolist-tail-- (setq aut
    (while --dolist-tail-- (setq input (car --dolist-tail--)) (let ((--d
    (let ((--dolist-tail-- inputs) input) (while --dolist-tail-- (setq i
    (while --dolist-tail-- (setq cipher (car --dolist-tail--)) (let ((--
    (let ((--dolist-tail-- ciphers) cipher) (while --dolist-tail-- (setq
    (let ((keys '("mykey" "mykey2")) (inputs gnutls-tests-mondo-strings)
    (lambda nil (let ((fn-120 (function memq)) (args-121 (list 'AEAD-cip
    ert--run-test-internal(#s(ert--test-execution-info :test #s(ert-test
    ert-run-test(#s(ert-test :name test-gnutls-005-aead-ciphers :documen
    ert-run-or-rerun-test(#s(ert--stats :selector t :tests [#s(ert-test
    ert-run-tests(t #f(compiled-function (event-type &rest event-args) #
    ert-run-tests-batch(nil)
    ert-run-tests-batch-and-exit(nil)
    eval((ert-run-tests-batch-and-exit nil))
    command-line-1(("-L" ";." "-l" "ert" "-l" "lisp/net/gnutls-tests.el"
    command-line()
    normal-top-level()
  Test test-gnutls-005-aead-ciphers condition:
      (ert-test-failed
       ((should
	 (gnutls-tests-hexstring-equal input reverse))
	:form
	(gnutls-tests-hexstring-equal "                " "                ►   ►  ∟\231\202    \200")
	:value nil))
     FAILED  7/7  test-gnutls-005-aead-ciphers




^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-14 19:04                                                                     ` Eli Zaretskii
@ 2017-07-14 19:43                                                                       ` Ted Zlatanov
  2017-07-14 20:04                                                                         ` Eli Zaretskii
  2017-07-15  9:15                                                                         ` Eli Zaretskii
  0 siblings, 2 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-07-14 19:43 UTC (permalink / raw)
  To: emacs-devel

On Fri, 14 Jul 2017 22:04:50 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

EZ> I needed to add some Windows-specific code to get it to build on
EZ> Windows.

Oh, thank you.

EZ> Also, the 7th test fails for me as below; any idea?
EZ>   Test test-gnutls-005-aead-ciphers condition:
EZ>       (ert-test-failed
EZ>        ((should
EZ> 	 (gnutls-tests-hexstring-equal input reverse))
EZ> 	:form
EZ> 	(gnutls-tests-hexstring-equal "                " "                ►   ►  ∟\231\202    \200")
EZ> 	:value nil))
EZ>      FAILED  7/7  test-gnutls-005-aead-ciphers

It works for me, on Ubuntu 17.04, GnuTLS 3.5.6 and a fresh checkout.

Could you run with `make gnutls-tests GNUTLS_TEST_VERBOSE=1' and look at
the output? This full output is normally silenced because it's so
verbose, but in this case we'll need it because so many different test
cases are attempted.

(If there's a better way to control ERT verbosity, let me know.)

Thanks
Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-14 19:43                                                                       ` Ted Zlatanov
@ 2017-07-14 20:04                                                                         ` Eli Zaretskii
  2017-07-15 18:30                                                                           ` Ted Zlatanov
  2017-07-15  9:15                                                                         ` Eli Zaretskii
  1 sibling, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-07-14 20:04 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Fri, 14 Jul 2017 15:43:53 -0400
> 
> EZ> Also, the 7th test fails for me as below; any idea?
> EZ>   Test test-gnutls-005-aead-ciphers condition:
> EZ>       (ert-test-failed
> EZ>        ((should
> EZ> 	 (gnutls-tests-hexstring-equal input reverse))
> EZ> 	:form
> EZ> 	(gnutls-tests-hexstring-equal "                " "                ►   ►  ∟\231\202    \200")
> EZ> 	:value nil))
> EZ>      FAILED  7/7  test-gnutls-005-aead-ciphers
> 
> It works for me, on Ubuntu 17.04, GnuTLS 3.5.6 and a fresh checkout.
> 
> Could you run with `make gnutls-tests GNUTLS_TEST_VERBOSE=1' and look at
> the output? This full output is normally silenced because it's so
> verbose, but in this case we'll need it because so many different test
> cases are attempted.

I will look into this tomorrow.

One more thing: at least one of the functions you use is (according to
the GnuTLS manual) available only since v3.4.0, and another since
3.2.0.  So I think we need more fine-grained conditionals for the APIs
which use those functions, or else some of them will fail the build or
cause Emacs to crash at run time.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-14 19:43                                                                       ` Ted Zlatanov
  2017-07-14 20:04                                                                         ` Eli Zaretskii
@ 2017-07-15  9:15                                                                         ` Eli Zaretskii
  2017-07-15 18:40                                                                           ` Ted Zlatanov
  1 sibling, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-07-15  9:15 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Fri, 14 Jul 2017 15:43:53 -0400
> 
> EZ> Also, the 7th test fails for me as below; any idea?
> EZ>   Test test-gnutls-005-aead-ciphers condition:
> EZ>       (ert-test-failed
> EZ>        ((should
> EZ> 	 (gnutls-tests-hexstring-equal input reverse))
> EZ> 	:form
> EZ> 	(gnutls-tests-hexstring-equal "                " "                ►   ►  ∟\231\202    \200")
> EZ> 	:value nil))
> EZ>      FAILED  7/7  test-gnutls-005-aead-ciphers
> 
> It works for me, on Ubuntu 17.04, GnuTLS 3.5.6 and a fresh checkout.
> 
> Could you run with `make gnutls-tests GNUTLS_TEST_VERBOSE=1' and look at
> the output? This full output is normally silenced because it's so
> verbose, but in this case we'll need it because so many different test
> cases are attempted.

I've run the test under a debugger.  Tell me what you want to know.

What I see (without really understanding what should happen) is that
gnutls-symmetric-decrypt returns a string that is 32-byte long,
whereas the original input before encryption was a 16-byte string.  In
the decrypted string, the first 16 bytes are identical to the input
before encryption, the rest seem to be binary garbage.  When the
original 16-byte input is encrypted by gnutls-symmetric-encrypt, the
result is a 32-byte string.  All of this happens in the first call to
encryption and decryption functions produced by the loop in
test-gnutls-005-aead-ciphers.  Does this makes sense and/or ring any
bells?  Do you need me to provide any further information?

The version of GnuTLS library I have here is 3.4.15, btw.

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-14 20:04                                                                         ` Eli Zaretskii
@ 2017-07-15 18:30                                                                           ` Ted Zlatanov
  0 siblings, 0 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-07-15 18:30 UTC (permalink / raw)
  To: emacs-devel

On Fri, 14 Jul 2017 23:04:30 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

EZ> One more thing: at least one of the functions you use is (according to
EZ> the GnuTLS manual) available only since v3.4.0, and another since
EZ> 3.2.0.  So I think we need more fine-grained conditionals for the APIs
EZ> which use those functions, or else some of them will fail the build or
EZ> cause Emacs to crash at run time.

Yeah unfortunately this is a problem, and 3.4.0 is not available on many
platforms but is the only choice on others. I will work with the other
developers to clear out these edge cases.

I'll also try to make Docker images with these specific libraries to
help me find these problems on Linux, at least.

Thanks
Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-15  9:15                                                                         ` Eli Zaretskii
@ 2017-07-15 18:40                                                                           ` Ted Zlatanov
  2017-07-15 19:12                                                                             ` Eli Zaretskii
  0 siblings, 1 reply; 128+ messages in thread
From: Ted Zlatanov @ 2017-07-15 18:40 UTC (permalink / raw)
  To: emacs-devel

On Sat, 15 Jul 2017 12:15:20 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

EZ> What I see (without really understanding what should happen) is that
EZ> gnutls-symmetric-decrypt returns a string that is 32-byte long,
EZ> whereas the original input before encryption was a 16-byte string.  In
EZ> the decrypted string, the first 16 bytes are identical to the input
EZ> before encryption, the rest seem to be binary garbage.  When the
EZ> original 16-byte input is encrypted by gnutls-symmetric-encrypt, the
EZ> result is a 32-byte string.  All of this happens in the first call to
EZ> encryption and decryption functions produced by the loop in
EZ> test-gnutls-005-aead-ciphers.  Does this makes sense and/or ring any
EZ> bells?  Do you need me to provide any further information?

EZ> The version of GnuTLS library I have here is 3.4.15, btw.

The size of the output is determined by the cipher's parameters. So we
need to know the parameters, which are in cplist.

Can you capture the full parameters and memory buffers passed into the
GnuTLS functions?

Here's the sequence we expect in the test, this in a let* form:

1. The IV may be actual data or it may be a list of 'iv-auto and ivsize
which tells extract_data_from_object() to generate a random IV (which is
returned as the second element by `gnutls-symmetric-encrypt').

2. We call (gnutls-symmetric-encrypt cplist (copy-sequence key) iv input (copy-sequence auth))

3. We need to make sure the cplist, key, iv, input, and auth make it
down to the GnuTLS C functions.

4. We need to look at the output returned from the GnuTLS C function and
make sure it makes it to the return of `gnutls-symmetric-encrypt'
together with the actual IV used.

5. We need to do the same as steps 1-4 for decryption.

If you can provide a recipe for testing your case, that would be
helpful. I may be able to duplicate it in a Docker container.

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-15 18:40                                                                           ` Ted Zlatanov
@ 2017-07-15 19:12                                                                             ` Eli Zaretskii
  2017-07-22  9:10                                                                               ` Eli Zaretskii
  0 siblings, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-07-15 19:12 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Sat, 15 Jul 2017 14:40:37 -0400
> 
> The size of the output is determined by the cipher's parameters. So we
> need to know the parameters, which are in cplist.

As I wrote, the problem happens on the first iteration through the
loop in test-gnutls-005-aead-ciphers, so the cipher's parameters
should be known, as they are set up by the code.  But in case this
isn't telling the whole story, you will see the values below.

> Can you capture the full parameters and memory buffers passed into the
> GnuTLS functions?

I attach below the GDB transcript with this information.  Let me know
if you need more data.

> Here's the sequence we expect in the test, this in a let* form:
> 
> 1. The IV may be actual data or it may be a list of 'iv-auto and ivsize
> which tells extract_data_from_object() to generate a random IV (which is
> returned as the second element by `gnutls-symmetric-encrypt').
> 
> 2. We call (gnutls-symmetric-encrypt cplist (copy-sequence key) iv input (copy-sequence auth))
> 
> 3. We need to make sure the cplist, key, iv, input, and auth make it
> down to the GnuTLS C functions.
> 
> 4. We need to look at the output returned from the GnuTLS C function and
> make sure it makes it to the return of `gnutls-symmetric-encrypt'
> together with the actual IV used.
> 
> 5. We need to do the same as steps 1-4 for decryption.

That's what I did, and my observations were in the previous mail.  I
just didn't know what to expect, so I couldn't tell whether some of
the data was incorrect.

> If you can provide a recipe for testing your case, that would be
> helpful.

I'm just running test-gnutls-005-aead-ciphers in an interactive
session after loading gnutls-tests.el by hand.

Here's the GDB transcript:

  (gdb) break Fgnutls_symmetric_encrypt
  Breakpoint 3 at 0x12aa8f5: file gnutls.c, line 2142.
  (gdb) break Fgnutls_symmetric_decrypt
  Breakpoint 4 at 0x12aa98b: file gnutls.c, line 2169.
  (gdb) r -Q

  Thread 1 hit Breakpoint 3, Fgnutls_symmetric_encrypt (
      cipher=XIL(0xc000000006c4c2a0), key=XIL(0x8000000006c336f0),
      iv=XIL(0x8000000006c33700), input=XIL(0x8000000006c34990),
      aead_auth=XIL(0)) at gnutls.c:2142
  2142      return gnutls_symmetric (true, cipher, key, iv, input, aead_auth);
  (gdb) pp cipher
  (:cipher-id 16 :type gnutls-symmetric-cipher :cipher-aead-capable t :cipher-tagsize 16 :cipher-blocksize 16 :cipher-keysize 32 :cipher-ivsize 12)
  (gdb) pp key
  "                           mykey"
  (gdb) pp iv
  "            "
  (gdb) pp input
  "                "
  (gdb) pp aead_auth
  nil
  (gdb) c
  Continuing.

  Thread 1 hit Breakpoint 4, Fgnutls_symmetric_decrypt (
      cipher=XIL(0xc000000006c4c2a0), key=XIL(0x8000000006c32fa0),
      iv=XIL(0x8000000006c336a0), input=XIL(0x8000000006c32fb0),
      aead_auth=XIL(0)) at gnutls.c:2169
  2169      return gnutls_symmetric (false, cipher, key, iv, input, aead_auth);
  (gdb) pp cipher
  (:cipher-id 16 :type gnutls-symmetric-cipher :cipher-aead-capable t :cipher-tagsize 16 :cipher-blocksize 16 :cipher-keysize 32 :cipher-ivsize 12)
  (gdb) pp key
  "                           mykey"
  (gdb) pp iv
  "            "
  (gdb) pp input
  "% F[MM   ¼  t
  E  ↑ %  >*Rº [z  "
  (gdb) p input
  $1 = XIL(0x8000000006c32fb0)
  (gdb) xstring
  $2 = (struct Lisp_String *) 0x6c32fb0
  "%\231F[MM∩\237\212¼µ≤t\212\nEπ \030\376%τµ>*Rº╬[zו\200"
  (gdb) p *$
  $3 = {
    size = 32,
    size_byte = -1,
    intervals = 0x0,
    data = 0x6ce0a40 "%\231F[MM∩\237\212¼µ≤t\212\nEπ \030\376%τµ>*Rº╬[zו\200"
  }
  (gdb) pp aead_auth
  nil



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-15 19:12                                                                             ` Eli Zaretskii
@ 2017-07-22  9:10                                                                               ` Eli Zaretskii
  2017-07-26  6:58                                                                                 ` Ted Zlatanov
  0 siblings, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-07-22  9:10 UTC (permalink / raw)
  To: tzz; +Cc: emacs-devel

Ping!  Anything further on this?  Anything I can do to help debugging
this problem?

> Date: Sat, 15 Jul 2017 22:12:59 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: emacs-devel@gnu.org
> 
> > From: Ted Zlatanov <tzz@lifelogs.com>
> > Date: Sat, 15 Jul 2017 14:40:37 -0400
> > 
> > The size of the output is determined by the cipher's parameters. So we
> > need to know the parameters, which are in cplist.
> 
> As I wrote, the problem happens on the first iteration through the
> loop in test-gnutls-005-aead-ciphers, so the cipher's parameters
> should be known, as they are set up by the code.  But in case this
> isn't telling the whole story, you will see the values below.
> 
> > Can you capture the full parameters and memory buffers passed into the
> > GnuTLS functions?
> 
> I attach below the GDB transcript with this information.  Let me know
> if you need more data.
> 
> > Here's the sequence we expect in the test, this in a let* form:
> > 
> > 1. The IV may be actual data or it may be a list of 'iv-auto and ivsize
> > which tells extract_data_from_object() to generate a random IV (which is
> > returned as the second element by `gnutls-symmetric-encrypt').
> > 
> > 2. We call (gnutls-symmetric-encrypt cplist (copy-sequence key) iv input (copy-sequence auth))
> > 
> > 3. We need to make sure the cplist, key, iv, input, and auth make it
> > down to the GnuTLS C functions.
> > 
> > 4. We need to look at the output returned from the GnuTLS C function and
> > make sure it makes it to the return of `gnutls-symmetric-encrypt'
> > together with the actual IV used.
> > 
> > 5. We need to do the same as steps 1-4 for decryption.
> 
> That's what I did, and my observations were in the previous mail.  I
> just didn't know what to expect, so I couldn't tell whether some of
> the data was incorrect.
> 
> > If you can provide a recipe for testing your case, that would be
> > helpful.
> 
> I'm just running test-gnutls-005-aead-ciphers in an interactive
> session after loading gnutls-tests.el by hand.
> 
> Here's the GDB transcript:
> 
>   (gdb) break Fgnutls_symmetric_encrypt
>   Breakpoint 3 at 0x12aa8f5: file gnutls.c, line 2142.
>   (gdb) break Fgnutls_symmetric_decrypt
>   Breakpoint 4 at 0x12aa98b: file gnutls.c, line 2169.
>   (gdb) r -Q
> 
>   Thread 1 hit Breakpoint 3, Fgnutls_symmetric_encrypt (
>       cipher=XIL(0xc000000006c4c2a0), key=XIL(0x8000000006c336f0),
>       iv=XIL(0x8000000006c33700), input=XIL(0x8000000006c34990),
>       aead_auth=XIL(0)) at gnutls.c:2142
>   2142      return gnutls_symmetric (true, cipher, key, iv, input, aead_auth);
>   (gdb) pp cipher
>   (:cipher-id 16 :type gnutls-symmetric-cipher :cipher-aead-capable t :cipher-tagsize 16 :cipher-blocksize 16 :cipher-keysize 32 :cipher-ivsize 12)
>   (gdb) pp key
>   "                           mykey"
>   (gdb) pp iv
>   "            "
>   (gdb) pp input
>   "                "
>   (gdb) pp aead_auth
>   nil
>   (gdb) c
>   Continuing.
> 
>   Thread 1 hit Breakpoint 4, Fgnutls_symmetric_decrypt (
>       cipher=XIL(0xc000000006c4c2a0), key=XIL(0x8000000006c32fa0),
>       iv=XIL(0x8000000006c336a0), input=XIL(0x8000000006c32fb0),
>       aead_auth=XIL(0)) at gnutls.c:2169
>   2169      return gnutls_symmetric (false, cipher, key, iv, input, aead_auth);
>   (gdb) pp cipher
>   (:cipher-id 16 :type gnutls-symmetric-cipher :cipher-aead-capable t :cipher-tagsize 16 :cipher-blocksize 16 :cipher-keysize 32 :cipher-ivsize 12)
>   (gdb) pp key
>   "                           mykey"
>   (gdb) pp iv
>   "            "
>   (gdb) pp input
>   "% F[MM   ¼  t
>   E  ↑ %  >*Rº [z  "
>   (gdb) p input
>   $1 = XIL(0x8000000006c32fb0)
>   (gdb) xstring
>   $2 = (struct Lisp_String *) 0x6c32fb0
>   "%\231F[MM∩\237\212¼µ≤t\212\nEπ \030\376%τµ>*Rº╬[zו\200"
>   (gdb) p *$
>   $3 = {
>     size = 32,
>     size_byte = -1,
>     intervals = 0x0,
>     data = 0x6ce0a40 "%\231F[MM∩\237\212¼µ≤t\212\nEπ \030\376%τµ>*Rº╬[zו\200"
>   }
>   (gdb) pp aead_auth
>   nil
> 
> 



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-22  9:10                                                                               ` Eli Zaretskii
@ 2017-07-26  6:58                                                                                 ` Ted Zlatanov
  2017-07-26 14:52                                                                                   ` Eli Zaretskii
  2017-07-27  0:19                                                                                   ` Paul Eggert
  0 siblings, 2 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-07-26  6:58 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1493 bytes --]

On Sat, 22 Jul 2017 12:10:34 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

EZ> Ping!  Anything further on this?  Anything I can do to help debugging
EZ> this problem?

Hi Eli,

I had to travel urgently and had trouble testing this on my Mac OS X
laptop; it took me a while to build all the pieces. Sorry for the delay.

Using a Ubuntu 17.04 Docker image, I compiled GnuTLS from source (a
GnuTLS 3.4.15 dev package was not available) and all the GnuTLS Emacs
tests passed in master against it (see summary and Dockerfile attached
below). A few of the Emacs tests failed in master, but that was not
related to GnuTLS.

Can you give me a recipe to reproduce the problem with your distribution
and packages? It may be irreproducible in Docker, or I tested
incorrectly, or maybe there's something specific about the optimizations
of the packaged libraries.

Thanks!
Ted

root@611652fe3513:/tmp/emacs/test/lisp/net# grep GNUTLS_VERSION\  /usr/include/gnutls/gnutls.h
#define GNUTLS_VERSION "3.4.15"
root@611652fe3513:/tmp/emacs/test/lisp/net# grep passed gnutls-tests.log
   passed  1/7  test-gnutls-000-availability
   passed  2/7  test-gnutls-000-data-extractions
   passed  3/7  test-gnutls-001-hashes-internal-digests
   passed  4/7  test-gnutls-002-hashes-digests
   passed  5/7  test-gnutls-003-hashes-hmacs
   passed  6/7  test-gnutls-004-symmetric-ciphers
   passed  7/7  test-gnutls-005-aead-ciphers
root@611652fe3513:/tmp/emacs/test/lisp/net# ../../../src/emacs --version
GNU Emacs 26.0.50


[-- Attachment #2: Dockerfile.multistage --]
[-- Type: text/plain, Size: 1934 bytes --]

ARG BASE_IMAGE=ubuntu:17.04
# Multi-stage building requires Docker 17.05 or later
FROM ${BASE_IMAGE} as gnutls-builder

ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update && apt-get install -y -o=Dpkg::Use-Pty=0 \
 autoconf \
 autogen \
 automake \
 autopoint \
 bison \
 dns-root-data \
 gawk \
 gettext \
 git-core \
 gperf \
 gtk-doc-tools \
 guile-2.0-dev \
 help2man \
 libidn2-0-dev \
 libp11-kit-dev \
 libtasn1-6-dev \
 libtool \
 libtspi-dev \
 libunbound-dev \
 libunistring-dev \
 nettle-dev \
 texinfo \
 texlive \
 texlive-extra-utils \
 texlive-generic-recommended

ARG GNUTLS_REPOSITORY="https://gitlab.com/gnutls/gnutls.git"
ARG GNUTLS_BRANCH="master"
# override to "check" here to run the GnuTLS tests
ARG GNUTLS_EXTRA_TARGETS=""

RUN git clone --depth 1 --branch ${GNUTLS_BRANCH} ${GNUTLS_REPOSITORY} /tmp/gnutls

WORKDIR /tmp/gnutls
RUN git submodule update --init
RUN make bootstrap autoreconf

RUN mkdir -p /tmp/gnutls/build
WORKDIR /tmp/gnutls/build
RUN ../configure --prefix=/usr --disable-doc --disable-guile --disable-cxx --disable-full-test-suite
RUN make -j8 ${GNUTLS_EXTRA_TARGETS}

FROM gnutls-builder as emacs-builder

ENV DEBIAN_FRONTEND noninteractive
# Install dependencies
RUN apt-get update && apt-get install --no-install-recommends -y -o=Dpkg::Use-Pty=0 \
 curl \
 git

RUN apt-get build-dep -y emacs24

# Now install GnuTLS
COPY --from=gnutls-builder /tmp/gnutls /tmp/gnutls
WORKDIR /tmp/gnutls/build
RUN make install

# Build emacs
ARG EMACS_REPOSITORY="git://git.sv.gnu.org/emacs.git"
ARG EMACS_BRANCH="master"

RUN git clone --depth 1 --branch ${EMACS_BRANCH} ${EMACS_REPOSITORY} /tmp/emacs
WORKDIR /tmp/emacs
RUN ./autogen.sh
RUN ./configure
RUN make -j 8

FROM emacs-builder as emacs-tester

WORKDIR /tmp/emacs/test
RUN make

# Final image
FROM emacs-builder as emacs-final

WORKDIR /tmp/emacs
RUN make -j 8 install
RUN rm -rf /tmp/emacs /tmp/gnutls

WORKDIR /rootfs
ENTRYPOINT ["emacs"]

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-26  6:58                                                                                 ` Ted Zlatanov
@ 2017-07-26 14:52                                                                                   ` Eli Zaretskii
  2017-07-26 15:34                                                                                     ` Ted Zlatanov
  2017-07-27  0:19                                                                                   ` Paul Eggert
  1 sibling, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-07-26 14:52 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Wed, 26 Jul 2017 09:58:25 +0300
> 
> Using a Ubuntu 17.04 Docker image, I compiled GnuTLS from source (a
> GnuTLS 3.4.15 dev package was not available) and all the GnuTLS Emacs
> tests passed in master against it (see summary and Dockerfile attached
> below). A few of the Emacs tests failed in master, but that was not
> related to GnuTLS.
> 
> Can you give me a recipe to reproduce the problem with your distribution
> and packages?

Once again, I just run the gnutls-tests file, that's all.  You can run
it interactively, if you want.

Frankly, I don't quite understand why you want to reproduce it on your
machine, since it seems to be so hard for you.  Just tell me what you
need to know, and I will collect any data you need, because I can run
this under a debugger.

> root@611652fe3513:/tmp/emacs/test/lisp/net# grep passed gnutls-tests.log
>    passed  1/7  test-gnutls-000-availability
>    passed  2/7  test-gnutls-000-data-extractions
>    passed  3/7  test-gnutls-001-hashes-internal-digests
>    passed  4/7  test-gnutls-002-hashes-digests
>    passed  5/7  test-gnutls-003-hashes-hmacs
>    passed  6/7  test-gnutls-004-symmetric-ciphers
>    passed  7/7  test-gnutls-005-aead-ciphers

The last one fails for me, all the rest pass.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-26 14:52                                                                                   ` Eli Zaretskii
@ 2017-07-26 15:34                                                                                     ` Ted Zlatanov
  2017-07-26 15:49                                                                                       ` Eli Zaretskii
  0 siblings, 1 reply; 128+ messages in thread
From: Ted Zlatanov @ 2017-07-26 15:34 UTC (permalink / raw)
  To: emacs-devel

On Wed, 26 Jul 2017 17:52:59 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

>> Can you give me a recipe to reproduce the problem with your distribution
>> and packages?

EZ> Once again, I just run the gnutls-tests file, that's all.  You can run
EZ> it interactively, if you want.

I did! With the same GnuTLS version as you! And it worked.

I don't have enough information (although you've been very helpful) to
know why it's failing for you. The tests work for me in many Docker
configurations I've tried, spending many hours to set them up. Also our
current build system passes them, I believe.

So I'm asking for more information about your environment so I can
reproduce it exactly.

EZ> Frankly, I don't quite understand why you want to reproduce it on your
EZ> machine, since it seems to be so hard for you.  Just tell me what you
EZ> need to know, and I will collect any data you need, because I can run
EZ> this under a debugger.

I'm sorry it's frustrating, but I just don't know what and where is
failing. There shouldn't be a failure where you get it. The AEAD ciphers
are a bit newer than the rest, but your GnuTLS release shouldn't have
any problems according to the ChangeLog.

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-26 15:34                                                                                     ` Ted Zlatanov
@ 2017-07-26 15:49                                                                                       ` Eli Zaretskii
  2017-07-26 16:08                                                                                         ` Ted Zlatanov
  0 siblings, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-07-26 15:49 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Wed, 26 Jul 2017 18:34:49 +0300
> 
> >> Can you give me a recipe to reproduce the problem with your distribution
> >> and packages?
> 
> EZ> Once again, I just run the gnutls-tests file, that's all.  You can run
> EZ> it interactively, if you want.
> 
> I did! With the same GnuTLS version as you! And it worked.

Yes, I know.  But I do nothing else special.

> So I'm asking for more information about your environment so I can
> reproduce it exactly.

What information about the environment do you need, specifically?
The GnuTLS library is the same you can find on ezwinports, and Emacs
is a 32-bit build --with-wide-int.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-26 15:49                                                                                       ` Eli Zaretskii
@ 2017-07-26 16:08                                                                                         ` Ted Zlatanov
  2017-07-26 18:51                                                                                           ` Eli Zaretskii
  0 siblings, 1 reply; 128+ messages in thread
From: Ted Zlatanov @ 2017-07-26 16:08 UTC (permalink / raw)
  To: emacs-devel

On Wed, 26 Jul 2017 18:49:51 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

>> From: Ted Zlatanov <tzz@lifelogs.com>
>> Date: Wed, 26 Jul 2017 18:34:49 +0300
>> 
>> >> Can you give me a recipe to reproduce the problem with your distribution
>> >> and packages?
>> 
EZ> Once again, I just run the gnutls-tests file, that's all.  You can run
EZ> it interactively, if you want.
>> 
>> I did! With the same GnuTLS version as you! And it worked.

EZ> Yes, I know.  But I do nothing else special.

I'm sure. So it must be something about your environment.

>> So I'm asking for more information about your environment so I can
>> reproduce it exactly.

EZ> What information about the environment do you need, specifically?
EZ> The GnuTLS library is the same you can find on ezwinports, and Emacs
EZ> is a 32-bit build --with-wide-int.

I was hoping to reproduce it in a virtual machine. How could I do that?
I'm not knowledgeable about that platform anyway.

I'd love it if someone with access to that platform could help out.

Ted




^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-26 16:08                                                                                         ` Ted Zlatanov
@ 2017-07-26 18:51                                                                                           ` Eli Zaretskii
  2017-07-26 20:48                                                                                             ` Ted Zlatanov
  0 siblings, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-07-26 18:51 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Wed, 26 Jul 2017 19:08:21 +0300
> 
> EZ> What information about the environment do you need, specifically?
> EZ> The GnuTLS library is the same you can find on ezwinports, and Emacs
> EZ> is a 32-bit build --with-wide-int.
> 
> I was hoping to reproduce it in a virtual machine. How could I do that?

I don't know.  Since no one else reported the same problem, perhaps
it's something specific to my local configuration, although I doubt
that.

> I'd love it if someone with access to that platform could help out.

I offer my help.  What would you like me to do, try, find out?  Just
ask.  Did the information I provided already give any clues, or any
leads, or any ideas for further tinkering?



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-26 18:51                                                                                           ` Eli Zaretskii
@ 2017-07-26 20:48                                                                                             ` Ted Zlatanov
  0 siblings, 0 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-07-26 20:48 UTC (permalink / raw)
  To: emacs-devel

On Wed, 26 Jul 2017 21:51:30 +0300 Eli Zaretskii <eliz@gnu.org> wrote: 

EZ> I don't know.  Since no one else reported the same problem, perhaps
EZ> it's something specific to my local configuration, although I doubt
EZ> that.

Yes, I also think it's likely to be a real problem, that's why I've
spent a lot of time on it. But it seems to only occur on that platform.
Do you have a way to test with a newer GnuTLS release?

Is there anything in the code that looks suspicious to you? I've looked
it over several times and don't see how it could be causing garbage data.
Maybe the storage math is wrong somehow?

Can you trace the exact point at which the GnuTLS functions are called
on encryption and decryption?

  ret = ((encrypting ? gnutls_aead_cipher_encrypt : gnutls_aead_cipher_decrypt)
	 (acipher, vdata, vsize, aead_auth_data, aead_auth_size,
	  cipher_tag_size, idata, isize, storage, &storage_length));

Do those numbers match up with the cipher parameters?

  (:cipher-id 16 :type gnutls-symmetric-cipher :cipher-aead-capable t :cipher-tagsize 16 :cipher-blocksize 16 :cipher-keysize 32 :cipher-ivsize 12)

EZ> I offer my help.  What would you like me to do, try, find out?  Just
EZ> ask.  Did the information I provided already give any clues, or any
EZ> leads, or any ideas for further tinkering?

Your information was thorough but didn't tell me where the error is in
the Emacs code, and I don't yet know how to reproduce it.

Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-26  6:58                                                                                 ` Ted Zlatanov
  2017-07-26 14:52                                                                                   ` Eli Zaretskii
@ 2017-07-27  0:19                                                                                   ` Paul Eggert
  2017-07-27  2:34                                                                                     ` Eli Zaretskii
  1 sibling, 1 reply; 128+ messages in thread
From: Paul Eggert @ 2017-07-27  0:19 UTC (permalink / raw)
  To: emacs-devel

Ted Zlatanov wrote:
> Can you give me a recipe to reproduce the problem with your distribution
> and packages?

I can reliably reproduce what appears to be this problem (or one closely related 
to it) on Ubuntu 16.04.2 LTS x86-64 when running Emacs master. My guess is that 
GnuTLS silently changed the semantics of the affected C functions. The 
problematic version of Ubuntu has GnuTLS 3.4.10. I do not observe a problem on 
Fedora 26, which has GnuTLS 3.5.14. If my guess is right, a simple workaround 
would be to not use the affected functions unless running GnuTLS 3.5.14 or later.

"make check" reports:

    passed  6/7  test-gnutls-004-symmetric-ciphers
Test test-gnutls-005-aead-ciphers backtrace:
   signal(ert-test-failed (((should (gnutls-tests-hexstring-equal input
   ert-fail(((should (gnutls-tests-hexstring-equal input reverse)) :for
   (if (unwind-protect (setq value-152 (apply fn-150 args-151)) (setq f
   (let (form-description-154) (if (unwind-protect (setq value-152 (app
   (let ((value-152 'ert-form-evaluation-aborted-153)) (let (form-descr
   (let ((fn-150 (function gnutls-tests-hexstring-equal)) (args-151 (li
   (let* ((iv (gnutls-tests-pad-or-trim iv (plist-get cplist :cipher-iv
   (while --dolist-tail-- (setq iv (car --dolist-tail--)) (gnutls-tests
   (let ((--dolist-tail-- (append ivs (list (list 'iv-auto ivsize)))) i
   (let* ((cplist (cdr (assq cipher (gnutls-ciphers)))) (key (gnutls-te
   (while --dolist-tail-- (setq key (car --dolist-tail--)) (let* ((cpli
   (let ((--dolist-tail-- keys) key) (while --dolist-tail-- (setq key (
   (while --dolist-tail-- (setq auth (car --dolist-tail--)) (let ((--do
   (let ((--dolist-tail-- auths) auth) (while --dolist-tail-- (setq aut
   (while --dolist-tail-- (setq input (car --dolist-tail--)) (let ((--d
   (let ((--dolist-tail-- inputs) input) (while --dolist-tail-- (setq i
   (while --dolist-tail-- (setq cipher (car --dolist-tail--)) (let ((--
   (let ((--dolist-tail-- ciphers) cipher) (while --dolist-tail-- (setq
   (let ((keys '("mykey" "mykey2")) (inputs gnutls-tests-mondo-strings)
   (lambda nil (let ((fn-120 (function memq)) (args-121 (list 'AEAD-cip
   ert--run-test-internal(#s(ert--test-execution-info :test #s(ert-test
   ert-run-test(#s(ert-test :name test-gnutls-005-aead-ciphers :documen
   ert-run-or-rerun-test(#s(ert--stats :selector (not (tag :expensive-t
   ert-run-tests((not (tag :expensive-test)) #f(compiled-function (even
   ert-run-tests-batch((not (tag :expensive-test)))
   ert-run-tests-batch-and-exit((not (tag :expensive-test)))
   eval((ert-run-tests-batch-and-exit '(not (tag :expensive-test))))
   command-line-1(("-L" ":." "-l" "ert" "-l" "lisp/net/gnutls-tests.el"
   command-line()
   normal-top-level()
Test test-gnutls-005-aead-ciphers condition:
     (ert-test-failed
      ((should
        (gnutls-tests-hexstring-equal input reverse))
       :form
       (gnutls-tests-hexstring-equal "                " " 
\240~\267Q\377^?^@^@^P^@^@^@^@^@^@^@")
       :value nil))
    FAILED  7/7  test-gnutls-005-aead-ciphers


In the above report, the "^?^@^@^P^@^@^@^@^@^@^@" stands for a sequence of 11 
control characters.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-27  0:19                                                                                   ` Paul Eggert
@ 2017-07-27  2:34                                                                                     ` Eli Zaretskii
  2017-07-27  4:36                                                                                       ` Paul Eggert
  0 siblings, 1 reply; 128+ messages in thread
From: Eli Zaretskii @ 2017-07-27  2:34 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Wed, 26 Jul 2017 17:19:10 -0700
> 
> Ted Zlatanov wrote:
> > Can you give me a recipe to reproduce the problem with your distribution
> > and packages?
> 
> I can reliably reproduce what appears to be this problem (or one closely related 
> to it) on Ubuntu 16.04.2 LTS x86-64 when running Emacs master. My guess is that 
> GnuTLS silently changed the semantics of the affected C functions. The 
> problematic version of Ubuntu has GnuTLS 3.4.10. I do not observe a problem on 
> Fedora 26, which has GnuTLS 3.5.14. If my guess is right, a simple workaround 
> would be to not use the affected functions unless running GnuTLS 3.5.14 or later.

It would be a pity to give up these functions if we can make the code
work with some conditional code.

Can you tell what change in semantics could explain this failure?

Thanks.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-27  2:34                                                                                     ` Eli Zaretskii
@ 2017-07-27  4:36                                                                                       ` Paul Eggert
  2017-07-27 15:56                                                                                         ` Ted Zlatanov
  2017-08-03  8:02                                                                                         ` Paul Eggert
  0 siblings, 2 replies; 128+ messages in thread
From: Paul Eggert @ 2017-07-27  4:36 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Eli Zaretskii wrote:
> Can you tell what change in semantics could explain this failure?

In principle, yes. But it would take some time for me to come up to speed on GnuTLS.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-27  4:36                                                                                       ` Paul Eggert
@ 2017-07-27 15:56                                                                                         ` Ted Zlatanov
  2017-08-03 19:52                                                                                           ` Ted Zlatanov
  2017-08-03  8:02                                                                                         ` Paul Eggert
  1 sibling, 1 reply; 128+ messages in thread
From: Ted Zlatanov @ 2017-07-27 15:56 UTC (permalink / raw)
  To: emacs-devel

On Wed, 26 Jul 2017 21:36:31 -0700 Paul Eggert <eggert@cs.ucla.edu> wrote: 

PE> Eli Zaretskii wrote:
>> Can you tell what change in semantics could explain this failure?

PE> In principle, yes. But it would take some time for me to come up to speed on GnuTLS.

Thanks so much for the hint, Paul. I was able to put together a Docker
image using Ubuntu 16.04 that reproduces the problem. It happens with
the OS packages of libgnutls-dev.

I'll try to put more time into chasing this, since 16.04 is a LTS
release and just requiring a newer GnuTLS will inconvenience users that
compile there. Maybe we'll have to disable AEAD ciphers there, but I
hope to find a better solution. If you have more insight into the
specific breakage, let me know, and I'll post as soon as I have useful
information.

Thanks again
Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-27  4:36                                                                                       ` Paul Eggert
  2017-07-27 15:56                                                                                         ` Ted Zlatanov
@ 2017-08-03  8:02                                                                                         ` Paul Eggert
  2017-08-03 16:49                                                                                           ` Eli Zaretskii
  1 sibling, 1 reply; 128+ messages in thread
From: Paul Eggert @ 2017-08-03  8:02 UTC (permalink / raw)
  To: Emacs Development

[-- Attachment #1: Type: text/plain, Size: 308 bytes --]

Paul Eggert wrote:
> it would take some time for me to come up to speed on GnuTLS.

Although I'm still not up to speed at all, I think I tracked the problem down. I 
installed the attached patch which fixes things for me on Ubuntu 16.04.2 LTS, by 
having Emacs use AEAD only on GnuTLS 3.5.1 and later.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Port-GnuTLS-usage-to-Ubuntu-16.04.2-LTS.patch --]
[-- Type: text/x-patch; name="0001-Port-GnuTLS-usage-to-Ubuntu-16.04.2-LTS.patch", Size: 1380 bytes --]

From fcd7c6b8605dbfb0c38da39a50d5ac882c2c0ba3 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Thu, 3 Aug 2017 01:00:10 -0700
Subject: [PATCH] Port GnuTLS usage to Ubuntu 16.04.2 LTS

* src/gnutls.h (HAVE_GNUTLS3_AEAD): Define only if GnuTLS 3.5.1 or
later, as opposed to the old 3.4.0 or later.
---
 src/gnutls.h | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/gnutls.h b/src/gnutls.h
index 19c1686..8fe4ac3 100644
--- a/src/gnutls.h
+++ b/src/gnutls.h
@@ -29,12 +29,21 @@ along with GNU Emacs.  If not, see <http://www.gnu.org/licenses/>.  */
 #endif
 
 #if 0x030400 <= GNUTLS_VERSION_NUMBER
-# define HAVE_GNUTLS3_AEAD
 # define HAVE_GNUTLS3_CIPHER
 # define HAVE_GNUTLS3_DIGEST
 # define HAVE_GNUTLS3_HMAC
 #endif
 
+/* Although AEAD support started in GnuTLS 3.4.0 and works in 3.5.14,
+   it was broken through at least GnuTLS 3.4.10; see:
+   https://lists.gnu.org/archive/html/emacs-devel/2017-07/msg00992.html
+   The relevant fix seems to have been made in GnuTLS 3.5.1; see:
+   https://gitlab.com/gnutls/gnutls/commit/568935848dd6b82b9315d8b6c529d00e2605e03d
+   So use 3.5.1 for now.  */
+#if 0x030501 <= GNUTLS_VERSION_NUMBER
+# define HAVE_GNUTLS3_AEAD
+#endif
+
 #include "lisp.h"
 
 /* This limits the attempts to handshake per process (connection).  It
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-08-03  8:02                                                                                         ` Paul Eggert
@ 2017-08-03 16:49                                                                                           ` Eli Zaretskii
  0 siblings, 0 replies; 128+ messages in thread
From: Eli Zaretskii @ 2017-08-03 16:49 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Emacs-devel

> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Thu, 3 Aug 2017 01:02:13 -0700
> 
> Although I'm still not up to speed at all, I think I tracked the problem down. I 
> installed the attached patch which fixes things for me on Ubuntu 16.04.2 LTS, by 
> having Emacs use AEAD only on GnuTLS 3.5.1 and later.

The tests pass for me now, of course, but it would be a pity to lose
this functionality entirely for 3.4.X, if we can somehow do better.

Thanks.



^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: libnettle/libhogweed WIP
  2017-07-27 15:56                                                                                         ` Ted Zlatanov
@ 2017-08-03 19:52                                                                                           ` Ted Zlatanov
  0 siblings, 0 replies; 128+ messages in thread
From: Ted Zlatanov @ 2017-08-03 19:52 UTC (permalink / raw)
  To: emacs-devel

On Thu, 27 Jul 2017 18:56:08 +0300 Ted Zlatanov <tzz@lifelogs.com> wrote: 

TZ> On Wed, 26 Jul 2017 21:36:31 -0700 Paul Eggert <eggert@cs.ucla.edu> wrote: 
PE> Eli Zaretskii wrote:
>>> Can you tell what change in semantics could explain this failure?

PE> In principle, yes. But it would take some time for me to come up to speed on GnuTLS.

TZ> Thanks so much for the hint, Paul. I was able to put together a Docker
TZ> image using Ubuntu 16.04 that reproduces the problem. It happens with
TZ> the OS packages of libgnutls-dev.

TZ> I'll try to put more time into chasing this, since 16.04 is a LTS
TZ> release and just requiring a newer GnuTLS will inconvenience users that
TZ> compile there. Maybe we'll have to disable AEAD ciphers there, but I
TZ> hope to find a better solution. If you have more insight into the
TZ> specific breakage, let me know, and I'll post as soon as I have useful
TZ> information.

I'm finally back from traveling and found that this morning Paul pushed
a fix to block AEAD ciphers in 3.4.x GnuTLS, citing a fix made a year
ago in 3.5.x. Since we can't easily do more on the Emacs side, I agree
with the fix and it should resolve the problem for you and other 3.4.x
users by avoiding AEAD tests. Sorry this took so long from my side.

Thanks
Ted

^ permalink raw reply	[flat|nested] 128+ messages in thread

end of thread, other threads:[~2017-08-03 19:52 UTC | newest]

Thread overview: 128+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-20 10:00 How to ship native modules? Elias Mårtenson
2017-02-20 15:27 ` Eli Zaretskii
2017-02-20 16:01   ` Elias Mårtenson
2017-02-20 16:30     ` Eli Zaretskii
2017-02-21  2:48       ` Elias Mårtenson
2017-02-21  3:41         ` Eli Zaretskii
2017-02-21  4:13           ` Elias Mårtenson
2017-02-21 16:48             ` Eli Zaretskii
2017-02-21 20:06               ` John Wiegley
2017-02-21 14:44       ` Stefan Monnier
     [not found]         ` <CADtN0WLjNcFRLCsJNZX+XfqOcq+veTaoGkwHQCV9bjvuQoEORA@mail.gmail.com>
2017-02-21 15:48           ` Elias Mårtenson
2017-02-21 17:14             ` Stefan Monnier
2017-02-21 16:59         ` Eli Zaretskii
2017-03-02 14:59   ` request to reconsider libnettle/libhogweed (was: How to ship native modules?) Ted Zlatanov
2017-03-02 15:19     ` request to reconsider libnettle/libhogweed Stefan Monnier
2017-03-02 15:55     ` request to reconsider libnettle/libhogweed (was: How to ship native modules?) Eli Zaretskii
2017-03-15 21:19       ` libnettle/libhogweed WIP (was: request to reconsider libnettle/libhogweed) Ted Zlatanov
2017-03-16 15:28         ` Eli Zaretskii
2017-03-17 22:46           ` libnettle/libhogweed WIP Ted Zlatanov
2017-03-18  8:12             ` Eli Zaretskii
2017-03-20 18:45           ` Ted Zlatanov
2017-04-11 20:05           ` Ted Zlatanov
2017-04-14 20:48             ` Ted Zlatanov
2017-04-15  9:32               ` Eli Zaretskii
2017-04-15 14:27                 ` Ted Zlatanov
2017-04-15 14:55                   ` Eli Zaretskii
2017-04-16  2:39                     ` Ted Zlatanov
2017-04-16  6:25                       ` Eli Zaretskii
2017-04-16  6:51                       ` Eli Zaretskii
2017-04-17 16:23                         ` Ted Zlatanov
2017-04-17 16:34                           ` Eli Zaretskii
2017-04-17 16:55                             ` Ted Zlatanov
2017-04-17 17:11                               ` Eli Zaretskii
2017-04-17 17:34                                 ` Ted Zlatanov
2017-04-17 17:46                                   ` Ted Zlatanov
2017-04-17 18:11                                   ` Eli Zaretskii
2017-04-17 20:50                               ` Ted Zlatanov
2017-04-17 21:19                                 ` Noam Postavsky
2017-04-17 23:29                                   ` Ted Zlatanov
2017-04-19  2:08                                     ` Ted Zlatanov
2017-04-19  2:42                                       ` Noam Postavsky
2017-04-19 15:24                                       ` Davis Herring
2017-04-19 15:45                                       ` Eli Zaretskii
2017-04-20 17:24                                         ` Ted Zlatanov
2017-04-20 19:38                                           ` Eli Zaretskii
2017-04-20 20:24                                             ` Ted Zlatanov
2017-04-20 20:42                                               ` Lars Ingebrigtsen
2017-04-20 21:54                                                 ` Ted Zlatanov
2017-04-21  6:21                                                   ` Eli Zaretskii
2017-04-21 18:45                                                   ` Lars Ingebrigtsen
2017-04-21 19:15                                                     ` Eli Zaretskii
2017-04-21  6:14                                               ` Eli Zaretskii
2017-05-15 21:55                                                 ` Ted Zlatanov
2017-05-16 22:19                                                   ` Ted Zlatanov
2017-05-17 16:22                                                   ` Eli Zaretskii
2017-05-17 20:05                                                     ` Ted Zlatanov
2017-05-31 18:17                                                       ` Ted Zlatanov
2017-06-03  7:23                                                         ` Eli Zaretskii
2017-06-03  9:00                                                           ` Andreas Schwab
2017-06-03 10:01                                                             ` Eli Zaretskii
2017-06-03 10:09                                                               ` Andreas Schwab
2017-06-03 10:47                                                                 ` Eli Zaretskii
2017-06-27 22:58                                                           ` Ted Zlatanov
2017-06-28 16:54                                                             ` Eli Zaretskii
2017-06-28 19:44                                                               ` Ted Zlatanov
2017-07-13 18:35                                                                 ` Ted Zlatanov
2017-07-14 15:10                                                                   ` Ted Zlatanov
2017-07-14 19:04                                                                     ` Eli Zaretskii
2017-07-14 19:43                                                                       ` Ted Zlatanov
2017-07-14 20:04                                                                         ` Eli Zaretskii
2017-07-15 18:30                                                                           ` Ted Zlatanov
2017-07-15  9:15                                                                         ` Eli Zaretskii
2017-07-15 18:40                                                                           ` Ted Zlatanov
2017-07-15 19:12                                                                             ` Eli Zaretskii
2017-07-22  9:10                                                                               ` Eli Zaretskii
2017-07-26  6:58                                                                                 ` Ted Zlatanov
2017-07-26 14:52                                                                                   ` Eli Zaretskii
2017-07-26 15:34                                                                                     ` Ted Zlatanov
2017-07-26 15:49                                                                                       ` Eli Zaretskii
2017-07-26 16:08                                                                                         ` Ted Zlatanov
2017-07-26 18:51                                                                                           ` Eli Zaretskii
2017-07-26 20:48                                                                                             ` Ted Zlatanov
2017-07-27  0:19                                                                                   ` Paul Eggert
2017-07-27  2:34                                                                                     ` Eli Zaretskii
2017-07-27  4:36                                                                                       ` Paul Eggert
2017-07-27 15:56                                                                                         ` Ted Zlatanov
2017-08-03 19:52                                                                                           ` Ted Zlatanov
2017-08-03  8:02                                                                                         ` Paul Eggert
2017-08-03 16:49                                                                                           ` Eli Zaretskii
2017-04-18 17:44                                 ` Ted Zlatanov
2017-04-19 12:22                               ` Stefan Monnier
2017-04-19 13:38                                 ` Ted Zlatanov
2017-04-19 14:16                                 ` Lars Ingebrigtsen
2017-04-19 14:48                                   ` Stefan Monnier
2017-04-19 14:41                                 ` Eli Zaretskii
2017-04-19 14:54                                   ` Stefan Monnier
2017-04-19 15:31                                     ` Eli Zaretskii
2017-04-19 15:48                                   ` Ted Zlatanov
2017-04-19 16:49                                     ` Lars Ingebrigtsen
2017-04-19 17:24                                       ` Eli Zaretskii
2017-04-19 19:53                                         ` Stefan Monnier
2017-04-20  2:30                                           ` Eli Zaretskii
2017-04-20  3:36                                             ` Stefan Monnier
2017-04-20 15:46                                               ` Eli Zaretskii
2017-04-20 15:59                                                 ` Lars Ingebrigtsen
2017-04-20 16:24                                                   ` Eli Zaretskii
2017-04-20 17:25                                                     ` Stefan Monnier
2017-04-20 19:40                                                       ` Lars Ingebrigtsen
2017-04-20 20:31                                                         ` Eli Zaretskii
2017-04-20 19:58                                                       ` Eli Zaretskii
2017-04-20 20:36                                                         ` Eli Zaretskii
2017-04-20 17:14                                                 ` Stefan Monnier
2017-04-20 19:29                                                   ` Eli Zaretskii
2017-04-19 19:49                                       ` Stefan Monnier
2017-04-17 16:00                       ` rename STRING_SET_CHARS to STRING_SET_SIZE (was: libnettle/libhogweed WIP) Ted Zlatanov
2017-04-17 16:24                         ` rename STRING_SET_CHARS to STRING_SET_SIZE Eli Zaretskii
2017-04-17 16:29                         ` Stefan Monnier
2017-04-17 16:34                           ` Ted Zlatanov
2017-04-16  3:37                     ` libnettle/libhogweed WIP Stefan Monnier
2017-04-16  6:19                       ` Eli Zaretskii
2017-04-16 13:20                         ` Stefan Monnier
2017-04-16  7:47               ` Toon Claes
2017-03-02 17:58     ` request to reconsider libnettle/libhogweed Paul Eggert
2017-03-02 18:33       ` Ted Zlatanov
2017-02-20 15:33 ` How to ship native modules? Aurélien Aptel
2017-02-21  4:50 ` Andreas Politz
2017-02-21  5:12   ` Elias Mårtenson
2017-02-21  5:23     ` Andreas Politz

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).