unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [PATCH] Use libgcrypt for hashing.
@ 2009-11-28  3:28 Jeffrey C. Ollie
  2009-11-28  3:31 ` Mikhail Gusarov
  2009-11-28  6:22 ` Carl Worth
  0 siblings, 2 replies; 12+ messages in thread
From: Jeffrey C. Ollie @ 2009-11-28  3:28 UTC (permalink / raw)
  To: Not Much Mail

Instead of including a private implementation of the SHA1 hash, use
libgcrypt.  This means less code of our own to maintain and it will be
easier to switch to a different hash function like SHA256.

libgcrypt was chosen because it has a fairly simple API, it's well
tested (it's used in gnutls and gnupg2), and it's licensed under the
LGPL.

Signed-off-by: Jeffrey C. Ollie <jeff@ocjtech.us>
---
 Makefile           |    6 +-
 configure          |   15 +++-
 lib/Makefile.local |    1 -
 lib/libsha1.c      |  242 ----------------------------------------------------
 lib/libsha1.h      |   67 --------------
 lib/sha1.c         |   44 +++++-----
 6 files changed, 42 insertions(+), 333 deletions(-)
 delete mode 100644 lib/libsha1.c
 delete mode 100644 lib/libsha1.h

diff --git a/Makefile b/Makefile
index 2cd1b1b..a5ff3e7 100644
--- a/Makefile
+++ b/Makefile
@@ -10,7 +10,8 @@ gzip = gzip
 
 # Additional flags that we will append to whatever the user set.
 # These aren't intended for the user to manipulate.
-extra_cflags := $(shell pkg-config --cflags glib-2.0 gmime-2.4 talloc)
+extra_cflags := $(shell pkg-config --cflags glib-2.0 gmime-2.4 talloc) \
+	$(shell libgcrypt-config --cflags)
 extra_cxxflags := $(shell xapian-config --cxxflags)
 
 emacs_lispdir := $(shell pkg-config emacs --variable sitepkglispdir)
@@ -28,7 +29,8 @@ override CXXFLAGS += $(WARN_FLAGS) $(extra_cflags) $(extra_cxxflags)
 
 override LDFLAGS += \
 	$(shell pkg-config --libs glib-2.0 gmime-2.4 talloc) \
-	$(shell xapian-config --libs)
+	$(shell xapian-config --libs) \
+	$(shell libgcrypt-config --libs)
 
 # Include our local Makefile.local first so that its first target is default
 include Makefile.local
diff --git a/configure b/configure
index b4770ec..22f8066 100755
--- a/configure
+++ b/configure
@@ -61,6 +61,15 @@ else
     have_valgrind=
 fi
 
+if libgcrypt-config --version > /dev/null 2>&1; then
+    echo "Checking for libgcrypt development files... Yes."
+    have_libgcrypt=1
+else
+    echo "Checking for libgcrypt development files... No."
+    have_libgcrypt=0
+    errors=$((errors + 1))
+fi
+
 if [ $errors -gt 0 ]; then
     cat <<EOF
 
@@ -81,13 +90,17 @@ EOF
 	echo "	The talloc library (including development files such as headers)"
 	echo "	http://talloc.samba.org/"
     fi
+    if [ $have_libgcrypt -eq 0 ]; then
+	echo "	The libgcrypt library (including development files such as headers)"
+	echo "	http://directory.fsf.org/project/libgcrypt/"
+    fi
     cat <<EOF
 
 On a modern, package-based operating system such as Debian, you can
 install all of the dependencies with the following simple command
 line:
 
-	sudo apt-get install libxapian-dev libgmime-2.4-dev libtalloc-dev
+	sudo apt-get install libxapian-dev libgmime-2.4-dev libtalloc-dev libgcrypt-dev
 
 On other systems, a similar command can be used, but the details of the 
 package names may be different, (such as "devel" in place of "dev").
diff --git a/lib/Makefile.local b/lib/Makefile.local
index a7562c9..28c024c 100644
--- a/lib/Makefile.local
+++ b/lib/Makefile.local
@@ -2,7 +2,6 @@ dir=lib
 extra_cflags += -I$(dir)
 
 libnotmuch_c_srcs =		\
-	$(dir)/libsha1.c	\
 	$(dir)/message-file.c	\
 	$(dir)/messages.c	\
 	$(dir)/sha1.c		\
diff --git a/lib/libsha1.c b/lib/libsha1.c
deleted file mode 100644
index c39a5a1..0000000
--- a/lib/libsha1.c
+++ /dev/null
@@ -1,242 +0,0 @@
-/*
- ---------------------------------------------------------------------------
- Copyright (c) 2002, Dr Brian Gladman, Worcester, UK.   All rights reserved.
-
- LICENSE TERMS
-
- The free distribution and use of this software in both source and binary
- form is allowed (with or without changes) provided that:
-
-   1. distributions of this source code include the above copyright
-      notice, this list of conditions and the following disclaimer;
-
-   2. distributions in binary form include the above copyright
-      notice, this list of conditions and the following disclaimer
-      in the documentation and/or other associated materials;
-
-   3. the copyright holder's name is not used to endorse products
-      built using this software without specific written permission.
-
- ALTERNATIVELY, provided that this notice is retained in full, this product
- may be distributed under the terms of the GNU General Public License (GPL),
- in which case the provisions of the GPL apply INSTEAD OF those given above.
-
- DISCLAIMER
-
- This software is provided 'as is' with no explicit or implied warranties
- in respect of its properties, including, but not limited to, correctness
- and/or fitness for purpose.
- ---------------------------------------------------------------------------
- Issue Date: 01/08/2005
-
- This is a byte oriented version of SHA1 that operates on arrays of bytes
- stored in memory.
-*/
-
-#include <string.h>     /* for memcpy() etc.        */
-
-#include "libsha1.h"
-
-#if defined(__cplusplus)
-extern "C"
-{
-#endif
-
-#define SHA1_BLOCK_SIZE  64
-
-#define rotl32(x,n)   (((x) << n) | ((x) >> (32 - n)))
-#define rotr32(x,n)   (((x) >> n) | ((x) << (32 - n)))
-
-#define bswap_32(x) ((rotr32((x), 24) & 0x00ff00ff) | (rotr32((x), 8) & 0xff00ff00))
-
-#if (PLATFORM_BYTE_ORDER == IS_LITTLE_ENDIAN)
-#define bsw_32(p,n) \
-    { int _i = (n); while(_i--) ((uint32_t*)p)[_i] = bswap_32(((uint32_t*)p)[_i]); }
-#else
-#define bsw_32(p,n)
-#endif
-
-#define SHA1_MASK   (SHA1_BLOCK_SIZE - 1)
-
-#if 0
-
-#define ch(x,y,z)       (((x) & (y)) ^ (~(x) & (z)))
-#define parity(x,y,z)   ((x) ^ (y) ^ (z))
-#define maj(x,y,z)      (((x) & (y)) ^ ((x) & (z)) ^ ((y) & (z)))
-
-#else   /* Discovered by Rich Schroeppel and Colin Plumb   */
-
-#define ch(x,y,z)       ((z) ^ ((x) & ((y) ^ (z))))
-#define parity(x,y,z)   ((x) ^ (y) ^ (z))
-#define maj(x,y,z)      (((x) & (y)) | ((z) & ((x) ^ (y))))
-
-#endif
-
-/* Compile 64 bytes of hash data into SHA1 context. Note    */
-/* that this routine assumes that the byte order in the     */
-/* ctx->wbuf[] at this point is in such an order that low   */
-/* address bytes in the ORIGINAL byte stream will go in     */
-/* this buffer to the high end of 32-bit words on BOTH big  */
-/* and little endian systems                                */
-
-#ifdef ARRAY
-#define q(v,n)  v[n]
-#else
-#define q(v,n)  v##n
-#endif
-
-#define one_cycle(v,a,b,c,d,e,f,k,h)            \
-    q(v,e) += rotr32(q(v,a),27) +               \
-              f(q(v,b),q(v,c),q(v,d)) + k + h;  \
-    q(v,b)  = rotr32(q(v,b), 2)
-
-#define five_cycle(v,f,k,i)                 \
-    one_cycle(v, 0,1,2,3,4, f,k,hf(i  ));   \
-    one_cycle(v, 4,0,1,2,3, f,k,hf(i+1));   \
-    one_cycle(v, 3,4,0,1,2, f,k,hf(i+2));   \
-    one_cycle(v, 2,3,4,0,1, f,k,hf(i+3));   \
-    one_cycle(v, 1,2,3,4,0, f,k,hf(i+4))
-
-static void sha1_compile(sha1_ctx ctx[1])
-{   uint32_t    *w = ctx->wbuf;
-
-#ifdef ARRAY
-    uint32_t    v[5];
-    memcpy(v, ctx->hash, 5 * sizeof(uint32_t));
-#else
-    uint32_t    v0, v1, v2, v3, v4;
-    v0 = ctx->hash[0]; v1 = ctx->hash[1];
-    v2 = ctx->hash[2]; v3 = ctx->hash[3];
-    v4 = ctx->hash[4];
-#endif
-
-#define hf(i)   w[i]
-
-    five_cycle(v, ch, 0x5a827999,  0);
-    five_cycle(v, ch, 0x5a827999,  5);
-    five_cycle(v, ch, 0x5a827999, 10);
-    one_cycle(v,0,1,2,3,4, ch, 0x5a827999, hf(15)); \
-
-#undef  hf
-#define hf(i) (w[(i) & 15] = rotl32(                    \
-                 w[((i) + 13) & 15] ^ w[((i) + 8) & 15] \
-               ^ w[((i) +  2) & 15] ^ w[(i) & 15], 1))
-
-    one_cycle(v,4,0,1,2,3, ch, 0x5a827999, hf(16));
-    one_cycle(v,3,4,0,1,2, ch, 0x5a827999, hf(17));
-    one_cycle(v,2,3,4,0,1, ch, 0x5a827999, hf(18));
-    one_cycle(v,1,2,3,4,0, ch, 0x5a827999, hf(19));
-
-    five_cycle(v, parity, 0x6ed9eba1,  20);
-    five_cycle(v, parity, 0x6ed9eba1,  25);
-    five_cycle(v, parity, 0x6ed9eba1,  30);
-    five_cycle(v, parity, 0x6ed9eba1,  35);
-
-    five_cycle(v, maj, 0x8f1bbcdc,  40);
-    five_cycle(v, maj, 0x8f1bbcdc,  45);
-    five_cycle(v, maj, 0x8f1bbcdc,  50);
-    five_cycle(v, maj, 0x8f1bbcdc,  55);
-
-    five_cycle(v, parity, 0xca62c1d6,  60);
-    five_cycle(v, parity, 0xca62c1d6,  65);
-    five_cycle(v, parity, 0xca62c1d6,  70);
-    five_cycle(v, parity, 0xca62c1d6,  75);
-
-#ifdef ARRAY
-    ctx->hash[0] += v[0]; ctx->hash[1] += v[1];
-    ctx->hash[2] += v[2]; ctx->hash[3] += v[3];
-    ctx->hash[4] += v[4];
-#else
-    ctx->hash[0] += v0; ctx->hash[1] += v1;
-    ctx->hash[2] += v2; ctx->hash[3] += v3;
-    ctx->hash[4] += v4;
-#endif
-}
-
-void sha1_begin(sha1_ctx ctx[1])
-{
-    ctx->count[0] = ctx->count[1] = 0;
-    ctx->hash[0] = 0x67452301;
-    ctx->hash[1] = 0xefcdab89;
-    ctx->hash[2] = 0x98badcfe;
-    ctx->hash[3] = 0x10325476;
-    ctx->hash[4] = 0xc3d2e1f0;
-}
-
-/* SHA1 hash data in an array of bytes into hash buffer and */
-/* call the hash_compile function as required.              */
-
-void sha1_hash(const unsigned char data[], unsigned long len, sha1_ctx ctx[1])
-{   uint32_t pos = (uint32_t)(ctx->count[0] & SHA1_MASK),
-            space = SHA1_BLOCK_SIZE - pos;
-    const unsigned char *sp = data;
-
-    if((ctx->count[0] += len) < len)
-        ++(ctx->count[1]);
-
-    while(len >= space)     /* tranfer whole blocks if possible  */
-    {
-        memcpy(((unsigned char*)ctx->wbuf) + pos, sp, space);
-        sp += space; len -= space; space = SHA1_BLOCK_SIZE; pos = 0;
-        bsw_32(ctx->wbuf, SHA1_BLOCK_SIZE >> 2);
-        sha1_compile(ctx);
-    }
-
-    memcpy(((unsigned char*)ctx->wbuf) + pos, sp, len);
-}
-
-/* SHA1 final padding and digest calculation  */
-
-void sha1_end(unsigned char hval[], sha1_ctx ctx[1])
-{   uint32_t    i = (uint32_t)(ctx->count[0] & SHA1_MASK);
-
-    /* put bytes in the buffer in an order in which references to   */
-    /* 32-bit words will put bytes with lower addresses into the    */
-    /* top of 32 bit words on BOTH big and little endian machines   */
-    bsw_32(ctx->wbuf, (i + 3) >> 2);
-
-    /* we now need to mask valid bytes and add the padding which is */
-    /* a single 1 bit and as many zero bits as necessary. Note that */
-    /* we can always add the first padding byte here because the    */
-    /* buffer always has at least one empty slot                    */
-    ctx->wbuf[i >> 2] &= 0xffffff80 << 8 * (~i & 3);
-    ctx->wbuf[i >> 2] |= 0x00000080 << 8 * (~i & 3);
-
-    /* we need 9 or more empty positions, one for the padding byte  */
-    /* (above) and eight for the length count. If there is not      */
-    /* enough space, pad and empty the buffer                       */
-    if(i > SHA1_BLOCK_SIZE - 9)
-    {
-        if(i < 60) ctx->wbuf[15] = 0;
-        sha1_compile(ctx);
-        i = 0;
-    }
-    else    /* compute a word index for the empty buffer positions  */
-        i = (i >> 2) + 1;
-
-    while(i < 14) /* and zero pad all but last two positions        */
-        ctx->wbuf[i++] = 0;
-
-    /* the following 32-bit length fields are assembled in the      */
-    /* wrong byte order on little endian machines but this is       */
-    /* corrected later since they are only ever used as 32-bit      */
-    /* word values.                                                 */
-    ctx->wbuf[14] = (ctx->count[1] << 3) | (ctx->count[0] >> 29);
-    ctx->wbuf[15] = ctx->count[0] << 3;
-    sha1_compile(ctx);
-
-    /* extract the hash value as bytes in case the hash buffer is   */
-    /* misaligned for 32-bit words                                  */
-    for(i = 0; i < SHA1_DIGEST_SIZE; ++i)
-        hval[i] = (unsigned char)(ctx->hash[i >> 2] >> (8 * (~i & 3)));
-}
-
-void sha1(unsigned char hval[], const unsigned char data[], unsigned long len)
-{   sha1_ctx    cx[1];
-
-    sha1_begin(cx); sha1_hash(data, len, cx); sha1_end(hval, cx);
-}
-
-#if defined(__cplusplus)
-}
-#endif
diff --git a/lib/libsha1.h b/lib/libsha1.h
deleted file mode 100644
index b4dca93..0000000
--- a/lib/libsha1.h
+++ /dev/null
@@ -1,67 +0,0 @@
-/*
- ---------------------------------------------------------------------------
- Copyright (c) 2002, Dr Brian Gladman, Worcester, UK.   All rights reserved.
-
- LICENSE TERMS
-
- The free distribution and use of this software in both source and binary
- form is allowed (with or without changes) provided that:
-
-   1. distributions of this source code include the above copyright
-      notice, this list of conditions and the following disclaimer;
-
-   2. distributions in binary form include the above copyright
-      notice, this list of conditions and the following disclaimer
-      in the documentation and/or other associated materials;
-
-   3. the copyright holder's name is not used to endorse products
-      built using this software without specific written permission.
-
- ALTERNATIVELY, provided that this notice is retained in full, this product
- may be distributed under the terms of the GNU General Public License (GPL),
- in which case the provisions of the GPL apply INSTEAD OF those given above.
-
- DISCLAIMER
-
- This software is provided 'as is' with no explicit or implied warranties
- in respect of its properties, including, but not limited to, correctness
- and/or fitness for purpose.
- ---------------------------------------------------------------------------
- Issue Date: 01/08/2005
-*/
-
-#ifndef _SHA1_H
-#define _SHA1_H
-
-#if defined(__cplusplus)
-extern "C"
-{
-#endif
-#if 0
-} /* Appleasing Emacs */
-#endif
-
-#include <stdint.h>
-
-/* Size of SHA1 digest */
-
-#define SHA1_DIGEST_SIZE 20
-
-/* type to hold the SHA1 context  */
-
-typedef struct
-{   uint32_t count[2];
-    uint32_t hash[5];
-    uint32_t wbuf[16];
-} sha1_ctx;
-
-void sha1_begin(sha1_ctx ctx[1]);
-void sha1_hash(const unsigned char data[], unsigned long len, sha1_ctx ctx[1]);
-void sha1_end(unsigned char hval[], sha1_ctx ctx[1]);
-void sha1(unsigned char hval[], const unsigned char data[], unsigned long len);
-
-#if defined(__cplusplus)
-}
-#endif
-
-#endif
diff --git a/lib/sha1.c b/lib/sha1.c
index cc48108..0d22cac 100644
--- a/lib/sha1.c
+++ b/lib/sha1.c
@@ -20,21 +20,24 @@
 
 #include "notmuch-private.h"
 
-#include "libsha1.h"
+#include <gcrypt.h>
 
-/* Just some simple interfaces on top of libsha1 so that we can leave
- * libsha1 as untouched as possible. */
+/* Just some simple interfaces on top of libgcrypt */
 
 static char *
-_hex_of_sha1_digest (const unsigned char digest[SHA1_DIGEST_SIZE])
+_hex_of_sha1_digest (gcry_md_hd_t c)
 {
     char *result, *r;
+    unsigned char *digest;
     int i;
+    int digest_size;
 
-    result = xcalloc (SHA1_DIGEST_SIZE * 2 + 1, 1);
+    digest_size = gcry_md_get_algo_dlen(gcry_md_get_algo(c));
+    digest = gcry_md_read(c, 0);
+    result = xcalloc (digest_size * 2 + 1, 1);
 
     for (r = result, i = 0;
-	 i < SHA1_DIGEST_SIZE;
+	 i < digest_size;
 	 r += 2, i++)
     {
 	sprintf (r, "%02x", digest[i]);
@@ -52,16 +55,16 @@ _hex_of_sha1_digest (const unsigned char digest[SHA1_DIGEST_SIZE])
 char *
 notmuch_sha1_of_string (const char *str)
 {
-    sha1_ctx sha1;
-    unsigned char digest[SHA1_DIGEST_SIZE];
-
-    sha1_begin (&sha1);
-
-    sha1_hash ((unsigned char *) str, strlen (str) + 1, &sha1);
+    char *result;
+    gcry_md_hd_t c;
 
-    sha1_end (digest, &sha1);
+    gcry_md_open (&c, GCRY_MD_SHA1, 0);
+    gcry_md_write (c, str, strlen (str) + 1);
+    gcry_md_final (c);
+    result = _hex_of_sha1_digest (c);
+    gcry_md_close (c);
 
-    return _hex_of_sha1_digest (digest);
+    return result;
 }
 
 /* Create a hexadecimal string version of the SHA-1 digest of the
@@ -80,15 +83,14 @@ notmuch_sha1_of_file (const char *filename)
 #define BLOCK_SIZE 4096
     unsigned char block[BLOCK_SIZE];
     size_t bytes_read;
-    sha1_ctx sha1;
-    unsigned char digest[SHA1_DIGEST_SIZE];
+    gcry_md_hd_t c;
     char *result;
 
     file = fopen (filename, "r");
     if (file == NULL)
 	return NULL;
 
-    sha1_begin (&sha1);
+    gcry_md_open(&c, GCRY_MD_SHA1, 0);
 
     while (1) {
 	bytes_read = fread (block, 1, 4096, file);
@@ -100,13 +102,15 @@ notmuch_sha1_of_file (const char *filename)
 		return NULL;
 	    }
 	} else {
-	    sha1_hash (block, bytes_read, &sha1);
+	    gcry_md_write (c, block, bytes_read);
 	}
     }
 
-    sha1_end (digest, &sha1);
+    gcry_md_final (c);
+
+    result = _hex_of_sha1_digest (c);
 
-    result = _hex_of_sha1_digest (digest);
+    gcry_md_close (c);
 
     fclose (file);
 
-- 
1.6.5.2

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] Use libgcrypt for hashing.
  2009-11-28  3:28 [PATCH] Use libgcrypt for hashing Jeffrey C. Ollie
@ 2009-11-28  3:31 ` Mikhail Gusarov
  2009-11-28  3:59   ` Ingmar Vanhassel
                     ` (2 more replies)
  2009-11-28  6:22 ` Carl Worth
  1 sibling, 3 replies; 12+ messages in thread
From: Mikhail Gusarov @ 2009-11-28  3:31 UTC (permalink / raw)
  To: Not Much Mail

[-- Attachment #1: Type: text/plain, Size: 462 bytes --]


Twas brillig at 21:28:03 27.11.2009 UTC-06 when jeff@ocjtech.us did gyre and gimble:

 JCO> Instead of including a private implementation of the SHA1 hash

xserver went this road, and now it has
--with-sha1=libc|libmd|libgcrypt|libcrypto|libsha1|CommonCrypto in
configure.

 JCO> This means less code of our own to maintain and

As libsha1 maintainer I'm volunteering to maintain in-tree copy in
notmuch :)

-- 
  http://fossarchy.blogspot.com/

[-- Attachment #2: Type: application/pgp-signature, Size: 834 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Use libgcrypt for hashing.
  2009-11-28  3:31 ` Mikhail Gusarov
@ 2009-11-28  3:59   ` Ingmar Vanhassel
  2009-11-28  5:43     ` Jeffrey Ollie
  2009-11-28  5:41   ` Jeffrey Ollie
  2009-11-28  6:23   ` Carl Worth
  2 siblings, 1 reply; 12+ messages in thread
From: Ingmar Vanhassel @ 2009-11-28  3:59 UTC (permalink / raw)
  To: notmuch

Excerpts from Mikhail Gusarov's message of Sat Nov 28 04:31:15 +0100 2009:
> 
> Twas brillig at 21:28:03 27.11.2009 UTC-06 when jeff@ocjtech.us did gyre and
> gimble:
> 
>  JCO> Instead of including a private implementation of the SHA1 hash
> 
> xserver went this road, and now it has
> --with-sha1=libc|libmd|libgcrypt|libcrypto|libsha1|CommonCrypto in
> configure.

From a distribution & security point of view I'd much rather be able to
choose one hashing library & use that as widely as possible, rather than
having every application ship its own copy.

>  JCO> This means less code of our own to maintain and
> 
> As libsha1 maintainer I'm volunteering to maintain in-tree copy in
> notmuch :)

Right, but on top of that, it would still be preferable to keep the
option for packagers to use a system library instead.
Most distributions have a rather strict policy to use system libraries
over internal copies.

-- 
Exherbo KDE, X.org maintainer

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Use libgcrypt for hashing.
  2009-11-28  3:31 ` Mikhail Gusarov
  2009-11-28  3:59   ` Ingmar Vanhassel
@ 2009-11-28  5:41   ` Jeffrey Ollie
  2009-11-28  6:43     ` Carl Worth
  2009-11-28  6:23   ` Carl Worth
  2 siblings, 1 reply; 12+ messages in thread
From: Jeffrey Ollie @ 2009-11-28  5:41 UTC (permalink / raw)
  To: Mikhail Gusarov; +Cc: Not Much Mail

On Fri, Nov 27, 2009 at 9:31 PM, Mikhail Gusarov
<dottedmag@dottedmag.net> wrote:
> Twas brillig at 21:28:03 27.11.2009 UTC-06 when jeff@ocjtech.us did gyre and gimble:
>
>  JCO> Instead of including a private implementation of the SHA1 hash
>
> xserver went this road, and now it has
> --with-sha1=libc|libmd|libgcrypt|libcrypto|libsha1|CommonCrypto in
> configure.

I doubt that we'll get to that level of insanity.

>  JCO> This means less code of our own to maintain and
>
> As libsha1 maintainer I'm volunteering to maintain in-tree copy in
> notmuch :)

That's great that you're willing to take on the task, but as I do a
lot of work for Fedora I tend to think about these things differently.
 It's not about a project here or there making private copies of some
code, it's about tracking down *all* of the projects that have private
copies of the code when something goes wrong, especially when there
are security implications.

The most famous example is Zlib.  Based upon what Google turned up I
see that you've been around long enough that you should have heard of
the problems the computer world had when security flaws turned up in
Zlib.  There's even a project[1] that developed methods for
identifying vulnerable code in binaries the problem was so pervasive.

A more recent example is a cross-site ajax request vulnerability in
the Prototype JavaScript framework[2].  A lot of projects copied that
code into their repositories as well, and now someone has to track all
of those down and get them fixed as well.

When projects copy code into their repositories like this, it makes
things a lot more difficult for someone else down the road.  Both
Fedora[3] and Debian[4] (and Ubuntu by extension), while not expressly
forbidding the practice, *strongly* recommend against it.  I'd really
recommend reading the Fedora page that explains the policy as it has a
great summation of the problem.

[1] http://www.enyo.de/fw/security/zlib-fingerprint/
[2] http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-7220
[3] https://fedoraproject.org/wiki/Packaging:No_Bundled_Libraries
[4] http://www.debian.org/doc/debian-policy/ch-source.html#s-embeddedfiles

-- 
Jeff Ollie

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Use libgcrypt for hashing.
  2009-11-28  3:59   ` Ingmar Vanhassel
@ 2009-11-28  5:43     ` Jeffrey Ollie
  2009-11-28  5:52       ` Alexander Botero-Lowry
  0 siblings, 1 reply; 12+ messages in thread
From: Jeffrey Ollie @ 2009-11-28  5:43 UTC (permalink / raw)
  To: Ingmar Vanhassel; +Cc: notmuch

On Fri, Nov 27, 2009 at 9:59 PM, Ingmar Vanhassel <ingmar@exherbo.org> wrote:
>
> Most distributions have a rather strict policy to use system libraries
> over internal copies.

Fedora:

https://fedoraproject.org/wiki/Packaging:No_Bundled_Libraries

Debian:

http://www.debian.org/doc/debian-policy/ch-source.html#s-embeddedfiles

If there are other distributions out there that have similar policies
I'd be interested in learning about them.

-- 
Jeff Ollie

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Use libgcrypt for hashing.
  2009-11-28  5:43     ` Jeffrey Ollie
@ 2009-11-28  5:52       ` Alexander Botero-Lowry
  0 siblings, 0 replies; 12+ messages in thread
From: Alexander Botero-Lowry @ 2009-11-28  5:52 UTC (permalink / raw)
  To: Jeffrey Ollie, Ingmar Vanhassel; +Cc: notmuch

On Fri, 27 Nov 2009 23:43:34 -0600, Jeffrey Ollie <jeff@ocjtech.us> wrote:
> On Fri, Nov 27, 2009 at 9:59 PM, Ingmar Vanhassel <ingmar@exherbo.org> wrote:
> >
> > Most distributions have a rather strict policy to use system libraries
> > over internal copies.
> 
> Fedora [...], Debian [...]
> 
> If there are other distributions out there that have similar policies
> I'd be interested in learning about them.
> 
FreeBSD ports are expected to not use bundled libraries as well.

Alex

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Use libgcrypt for hashing.
  2009-11-28  3:28 [PATCH] Use libgcrypt for hashing Jeffrey C. Ollie
  2009-11-28  3:31 ` Mikhail Gusarov
@ 2009-11-28  6:22 ` Carl Worth
  2010-01-08 20:43   ` micah anderson
  1 sibling, 1 reply; 12+ messages in thread
From: Carl Worth @ 2009-11-28  6:22 UTC (permalink / raw)
  To: Jeffrey C. Ollie, Not Much Mail

[-- Attachment #1: Type: text/plain, Size: 1102 bytes --]

On Fri, 27 Nov 2009 21:28:03 -0600, "Jeffrey C. Ollie" <jeff@ocjtech.us> wrote:
> Instead of including a private implementation of the SHA1 hash, use
> libgcrypt.  This means less code of our own to maintain and it will be
> easier to switch to a different hash function like SHA256.

I don't believe we have a significant code-maintenance burden with
libsha1.c. And as for different hash functions, the only use of sha-1 in
notmuch is as a fallback in the case of a message not including a
Message-ID header.

So I don't see it as important at all to try to remove this code.

> libgcrypt was chosen because it has a fairly simple API, it's well
> tested (it's used in gnutls and gnupg2), and it's licensed under the
> LGPL.

What might make more sense is an option to compile against an existing
library (if present) but not to introduce an error in the build if the
library is not present, (in which case just build the builtin libsha1.c
code).

But if that wouldn't solve the problem you were trying to solve, (to
actually remove libsha1.c), then maybe we don't need to do anything for
now?

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Use libgcrypt for hashing.
  2009-11-28  3:31 ` Mikhail Gusarov
  2009-11-28  3:59   ` Ingmar Vanhassel
  2009-11-28  5:41   ` Jeffrey Ollie
@ 2009-11-28  6:23   ` Carl Worth
  2 siblings, 0 replies; 12+ messages in thread
From: Carl Worth @ 2009-11-28  6:23 UTC (permalink / raw)
  To: Mikhail Gusarov, Not Much Mail

[-- Attachment #1: Type: text/plain, Size: 485 bytes --]

On Sat, 28 Nov 2009 09:31:15 +0600, Mikhail Gusarov <dottedmag@dottedmag.net> wrote:
>  JCO> This means less code of our own to maintain and
> 
> As libsha1 maintainer I'm volunteering to maintain in-tree copy in
> notmuch :)

And Mikhail,

I wanted to thank you publicly for your maintenance work of libsha1. It
was a pleasant surprise to see you on the notmuch list and contributing
patches after I had independently found and incorporated libsha1
previously.

-Carl


[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Use libgcrypt for hashing.
  2009-11-28  5:41   ` Jeffrey Ollie
@ 2009-11-28  6:43     ` Carl Worth
  2009-11-28  7:38       ` Jeffrey Ollie
  0 siblings, 1 reply; 12+ messages in thread
From: Carl Worth @ 2009-11-28  6:43 UTC (permalink / raw)
  To: Jeffrey Ollie, Mikhail Gusarov; +Cc: Not Much Mail

[-- Attachment #1: Type: text/plain, Size: 2755 bytes --]

On Fri, 27 Nov 2009 23:41:40 -0600, Jeffrey Ollie <jeff@ocjtech.us> wrote:
> That's great that you're willing to take on the task, but as I do a
> lot of work for Fedora I tend to think about these things differently.
>  It's not about a project here or there making private copies of some
> code, it's about tracking down *all* of the projects that have private
> copies of the code when something goes wrong, especially when there
> are security implications.

Hi Jeffrey,

Have you actually *looked* at the implementation of libsha1.c that we
have in notmuch? I can't say with 100% certainty that it's free of any
buffer overruns, but I can see that it's not doing any memory allocation
nor network communication. So there are entire classes of security
problems, (such as have afflicted libraries in your examples), that just
aren't present here.

And as for security compromises due to a bug in the cryptographic nature
of this function---well, notmuch isn't even *using* SHA-1 for any secure
purpose.

The actual functionality that we need here is *so* small that I am
unwilling to introduce a required dependency on any library as large as
libcrypt. I mean, look at the actual sizes we're talking about

$ size lib/notmuch.a
   text	   data	    bss	    dec	    hex	filename
   6774	      0	      0	   6774	   1a76	libsha1.o (ex lib/notmuch.a)
   2372	      0	      4	   2376	    948	message-file.o (ex lib/notmuch.a)
    756	      0	      0	    756	    2f4	messages.o (ex lib/notmuch.a)
    405	      0	      0	    405	    195	sha1.o (ex lib/notmuch.a)
    406	      0	      0	    406	    196	tags.o (ex lib/notmuch.a)
    842	      0	      0	    842	    34a	xutil.o (ex lib/notmuch.a)
  15834	    100	      1	  15935	   3e3f	database.o (ex lib/notmuch.a)
   2826	      0	      4	   2830	    b0e	index.o (ex lib/notmuch.a)
  11834	      0	      4	  11838	   2e3e	message.o (ex lib/notmuch.a)
   7042	      0	      0	   7042	   1b82	query.o (ex lib/notmuch.a)
   2553	      0	      0	   2553	    9f9	thread.o (ex lib/notmuch.a)

$ size /usr/lib/libgcrypt.so.11.5.2 
   text	   data	    bss	    dec	    hex	filename
 466236	   8424	    748	 475408	  74110	/usr/lib/libgcrypt.so.11.5.2

You can see that libgcrypt is 7 times the size of all of libnotmuch.a
combined.

Now, if somebody wanted to maintain libsha1 inside a distribution like
Debian, say, then I'd be happy to link against that version rather than
a locally compiled version. And like I said earlier, if people would
rather link against a large cyptographic library for this one tiny
function, then we could arrange that too, but I don't think that
justifies dropping this code from notmuch and introducing a hard
dependency.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Use libgcrypt for hashing.
  2009-11-28  6:43     ` Carl Worth
@ 2009-11-28  7:38       ` Jeffrey Ollie
  0 siblings, 0 replies; 12+ messages in thread
From: Jeffrey Ollie @ 2009-11-28  7:38 UTC (permalink / raw)
  To: Carl Worth; +Cc: Not Much Mail

On Sat, Nov 28, 2009 at 12:43 AM, Carl Worth <cworth@cworth.org> wrote:
>
> Have you actually *looked* at the implementation of libsha1.c that we
> have in notmuch? I can't say with 100% certainty that it's free of any
> buffer overruns, but I can see that it's not doing any memory allocation
> nor network communication. So there are entire classes of security
> problems, (such as have afflicted libraries in your examples), that just
> aren't present here.

I've looked at the code, if only briefly.  But you're wrong that the
code doesn't do any "network communication" - we feed libsha1 hostile
data every time we take the hash of a message.

> And as for security compromises due to a bug in the cryptographic nature
> of this function---well, notmuch isn't even *using* SHA-1 for any secure
> purpose.

From a distributor's point of view, it doesn't matter what you use the
code for, it only matters that it has the bug and someone has to spend
the time to track down all of the copies of the code and replace the
code with a fixed version.  If the code is confined to one shared
library it's trivial to update the shared library, if the code has
been copied to N packages, it's at least N times the work to verify
that all of those packages get updated.

> The actual functionality that we need here is *so* small that I am
> unwilling to introduce a required dependency on any library as large as
> libcrypt. I mean, look at the actual sizes we're talking about

If libgcrypt were some obscure library that wasn't already packaged up
by your favorite distribution or took up hundreds of megabytes of RAM
and/or disk you might have a point.  But the fact is that it *doesn't
matter* how big libgcrypt is because we essentially get it for free -
I'd bet that libgcrypt is already installed on most people's systems.
As a test I tried to remove libgcrypt from my laptop - if I actually
had gone though with it I would have crippled my system because things
like the X server and the package manager depend (albeit indirectly)
on it.

> Now, if somebody wanted to maintain libsha1 inside a distribution like
> Debian, say, then I'd be happy to link against that version rather than
> a locally compiled version. And like I said earlier, if people would
> rather link against a large cyptographic library for this one tiny
> function, then we could arrange that too, but I don't think that
> justifies dropping this code from notmuch and introducing a hard
> dependency.

Given Debian's reputation for packaging the kitchen sink I'm surprised
that it doesn't already.  Fedora tends not to package libraries unless
there is an application that is going to make use of it...

-- 
Jeff Ollie

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Use libgcrypt for hashing.
  2009-11-28  6:22 ` Carl Worth
@ 2010-01-08 20:43   ` micah anderson
  2010-01-14 22:16     ` Carl Worth
  0 siblings, 1 reply; 12+ messages in thread
From: micah anderson @ 2010-01-08 20:43 UTC (permalink / raw)
  To: Carl Worth, Jeffrey C. Ollie, Not Much Mail

[-- Attachment #1: Type: text/plain, Size: 2491 bytes --]

On Fri, 27 Nov 2009 22:22:01 -0800, Carl Worth <cworth@cworth.org> wrote:
> On Fri, 27 Nov 2009 21:28:03 -0600, "Jeffrey C. Ollie" <jeff@ocjtech.us> wrote:
> > Instead of including a private implementation of the SHA1 hash, use
> > libgcrypt.  This means less code of our own to maintain and it will be
> > easier to switch to a different hash function like SHA256.
> 
> I don't believe we have a significant code-maintenance burden with
> libsha1.c. And as for different hash functions, the only use of sha-1 in
> notmuch is as a fallback in the case of a message not including a
> Message-ID header.
> 
> So I don't see it as important at all to try to remove this code.

Its good that this is not a burden to maintain for the notmuch project,
even better that Mikhail, the libsha1 maintainer, is currently active in
this project and has volunteered to maintain the in-tree copy. 

However, the problem that has been raised is about the code-maintenance
burden that distributions face. In fact, this is not an unique problem
to notmuch, if it was it wouldn't be such a big deal. The reality is
that the more projects which cargo-cult around 'convenience copies' of
code, the more of a burden is placed on the distributors.

In some ways, the notmuch project and the role of distributors are at
cross-purposes on this issue, each side has an argument that makes sense
From their individual perspectives.

> > libgcrypt was chosen because it has a fairly simple API, it's well
> > tested (it's used in gnutls and gnupg2), and it's licensed under the
> > LGPL.
> 
> What might make more sense is an option to compile against an existing
> library (if present) but not to introduce an error in the build if the
> library is not present, (in which case just build the builtin libsha1.c
> code).

This makes the most sense, and resolves the issue in a way that both
sides of the issue benefit!

> But if that wouldn't solve the problem you were trying to solve, (to
> actually remove libsha1.c), then maybe we don't need to do anything for
> now?

I think from a distribution point-of-view, if you are providing a
mechanism to link against libgcrypt, while still maintaining this
embedded code-copy for convenience's-sake, actually removing libsha1.c
is not so necessary. It does mean an exception must be noted on the
distribution side that indicates that although this code exists, its not
being used, but that is a negligible burden.

micah

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Use libgcrypt for hashing.
  2010-01-08 20:43   ` micah anderson
@ 2010-01-14 22:16     ` Carl Worth
  0 siblings, 0 replies; 12+ messages in thread
From: Carl Worth @ 2010-01-14 22:16 UTC (permalink / raw)
  To: micah anderson, Jeffrey C. Ollie, Not Much Mail

[-- Attachment #1: Type: text/plain, Size: 1648 bytes --]

On Fri, 08 Jan 2010 15:43:52 -0500, micah anderson <micah@riseup.net> wrote:
> Its good that this is not a burden to maintain for the notmuch project,
> even better that Mikhail, the libsha1 maintainer, is currently active in
> this project and has volunteered to maintain the in-tree copy. 
> 
> However, the problem that has been raised is about the code-maintenance
> burden that distributions face. In fact, this is not an unique problem
> to notmuch, if it was it wouldn't be such a big deal. The reality is
> that the more projects which cargo-cult around 'convenience copies' of
> code, the more of a burden is placed on the distributors.
> 
> In some ways, the notmuch project and the role of distributors are at
> cross-purposes on this issue, each side has an argument that makes sense
> From their individual perspectives.

Well, I think it's important for notmuch to ease the burden on the
distribution as well. That's just a matter of being a good citizen.

If notmuch were including code that existed as a library package in
Debian, say. Then that would definitely be problematic, and notmuch
should be fixed to link with the library.

We could get to that point if someone wanted to package libsha1, say.

> > What might make more sense is an option to compile against an existing
> > library (if present) but not to introduce an error in the build if the
> > library is not present, (in which case just build the builtin libsha1.c
> > code).
> 
> This makes the most sense, and resolves the issue in a way that both
> sides of the issue benefit!

I'd be glad to see a patch that does that.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-01-14 22:16 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-28  3:28 [PATCH] Use libgcrypt for hashing Jeffrey C. Ollie
2009-11-28  3:31 ` Mikhail Gusarov
2009-11-28  3:59   ` Ingmar Vanhassel
2009-11-28  5:43     ` Jeffrey Ollie
2009-11-28  5:52       ` Alexander Botero-Lowry
2009-11-28  5:41   ` Jeffrey Ollie
2009-11-28  6:43     ` Carl Worth
2009-11-28  7:38       ` Jeffrey Ollie
2009-11-28  6:23   ` Carl Worth
2009-11-28  6:22 ` Carl Worth
2010-01-08 20:43   ` micah anderson
2010-01-14 22:16     ` Carl Worth

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).