unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Leo <sdl.web@gmail.com>
To: Paul Eggert <eggert@cs.ucla.edu>
Cc: Eli Zaretskii <eliz@gnu.org>,
	monnier@iro.umontreal.ca, sand@blarg.net,
	YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp>,
	emacs-devel@gnu.org
Subject: Re: Adding sha256 and sha512 to C?
Date: Mon, 20 Jun 2011 00:08:28 +0800	[thread overview]
Message-ID: <m1tybl248z.fsf@th041141.ip.tsinghua.edu.cn> (raw)
In-Reply-To: <4DF53FB3.9060208@cs.ucla.edu> (Paul Eggert's message of "Sun, 12 Jun 2011 15:37:39 -0700")

[-- Attachment #1: Type: text/plain, Size: 2197 bytes --]

Sorry for the delay.

On 2011-06-13 06:37 +0800, Paul Eggert wrote:
> That's better, thanks, but I still have two qualms.  First, the name
> "sha" is confusing at the Emacs Lisp level: it feels too much like
> "ash".  It's not like programmers will be using crypto functions in
> every expression; their names need not be *that* short.  How about the
> name "secure-hash" instead?  That's pretty short.

Sounds good.

> Second, naming algorithms via bit counts doesn't sound
> forward-looking.  SHA-3 is likely to have a 512-bit variant, for
> example.  How about using atoms to name the algorithms, e.g., SHA-1,
> SHA-224, SHA-256, etc.?  This is more likely to be robust after SHA-3
> comes out, not to mention SHA-4 etc.
>
> +      hash_func	  = &md5_buffer;
>
> There's no need for the "&" here, or in similar assignments to
> hash_func.  (And there's no need for multiple spaces before the "=".)
>
> +  digest = make_uninit_string (digest_size);
> ...
> +      Lisp_Object value = make_uninit_string (2 * digest_size);
>
> There's no need to call make_uninit_string twice, as only one
> string is being returned.  Any temporary buffer for the digest can
> be put into the C stack.  Or, perhaps better, use the same
> uninitialized string for both the binary digest and the text
> digest, and run the binary-to-text loop backwards (and without
> using sprintf) so that the loop doesn't stomp on its own work.
> Something like this:
>
>       unsigned char *p = SDATA (digest);
>       for (i = digest_size - 1; i >= 0; i--)
> 	{
> 	  static char const hexdigit[16] = "0123456789abcdef";
> 	  int p_i = p[i];
> 	  p[2 * i] = hexdigit[p_i >> 4];
> 	  p[2 * i + 1] = hexdigit[p_i & 0xf];
> 	}

Thanks for the suggestion.

> The text-vs-binary checksum thing seems to be enough of a hassle that
> perhaps it should be pulled out into a separate function, rather than
> as a flag to the sha/secure-hash function.  That is, secure-hash could
> always return the text form, and if someone wants a binary form they
> could call the text-to-binary converter.
>
> Won't there need to be changes to the Emacs Lisp reference manual, and
> to NEWS?

Will take care of the NEWS entry when committing.

Leo


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: sha-2.diff --]
[-- Type: text/x-diff, Size: 7995 bytes --]

=== modified file 'lisp/subr.el'
--- lisp/subr.el	2011-06-15 17:30:41 +0000
+++ lisp/subr.el	2011-06-19 16:06:58 +0000
@@ -2600,6 +2600,14 @@
 	(get-char-property (1- (field-end pos)) 'field)
       raw-field)))
 
+(defun sha1 (object &optional start end binary)
+  "Return the SHA1 (Secure Hash Algorithm) of an OBJECT.
+OBJECT is either a string or a buffer.  Optional arguments START and
+END are character positions specifying which portion of OBJECT for
+computing the hash.  If BINARY is non-nil, return a string in binary
+form."
+  (secure-hash 'sha1 object start end binary))
+
 \f
 ;;;; Support for yanking and text properties.
 

=== modified file 'src/deps.mk'
--- src/deps.mk	2011-05-24 08:22:58 +0000
+++ src/deps.mk	2011-06-19 15:46:13 +0000
@@ -284,8 +284,8 @@
 floatfns.o: floatfns.c syssignal.h lisp.h globals.h $(config_h)
 fns.o: fns.c commands.h lisp.h $(config_h) frame.h buffer.h character.h \
    keyboard.h keymap.h window.h $(INTERVALS_H) coding.h ../lib/md5.h \
-   ../lib/sha1.h blockinput.h atimer.h systime.h xterm.h ../lib/unistd.h \
-   globals.h
+   ../lib/sha1.h ../lib/sha256.h ../lib/sha512.h blockinput.h atimer.h \
+   systime.h xterm.h ../lib/unistd.h globals.h
 print.o: print.c process.h frame.h window.h buffer.h keyboard.h character.h \
    lisp.h globals.h $(config_h) termchar.h $(INTERVALS_H) msdos.h termhooks.h \
    blockinput.h atimer.h systime.h font.h charset.h coding.h ccl.h \

=== modified file 'src/fns.c'
--- src/fns.c	2011-06-17 15:18:54 +0000
+++ src/fns.c	2011-06-19 15:46:13 +0000
@@ -51,6 +51,8 @@
 static Lisp_Object Qwidget_type;
 static Lisp_Object Qcodeset, Qdays, Qmonths, Qpaper;
 
+static Lisp_Object Qmd5, Qsha1, Qsha224, Qsha256, Qsha384, Qsha512;
+
 static int internal_equal (Lisp_Object , Lisp_Object, int, int);
 
 #ifndef HAVE_UNISTD_H
@@ -4550,21 +4552,18 @@
 
 \f
 /************************************************************************
-			     MD5 and SHA1
+			MD5, SHA-1, and SHA-2
  ************************************************************************/
 
 #include "md5.h"
 #include "sha1.h"
-
-/* Convert a possibly-signed character to an unsigned character.  This is
-   a bit safer than casting to unsigned char, since it catches some type
-   errors that the cast doesn't.  */
-static inline unsigned char to_uchar (char ch) { return ch; }
-
-/* TYPE: 0 for md5, 1 for sha1. */
+#include "sha256.h"
+#include "sha512.h"
+
+/* ALGORITHM is a symbol: md5, sha1, sha224 and so on. */
 
 static Lisp_Object
-crypto_hash_function (int type, Lisp_Object object, Lisp_Object start, Lisp_Object end, Lisp_Object coding_system, Lisp_Object noerror, Lisp_Object binary)
+secure_hash (Lisp_Object algorithm, Lisp_Object object, Lisp_Object start, Lisp_Object end, Lisp_Object coding_system, Lisp_Object noerror, Lisp_Object binary)
 {
   int i;
   EMACS_INT size;
@@ -4574,7 +4573,11 @@
   register EMACS_INT b, e;
   register struct buffer *bp;
   EMACS_INT temp;
-  Lisp_Object res=Qnil;
+  int digest_size;
+  void *(*hash_func) (const char *, size_t, void *);
+  Lisp_Object digest;
+
+  CHECK_SYMBOL (algorithm);
 
   if (STRINGP (object))
     {
@@ -4745,47 +4748,61 @@
 	object = code_convert_string (object, coding_system, Qnil, 1, 0, 0);
     }
 
-  switch (type)
-    {
-    case 0:			/* MD5 */
-      {
-	char digest[16];
-	md5_buffer (SSDATA (object) + start_byte,
-		    SBYTES (object) - (size_byte - end_byte),
-		    digest);
-
-	if (NILP (binary))
-	  {
-	    char value[33];
-	    for (i = 0; i < 16; i++)
-	      sprintf (&value[2 * i], "%02x", to_uchar (digest[i]));
-	    res = make_string (value, 32);
-	  }
-	else
-	  res = make_string (digest, 16);
-	break;
-      }
-
-    case 1:			/* SHA1 */
-      {
-	char digest[20];
-	sha1_buffer (SSDATA (object) + start_byte,
-		     SBYTES (object) - (size_byte - end_byte),
-		     digest);
-	if (NILP (binary))
-	  {
-	    char value[41];
-	    for (i = 0; i < 20; i++)
-	      sprintf (&value[2 * i], "%02x", to_uchar (digest[i]));
-	    res = make_string (value, 40);
-	  }
-	else
-	  res = make_string (digest, 20);
-	break;
-      }
-    }
-
-  return res;
+  if (EQ (algorithm, Qmd5))
+    {
+      digest_size = MD5_DIGEST_SIZE;
+      hash_func	  = md5_buffer;
+    }
+  else if (EQ (algorithm, Qsha1))
+    {
+      digest_size = SHA1_DIGEST_SIZE;
+      hash_func	  = sha1_buffer;
+    }
+  else if (EQ (algorithm, Qsha224))
+    {
+      digest_size = SHA224_DIGEST_SIZE;
+      hash_func	  = sha224_buffer;
+    }
+  else if (EQ (algorithm, Qsha256))
+    {
+      digest_size = SHA256_DIGEST_SIZE;
+      hash_func	  = sha256_buffer;
+    }
+  else if (EQ (algorithm, Qsha384))
+    {
+      digest_size = SHA384_DIGEST_SIZE;
+      hash_func	  = sha384_buffer;
+    }
+  else if (EQ (algorithm, Qsha512))
+    {
+      digest_size = SHA512_DIGEST_SIZE;
+      hash_func	  = sha512_buffer;
+    }
+  else
+    error ("Invalid algorithm arg: %s", SDATA (Fsymbol_name (algorithm)));
+
+  /* allocate 2 times the size of digest_size so that it can be
+     re-used to hold the hexified value */
+  digest = make_uninit_string (digest_size * 2);
+
+  hash_func (SSDATA (object) + start_byte,
+	     SBYTES (object) - (size_byte - end_byte),
+	     SSDATA (digest));
+
+  if (NILP (binary))
+    {
+      unsigned char *p = SDATA (digest);
+      for (i = digest_size - 1; i >= 0; i--)
+	{
+	  static char const hexdigit[16] = "0123456789abcdef";
+	  int p_i = p[i];
+	  p[2 * i] = hexdigit[p_i >> 4];
+	  p[2 * i + 1] = hexdigit[p_i & 0xf];
+	}
+      return digest;
+    }
+  else
+    return make_unibyte_string (SDATA (digest), digest_size);
 }
 
 DEFUN ("md5", Fmd5, Smd5, 1, 5, 0,
@@ -4817,25 +4834,31 @@
 guesswork fails.  Normally, an error is signaled in such case.  */)
   (Lisp_Object object, Lisp_Object start, Lisp_Object end, Lisp_Object coding_system, Lisp_Object noerror)
 {
-  return crypto_hash_function (0, object, start, end, coding_system, noerror, Qnil);
+  return secure_hash (Qmd5, object, start, end, coding_system, noerror, Qnil);
 }
 
-DEFUN ("sha1", Fsha1, Ssha1, 1, 4, 0,
-       doc: /* Return the SHA-1 (Secure Hash Algorithm) of an OBJECT.
-
-OBJECT is either a string or a buffer.  Optional arguments START and
-END are character positions specifying which portion of OBJECT for
-computing the hash.  If BINARY is non-nil, return a string in binary
-form.  */)
-     (Lisp_Object object, Lisp_Object start, Lisp_Object end, Lisp_Object binary)
+DEFUN ("secure-hash", Fsecure_hash, Ssecure_hash, 2, 5, 0,
+       doc: /* Return the secure hash of an OBJECT.
+ALGORITHM is a symbol: md5, sha1, sha224, sha256, sha384 or sha512.
+OBJECT is either a string or a buffer.
+Optional arguments START and END are character positions specifying
+which portion of OBJECT for computing the hash.  If BINARY is non-nil,
+return a string in binary form.  */)
+  (Lisp_Object algorithm, Lisp_Object object, Lisp_Object start, Lisp_Object end, Lisp_Object binary)
 {
-  return crypto_hash_function (1, object, start, end, Qnil, Qnil, binary);
+  return secure_hash (algorithm, object, start, end, Qnil, Qnil, binary);
 }
-
 \f
 void
 syms_of_fns (void)
 {
+  DEFSYM (Qmd5,    "md5");
+  DEFSYM (Qsha1,   "sha1");
+  DEFSYM (Qsha224, "sha224");
+  DEFSYM (Qsha256, "sha256");
+  DEFSYM (Qsha384, "sha384");
+  DEFSYM (Qsha512, "sha512");
+
   /* Hash table stuff.  */
   Qhash_table_p = intern_c_string ("hash-table-p");
   staticpro (&Qhash_table_p);
@@ -5004,7 +5027,7 @@
   defsubr (&Sbase64_encode_string);
   defsubr (&Sbase64_decode_string);
   defsubr (&Smd5);
-  defsubr (&Ssha1);
+  defsubr (&Ssecure_hash);
   defsubr (&Slocale_info);
 }
 

=== modified file 'src/makefile.w32-in'
--- src/makefile.w32-in	2011-06-12 02:48:18 +0000
+++ src/makefile.w32-in	2011-06-19 15:46:13 +0000
@@ -867,6 +867,8 @@
 	$(EMACS_ROOT)/nt/inc/sys/time.h \
 	$(EMACS_ROOT)/lib/md5.h \
 	$(EMACS_ROOT)/lib/sha1.h \
+	$(EMACS_ROOT)/lib/sha256.h \
+	$(EMACS_ROOT)/lib/sha512.h \
 	$(LISP_H) \
 	$(SRC)/atimer.h \
 	$(SRC)/blockinput.h \


  reply	other threads:[~2011-06-19 16:08 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-28  3:18 Adding sha256 and sha512 to C? sand
2011-05-28  3:58 ` Paul Eggert
2011-05-28  7:25   ` Eli Zaretskii
2011-05-30  4:06     ` Stefan Monnier
2011-06-11  5:43       ` Leo
2011-06-11  8:00         ` Eli Zaretskii
2011-06-11 12:37           ` Leo
2011-06-11 15:24             ` Eli Zaretskii
2011-06-11 16:02               ` Paul Eggert
2011-06-11 20:36                 ` Juanma Barranquero
2011-06-12  0:34                 ` YAMAMOTO Mitsuharu
2011-06-12 13:03                 ` Leo
2011-06-12 14:05                   ` Thien-Thi Nguyen
2011-06-12 15:48                   ` Deniz Dogan
2011-06-12 17:06                     ` Richard Riley
2011-06-12 22:37                   ` Paul Eggert
2011-06-19 16:08                     ` Leo [this message]
2011-05-29  4:22   ` Leo
2011-05-29  5:18     ` Paul Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m1tybl248z.fsf@th041141.ip.tsinghua.edu.cn \
    --to=sdl.web@gmail.com \
    --cc=eggert@cs.ucla.edu \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=mituharu@math.s.chiba-u.ac.jp \
    --cc=monnier@iro.umontreal.ca \
    --cc=sand@blarg.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).