unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [PATCH] This patch is a little finger excercise for working with git. I found a piece of code that I didn't understand at first.
@ 2013-02-03 18:51 Robert Mast
  2013-02-03 20:01 ` Austin Clements
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Robert Mast @ 2013-02-03 18:51 UTC (permalink / raw)
  To: notmuch; +Cc: Robert Mast

Reading it in detail I thought it allocated way too much memory and
didn't use the full size of the allocated unsigned ints for storing
bits.

Am I right, and is this the right way to patch code to notmuch?

---
 lib/query.cc |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/query.cc b/lib/query.cc
index e9c1a2d..046663a 100644
--- a/lib/query.cc
+++ b/lib/query.cc
@@ -43,8 +43,8 @@ struct _notmuch_doc_id_set {
     unsigned int bound;
 };
 
-#define DOCIDSET_WORD(bit) ((bit) / sizeof (unsigned int))
-#define DOCIDSET_BIT(bit) ((bit) % sizeof (unsigned int))
+#define DOCIDSET_WORD(bit) ((bit) / (sizeof (unsigned int) * CHAR_BIT))
+#define DOCIDSET_BIT(bit) ((bit) % (sizeof (unsigned int) * CHAR_BIT))
 
 struct visible _notmuch_threads {
     notmuch_query_t *query;
@@ -363,7 +363,7 @@ _notmuch_doc_id_set_init (void *ctx,
 
     for (unsigned int i = 0; i < arr->len; i++)
 	max = MAX(max, g_array_index (arr, unsigned int, i));
-    bitmap = talloc_zero_array (ctx, unsigned int, 1 + max / sizeof (*bitmap));
+    bitmap = talloc_zero_array (ctx, unsigned int, (DOCIDSET_WORD(max) + 1) * sizeof (*bitmap));
 
     if (bitmap == NULL)
 	return FALSE;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] This patch is a little finger excercise for working with git. I found a piece of code that I didn't understand at first.
  2013-02-03 18:51 [PATCH] This patch is a little finger excercise for working with git. I found a piece of code that I didn't understand at first Robert Mast
@ 2013-02-03 20:01 ` Austin Clements
  2013-02-10 13:13 ` [PATCH] convert bitmap to unsigned char Robert Mast
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Austin Clements @ 2013-02-03 20:01 UTC (permalink / raw)
  To: Robert Mast; +Cc: notmuch

If I'm reading this right, this patch reduces a 8x memory waste to a
4x memory waste, but doesn't actually eliminate the waste.

Quoth Robert Mast on Feb 03 at  7:51 pm:
> Reading it in detail I thought it allocated way too much memory and
> didn't use the full size of the allocated unsigned ints for storing
> bits.
> 
> Am I right, and is this the right way to patch code to notmuch?

Side-comments about a patch like these should go under the "---" line,
while the commit message should explain what a patch does and why.
From your message, I didn't even understand this was fixing a bug; it
sounded like a cleanup.

  http://notmuchmail.org/contributing/#index5h2
has a bit to say about commit message contents and the notmuch git
history has a lot to say.

> 
> ---
>  lib/query.cc |    6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/query.cc b/lib/query.cc
> index e9c1a2d..046663a 100644
> --- a/lib/query.cc
> +++ b/lib/query.cc
> @@ -43,8 +43,8 @@ struct _notmuch_doc_id_set {
>      unsigned int bound;
>  };
>  
> -#define DOCIDSET_WORD(bit) ((bit) / sizeof (unsigned int))
> -#define DOCIDSET_BIT(bit) ((bit) % sizeof (unsigned int))
> +#define DOCIDSET_WORD(bit) ((bit) / (sizeof (unsigned int) * CHAR_BIT))
> +#define DOCIDSET_BIT(bit) ((bit) % (sizeof (unsigned int) * CHAR_BIT))

This is definitely the right thing to do, though a possibly clearer
way to fix this would be to change the type of
_notmuch_doc_id_set::bitmap to unsigned char.  Then the macros would
become simply

#define DOCIDSET_WORD(bit) ((bit) / CHAR_BIT)
#define DOCIDSET_BIT(bit) ((bit) % CHAR_BIT)

>  
>  struct visible _notmuch_threads {
>      notmuch_query_t *query;
> @@ -363,7 +363,7 @@ _notmuch_doc_id_set_init (void *ctx,
>  
>      for (unsigned int i = 0; i < arr->len; i++)
>  	max = MAX(max, g_array_index (arr, unsigned int, i));
> -    bitmap = talloc_zero_array (ctx, unsigned int, 1 + max / sizeof (*bitmap));
> +    bitmap = talloc_zero_array (ctx, unsigned int, (DOCIDSET_WORD(max) + 1) * sizeof (*bitmap));

This, however, isn't quite right.  I like your idea of using the
macros to compute the size, but the size argument should just be
  DOCIDSET_WORD(max) + 1
since talloc_zero_array expects the number of array elements, not the
number of bytes.

>  
>      if (bitmap == NULL)
>  	return FALSE;

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] convert bitmap to unsigned char
  2013-02-03 18:51 [PATCH] This patch is a little finger excercise for working with git. I found a piece of code that I didn't understand at first Robert Mast
  2013-02-03 20:01 ` Austin Clements
@ 2013-02-10 13:13 ` Robert Mast
  2013-02-11 18:56   ` Carl Worth
  2013-02-10 17:24 ` [PATCH] bitmap:improve memory usage using CHAR_BITS and unsigned CHAR Robert Mast
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Robert Mast @ 2013-02-10 13:13 UTC (permalink / raw)
  To: notmuch; +Cc: Robert Mast

---
 lib/query.cc |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/query.cc b/lib/query.cc
index 046663a..7381a54 100644
--- a/lib/query.cc
+++ b/lib/query.cc
@@ -39,12 +39,12 @@ typedef struct _notmuch_mset_messages {
 } notmuch_mset_messages_t;
 
 struct _notmuch_doc_id_set {
-    unsigned int *bitmap;
+    unsigned char *bitmap;
     unsigned int bound;
 };
 
-#define DOCIDSET_WORD(bit) ((bit) / (sizeof (unsigned int) * CHAR_BIT))
-#define DOCIDSET_BIT(bit) ((bit) % (sizeof (unsigned int) * CHAR_BIT))
+#define DOCIDSET_WORD(bit) ((bit) / CHAR_BIT)
+#define DOCIDSET_BIT(bit) ((bit) % CHAR_BIT)
 
 struct visible _notmuch_threads {
     notmuch_query_t *query;
@@ -359,11 +359,11 @@ _notmuch_doc_id_set_init (void *ctx,
 			  GArray *arr)
 {
     unsigned int max = 0;
-    unsigned int *bitmap;
+    unsigned char *bitmap;
 
     for (unsigned int i = 0; i < arr->len; i++)
 	max = MAX(max, g_array_index (arr, unsigned int, i));
-    bitmap = talloc_zero_array (ctx, unsigned int, (DOCIDSET_WORD(max) + 1) * sizeof (*bitmap));
+    bitmap = talloc_zero_array (ctx, unsigned char, DOCIDSET_WORD(max) + 1);
 
     if (bitmap == NULL)
 	return FALSE;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH] bitmap:improve memory usage using CHAR_BITS and unsigned CHAR
  2013-02-03 18:51 [PATCH] This patch is a little finger excercise for working with git. I found a piece of code that I didn't understand at first Robert Mast
  2013-02-03 20:01 ` Austin Clements
  2013-02-10 13:13 ` [PATCH] convert bitmap to unsigned char Robert Mast
@ 2013-02-10 17:24 ` Robert Mast
  2013-02-13 15:11 ` Robert Mast
  2013-02-13 15:32 ` Robert Mast
  4 siblings, 0 replies; 10+ messages in thread
From: Robert Mast @ 2013-02-10 17:24 UTC (permalink / raw)
  To: notmuch; +Cc: Robert Mast

A little bug-fix to learn how to contribute to nutmuch, this time a combined commit with git rebase to provide one patch from 15.1.
---
 lib/query.cc |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/query.cc b/lib/query.cc
index e9c1a2d..7381a54 100644
--- a/lib/query.cc
+++ b/lib/query.cc
@@ -39,12 +39,12 @@ typedef struct _notmuch_mset_messages {
 } notmuch_mset_messages_t;
 
 struct _notmuch_doc_id_set {
-    unsigned int *bitmap;
+    unsigned char *bitmap;
     unsigned int bound;
 };
 
-#define DOCIDSET_WORD(bit) ((bit) / sizeof (unsigned int))
-#define DOCIDSET_BIT(bit) ((bit) % sizeof (unsigned int))
+#define DOCIDSET_WORD(bit) ((bit) / CHAR_BIT)
+#define DOCIDSET_BIT(bit) ((bit) % CHAR_BIT)
 
 struct visible _notmuch_threads {
     notmuch_query_t *query;
@@ -359,11 +359,11 @@ _notmuch_doc_id_set_init (void *ctx,
 			  GArray *arr)
 {
     unsigned int max = 0;
-    unsigned int *bitmap;
+    unsigned char *bitmap;
 
     for (unsigned int i = 0; i < arr->len; i++)
 	max = MAX(max, g_array_index (arr, unsigned int, i));
-    bitmap = talloc_zero_array (ctx, unsigned int, 1 + max / sizeof (*bitmap));
+    bitmap = talloc_zero_array (ctx, unsigned char, DOCIDSET_WORD(max) + 1);
 
     if (bitmap == NULL)
 	return FALSE;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] convert bitmap to unsigned char
  2013-02-10 13:13 ` [PATCH] convert bitmap to unsigned char Robert Mast
@ 2013-02-11 18:56   ` Carl Worth
  0 siblings, 0 replies; 10+ messages in thread
From: Carl Worth @ 2013-02-11 18:56 UTC (permalink / raw)
  To: Robert Mast, notmuch; +Cc: Robert Mast

[-- Attachment #1: Type: text/plain, Size: 2439 bytes --]

Robert Mast <beheerder@tekenbeetziekten.nl> writes:
> ---

Hi Robert,

It looks like the git exercise is proving useful. Keep it up!

If you look through the git logs a bit you'll find that the overwhelming
convention is for a commit message to have a single-line summary
followed by a (potentially longer) expanded comment.

The convention I like best is for the single-line summary to
efficiently  describe everything about "what" the patch does. If it's
hard to fit this into a single line there's a good chance your patch
should be split up. With your commit message here, the one-line summary
is great, (and much better than in your first submission).

The expanded comment should describe the "why" of the change. And here,
your commit message doesn't have anything. So I'm left wondering why
your commit exists. Does this save memory? Does this fix some type error
or expected alignment somewhere? etc.

Please re-submit your patch with a little more explanation of the
motivation behind the patch.

-Carl

PS. I do know that you did have more in the commit message originally,
and Austin recommended removing "snide" issues, (doubts and
meta-questions about the patch---things that don't belong in the commit
history).

Your previous commit message was:

  Reading it in detail I thought it allocated way too much memory and
  didn't use the full size of the allocated unsigned ints for storing
  bits.

  Am I right, and is this the right way to patch code to notmuch?

I'm not actually looking at the code in context now, so I can't render a
correct commit message, but this attempted rewording should hopefully
give you the idea:

  Using char instead of int allows for simpler definitions of the
  DOCIDSET macros so the code is easier to understand.

  ---

  Am I reading that correctly? Or is there also some space saving
  happening here as well? Please re-submit a new patch with a commit
  message that makes things clear one way or the other.

Do you see how this "why" part of the commit message is ready to stand
alone in our commit history (assuming it's correct and
accepted). Meta-questions like "is this even a sane way of doing
things?" are great to ask, and very helpful for code reviewers, but best
placed after the "---" so that they don't become part of the commit
message.

Thanks for playing with notmuch!

-Carl

-- 
carl.d.worth@intel.com

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] bitmap:improve memory usage using CHAR_BITS and unsigned CHAR
  2013-02-03 18:51 [PATCH] This patch is a little finger excercise for working with git. I found a piece of code that I didn't understand at first Robert Mast
                   ` (2 preceding siblings ...)
  2013-02-10 17:24 ` [PATCH] bitmap:improve memory usage using CHAR_BITS and unsigned CHAR Robert Mast
@ 2013-02-13 15:11 ` Robert Mast
  2013-02-13 15:15   ` [english 100%] " Robert Mast
  2013-02-13 15:32 ` Robert Mast
  4 siblings, 1 reply; 10+ messages in thread
From: Robert Mast @ 2013-02-13 15:11 UTC (permalink / raw)
  To: notmuch; +Cc: Robert Mast

diff --git a/lib/query.cc b/lib/query.cc
index e9c1a2d..7381a54 100644
--- a/lib/query.cc
+++ b/lib/query.cc
@@ -39,12 +39,12 @@ typedef struct _notmuch_mset_messages {
 } notmuch_mset_messages_t;
 
 struct _notmuch_doc_id_set {
-    unsigned int *bitmap;
+    unsigned char *bitmap;
     unsigned int bound;
 };
 
-#define DOCIDSET_WORD(bit) ((bit) / sizeof (unsigned int))
-#define DOCIDSET_BIT(bit) ((bit) % sizeof (unsigned int))
+#define DOCIDSET_WORD(bit) ((bit) / CHAR_BIT)
+#define DOCIDSET_BIT(bit) ((bit) % CHAR_BIT)
 
 struct visible _notmuch_threads {
     notmuch_query_t *query;
@@ -359,11 +359,11 @@ _notmuch_doc_id_set_init (void *ctx,
 			  GArray *arr)
 {
     unsigned int max = 0;
-    unsigned int *bitmap;
+    unsigned char *bitmap;
 
     for (unsigned int i = 0; i < arr->len; i++)
 	max = MAX(max, g_array_index (arr, unsigned int, i));
-    bitmap = talloc_zero_array (ctx, unsigned int, 1 + max / sizeof (*bitmap));
+    bitmap = talloc_zero_array (ctx, unsigned char, DOCIDSET_WORD(max) + 1);
 
     if (bitmap == NULL)
 	return FALSE;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* RE: [english 100%] [PATCH] bitmap:improve memory usage using CHAR_BITS and unsigned CHAR
  2013-02-13 15:11 ` Robert Mast
@ 2013-02-13 15:15   ` Robert Mast
  0 siblings, 0 replies; 10+ messages in thread
From: Robert Mast @ 2013-02-13 15:15 UTC (permalink / raw)
  To: notmuch

Somehow my comments don't get through... I'll try it another way.

-----Oorspronkelijk bericht-----
Van: Robert Mast [mailto:beheerder@tekenbeetziekten.nl] 
Verzonden: woensdag 13 februari 2013 16:12
Aan: notmuch@notmuchmail.org
CC: Robert Mast
Onderwerp: [english 100%] [PATCH] bitmap:improve memory usage using CHAR_BITS and unsigned CHAR

diff --git a/lib/query.cc b/lib/query.cc index e9c1a2d..7381a54 100644
--- a/lib/query.cc
+++ b/lib/query.cc
@@ -39,12 +39,12 @@ typedef struct _notmuch_mset_messages {  } notmuch_mset_messages_t;
 
 struct _notmuch_doc_id_set {
-    unsigned int *bitmap;
+    unsigned char *bitmap;
     unsigned int bound;
 };
 
-#define DOCIDSET_WORD(bit) ((bit) / sizeof (unsigned int)) -#define DOCIDSET_BIT(bit) ((bit) % sizeof (unsigned int))
+#define DOCIDSET_WORD(bit) ((bit) / CHAR_BIT) #define DOCIDSET_BIT(bit) 
+((bit) % CHAR_BIT)
 
 struct visible _notmuch_threads {
     notmuch_query_t *query;
@@ -359,11 +359,11 @@ _notmuch_doc_id_set_init (void *ctx,
 			  GArray *arr)
 {
     unsigned int max = 0;
-    unsigned int *bitmap;
+    unsigned char *bitmap;
 
     for (unsigned int i = 0; i < arr->len; i++)
 	max = MAX(max, g_array_index (arr, unsigned int, i));
-    bitmap = talloc_zero_array (ctx, unsigned int, 1 + max / sizeof (*bitmap));
+    bitmap = talloc_zero_array (ctx, unsigned char, DOCIDSET_WORD(max) 
+ + 1);
 
     if (bitmap == NULL)
 	return FALSE;
--
1.7.9.5


.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] bitmap:improve memory usage using CHAR_BITS and unsigned CHAR
  2013-02-03 18:51 [PATCH] This patch is a little finger excercise for working with git. I found a piece of code that I didn't understand at first Robert Mast
                   ` (3 preceding siblings ...)
  2013-02-13 15:11 ` Robert Mast
@ 2013-02-13 15:32 ` Robert Mast
  2013-02-13 16:21   ` Carl Worth
  2013-02-16 11:37   ` David Bremner
  4 siblings, 2 replies; 10+ messages in thread
From: Robert Mast @ 2013-02-13 15:32 UTC (permalink / raw)
  To: notmuch; +Cc: Robert Mast

Using char instead of int allows for simpler definitions of the
DOCIDSET macros so the code is easier to understand and consistent with
respect to memory-usage. Estimated reduction of memory-usage for
bitmap about 8 times.

---

Re-submission for comment-reasons.

---
---
 lib/query.cc |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/query.cc b/lib/query.cc
index e9c1a2d..7381a54 100644
--- a/lib/query.cc
+++ b/lib/query.cc
@@ -39,12 +39,12 @@ typedef struct _notmuch_mset_messages {
 } notmuch_mset_messages_t;
 
 struct _notmuch_doc_id_set {
-    unsigned int *bitmap;
+    unsigned char *bitmap;
     unsigned int bound;
 };
 
-#define DOCIDSET_WORD(bit) ((bit) / sizeof (unsigned int))
-#define DOCIDSET_BIT(bit) ((bit) % sizeof (unsigned int))
+#define DOCIDSET_WORD(bit) ((bit) / CHAR_BIT)
+#define DOCIDSET_BIT(bit) ((bit) % CHAR_BIT)
 
 struct visible _notmuch_threads {
     notmuch_query_t *query;
@@ -359,11 +359,11 @@ _notmuch_doc_id_set_init (void *ctx,
 			  GArray *arr)
 {
     unsigned int max = 0;
-    unsigned int *bitmap;
+    unsigned char *bitmap;
 
     for (unsigned int i = 0; i < arr->len; i++)
 	max = MAX(max, g_array_index (arr, unsigned int, i));
-    bitmap = talloc_zero_array (ctx, unsigned int, 1 + max / sizeof (*bitmap));
+    bitmap = talloc_zero_array (ctx, unsigned char, DOCIDSET_WORD(max) + 1);
 
     if (bitmap == NULL)
 	return FALSE;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] bitmap:improve memory usage using CHAR_BITS and unsigned CHAR
  2013-02-13 15:32 ` Robert Mast
@ 2013-02-13 16:21   ` Carl Worth
  2013-02-16 11:37   ` David Bremner
  1 sibling, 0 replies; 10+ messages in thread
From: Carl Worth @ 2013-02-13 16:21 UTC (permalink / raw)
  To: Robert Mast, notmuch; +Cc: Robert Mast

[-- Attachment #1: Type: text/plain, Size: 237 bytes --]

Robert Mast <beheerder@tekenbeetziekten.nl> writes:
> Re-submission for comment-reasons.

Very nicely done, Robert. Thanks.

-Carl (who's not quite merging patches again yet, but likely will be soon)

-- 
carl.d.worth@intel.com

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] bitmap:improve memory usage using CHAR_BITS and unsigned CHAR
  2013-02-13 15:32 ` Robert Mast
  2013-02-13 16:21   ` Carl Worth
@ 2013-02-16 11:37   ` David Bremner
  1 sibling, 0 replies; 10+ messages in thread
From: David Bremner @ 2013-02-16 11:37 UTC (permalink / raw)
  To: Robert Mast, notmuch

Robert Mast <beheerder@tekenbeetziekten.nl> writes:

> Using char instead of int allows for simpler definitions of the
> DOCIDSET macros so the code is easier to understand and consistent with
> respect to memory-usage. Estimated reduction of memory-usage for
> bitmap about 8 times.

Pushed to master.

I did a bit of testing, and this yields about a 7% improvement in the
total memory allocated on the medium corpus (running
performance-tests/M00-new) and about a 3% speedup running the large
corpus.  These are definitely improvements worth having, but I think I
won't push it to the bug fix release, but rather wait for the next
regular release.

d

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-02-16 11:37 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-03 18:51 [PATCH] This patch is a little finger excercise for working with git. I found a piece of code that I didn't understand at first Robert Mast
2013-02-03 20:01 ` Austin Clements
2013-02-10 13:13 ` [PATCH] convert bitmap to unsigned char Robert Mast
2013-02-11 18:56   ` Carl Worth
2013-02-10 17:24 ` [PATCH] bitmap:improve memory usage using CHAR_BITS and unsigned CHAR Robert Mast
2013-02-13 15:11 ` Robert Mast
2013-02-13 15:15   ` [english 100%] " Robert Mast
2013-02-13 15:32 ` Robert Mast
2013-02-13 16:21   ` Carl Worth
2013-02-16 11:37   ` David Bremner

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).