unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [PATCH] after gzgets(), Z_STREAM_END means EOF, not error
@ 2020-04-14 17:38 Olivier Taïbi
  2020-04-14 20:38 ` David Bremner
  2020-04-16 11:11 ` David Bremner
  0 siblings, 2 replies; 4+ messages in thread
From: Olivier Taïbi @ 2020-04-14 17:38 UTC (permalink / raw)
  To: notmuch

As suggested by David Bremner in
https://notmuchmail.org/pipermail/notmuch/2020/029288.html
here is the patch for bug #3: after gzgets() returns NULL (meaning EOF
or error), the error code Z_STREAM_END means EOF and not error.

Context: I am compiling notmuch on OpenBSD which has a rather old zlib
1.2.3.  It seems that the behaviour of gzgets() changed slightly between
this version and more recent versions, but the manual does not reflect
that change.  Note that zlib's manual:
- does not specify which error code (Z_OK or Z_STREAM_END) is set when
  EOF is reached,
- does not indicate the meaning of Z_STREAM_END after gzgets(), but
  based on its meaning as a possible return value of inflate(), I would
  guess that it means EOF.

PS: out of curiosity, why bother with the --gzip feature in notmuch dump
and restore when the user can simply pipe to/from a gzip/bzip2/xz/...
command?  It seems that this feature adds complication, some of this
complication being due to the fact that the behaviour of zlib is not as
well-defined as stdio.  Moreover using pipes allows to check that the
compressed dump decompresses as it should, e.g.:
$ mkfifo pipe
$ <pipe gzip >dump.gz &
$ notmuch dump | tee pipe | cksum
$ wait && zcat dump.gz | cksum ; rm pipe
This could be simplified with process substitution for shells that have
this feature, and the checksum comparison can certainly be made
automatic in a backup script.
Disclosure: I am biased because I am currently patching notmuch-dump.c
to use stdio instead of zlib in order to port notmuch to OpenBSD, since
OpenBSD's older zlib does not have the "T" mode for gzopen(), so with
zlib the only choice would to compress the output.  Perhaps zlib will be
updated in OpenBSD in the future, but this is a short-term solution that
seems to not be too much trouble to maintain downstream.

PPS: apart from dump and restore (and the indirect use of xapian), it
seems that the only other use of zlib in notmuch is in
format_part_mbox() in notmuch-show.c, which is able to read a compressed
email (it seems that dovecot has an option to write emails in maildir
format in this way to save space).  Do I understand correctly that
notmuch does not support indexing compressed email, and if so what is
the point of using zlib in format_part_mbox()?


diff --git a/util/zlib-extra.c b/util/zlib-extra.c
index 623f6d62..2d2d2414 100644
--- a/util/zlib-extra.c
+++ b/util/zlib-extra.c
@@ -47,6 +47,7 @@ gz_getline (void *talloc_ctx, char **bufptr, ssize_t *bytes_read, gzFile stream)
 	    int zlib_status = 0;
 	    (void) gzerror (stream, &zlib_status);
 	    switch (zlib_status) {
+	    case Z_STREAM_END:
 	    case Z_OK:
 		/* no data read before EOF */
 		if (offset == 0)

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] after gzgets(), Z_STREAM_END means EOF, not error
  2020-04-14 17:38 [PATCH] after gzgets(), Z_STREAM_END means EOF, not error Olivier Taïbi
@ 2020-04-14 20:38 ` David Bremner
  2020-04-14 21:23   ` Olivier Taïbi
  2020-04-16 11:11 ` David Bremner
  1 sibling, 1 reply; 4+ messages in thread
From: David Bremner @ 2020-04-14 20:38 UTC (permalink / raw)
  To: Olivier Taïbi, notmuch

Olivier Taïbi <oli@olitb.net> writes:

> As suggested by David Bremner in
> https://notmuchmail.org/pipermail/notmuch/2020/029288.html
> here is the patch for bug #3: after gzgets() returns NULL (meaning EOF
> or error), the error code Z_STREAM_END means EOF and not error.

> PS: out of curiosity, why bother with the --gzip feature in notmuch dump
> and restore when the user can simply pipe to/from a gzip/bzip2/xz/...
> command?

I believe the original motivation (in 2014) was to make the dumping
process more atomic. I guess you could dig up the mailing list
discussion from the  time if you were interested. I'd be reluctant to
remove the feature after 6 six years.

> PPS: apart from dump and restore (and the indirect use of xapian), it
> seems that the only other use of zlib in notmuch is in
> format_part_mbox() in notmuch-show.c, which is able to read a compressed
> email (it seems that dovecot has an option to write emails in maildir
> format in this way to save space).>  Do I understand correctly that
> notmuch does not support indexing compressed email, and if so what is
> the point of using zlib in format_part_mbox()?

No, that's not correct. Notmuch does support indexing gzip compressed
mail as of version 0.29. IIRC that part uses GMime streams to do the
decompression (probably also using zlib indirectly).
>
> diff --git a/util/zlib-extra.c b/util/zlib-extra.c
> index 623f6d62..2d2d2414 100644
> --- a/util/zlib-extra.c
> +++ b/util/zlib-extra.c
> @@ -47,6 +47,7 @@ gz_getline (void *talloc_ctx, char **bufptr, ssize_t *bytes_read, gzFile stream)
>  	    int zlib_status = 0;
>  	    (void) gzerror (stream, &zlib_status);
>  	    switch (zlib_status) {
> +	    case Z_STREAM_END:
>  	    case Z_OK:
>  		/* no data read before EOF */
>  		if (offset == 0)

As you say, the lib docs could be improved, but I guess the "zlib error
numbers" referred to as re the same as "Return codes for the
compression/decompression functions", in which case this makes sense.

It would be helpful to move the discussion not intended to be part of
the commit message after --- (see https://notmuchmail.org/contributing/
for details).

d

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] after gzgets(), Z_STREAM_END means EOF, not error
  2020-04-14 20:38 ` David Bremner
@ 2020-04-14 21:23   ` Olivier Taïbi
  0 siblings, 0 replies; 4+ messages in thread
From: Olivier Taïbi @ 2020-04-14 21:23 UTC (permalink / raw)
  To: notmuch

On Tue, Apr 14, 2020 at 05:38:32PM -0300, David Bremner wrote:
> Olivier Taïbi <oli@olitb.net> writes:
> > PS: out of curiosity, why bother with the --gzip feature in notmuch dump
> > and restore when the user can simply pipe to/from a gzip/bzip2/xz/...
> > command?
> 
> I believe the original motivation (in 2014) was to make the dumping
> process more atomic. I guess you could dig up the mailing list
> discussion from the  time if you were interested. I'd be reluctant to
> remove the feature after 6 six years.

Thanks for the explanation.
 
> > PPS: apart from dump and restore (and the indirect use of xapian), it
> > seems that the only other use of zlib in notmuch is in
> > format_part_mbox() in notmuch-show.c, which is able to read a compressed
> > email (it seems that dovecot has an option to write emails in maildir
> > format in this way to save space).>  Do I understand correctly that
> > notmuch does not support indexing compressed email, and if so what is
> > the point of using zlib in format_part_mbox()?
> 
> No, that's not correct. Notmuch does support indexing gzip compressed
> mail as of version 0.29. IIRC that part uses GMime streams to do the
> decompression (probably also using zlib indirectly).

Thanks, that's good to know.

> It would be helpful to move the discussion not intended to be part of
> the commit message after --- (see https://notmuchmail.org/contributing/
> for details).

Sorry about that, I read the "bugs" section of the website but did not
notice these guidelines.  I'll do my best to do better next time.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] after gzgets(), Z_STREAM_END means EOF, not error
  2020-04-14 17:38 [PATCH] after gzgets(), Z_STREAM_END means EOF, not error Olivier Taïbi
  2020-04-14 20:38 ` David Bremner
@ 2020-04-16 11:11 ` David Bremner
  1 sibling, 0 replies; 4+ messages in thread
From: David Bremner @ 2020-04-16 11:11 UTC (permalink / raw)
  To: Olivier Taïbi, notmuch

Olivier Taïbi <oli@olitb.net> writes:

> As suggested by David Bremner in
> https://notmuchmail.org/pipermail/notmuch/2020/029288.html
> here is the patch for bug #3: after gzgets() returns NULL (meaning EOF
> or error), the error code Z_STREAM_END means EOF and not error.

pushed, with revised commit message.

d

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-04-16 11:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-14 17:38 [PATCH] after gzgets(), Z_STREAM_END means EOF, not error Olivier Taïbi
2020-04-14 20:38 ` David Bremner
2020-04-14 21:23   ` Olivier Taïbi
2020-04-16 11:11 ` David Bremner

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).