unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Floris Bruynooghe <flub@devork.be>
To: Anton Khirnov <anton@khirnov.net>, notmuch@notmuchmail.org
Subject: Re: [PATCH 1/2] python/notmuch2: do not destroy messages owned by a query
Date: Fri, 22 May 2020 00:17:38 +0200	[thread overview]
Message-ID: <87tv09gcfx.fsf@powell.devork.be> (raw)
In-Reply-To: <20200509050526.23148-1-anton@khirnov.net>

Hi Anton,

Thanks for improving the bindings!  Any my apologies for the late
response, I failed to spot this mail the first time round.

Also, this is a pretty serious bug, thanks for finding it.

This looks pretty solid, a few small style comments that aren't very
important notwithstanding.  Though I do think you should be able to add
a test for this in the pytest tests.  I think just triggering the
use-after-free you demonstrate in the commit message is fine as it will
work with your patch.  The fact it will crash rather then fail without
is unfortunate but probably fine.

On Sat 09 May 2020 at 07:05 +0200, Anton Khirnov wrote:

> Any messages retrieved from a query - either directly via
> search_messages() or indirectly via thread objects - are owned by that
> query. Retrieving the same message (i.e. corresponding to the same
> message ID / database object) several times will always yield the same
> C object.
>
> The caller is allowed to destroy message objects owned by a query before
> the query itself - which can save memory for long-lived queries.
> However, that message must then never be retrieved again from that
> query.
>
> The python-notmuch2 bindings will currently destroy every message object
> in Message._destroy(), which will lead to an invalid free if the same
> message is then retrieved again. E.g. the following python program leads
> to libtalloc abort()ing:
>
> import notmuch2
> db   = notmuch2.Database(mode = notmuch2.Database.MODE.READ_ONLY)
> t    = next(db.threads('*'))
> msgs = list(zip(t.toplevel(), t.toplevel()))
> msgs = list(zip(t.toplevel(), t.toplevel()))
>
> Fix this issue by creating a subclass of Message, which is used for
> "standalone" message which have to be freed by the caller. Message class
> is then used only for messages descended from a query, which do not need
> to be freed by the caller.
> ---
>  bindings/python-cffi/notmuch2/_database.py |  6 ++--
>  bindings/python-cffi/notmuch2/_message.py  | 42 ++++++++++++++--------
>  2 files changed, 30 insertions(+), 18 deletions(-)
>
> diff --git a/bindings/python-cffi/notmuch2/_database.py b/bindings/python-cffi/notmuch2/_database.py
> index 95f59ca0..f14eac78 100644
> --- a/bindings/python-cffi/notmuch2/_database.py
> +++ b/bindings/python-cffi/notmuch2/_database.py
> @@ -399,7 +399,7 @@ class Database(base.NotmuchObject):
>                capi.lib.NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID]
>          if ret not in ok:
>              raise errors.NotmuchError(ret)
> -        msg = message.Message(self, msg_pp[0], db=self)
> +        msg = message.StandaloneMessage(self, msg_pp[0], db=self)
>          if sync_flags:
>              msg.tags.from_maildir_flags()
>          return self.AddedMessage(
> @@ -468,7 +468,7 @@ class Database(base.NotmuchObject):
>          msg_p = msg_pp[0]
>          if msg_p == capi.ffi.NULL:
>              raise LookupError
> -        msg = message.Message(self, msg_p, db=self)
> +        msg = message.StandaloneMessage(self, msg_p, db=self)
>          return msg
>  
>      def get(self, filename):
> @@ -501,7 +501,7 @@ class Database(base.NotmuchObject):
>          msg_p = msg_pp[0]
>          if msg_p == capi.ffi.NULL:
>              raise LookupError
> -        msg = message.Message(self, msg_p, db=self)
> +        msg = message.StandaloneMessage(self, msg_p, db=self)
>          return msg
>  
>      @property
> diff --git a/bindings/python-cffi/notmuch2/_message.py b/bindings/python-cffi/notmuch2/_message.py
> index c5fdbf6d..416ce7ca 100644
> --- a/bindings/python-cffi/notmuch2/_message.py
> +++ b/bindings/python-cffi/notmuch2/_message.py
> @@ -14,7 +14,7 @@ __all__ = ['Message']
>  
>  
>  class Message(base.NotmuchObject):
> -    """An email message stored in the notmuch database.
> +    """An email message stored in the notmuch database retrieved via a query.
>  
>      This should not be directly created, instead it will be returned
>      by calling methods on :class:`Database`.  A message keeps a
> @@ -61,22 +61,10 @@ class Message(base.NotmuchObject):
>  
>      @property
>      def alive(self):
> -        if not self._parent.alive:
> -            return False
> -        try:
> -            self._msg_p
> -        except errors.ObjectDestroyedError:
> -            return False
> -        else:
> -            return True
> -
> -    def __del__(self):
> -        self._destroy()
> +        return self._parent.alive
>  
>      def _destroy(self):
> -        if self.alive:
> -            capi.lib.notmuch_message_destroy(self._msg_p)
> -        self._msg_p = None
> +        pass

I feel like some more description from the commit message could live
here to explain why this isn't destroying the message rather than having
to find it in the commit message.

>  
>      @property
>      def messageid(self):
> @@ -375,6 +363,30 @@ class Message(base.NotmuchObject):
>          if isinstance(other, self.__class__):
>              return self.messageid == other.messageid
>  
> +class StandaloneMessage(Message):

Minor style thing, between top-level code one usually leaves 2 blank
lines per PEP8.  Nowadays perhaps we should think about adopting an
automatic formatter but hey.

(This applies to lots of places in both patches btw)

> +    """An email message stored in the notmuch database.
> +
> +    This subclass of Message is used for messages that are retrieved from the
> +    database directly and are not owned by a query.
> +    """

Likewise I'd be tempted to add here that this message is destroyed as
soon as it is garbage collected pointing out that this is not the case
for Message.  And probably update the Message docstring with the inverse.

> +    @property
> +    def alive(self):
> +        if not self._parent.alive:
> +            return False
> +        try:
> +            self._msg_p
> +        except errors.ObjectDestroyedError:
> +            return False
> +        else:
> +            return True
> +
> +    def __del__(self):
> +        self._destroy()
> +
> +    def _destroy(self):
> +        if self.alive:
> +            capi.lib.notmuch_message_destroy(self._msg_p)
> +        self._msg_p = None
>  
>  class FilenamesIter(base.NotmuchIter):
>      """Iterator for binary filenames objects."""

  parent reply	other threads:[~2020-05-21 22:17 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-09  5:05 [PATCH 1/2] python/notmuch2: do not destroy messages owned by a query Anton Khirnov
2020-05-09  5:05 ` [PATCH 2/2] python/notmuch2: add bindings for the database config strings Anton Khirnov
2020-05-21 22:45   ` Floris Bruynooghe
2020-05-21  7:08 ` [PATCH 1/2] python/notmuch2: do not destroy messages owned by a query Anton Khirnov
2020-05-21 11:13   ` David Bremner
2020-06-10 21:02   ` Tomi Ollila
2020-06-13 12:38     ` [PATCH] wip test for id:20200509050526.23148-1-anton@khirnov.net David Bremner
2020-05-21 22:17 ` Floris Bruynooghe [this message]
  -- strict thread matches above, loose matches on Subject: below --
2020-06-15 20:58 python: Continuing message re-use fix Floris Bruynooghe
2020-06-15 20:58 ` [PATCH 1/2] python/notmuch2: do not destroy messages owned by a query Floris Bruynooghe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tv09gcfx.fsf@powell.devork.be \
    --to=flub@devork.be \
    --cc=anton@khirnov.net \
    --cc=notmuch@notmuchmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).