From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id mAKGBw/+xl4sTAAA0tVLHw (envelope-from ) for ; Thu, 21 May 2020 22:17:51 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id 4LFqAw/+xl4RcAAAB5/wlQ (envelope-from ) for ; Thu, 21 May 2020 22:17:51 +0000 Received: from arlo.cworth.org (arlo.cworth.org [50.126.95.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 1C0379403E7 for ; Thu, 21 May 2020 22:17:50 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id EE2576DE1055; Thu, 21 May 2020 15:17:47 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at cworth.org Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vVw31Txeq58l; Thu, 21 May 2020 15:17:47 -0700 (PDT) Received: from arlo.cworth.org (localhost [IPv6:::1]) by arlo.cworth.org (Postfix) with ESMTP id 3B8A46DE0F86; Thu, 21 May 2020 15:17:45 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id B4A276DE0F86 for ; Thu, 21 May 2020 15:17:43 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at cworth.org Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id d6zHx1TCv7RO for ; Thu, 21 May 2020 15:17:42 -0700 (PDT) Received: from mail-ej1-f66.google.com (mail-ej1-f66.google.com [209.85.218.66]) by arlo.cworth.org (Postfix) with ESMTPS id B09256DE0F60 for ; Thu, 21 May 2020 15:17:42 -0700 (PDT) Received: by mail-ej1-f66.google.com with SMTP id e2so10671943eje.13 for ; Thu, 21 May 2020 15:17:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:subject:in-reply-to:references:date:message-id :mime-version; bh=vvDloTQS+/Tjokty0ZQzcJBc3SqtIK7cqM45Bgkt0rs=; b=Kf0UtONmgTB5rJcSfsxNlAqz9P1zb2DKQDfoqPhOmx156A3BQ7Clo2wQGKuNgAwtY8 q/q2LXrpXY6ZuDf/p70VX6ji6O96AkuYEGum0MHAJ8iS79R9BkE6YgLYNOy+lO0OhAdK 2r+gwo7aZdkZGIdisrwVvsmwcSoRpR2lNAUYyP5Vm3IaJ5sLEeivQ909IhPT1IxUobiE iA+DbyDbOrW59D2n1LX76jXLSw+kh1dYN04wBWZ0A2BJhDosng8U89nRvZLpNqxDVcvf m2xdQzXMcgPVGX5Qb5x0C7MAccCxjtHL0u2/fHe8kqubiQoHYW0+fFRzQ5Vt20OgAEpF uB9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:subject:in-reply-to:references :date:message-id:mime-version; bh=vvDloTQS+/Tjokty0ZQzcJBc3SqtIK7cqM45Bgkt0rs=; b=LvFn/JRIxbMPas4RgUQvWXrBtF9fqu3qRiMdPCyWFCgEVJ9k95QWEST2s/aoIrf4v1 rbaEb0lw15xhKHt0cclaoyyonT3N9xG/xxZ0akyHcFPl8RDYToO5fa3Q8r/brItfn+RQ 3zq3Ab3s8RndZ1QRjdPkdaWeQiPaTGJ9dWRBiaKusVgx9I2mr/Vs2iiWfVGYws322cpV fS5Y5eJt3NrtOnB8rJ1XK+J2AStXXJ6wPiQZq8+Ob32nBc+OU0vQ/UTDYrFAPJuERSTc NDzmhwVYJbnwSEQzKN3t3Q3ZDskeYl8JClPDdy+EP3RHA3Zm+shQwjNWNDYYiUiM3K2U BO7Q== X-Gm-Message-State: AOAM53361VPIkhPKRwA8khEj/z1Ze2wIbagFT1IlWfrKPo1Ka1L6cor6 OTCT57YW90Rqxwp8nsMTOo7fr0sP X-Google-Smtp-Source: ABdhPJzfYf8RzEDdn1hhtScf+fcjZeVhuPf/ad6ch/p9zUuUEb3HyknfOvyOTJhu9bePMqDye7ilVA== X-Received: by 2002:a17:906:4bc1:: with SMTP id x1mr5388881ejv.13.1590099459967; Thu, 21 May 2020 15:17:39 -0700 (PDT) Received: from powell.devork.be (2a02-8388-8480-1180-4c18-fc69-8d8c-22b5.cable.dynamic.v6.surfer.at. [2a02:8388:8480:1180:4c18:fc69:8d8c:22b5]) by smtp.gmail.com with ESMTPSA id mb1sm6343197ejb.109.2020.05.21.15.17.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 May 2020 15:17:39 -0700 (PDT) Received: (nullmailer pid 667521 invoked by uid 1000); Thu, 21 May 2020 22:17:38 -0000 From: Floris Bruynooghe To: Anton Khirnov , notmuch@notmuchmail.org Subject: Re: [PATCH 1/2] python/notmuch2: do not destroy messages owned by a query In-Reply-To: <20200509050526.23148-1-anton@khirnov.net> References: <20200509050526.23148-1-anton@khirnov.net> Date: Fri, 22 May 2020 00:17:38 +0200 Message-ID: <87tv09gcfx.fsf@powell.devork.be> MIME-Version: 1.0 X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: notmuch-bounces@notmuchmail.org Sender: "notmuch" X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=fail (body hash did not verify) header.d=gmail.com header.s=20161025 header.b=Kf0UtONm; dmarc=none; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 50.126.95.6 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org X-Spam-Score: 1.49 X-TUID: TvQk651/s4qw Hi Anton, Thanks for improving the bindings! Any my apologies for the late response, I failed to spot this mail the first time round. Also, this is a pretty serious bug, thanks for finding it. This looks pretty solid, a few small style comments that aren't very important notwithstanding. Though I do think you should be able to add a test for this in the pytest tests. I think just triggering the use-after-free you demonstrate in the commit message is fine as it will work with your patch. The fact it will crash rather then fail without is unfortunate but probably fine. On Sat 09 May 2020 at 07:05 +0200, Anton Khirnov wrote: > Any messages retrieved from a query - either directly via > search_messages() or indirectly via thread objects - are owned by that > query. Retrieving the same message (i.e. corresponding to the same > message ID / database object) several times will always yield the same > C object. > > The caller is allowed to destroy message objects owned by a query before > the query itself - which can save memory for long-lived queries. > However, that message must then never be retrieved again from that > query. > > The python-notmuch2 bindings will currently destroy every message object > in Message._destroy(), which will lead to an invalid free if the same > message is then retrieved again. E.g. the following python program leads > to libtalloc abort()ing: > > import notmuch2 > db = notmuch2.Database(mode = notmuch2.Database.MODE.READ_ONLY) > t = next(db.threads('*')) > msgs = list(zip(t.toplevel(), t.toplevel())) > msgs = list(zip(t.toplevel(), t.toplevel())) > > Fix this issue by creating a subclass of Message, which is used for > "standalone" message which have to be freed by the caller. Message class > is then used only for messages descended from a query, which do not need > to be freed by the caller. > --- > bindings/python-cffi/notmuch2/_database.py | 6 ++-- > bindings/python-cffi/notmuch2/_message.py | 42 ++++++++++++++-------- > 2 files changed, 30 insertions(+), 18 deletions(-) > > diff --git a/bindings/python-cffi/notmuch2/_database.py b/bindings/python-cffi/notmuch2/_database.py > index 95f59ca0..f14eac78 100644 > --- a/bindings/python-cffi/notmuch2/_database.py > +++ b/bindings/python-cffi/notmuch2/_database.py > @@ -399,7 +399,7 @@ class Database(base.NotmuchObject): > capi.lib.NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID] > if ret not in ok: > raise errors.NotmuchError(ret) > - msg = message.Message(self, msg_pp[0], db=self) > + msg = message.StandaloneMessage(self, msg_pp[0], db=self) > if sync_flags: > msg.tags.from_maildir_flags() > return self.AddedMessage( > @@ -468,7 +468,7 @@ class Database(base.NotmuchObject): > msg_p = msg_pp[0] > if msg_p == capi.ffi.NULL: > raise LookupError > - msg = message.Message(self, msg_p, db=self) > + msg = message.StandaloneMessage(self, msg_p, db=self) > return msg > > def get(self, filename): > @@ -501,7 +501,7 @@ class Database(base.NotmuchObject): > msg_p = msg_pp[0] > if msg_p == capi.ffi.NULL: > raise LookupError > - msg = message.Message(self, msg_p, db=self) > + msg = message.StandaloneMessage(self, msg_p, db=self) > return msg > > @property > diff --git a/bindings/python-cffi/notmuch2/_message.py b/bindings/python-cffi/notmuch2/_message.py > index c5fdbf6d..416ce7ca 100644 > --- a/bindings/python-cffi/notmuch2/_message.py > +++ b/bindings/python-cffi/notmuch2/_message.py > @@ -14,7 +14,7 @@ __all__ = ['Message'] > > > class Message(base.NotmuchObject): > - """An email message stored in the notmuch database. > + """An email message stored in the notmuch database retrieved via a query. > > This should not be directly created, instead it will be returned > by calling methods on :class:`Database`. A message keeps a > @@ -61,22 +61,10 @@ class Message(base.NotmuchObject): > > @property > def alive(self): > - if not self._parent.alive: > - return False > - try: > - self._msg_p > - except errors.ObjectDestroyedError: > - return False > - else: > - return True > - > - def __del__(self): > - self._destroy() > + return self._parent.alive > > def _destroy(self): > - if self.alive: > - capi.lib.notmuch_message_destroy(self._msg_p) > - self._msg_p = None > + pass I feel like some more description from the commit message could live here to explain why this isn't destroying the message rather than having to find it in the commit message. > > @property > def messageid(self): > @@ -375,6 +363,30 @@ class Message(base.NotmuchObject): > if isinstance(other, self.__class__): > return self.messageid == other.messageid > > +class StandaloneMessage(Message): Minor style thing, between top-level code one usually leaves 2 blank lines per PEP8. Nowadays perhaps we should think about adopting an automatic formatter but hey. (This applies to lots of places in both patches btw) > + """An email message stored in the notmuch database. > + > + This subclass of Message is used for messages that are retrieved from the > + database directly and are not owned by a query. > + """ Likewise I'd be tempted to add here that this message is destroyed as soon as it is garbage collected pointing out that this is not the case for Message. And probably update the Message docstring with the inverse. > + @property > + def alive(self): > + if not self._parent.alive: > + return False > + try: > + self._msg_p > + except errors.ObjectDestroyedError: > + return False > + else: > + return True > + > + def __del__(self): > + self._destroy() > + > + def _destroy(self): > + if self.alive: > + capi.lib.notmuch_message_destroy(self._msg_p) > + self._msg_p = None > > class FilenamesIter(base.NotmuchIter): > """Iterator for binary filenames objects."""