From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id CAc5A0HH/WAKcgAAgWs5BA (envelope-from ) for ; Sun, 25 Jul 2021 22:19:13 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id 0IOtOkDH/WDRAwAA1q6Kng (envelope-from ) for ; Sun, 25 Jul 2021 20:19:12 +0000 Received: from mail.notmuchmail.org (nmbug.tethera.net [IPv6:2607:5300:201:3100::1657]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 67965DAC2 for ; Sun, 25 Jul 2021 22:19:12 +0200 (CEST) Received: from nmbug.tethera.net (localhost [127.0.0.1]) by mail.notmuchmail.org (Postfix) with ESMTP id B0EB129087; Sun, 25 Jul 2021 16:19:06 -0400 (EDT) Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by mail.notmuchmail.org (Postfix) with ESMTPS id 791F429079 for ; Sun, 25 Jul 2021 16:19:04 -0400 (EDT) Received: by mail-wm1-x333.google.com with SMTP id f14-20020a05600c154eb02902519e4abe10so649810wmg.4 for ; Sun, 25 Jul 2021 13:19:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:in-reply-to:references:date:message-id :mime-version; bh=wLkBjyuFriLL5M/GQftmkPO/Je3cOR5lhpJiEr8PeIA=; b=IDGPBszk//vkKhVmh65ElEw91MRClYUYcEwJYYXbHsqLWOPePY185XxM3F0XyKsi+/ O3pp8km/a+ieCzyIsNovCJQCKLLWWN0UuQ6MN/H3BApi0fTjna2Z8s9c5eKX/OUJvRvg EFtEhs6r8wk7shMmoonUMz4az0rFrt05V9ObPOX7d+o3IFFoZn6fDiaLUmLIsLhWCAAc qMpYspTr11GV3ue7aKnIpgP6QWWG+NvQKGwi387N4ulj0bWH1v5Lksbccf8GGO+bM42s iNedP7FZlHQy3tuwtnxIilpVjYBV6mykTRx8iS5gX/AoBHoR0W5uFhNKARJwTCak6xxQ SVZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:in-reply-to:references :date:message-id:mime-version; bh=wLkBjyuFriLL5M/GQftmkPO/Je3cOR5lhpJiEr8PeIA=; b=Qxw5xsF/mAbKX0HIhD7LUOU4kevMUnHNZF54ZdLS8jc58hrXtwz5sNRxpXdIDQSj64 bUEKZcVgUs2GmzugfA86dF86b4L79BPHNNcD5mANxo5Cd+OI+Q8vZCZQaYa4CatSsTN7 8S8vJSgh3+0+jxSzIockpA1kE/vaY0/YpA5v2xI5wJcjKMeWW6yL9ifuqPnq37z1YnIX mdkAYjo2B/IUQhIoZ8zUkqgklZae5trwICI9rmQOKOu7bWxNGE5j8/OKda7bewvO6EAN zsm/i1OHsgHNEbVdVugdmH81y0IgiO8ElOtO6zvrjqDayb7g21PTOMHSyErGh6sQOKY6 IZ6g== X-Gm-Message-State: AOAM533yKtiKj1AzEZV21maJS1mdQtOTuBfdblcKzo/owPC9kR8Hd1m2 M3ANAlxvmrs8nyrPpgv/YAg= X-Google-Smtp-Source: ABdhPJyXLcJWspdxxC85cGVz3DqcR/9wg9c8kfedfQ8dfNy8Pa1VO0XikyhUkPWkKpZu+TQlSN57hQ== X-Received: by 2002:a05:600c:3795:: with SMTP id o21mr129236wmr.90.1627244339888; Sun, 25 Jul 2021 13:18:59 -0700 (PDT) Received: from powell.devork.be (dynamic-2k8goc3gcrahrymqky-pd01.res.v6.highway.a1.net. [2001:871:25f:34e8:4013:75e7:f870:6e12]) by smtp.gmail.com with ESMTPSA id o18sm40005106wrx.21.2021.07.25.13.18.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 25 Jul 2021 13:18:59 -0700 (PDT) Sender: Floris Bruynooghe Received: (nullmailer pid 475338 invoked by uid 1000); Sun, 25 Jul 2021 20:18:58 -0000 From: Floris Bruynooghe To: Ludovic LANGE , notmuch@notmuchmail.org Subject: Re: [PATCH 1/1] python/notmuch2: provide binding for database_get_directory() In-Reply-To: <20210725081602.81497-1-ll-notmuch@lange.nom.fr> References: <20210725081602.81497-1-ll-notmuch@lange.nom.fr> Date: Sun, 25 Jul 2021 22:18:58 +0200 Message-ID: <87mtqa2ll9.fsf@powell.devork.be> MIME-Version: 1.0 Message-ID-Hash: KQ5PADBQF7ATLNQHVVIBL6LMGHDGUMVU X-Message-ID-Hash: KQ5PADBQF7ATLNQHVVIBL6LMGHDGUMVU X-MailFrom: floris.bruynooghe@gmail.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-notmuch.notmuchmail.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header CC: Ludovic LANGE X-Mailman-Version: 3.2.1 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Help: List-Post: List-Subscribe: List-Unsubscribe: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1627244352; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=wIAZhTTmT0QGsvMbuKPRcEPFKhLdmLmqY6yKijDWw4w=; b=ENvGRUUoRrJhkdC3kSvuiedwPIvf+FQLSmp3RwLfaamJnwsWRcerZtfCdnHGpjnRMeT8b9 Fb5MUZ4gzpjQbCud/Gezmb5Ev6h6erjOfEb26AHGFFVmA+hzQaHiLRXLkrsfrwK05g/lQe aOslgxF+Tm1RmPgEKoGrDW6hLkJFiNNJTt+z+rayMJ8P698tilYa8+MKzBH0MRHo9zWh6w U6A7zwSsAxBJG5ndzlKkDia0QMkiWG0oZ/kEj4YSNOFQLmIBV1ne1aufRazTyUpk21PCk3 9vgwQZqnDV8Y2FRWKqVHrxUcGlAt2pHl/8wrm35AnOGmGahGgj/0cUUdOd6b2w== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1627244352; a=rsa-sha256; cv=none; b=nodWIHdGkrA7uDXvVJw2wpkprZPrumCCswnkAJnABm2Z/dPBS2izFArjeY32vX4SsPRFVW OnYwwDzbusQEnZ2KjnSMQgFzosRUc346eHJmVMJulD3+tWyfQUYI+f/JoccwrGlAuUKJwE UPNMCRC/O7ZEvO6ECbutgqWIb4UkZx79zVwkhYVv7Up5OmqvYgcgF63Yp4GCEmMAuq90WC 4sq09L6yriN4HXbNXKN2PCmse/X4Qz5jW8K5zBLyClgajNu4X5DFhMNpg9W9ThgyJyTjQc 0+az6RfYrOYWbsRbmURPkjKqehJ9VxcjQBzbfLhWWUaoFcyl1IQ7PDrDc2YelQ== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("body hash did not verify") header.d=gmail.com header.s=20161025 header.b=IDGPBszk; dmarc=none; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 2607:5300:201:3100::1657 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org X-Migadu-Spam-Score: -0.98 Authentication-Results: aspmx1.migadu.com; dkim=fail ("body hash did not verify") header.d=gmail.com header.s=20161025 header.b=IDGPBszk; dmarc=none; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 2607:5300:201:3100::1657 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org X-Migadu-Queue-Id: 67965DAC2 X-Spam-Score: -0.98 X-Migadu-Scanner: scn0.migadu.com X-TUID: VqlKIwJn5YcC Hi Ludovic, Thanks for having a look at this! On Sun 25 Jul 2021 at 10:16 +0200, Ludovic LANGE wrote: > diff --git a/bindings/python-cffi/notmuch2/_database.py b/bindings/python-cffi/notmuch2/_database.py > index 868f4408..e48fa895 100644 > --- a/bindings/python-cffi/notmuch2/_database.py > +++ b/bindings/python-cffi/notmuch2/_database.py > @@ -13,6 +13,7 @@ import notmuch2._errors as errors > import notmuch2._message as message > import notmuch2._query as querymod > import notmuch2._tags as tags > +import notmuch2._directory as directory > > > __all__ = ['Database', 'AtomicContext', 'DbRevision'] > @@ -338,7 +339,46 @@ class Database(base.NotmuchObject): > return DbRevision(rev, capi.ffi.string(raw_uuid[0])) > > def get_directory(self, path): > - raise NotImplementedError > + """Returns a :class:`Directory` from the database for path, > + > + :param path: An unicode string containing the path relative to the path > + of database (see :attr:`path`), or else should be an absolute > + path with initial components that match the path of 'database'. Instead of a unicode string instead it should accept anything that `os.fspath` accepts: str, bytes or os.PathLike. This allows using e.g. pathlib.Path as well. > + :returns: :class:`Directory` or raises an exception. > + :raises: :exc:`FileError` if path is not relative database or absolute > + with initial components same as database. > + > + > + :raises XapianError: A Xapian exception occurred. > + :raises LookupError: The directory object referred to by ``pathname`` path or pathname? choose one ;) > + does not exist in the database. > + :raises FileNotEmailError: The file referreed to by > + ``pathname`` is not recognised as an email message. same > + :raises UpgradeRequiredError: The database must be upgraded > + first. > + """ > + if not hasattr(os, 'PathLike') and isinstance(path, pathlib.Path): > + path = bytes(path) I think `path = os.fspath(path)` should handle this fine. But I don't think you even need to do that as `os.fsencode()` will just do the right thing already. > + directory_pp = capi.ffi.new('notmuch_directory_t **') > + ret = capi.lib.notmuch_database_get_directory( > + self._db_p, os.fsencode(path), directory_pp) > + > + if ret != capi.lib.NOTMUCH_STATUS_SUCCESS: > + raise errors.NotmuchError(ret) > + > + directory_p = directory_pp[0] > + if directory_p == capi.ffi.NULL: > + raise LookupError > + > + if os.path.isabs(path): > + # we got an absolute path > + abs_dirpath = path > + else: > + #we got a relative path, make it absolute > + abs_dirpath = os.path.abspath(os.path.join(self.get_path(), path)) > + > + ret_dir = directory.Directory(abs_dirpath, directory_p, self) I think the code has been trying to use pathlib for path manipulations, so for consistency it'd be good to keep doing that This if condition becomes much simpler: path = pathlib.Path(path) # this is a no-op if it already is an instance ret_dir = directory.Directory(path.absolute(), directory_p, self) > + return ret_dir > > def default_indexopts(self): > """Returns default index options for the database. > diff --git a/bindings/python-cffi/notmuch2/_directory.py b/bindings/python-cffi/notmuch2/_directory.py > new file mode 100644 > index 00000000..1d48aa54 > --- /dev/null > +++ b/bindings/python-cffi/notmuch2/_directory.py > @@ -0,0 +1,164 @@ > +import os > +import pathlib > + > +import notmuch2._base as base > +import notmuch2._capi as capi > +import notmuch2._errors as errors > +from ._message import FilenamesIter > + > +__all__ = ["Directory"] > + > + > +class PurePathIter(FilenamesIter): > + """Iterator for pathlib.PurePath objects.""" > + > + def __next__(self): > + fname = super().__next__() > + return pathlib.PurePath(os.fsdecode(fname)) I'm surprised you explicitly need a PurePath here? > +class Directory(base.NotmuchObject): > + """Represents a directory entry in the notmuch directory > + > + Modifying attributes of this object will modify the > + database, not the real directory attributes. > + > + The Directory object is usually derived from another object > + e.g. via :meth:`Database.get_directory`, and will automatically be > + become invalid whenever that parent is deleted. You should > + therefore initialized this object handing it a reference to the > + parent, preventing the parent from automatically being garbage > + collected. > + """ > + > + _msg_p = base.MemoryPointer() > + > + def __init__(self, path, dir_p, parent): > + """ > + :param path: The absolute path of the directory object. For consistency I'd require this to be `pathlib.Path`. > + :param dir_p: The pointer to an internal notmuch_directory_t object. > + :param parent: The object this Directory is derived from > + (usually a :class:`Database`). We do not directly use > + this, but store a reference to it as long as > + this Directory object lives. This keeps the > + parent object alive. > + """ Just add this docstring the the class docstring, that's the usual way to document __init__. > + self._path = path > + self._dir_p = dir_p > + self._parent = parent > + > + @property > + def alive(self): > + if not self._parent.alive: > + return False > + try: > + self._dir_p > + except errors.ObjectDestroyedError: > + return False > + else: > + return True > + > + def __del__(self): > + """Close and free the Directory""" > + self._destroy() > + > + def _destroy(self): > + if self.alive: > + capi.lib.notmuch_directory_destroy(self._dir_p) > + self._dir_p = None > + > + def set_mtime(self, mtime): > + """Sets the mtime value of this directory in the database > + > + The intention is for the caller to use the mtime to allow efficient > + identification of new messages to be added to the database. The > + recommended usage is as follows: > + > + * Read the mtime of a directory from the filesystem > + > + * Call :meth:`Database.index_file` for all mail files in > + the directory > + > + * Call notmuch_directory_set_mtime with the mtime read from the > + filesystem. Then, when wanting to check for updates to the > + directory in the future, the client can call :meth:`get_mtime` > + and know that it only needs to add files if the mtime of the > + directory and files are newer than the stored timestamp. > + > + .. note:: > + > + :meth:`get_mtime` function does not allow the caller to > + distinguish a timestamp of 0 from a non-existent timestamp. So > + don't store a timestamp of 0 unless you are comfortable with > + that. Love the docs on how this is actually intended to be used! > + > + :param mtime: A (time_t) timestamp This description works for C-developers, but searching docs.python.org for that will get people very confused. I'd suggest: :param mtime: A POSIX timestamp. :type mtime: int or float I'm suggesting to accept float here because `time.time()` returns a float. `os.mtime(...).st_mtime` is int though, so you could argue to restrict it to int. > + :raises: :exc:`XapianError` a Xapian exception occurred, mtime > + not stored > + :raises: :exc:`ReadOnlyDatabaseError` the database was opened > + in read-only mode so directory mtime cannot be modified > + :raises: :exc:`NotInitializedError` the directory object has not > + been initialized > + """ > + ret = capi.lib.notmuch_directory_set_mtime(self._dir_p, mtime) int(mtime) if you follow my suggestion above > + > + if ret != capi.lib.NOTMUCH_STATUS_SUCCESS: > + raise errors.NotmuchError(ret) > + > + def get_mtime(self): > + """Gets the mtime value of this directory in the database > + > + Retrieves a previously stored mtime for this directory. > + > + :param mtime: A (time_t) timestamp I don't think there is a param ;) > + :raises: :exc:`NotmuchError`: > + > + :attr:`STATUS`.NOT_INITIALIZED > + The directory has not been initialized An old comment? > + """ > + return capi.lib.notmuch_directory_get_mtime(self._dir_p) > + > + # Make mtime attribute a property of Directory() > + mtime = property( > + get_mtime, > + set_mtime, > + doc="""Property that allows getting > + and setting of the Directory *mtime* (read-write) > + > + See :meth:`get_mtime` and :meth:`set_mtime` for usage and > + possible exceptions.""", > + ) Hmm, either we should have a get/set API or a property, not both in my opinion. I'm a bit conflicted on which is better, the code only has one setter property currently, Database.decrypt_policy and that feels somewhat similar so it could be good for consistency. But maybe that choice was a mistake. In general assigning to a property does not have the expectation of taking an action I think, and here you are directly updating the database. Anyway, I'm curious what other people think of this as well. > + def get_child_files(self): Style-wise I'd name this `def files(self):`, I think it follows the style of the API more. > + """Gets a Filenames iterator listing all the filenames of > + messages in the database within the given directory. > + > + The returned filenames will be the basename-entries only (not > + complete paths. > + """ > + fnames_p = capi.lib.notmuch_directory_get_child_files(self._dir_p) > + return PurePathIter(self, fnames_p) Hmm, seeing it's use I think this should probably be an iterator returning `pathlib.Path`, only giving a PurePath seems almost mean. I'd make sure they are already usable, instead of still having to get the path property to make them useful, so pass in the path of the directory to the iterator so it can return absolute paths. > + > + def get_child_directories(self): likewise `directories()` probably > + """Gets a :class:`Filenames` iterator listing all the filenames of > + sub-directories in the database within the given directory > + > + The returned filenames will be the basename-entries only (not > + complete paths. > + """ > + fnames_p = capi.lib.notmuch_directory_get_child_directories(self._dir_p) > + return PurePathIter(self, fnames_p) same as for the files i guess > + @property > + def path(self): > + """Returns the absolute path of this Directory (read-only)""" Some nitpicking, I'd document the return type (which should be changed to pathlib.Path), also the `read-only` part is already implicit in the signature, and it's a property so doesn't "return" things: """The path of this Directory as a pathlib.Path instance.""" > + return self._path > + > + def __repr__(self): > + """Object representation""" > + try: > + self._dir_p > + except errors.ObjectDestroyedError: > + return ''.format(self=self) The `(exhaused)` part seems odd, I'm even surprised you need this except clause for this object. > + else: > + return ''.format(self=self) Hope the many comments are acceptable, once again thanks for adding this and feel free to disagree with some suggestions :) Cheers, Floris