From: Floris Bruynooghe <flub@devork.be>
To: Ludovic LANGE <ll-notmuchmail@lange.nom.fr>, notmuch@notmuchmail.org
Subject: Re: [PATCH 1/1] python/notmuch2: provide binding for Database.directory()
Date: Sat, 31 Jul 2021 12:14:23 +0200 [thread overview]
Message-ID: <87tukaeqo0.fsf@powell.devork.be> (raw)
In-Reply-To: <20210726142056.25243-1-ll-notmuchmail@lange.nom.fr>
Hi Ludovic,
Thanks for following up and your patience waiting for my review.
Also, would you mind following the v2, v3 tagging of patches as
described on https://notmuchmail.org/contributing/#index15h2 I know it's
not straightforward and I also have to look it up every time, but it
does help avoiding confusion.
> In the (new) cffi bindings, the corresponding call was not yet implemented.
> We provide a naive implementation (renamed to Database.directory()), and the
> implementation of the needed (internal) Directory object.
> ---
>
> Hello,
>
> This my second try at updating the python-cffi bindings.
> Thanks to the feedback of @Floris and @Tomi, this new tentative includes most
> of the suggestions made:
> * Database.get_directory(..) renamed to Database.directory()
> * Database.directory() is properly documented
> * Database.directory() accepts, for its path argument, str, bytes or os.PathLike
> * Iterators returning directories or files are now returning Path (not PurePath)
> * Directory.__init__() accepts, for its path argument, str, bytes or os.PathLike
> * Directory class is properly documented
> * Directory.set_mtime() accepts int or float (uses int internally)
> * Directory.get_child_files() renamed to Directory.files()
> * Directory.files() can, optionally, return absolute paths (instead of, by
> default, relative paths) (*)
> * Directory.get_child_directories() renamed to Directory.directories()
> * Directory.directories() can, optionally, return absolute paths (instead of, by
> default, relative paths) (*)
> * Directory.__repr__ simplified
>
>
> Not (yet ?) implemented:
> * Suggestion to remove the code `if not hasattr(os, 'PathLike') and isinstance(
> path, pathlib.Path`, I'd like to understand if we should do it for all the 6
> other occurences in Database or if they are special cases. May be in another patch ?
Yes this can, and probably should, be done in all occurrences. I'd just
do it here in this patch and do the others in a separate patch as
that'll make reviewing easier.
> * Choosing betwen getters/setters OR property for Directory.set/get/mtime - some
> more discussion could be helping ?
So I think it's best to stick to what was introduced in
`Database.decrypt_policy` and go with the property, mostly for reasons
of consistency.
> (*) about relative vs absolute and default value:
> My use-case for this whole patch is related to [Alot MUA](https://github.com/pazz/alot)
> and more specifically trying to revive [an old patch](https://github.com/pazz/alot/pull/1170),
> which tries to display the folders in a TreeView.
> The previous python bindings, as well as the C-API, all return relative paths (
> relative to the current Directory).
> Having list of a relative paths is a plus for the abovementionned patch,
> as we won't have to iterate on the result and basename() those paths. It'll save
> some (precious) time on a lot of folders.
> That being said, I can see an interest for having a full, workable path (also
> without having to convert them).
> So I introduced an optional parameter (defaulting to 'relative' mode) to allow
> the end-user to choose absolute paths.
>
>
> Thank you in advance for your time reviewing this patchset !
>
> Regards,
>
> Ludovic.
>
> PS: I very slightly changed my email address since last patch, this one is the good one.
> PPS: Should we provide an update of the NEWS file ?
>
>
> bindings/python-cffi/notmuch2/__init__.py | 1 +
> bindings/python-cffi/notmuch2/_build.py | 18 ++
> bindings/python-cffi/notmuch2/_database.py | 39 ++++-
> bindings/python-cffi/notmuch2/_directory.py | 174 ++++++++++++++++++++
> 4 files changed, 230 insertions(+), 2 deletions(-)
> create mode 100644 bindings/python-cffi/notmuch2/_directory.py
>
> diff --git a/bindings/python-cffi/notmuch2/__init__.py b/bindings/python-cffi/notmuch2/__init__.py
> index f281edc1..32067ddc 100644
> --- a/bindings/python-cffi/notmuch2/__init__.py
> +++ b/bindings/python-cffi/notmuch2/__init__.py
> @@ -45,6 +45,7 @@ usually expect from Python containers.
> from notmuch2 import _capi
> from notmuch2._base import *
> from notmuch2._database import *
> +from notmuch2._directory import *
> from notmuch2._errors import *
> from notmuch2._message import *
> from notmuch2._tags import *
> diff --git a/bindings/python-cffi/notmuch2/_build.py b/bindings/python-cffi/notmuch2/_build.py
> index f712b6c5..0f0a0a46 100644
> --- a/bindings/python-cffi/notmuch2/_build.py
> +++ b/bindings/python-cffi/notmuch2/_build.py
> @@ -134,6 +134,10 @@ ffibuilder.cdef(
> notmuch_database_get_revision (notmuch_database_t *notmuch,
> const char **uuid);
> notmuch_status_t
> + notmuch_database_get_directory (notmuch_database_t *database,
> + const char *path,
> + notmuch_directory_t **directory);
> + notmuch_status_t
> notmuch_database_index_file (notmuch_database_t *database,
> const char *filename,
> notmuch_indexopts_t *indexopts,
> @@ -303,6 +307,20 @@ ffibuilder.cdef(
> void
> notmuch_tags_destroy (notmuch_tags_t *tags);
>
> + notmuch_status_t
> + notmuch_directory_set_mtime (notmuch_directory_t *directory,
> + time_t mtime);
> + time_t
> + notmuch_directory_get_mtime (notmuch_directory_t *directory);
> + notmuch_filenames_t *
> + notmuch_directory_get_child_files (notmuch_directory_t *directory);
> + notmuch_filenames_t *
> + notmuch_directory_get_child_directories (notmuch_directory_t *directory);
> + notmuch_status_t
> + notmuch_directory_delete (notmuch_directory_t *directory);
> + void
> + notmuch_directory_destroy (notmuch_directory_t *directory);
> +
> notmuch_bool_t
> notmuch_filenames_valid (notmuch_filenames_t *filenames);
> const char *
> diff --git a/bindings/python-cffi/notmuch2/_database.py b/bindings/python-cffi/notmuch2/_database.py
> index 868f4408..a42d5402 100644
> --- a/bindings/python-cffi/notmuch2/_database.py
> +++ b/bindings/python-cffi/notmuch2/_database.py
> @@ -9,6 +9,7 @@ import weakref
> import notmuch2._base as base
> import notmuch2._config as config
> import notmuch2._capi as capi
> +import notmuch2._directory as directory
> import notmuch2._errors as errors
> import notmuch2._message as message
> import notmuch2._query as querymod
> @@ -337,8 +338,42 @@ class Database(base.NotmuchObject):
> rev = capi.lib.notmuch_database_get_revision(self._db_p, raw_uuid)
> return DbRevision(rev, capi.ffi.string(raw_uuid[0]))
>
> - def get_directory(self, path):
> - raise NotImplementedError
> + def directory(self, path):
> + """Returns a :class:`Directory` from the database given a pathname.
> +
> + If a directory with the given pathname exists in the database
> + return the :class:`Directory` instance for it.
> + Otherwise raise a :exc:`LookupError` exception.
> +
> + :param path: A path relative to the path of database (see :attr:`path`),
> + or else an absolute path with initial components that match the
> + path of database.
> + :type path: str, bytes, os.PathLike or pathlib.Path
pathlib.Path is os.PathLike so no real need to mention it explicitly.
but it doesn't do any harm
> + :returns: :class:`Directory` or raises an exception.
> + :raises LookupError: The directory object referred to by ``path``
> + does not exist in the database.
> + :raises XapianError: A Xapian exception occurred.
> + :raises UpgradeRequiredError: The database must be upgraded
> + first.
> + :raises OutOfMemoryError: When there is no memory to allocate
> + the message instance.
> + """
> + if not hasattr(os, 'PathLike') and isinstance(path, pathlib.Path):
> + path = bytes(path)
As mentioned before, I'd still remove this, `os.fsencode()` below
handles this automatically.
> + directory_pp = capi.ffi.new('notmuch_directory_t **')
> + ret = capi.lib.notmuch_database_get_directory(
> + self._db_p, os.fsencode(path), directory_pp)
> +
> + if ret != capi.lib.NOTMUCH_STATUS_SUCCESS:
> + raise errors.NotmuchError(ret)
> +
> + directory_p = directory_pp[0]
> + if directory_p == capi.ffi.NULL:
> + raise LookupError
> +
> + path = pathlib.Path(path)
> + ret_dir = directory.Directory(path.resolve(), directory_p, self)
> + return ret_dir
>
> def default_indexopts(self):
> """Returns default index options for the database.
> diff --git a/bindings/python-cffi/notmuch2/_directory.py b/bindings/python-cffi/notmuch2/_directory.py
> new file mode 100644
> index 00000000..fea11e22
> --- /dev/null
> +++ b/bindings/python-cffi/notmuch2/_directory.py
> @@ -0,0 +1,174 @@
> +import os
> +import pathlib
> +
> +import notmuch2._base as base
> +import notmuch2._capi as capi
> +import notmuch2._errors as errors
> +from ._message import FilenamesIter
In general PEP8 prefers to import modules rather than things inside it,
so I'd `import notmuch2._message as message`. I also think this would
be the first relative import. It probably doesn't matter much but for
consistency I'd avoid it.
> +
> +__all__ = ["Directory"]
> +
> +
> +class PathIter(FilenamesIter):
> + """Iterator for pathlib.Path objects."""
> +
> + def __init__(self, parent, iter_p, basepath=""):
> + self._basepath = basepath
> + super().__init__(parent, iter_p)
> +
> + def __next__(self):
> + fname = super().__next__()
> + return pathlib.Path(self._basepath, os.fsdecode(fname))
> +
> +
> +class Directory(base.NotmuchObject):
> + """Represents a directory entry in the notmuch database.
> +
> + This should not be directly created, instead it will be returned
> + by calling :meth:`Database.directory`. A directory keeps a
> + reference to the database object since the database object can not
> + be released while the directory is in use.
> +
> + Modifying attributes of this object will modify the
> + database, not the real directory attributes.
Nitpicking, but I reformatted the below a bit to follow PEP
conventions a bit more.
> + :param path: The absolute path of the directory object.
> + :type path: str, bytes, os.PathLike or pathlib.Path
> + :param dir_p: The pointer to an internal notmuch_directory_t object.
> + :type dir_p: C-api notmuch_directory_t
> + :param parent: The object this Directory is derived from (usually a
> + :class:`Database`). We do not directly use this, but store a
> + reference to it as long as this Directory object lives. This
> + keeps the parent object alive.
> + :type parent: Database
> + """
> +
> + _msg_p = base.MemoryPointer()
> +
> + def __init__(self, path, dir_p, parent):
> + """
> + """
no point in an empty docstring :)
> + self._path = pathlib.Path(path)
> + self._dir_p = dir_p
> + self._parent = parent
> +
> + @property
> + def alive(self):
> + if not self._parent.alive:
> + return False
> + try:
> + self._dir_p
> + except errors.ObjectDestroyedError:
> + return False
> + else:
> + return True
> +
> + def __del__(self):
> + """Close and free the Directory"""
> + self._destroy()
> +
> + def _destroy(self):
> + if self.alive:
> + capi.lib.notmuch_directory_destroy(self._dir_p)
> + self._dir_p = None
> +
> + def set_mtime(self, mtime):
@mtime.setter
def mtime(self, mtime):
Basically make this in a getter/setter property using the same style as
Database.decrypt_policy is written. And probably should be moved to
below the getter.
> + """Sets the mtime value of this directory in the database
> +
> + The intention is for the caller to use the mtime to allow efficient
> + identification of new messages to be added to the database. The
> + recommended usage is as follows:
> +
> + * Read the mtime of a directory from the filesystem
> +
> + * Call :meth:`Database.index_file` for all mail files in
> + the directory
> +
> + * Call notmuch_directory_set_mtime with the mtime read from the
> + filesystem. Then, when wanting to check for updates to the
> + directory in the future, the client can call :meth:`get_mtime`
> + and know that it only needs to add files if the mtime of the
> + directory and files are newer than the stored timestamp.
> +
> + .. note::
> +
> + :meth:`get_mtime` function does not allow the caller to
> + distinguish a timestamp of 0 from a non-existent timestamp. So
> + don't store a timestamp of 0 unless you are comfortable with
> + that.
I think Id'd like to go further: do not allow setting 0, raising a
ValueError instead. To allow removing the mtime we would require an
explicit `None` value.
> + :param mtime: A POSIX timestamp
> + :type mtime: int or float
> + :raises: :exc:`XapianError` a Xapian exception occurred, mtime
> + not stored
> + :raises: :exc:`ReadOnlyDatabaseError` the database was opened
> + in read-only mode so directory mtime cannot be modified
> + """
> + ret = capi.lib.notmuch_directory_set_mtime(self._dir_p, int(mtime))
> +
> + if ret != capi.lib.NOTMUCH_STATUS_SUCCESS:
> + raise errors.NotmuchError(ret)
> +
> + def get_mtime(self):
@property
def mtime(self):
> + """Gets the mtime value of this directory in the database
> +
> + Retrieves a previously stored mtime for this directory.
> +
> + :returns: A POSIX timestamp
> + :rtype: int
> + """
> + return capi.lib.notmuch_directory_get_mtime(self._dir_p)
I reckon we should check the value and `raise AttributeError` if it is
`0`. This hides the low-level C-ism. And even if you are interacting
with another program that does use this 0 for something else you still
don't lose any information.
> +
> + # Make mtime attribute a property of Directory()
> + mtime = property(
> + get_mtime,
> + set_mtime,
> + doc="""Property that allows getting
> + and setting of the Directory *mtime* (read-write)
> +
> + See :meth:`get_mtime` and :meth:`set_mtime` for usage and
> + possible exceptions.""",
> + )
As said, let's use the existing @property and @prop.setter syntax and
remove this.
> + def files(self, absolute=False):
I don't think I'm a fan of the absolute parameter, it seems weird. Why
not always make it absolute? (or keep it relative and go back to
PurePath ;))
> + """Gets a :class:`PathIter` iterator listing all the filenames of
> + messages in the database within the given directory.
nitpicking, and being more explicit for people like me who get easily
confused by this API:
"""Returns iterator of all files in the directory.
The iterator yields all the files as known in the database, this is
not a filesystem query.
> +
> + :param absolute: `True` to return complete paths, and `False`
> + to return basename-entries only (not complete paths),
> + defaults to `False`
For some reason python docstrings use 3-space indents for continuations
instead of aligning. I which we had auto-formatters for this.
> + :type absolute: boolean, optional
> +
> + :returns: An iterator yielding :class:`pathlib.Path` instances.
> + :rtype: PathIter
> + """
> + fnames_p = capi.lib.notmuch_directory_get_child_files(self._dir_p)
> + return PathIter(self, fnames_p, self.path if absolute else "")
> +
> + def directories(self, absolute=False):
same wrt absolute parameter
> + """Gets a :class:`PathIter` iterator listing all the filenames of
> + sub-directories in the database within the given directory.
> +
> + :param absolute: `True` to return complete paths, and `False`
> + to return basename-entries only (not complete paths),
> + defaults to `False`
> + :type absolute: boolean, optional
> +
> + :returns: An iterator yielding :class:`pathlib.Path` instances.
> + :rtype: PathIter
> + """
> + fnames_p = capi.lib.notmuch_directory_get_child_directories(self._dir_p)
> + return PathIter(self, fnames_p, self.path if absolute else "")
> +
> + @property
> + def path(self):
> + """the absolute path of this Directory as a :class:`pathlib.Path` instance.
> +
> + :rtype: pathlib.Path
> + """
> + return self._path
> +
> + def __repr__(self):
> + """Object representation"""
> + return '<Directory(path={self.path})>'.format(self=self)
Cheers,
Floris
prev parent reply other threads:[~2021-07-31 10:14 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-26 14:20 [PATCH 1/1] python/notmuch2: provide binding for Database.directory() Ludovic LANGE
2021-07-31 10:14 ` Floris Bruynooghe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://notmuchmail.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87tukaeqo0.fsf@powell.devork.be \
--to=flub@devork.be \
--cc=ll-notmuchmail@lange.nom.fr \
--cc=notmuch@notmuchmail.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).