unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [PATCH 1/1] python/notmuch2: provide binding for database_get_directory()
@ 2021-07-25  8:16 Ludovic LANGE
  2021-07-25 20:18 ` Floris Bruynooghe
  0 siblings, 1 reply; 5+ messages in thread
From: Ludovic LANGE @ 2021-07-25  8:16 UTC (permalink / raw)
  To: notmuch; +Cc: Ludovic LANGE

database_get_directory() is accessible in the legacy bindings as a method on the
database object. In the cffi bindings, it raises NotImplementedError, so we provide a
naive implementation, and the corresponding implementation of Directory object.
---

Hello,

This a my first try at updating the python-cffi bindings. The motivation for that is
related to `alot` which uses those bindings, and for which I'm missing the
notmuch_database_get_directory() call.

It may not be the cleanest implementation (I had to guess a few things) and I do not
have much experience with neither notmuch C API, nor python's CFFI. I'm waiting
for your comments in order to improve it, and hope it can be accepted.

Regards,

Ludovic.

 bindings/python-cffi/notmuch2/_build.py     |  18 +++
 bindings/python-cffi/notmuch2/_database.py  |  42 ++++-
 bindings/python-cffi/notmuch2/_directory.py | 164 ++++++++++++++++++++
 3 files changed, 223 insertions(+), 1 deletion(-)
 create mode 100644 bindings/python-cffi/notmuch2/_directory.py

diff --git a/bindings/python-cffi/notmuch2/_build.py b/bindings/python-cffi/notmuch2/_build.py
index f712b6c5..0f0a0a46 100644
--- a/bindings/python-cffi/notmuch2/_build.py
+++ b/bindings/python-cffi/notmuch2/_build.py
@@ -134,6 +134,10 @@ ffibuilder.cdef(
     notmuch_database_get_revision (notmuch_database_t *notmuch,
                                    const char **uuid);
     notmuch_status_t
+    notmuch_database_get_directory (notmuch_database_t *database,
+                                    const char *path,
+                                    notmuch_directory_t **directory);
+    notmuch_status_t
     notmuch_database_index_file (notmuch_database_t *database,
                                  const char *filename,
                                  notmuch_indexopts_t *indexopts,
@@ -303,6 +307,20 @@ ffibuilder.cdef(
     void
     notmuch_tags_destroy (notmuch_tags_t *tags);
 
+    notmuch_status_t
+    notmuch_directory_set_mtime (notmuch_directory_t *directory,
+                                 time_t mtime);
+    time_t
+    notmuch_directory_get_mtime (notmuch_directory_t *directory);
+    notmuch_filenames_t *
+    notmuch_directory_get_child_files (notmuch_directory_t *directory);
+    notmuch_filenames_t *
+    notmuch_directory_get_child_directories (notmuch_directory_t *directory);
+    notmuch_status_t
+    notmuch_directory_delete (notmuch_directory_t *directory);
+    void
+    notmuch_directory_destroy (notmuch_directory_t *directory);
+
     notmuch_bool_t
     notmuch_filenames_valid (notmuch_filenames_t *filenames);
     const char *
diff --git a/bindings/python-cffi/notmuch2/_database.py b/bindings/python-cffi/notmuch2/_database.py
index 868f4408..e48fa895 100644
--- a/bindings/python-cffi/notmuch2/_database.py
+++ b/bindings/python-cffi/notmuch2/_database.py
@@ -13,6 +13,7 @@ import notmuch2._errors as errors
 import notmuch2._message as message
 import notmuch2._query as querymod
 import notmuch2._tags as tags
+import notmuch2._directory as directory
 
 
 __all__ = ['Database', 'AtomicContext', 'DbRevision']
@@ -338,7 +339,46 @@ class Database(base.NotmuchObject):
         return DbRevision(rev, capi.ffi.string(raw_uuid[0]))
 
     def get_directory(self, path):
-        raise NotImplementedError
+        """Returns a :class:`Directory` from the database for path,
+
+        :param path: An unicode string containing the path relative to the path
+              of database (see :attr:`path`), or else should be an absolute
+              path with initial components that match the path of 'database'.
+        :returns: :class:`Directory` or raises an exception.
+        :raises: :exc:`FileError` if path is not relative database or absolute
+                 with initial components same as database.
+
+
+        :raises XapianError: A Xapian exception occurred.
+        :raises LookupError: The directory object referred to by ``pathname``
+           does not exist in the database.
+        :raises FileNotEmailError: The file referreed to by
+           ``pathname`` is not recognised as an email message.
+        :raises UpgradeRequiredError: The database must be upgraded
+           first.
+        """
+        if not hasattr(os, 'PathLike') and isinstance(path, pathlib.Path):
+            path = bytes(path)
+        directory_pp = capi.ffi.new('notmuch_directory_t **')
+        ret = capi.lib.notmuch_database_get_directory(
+          self._db_p, os.fsencode(path), directory_pp)
+
+        if ret != capi.lib.NOTMUCH_STATUS_SUCCESS:
+            raise errors.NotmuchError(ret)
+
+        directory_p = directory_pp[0]
+        if directory_p == capi.ffi.NULL:
+            raise LookupError
+
+        if os.path.isabs(path):
+            # we got an absolute path
+            abs_dirpath = path
+        else:
+            #we got a relative path, make it absolute
+            abs_dirpath = os.path.abspath(os.path.join(self.get_path(), path))
+
+        ret_dir = directory.Directory(abs_dirpath, directory_p, self)
+        return ret_dir
 
     def default_indexopts(self):
         """Returns default index options for the database.
diff --git a/bindings/python-cffi/notmuch2/_directory.py b/bindings/python-cffi/notmuch2/_directory.py
new file mode 100644
index 00000000..1d48aa54
--- /dev/null
+++ b/bindings/python-cffi/notmuch2/_directory.py
@@ -0,0 +1,164 @@
+import os
+import pathlib
+
+import notmuch2._base as base
+import notmuch2._capi as capi
+import notmuch2._errors as errors
+from ._message import FilenamesIter
+
+__all__ = ["Directory"]
+
+
+class PurePathIter(FilenamesIter):
+    """Iterator for pathlib.PurePath objects."""
+
+    def __next__(self):
+        fname = super().__next__()
+        return pathlib.PurePath(os.fsdecode(fname))
+
+
+class Directory(base.NotmuchObject):
+    """Represents a directory entry in the notmuch directory
+
+    Modifying attributes of this object will modify the
+    database, not the real directory attributes.
+
+    The Directory object is usually derived from another object
+    e.g. via :meth:`Database.get_directory`, and will automatically be
+    become invalid whenever that parent is deleted. You should
+    therefore initialized this object handing it a reference to the
+    parent, preventing the parent from automatically being garbage
+    collected.
+    """
+
+    _msg_p = base.MemoryPointer()
+
+    def __init__(self, path, dir_p, parent):
+        """
+        :param path:   The absolute path of the directory object.
+        :param dir_p:  The pointer to an internal notmuch_directory_t object.
+        :param parent: The object this Directory is derived from
+                       (usually a :class:`Database`). We do not directly use
+                       this, but store a reference to it as long as
+                       this Directory object lives. This keeps the
+                       parent object alive.
+        """
+        self._path = path
+        self._dir_p = dir_p
+        self._parent = parent
+
+    @property
+    def alive(self):
+        if not self._parent.alive:
+            return False
+        try:
+            self._dir_p
+        except errors.ObjectDestroyedError:
+            return False
+        else:
+            return True
+
+    def __del__(self):
+        """Close and free the Directory"""
+        self._destroy()
+
+    def _destroy(self):
+        if self.alive:
+            capi.lib.notmuch_directory_destroy(self._dir_p)
+        self._dir_p = None
+
+    def set_mtime(self, mtime):
+        """Sets the mtime value of this directory in the database
+
+        The intention is for the caller to use the mtime to allow efficient
+        identification of new messages to be added to the database. The
+        recommended usage is as follows:
+
+        * Read the mtime of a directory from the filesystem
+
+        * Call :meth:`Database.index_file` for all mail files in
+          the directory
+
+        * Call notmuch_directory_set_mtime with the mtime read from the
+          filesystem.  Then, when wanting to check for updates to the
+          directory in the future, the client can call :meth:`get_mtime`
+          and know that it only needs to add files if the mtime of the
+          directory and files are newer than the stored timestamp.
+
+          .. note::
+
+                :meth:`get_mtime` function does not allow the caller to
+                distinguish a timestamp of 0 from a non-existent timestamp. So
+                don't store a timestamp of 0 unless you are comfortable with
+                that.
+
+        :param mtime: A (time_t) timestamp
+        :raises: :exc:`XapianError` a Xapian exception occurred, mtime
+                 not stored
+        :raises: :exc:`ReadOnlyDatabaseError` the database was opened
+                 in read-only mode so directory mtime cannot be modified
+        :raises: :exc:`NotInitializedError` the directory object has not
+                 been initialized
+        """
+        ret = capi.lib.notmuch_directory_set_mtime(self._dir_p, mtime)
+
+        if ret != capi.lib.NOTMUCH_STATUS_SUCCESS:
+            raise errors.NotmuchError(ret)
+
+    def get_mtime(self):
+        """Gets the mtime value of this directory in the database
+
+        Retrieves a previously stored mtime for this directory.
+
+        :param mtime: A (time_t) timestamp
+        :raises: :exc:`NotmuchError`:
+
+                        :attr:`STATUS`.NOT_INITIALIZED
+                          The directory has not been initialized
+        """
+        return capi.lib.notmuch_directory_get_mtime(self._dir_p)
+
+    # Make mtime attribute a property of Directory()
+    mtime = property(
+        get_mtime,
+        set_mtime,
+        doc="""Property that allows getting
+                     and setting of the Directory *mtime* (read-write)
+
+                     See :meth:`get_mtime` and :meth:`set_mtime` for usage and
+                     possible exceptions.""",
+    )
+
+    def get_child_files(self):
+        """Gets a Filenames iterator listing all the filenames of
+        messages in the database within the given directory.
+
+        The returned filenames will be the basename-entries only (not
+        complete paths.
+        """
+        fnames_p = capi.lib.notmuch_directory_get_child_files(self._dir_p)
+        return PurePathIter(self, fnames_p)
+
+    def get_child_directories(self):
+        """Gets a :class:`Filenames` iterator listing all the filenames of
+        sub-directories in the database within the given directory
+
+        The returned filenames will be the basename-entries only (not
+        complete paths.
+        """
+        fnames_p = capi.lib.notmuch_directory_get_child_directories(self._dir_p)
+        return PurePathIter(self, fnames_p)
+
+    @property
+    def path(self):
+        """Returns the absolute path of this Directory (read-only)"""
+        return self._path
+
+    def __repr__(self):
+        """Object representation"""
+        try:
+            self._dir_p
+        except errors.ObjectDestroyedError:
+            return '<Directory(path={self.path}) (exhausted)>'.format(self=self)
+        else:
+            return '<Directory(path={self.path})>'.format(self=self)
-- 
2.32.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] python/notmuch2: provide binding for database_get_directory()
  2021-07-25  8:16 [PATCH 1/1] python/notmuch2: provide binding for database_get_directory() Ludovic LANGE
@ 2021-07-25 20:18 ` Floris Bruynooghe
  2021-07-25 20:46   ` Ludovic LANGE
  0 siblings, 1 reply; 5+ messages in thread
From: Floris Bruynooghe @ 2021-07-25 20:18 UTC (permalink / raw)
  To: Ludovic LANGE, notmuch; +Cc: Ludovic LANGE

Hi Ludovic,

Thanks for having a look at this!

On Sun 25 Jul 2021 at 10:16 +0200, Ludovic LANGE wrote:

> diff --git a/bindings/python-cffi/notmuch2/_database.py b/bindings/python-cffi/notmuch2/_database.py
> index 868f4408..e48fa895 100644
> --- a/bindings/python-cffi/notmuch2/_database.py
> +++ b/bindings/python-cffi/notmuch2/_database.py
> @@ -13,6 +13,7 @@ import notmuch2._errors as errors
>  import notmuch2._message as message
>  import notmuch2._query as querymod
>  import notmuch2._tags as tags
> +import notmuch2._directory as directory
>  
>  
>  __all__ = ['Database', 'AtomicContext', 'DbRevision']
> @@ -338,7 +339,46 @@ class Database(base.NotmuchObject):
>          return DbRevision(rev, capi.ffi.string(raw_uuid[0]))
>  
>      def get_directory(self, path):
> -        raise NotImplementedError
> +        """Returns a :class:`Directory` from the database for path,
> +
> +        :param path: An unicode string containing the path relative to the path
> +              of database (see :attr:`path`), or else should be an absolute
> +              path with initial components that match the path of 'database'.

Instead of a unicode string instead it should accept anything that
`os.fspath` accepts: str, bytes or os.PathLike.  This allows using
e.g. pathlib.Path as well.

> +        :returns: :class:`Directory` or raises an exception.
> +        :raises: :exc:`FileError` if path is not relative database or absolute
> +                 with initial components same as database.
> +
> +
> +        :raises XapianError: A Xapian exception occurred.
> +        :raises LookupError: The directory object referred to by ``pathname``

path or pathname?  choose one ;)

> +           does not exist in the database.
> +        :raises FileNotEmailError: The file referreed to by
> +           ``pathname`` is not recognised as an email message.

same

> +        :raises UpgradeRequiredError: The database must be upgraded
> +           first.
> +        """
> +        if not hasattr(os, 'PathLike') and isinstance(path, pathlib.Path):
> +            path = bytes(path)

I think `path = os.fspath(path)` should handle this fine.  But I don't
think you even need to do that as `os.fsencode()` will just do the right
thing already.

> +        directory_pp = capi.ffi.new('notmuch_directory_t **')
> +        ret = capi.lib.notmuch_database_get_directory(
> +          self._db_p, os.fsencode(path), directory_pp)
> +
> +        if ret != capi.lib.NOTMUCH_STATUS_SUCCESS:
> +            raise errors.NotmuchError(ret)
> +
> +        directory_p = directory_pp[0]
> +        if directory_p == capi.ffi.NULL:
> +            raise LookupError
> +
> +        if os.path.isabs(path):
> +            # we got an absolute path
> +            abs_dirpath = path
> +        else:
> +            #we got a relative path, make it absolute
> +            abs_dirpath = os.path.abspath(os.path.join(self.get_path(), path))
> +
> +        ret_dir = directory.Directory(abs_dirpath, directory_p, self)

I think the code has been trying to use pathlib for path manipulations,
so for consistency it'd be good to keep doing that  This if condition
becomes much simpler:

path = pathlib.Path(path)  # this is a no-op if it already is an instance
ret_dir = directory.Directory(path.absolute(), directory_p, self)

> +        return ret_dir
>  
>      def default_indexopts(self):
>          """Returns default index options for the database.
> diff --git a/bindings/python-cffi/notmuch2/_directory.py b/bindings/python-cffi/notmuch2/_directory.py
> new file mode 100644
> index 00000000..1d48aa54
> --- /dev/null
> +++ b/bindings/python-cffi/notmuch2/_directory.py
> @@ -0,0 +1,164 @@
> +import os
> +import pathlib
> +
> +import notmuch2._base as base
> +import notmuch2._capi as capi
> +import notmuch2._errors as errors
> +from ._message import FilenamesIter
> +
> +__all__ = ["Directory"]
> +
> +
> +class PurePathIter(FilenamesIter):
> +    """Iterator for pathlib.PurePath objects."""
> +
> +    def __next__(self):
> +        fname = super().__next__()
> +        return pathlib.PurePath(os.fsdecode(fname))

I'm surprised you explicitly need a PurePath here?


> +class Directory(base.NotmuchObject):
> +    """Represents a directory entry in the notmuch directory
> +
> +    Modifying attributes of this object will modify the
> +    database, not the real directory attributes.
> +
> +    The Directory object is usually derived from another object
> +    e.g. via :meth:`Database.get_directory`, and will automatically be
> +    become invalid whenever that parent is deleted. You should
> +    therefore initialized this object handing it a reference to the
> +    parent, preventing the parent from automatically being garbage
> +    collected.
> +    """
> +
> +    _msg_p = base.MemoryPointer()
> +
> +    def __init__(self, path, dir_p, parent):
> +        """
> +        :param path:   The absolute path of the directory object.

For consistency I'd require this to be `pathlib.Path`.

> +        :param dir_p:  The pointer to an internal notmuch_directory_t object.
> +        :param parent: The object this Directory is derived from
> +                       (usually a :class:`Database`). We do not directly use
> +                       this, but store a reference to it as long as
> +                       this Directory object lives. This keeps the
> +                       parent object alive.
> +        """

Just add this docstring the the class docstring, that's the usual way to
document __init__.

> +        self._path = path
> +        self._dir_p = dir_p
> +        self._parent = parent
> +
> +    @property
> +    def alive(self):
> +        if not self._parent.alive:
> +            return False
> +        try:
> +            self._dir_p
> +        except errors.ObjectDestroyedError:
> +            return False
> +        else:
> +            return True
> +
> +    def __del__(self):
> +        """Close and free the Directory"""
> +        self._destroy()
> +
> +    def _destroy(self):
> +        if self.alive:
> +            capi.lib.notmuch_directory_destroy(self._dir_p)
> +        self._dir_p = None
> +
> +    def set_mtime(self, mtime):
> +        """Sets the mtime value of this directory in the database
> +
> +        The intention is for the caller to use the mtime to allow efficient
> +        identification of new messages to be added to the database. The
> +        recommended usage is as follows:
> +
> +        * Read the mtime of a directory from the filesystem
> +
> +        * Call :meth:`Database.index_file` for all mail files in
> +          the directory
> +
> +        * Call notmuch_directory_set_mtime with the mtime read from the
> +          filesystem.  Then, when wanting to check for updates to the
> +          directory in the future, the client can call :meth:`get_mtime`
> +          and know that it only needs to add files if the mtime of the
> +          directory and files are newer than the stored timestamp.
> +
> +          .. note::
> +
> +                :meth:`get_mtime` function does not allow the caller to
> +                distinguish a timestamp of 0 from a non-existent timestamp. So
> +                don't store a timestamp of 0 unless you are comfortable with
> +                that.

Love the docs on how this is actually intended to be used!

> +
> +        :param mtime: A (time_t) timestamp

This description works for C-developers, but searching docs.python.org
for that will get people very confused.  I'd suggest:

:param mtime: A POSIX timestamp.
:type mtime: int or float

I'm suggesting to accept float here because `time.time()` returns a
float.  `os.mtime(...).st_mtime` is int though, so you could argue to
restrict it to int.

> +        :raises: :exc:`XapianError` a Xapian exception occurred, mtime
> +                 not stored
> +        :raises: :exc:`ReadOnlyDatabaseError` the database was opened
> +                 in read-only mode so directory mtime cannot be modified
> +        :raises: :exc:`NotInitializedError` the directory object has not
> +                 been initialized
> +        """
> +        ret = capi.lib.notmuch_directory_set_mtime(self._dir_p, mtime)

int(mtime) if you follow my suggestion above

> +
> +        if ret != capi.lib.NOTMUCH_STATUS_SUCCESS:
> +            raise errors.NotmuchError(ret)
> +
> +    def get_mtime(self):
> +        """Gets the mtime value of this directory in the database
> +
> +        Retrieves a previously stored mtime for this directory.
> +
> +        :param mtime: A (time_t) timestamp

I don't think there is a param ;)

> +        :raises: :exc:`NotmuchError`:
> +
> +                        :attr:`STATUS`.NOT_INITIALIZED
> +                          The directory has not been initialized

An old comment?

> +        """
> +        return capi.lib.notmuch_directory_get_mtime(self._dir_p)
> +
> +    # Make mtime attribute a property of Directory()
> +    mtime = property(
> +        get_mtime,
> +        set_mtime,
> +        doc="""Property that allows getting
> +                     and setting of the Directory *mtime* (read-write)
> +
> +                     See :meth:`get_mtime` and :meth:`set_mtime` for usage and
> +                     possible exceptions.""",
> +    )

Hmm, either we should have a get/set API or a property, not both in my
opinion.  I'm a bit conflicted on which is better, the code only has one
setter property currently, Database.decrypt_policy and that feels
somewhat similar so it could be good for consistency.  But maybe that
choice was a mistake.  In general assigning to a property does not
have the expectation of taking an action I think, and here you are
directly updating the database.

Anyway, I'm curious what other people think of this as well.

> +    def get_child_files(self):

Style-wise I'd name this `def files(self):`, I think it follows the
style of the API more.


> +        """Gets a Filenames iterator listing all the filenames of
> +        messages in the database within the given directory.
> +
> +        The returned filenames will be the basename-entries only (not
> +        complete paths.
> +        """
> +        fnames_p = capi.lib.notmuch_directory_get_child_files(self._dir_p)
> +        return PurePathIter(self, fnames_p)

Hmm, seeing it's use I think this should probably be an iterator
returning `pathlib.Path`, only giving a PurePath seems almost mean.  I'd
make sure they are already usable, instead of still having to get the
path property to make them useful, so pass in the path of the directory
to the iterator so it can return absolute paths.

> +
> +    def get_child_directories(self):

likewise `directories()` probably

> +        """Gets a :class:`Filenames` iterator listing all the filenames of
> +        sub-directories in the database within the given directory
> +
> +        The returned filenames will be the basename-entries only (not
> +        complete paths.
> +        """
> +        fnames_p = capi.lib.notmuch_directory_get_child_directories(self._dir_p)
> +        return PurePathIter(self, fnames_p)

same as for the files i guess

> +    @property
> +    def path(self):
> +        """Returns the absolute path of this Directory (read-only)"""

Some nitpicking, I'd document the return type (which should be changed
to pathlib.Path), also the `read-only` part is already implicit in the
signature, and it's a property so doesn't "return" things:

"""The path of this Directory as a pathlib.Path instance."""

> +        return self._path
> +
> +    def __repr__(self):
> +        """Object representation"""
> +        try:
> +            self._dir_p
> +        except errors.ObjectDestroyedError:
> +            return '<Directory(path={self.path}) (exhausted)>'.format(self=self)

The `(exhaused)` part seems odd, I'm even surprised you need this except
clause for this object.

> +        else:
> +            return '<Directory(path={self.path})>'.format(self=self)


Hope the many comments are acceptable, once again thanks for adding this
and feel free to disagree with some suggestions :)

Cheers,
Floris

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] python/notmuch2: provide binding for database_get_directory()
  2021-07-25 20:18 ` Floris Bruynooghe
@ 2021-07-25 20:46   ` Ludovic LANGE
  2021-07-26  7:55     ` Tomi Ollila
       [not found]     ` <87im0v31z8.fsf@powell.devork.be>
  0 siblings, 2 replies; 5+ messages in thread
From: Ludovic LANGE @ 2021-07-25 20:46 UTC (permalink / raw)
  To: notmuch

Hi Floris,

Thanks for taking the time to make a review, this is very appreciated !

I'll answer below, but please note : for the comments, I was being lazy,
and they come directly from the previous python bindings. I was
expecting that the whole API was exactly the same and only differing in
implementation, so I was hoping that I could copy/paste them. It seems I
was wrong.

I'll send an updated version soon.


Le 25/07/2021 à 22:18, Floris Bruynooghe a écrit :
> @@ -338,7 +339,46 @@ class Database(base.NotmuchObject):
>>          return DbRevision(rev, capi.ffi.string(raw_uuid[0]))
>>  
>>      def get_directory(self, path):
>> -        raise NotImplementedError
>> +        """Returns a :class:`Directory` from the database for path,
>> +
>> +        :param path: An unicode string containing the path relative to the path
>> +              of database (see :attr:`path`), or else should be an absolute
>> +              path with initial components that match the path of 'database'.
> Instead of a unicode string instead it should accept anything that
> `os.fspath` accepts: str, bytes or os.PathLike.  This allows using
> e.g. pathlib.Path as well.
agreed.
>> +        :returns: :class:`Directory` or raises an exception.
>> +        :raises: :exc:`FileError` if path is not relative database or absolute
>> +                 with initial components same as database.
>> +
>> +
>> +        :raises XapianError: A Xapian exception occurred.
>> +        :raises LookupError: The directory object referred to by ``pathname``
> path or pathname?  choose one ;)
agreed.
>> +           does not exist in the database.
>> +        :raises FileNotEmailError: The file referreed to by
>> +           ``pathname`` is not recognised as an email message.
> same
agreed.
>> +        :raises UpgradeRequiredError: The database must be upgraded
>> +           first.
>> +        """
>> +        if not hasattr(os, 'PathLike') and isinstance(path, pathlib.Path):
>> +            path = bytes(path)
> I think `path = os.fspath(path)` should handle this fine.  But I don't
> think you even need to do that as `os.fsencode()` will just do the right
> thing already.

This is a construct I saw multiple times in the same file (_database.py)
of the notmuch2 python bindings. Not familiar with it, I guessed I
should do the same also.

I'm ok to remove it, but for consistency a more general approach should
be taken for the rest of the ~6 times it occurs in this file. I'd be
happy to have more insights on why you (as you seem to be the author of
the new bindings, and of these constructs) had to put them here the
first time ?

>> +        directory_pp = capi.ffi.new('notmuch_directory_t **')
>> +        ret = capi.lib.notmuch_database_get_directory(
>> +          self._db_p, os.fsencode(path), directory_pp)
>> +
>> +        if ret != capi.lib.NOTMUCH_STATUS_SUCCESS:
>> +            raise errors.NotmuchError(ret)
>> +
>> +        directory_p = directory_pp[0]
>> +        if directory_p == capi.ffi.NULL:
>> +            raise LookupError
>> +
>> +        if os.path.isabs(path):
>> +            # we got an absolute path
>> +            abs_dirpath = path
>> +        else:
>> +            #we got a relative path, make it absolute
>> +            abs_dirpath = os.path.abspath(os.path.join(self.get_path(), path))
>> +
>> +        ret_dir = directory.Directory(abs_dirpath, directory_p, self)
> I think the code has been trying to use pathlib for path manipulations,
> so for consistency it'd be good to keep doing that  This if condition
> becomes much simpler:
>
> path = pathlib.Path(path)  # this is a no-op if it already is an instance
> ret_dir = directory.Directory(path.absolute(), directory_p, self)

I'll try this, thanks.


>> +class PurePathIter(FilenamesIter):
>> +    """Iterator for pathlib.PurePath objects."""
>> +
>> +    def __next__(self):
>> +        fname = super().__next__()
>> +        return pathlib.PurePath(os.fsdecode(fname))
> I'm surprised you explicitly need a PurePath here?

Not needed. My reasoning was : a Directory is returning real-paths
(directories, files) on the filesystem, but the API is only directed
towards manipulating the notmuch database, not the underlying filesystem
(otherwise we would have to re-run notmuch to synch the differences). So
by returning a PurePath (as opposed to a concrete Path) it could be a
signal to the user not to mess with the undeerlying filesystem.


>> +    def __init__(self, path, dir_p, parent):
>> +        """
>> +        :param path:   The absolute path of the directory object.
> For consistency I'd require this to be `pathlib.Path`.

Ok, the idea here was: this is an object returned by the API, and not
expected to be instantiated by the end-user.

I'll do that nevertheless.

>> +        :param dir_p:  The pointer to an internal notmuch_directory_t object.
>> +        :param parent: The object this Directory is derived from
>> +                       (usually a :class:`Database`). We do not directly use
>> +                       this, but store a reference to it as long as
>> +                       this Directory object lives. This keeps the
>> +                       parent object alive.
>> +        """
> Just add this docstring the the class docstring, that's the usual way to
> document __init__.
Why not - here also it's a copy/paste.
>> +
>> +        :param mtime: A (time_t) timestamp
> This description works for C-developers, but searching docs.python.org
> for that will get people very confused.  I'd suggest:
>
> :param mtime: A POSIX timestamp.
> :type mtime: int or float
>
> I'm suggesting to accept float here because `time.time()` returns a
> float.  `os.mtime(...).st_mtime` is int though, so you could argue to
> restrict it to int.
Let me check that.
>
>> +        :raises: :exc:`XapianError` a Xapian exception occurred, mtime
>> +                 not stored
>> +        :raises: :exc:`ReadOnlyDatabaseError` the database was opened
>> +                 in read-only mode so directory mtime cannot be modified
>> +        :raises: :exc:`NotInitializedError` the directory object has not
>> +                 been initialized
>> +        """
>> +        ret = capi.lib.notmuch_directory_set_mtime(self._dir_p, mtime)
> int(mtime) if you follow my suggestion above
Ok
>
>> +
>> +        if ret != capi.lib.NOTMUCH_STATUS_SUCCESS:
>> +            raise errors.NotmuchError(ret)
>> +
>> +    def get_mtime(self):
>> +        """Gets the mtime value of this directory in the database
>> +
>> +        Retrieves a previously stored mtime for this directory.
>> +
>> +        :param mtime: A (time_t) timestamp
> I don't think there is a param ;)
You're right. Copy/paste is such a bad thing.
>> +        :raises: :exc:`NotmuchError`:
>> +
>> +                        :attr:`STATUS`.NOT_INITIALIZED
>> +                          The directory has not been initialized
> An old comment?

I'll have a look at it.


>> +        """
>> +        return capi.lib.notmuch_directory_get_mtime(self._dir_p)
>> +
>> +    # Make mtime attribute a property of Directory()
>> +    mtime = property(
>> +        get_mtime,
>> +        set_mtime,
>> +        doc="""Property that allows getting
>> +                     and setting of the Directory *mtime* (read-write)
>> +
>> +                     See :meth:`get_mtime` and :meth:`set_mtime` for usage and
>> +                     possible exceptions.""",
>> +    )
> Hmm, either we should have a get/set API or a property, not both in my
> opinion.  I'm a bit conflicted on which is better, the code only has one
> setter property currently, Database.decrypt_policy and that feels
> somewhat similar so it could be good for consistency.  But maybe that
> choice was a mistake.  In general assigning to a property does not
> have the expectation of taking an action I think, and here you are
> directly updating the database.
>
> Anyway, I'm curious what other people think of this as well.

Copy/paste (!!) of the previous bindings. As I don't know how people are
using them, I was expecting to make as little changes as necessary, so
that code change would be kept to a minimal when porting from old
bindings to new bindings.

Waiting for other feedback also.


>> +    def get_child_files(self):
> Style-wise I'd name this `def files(self):`, I think it follows the
> style of the API more.

Same reasoning as before - not wanting to make too much changes from old
to new bindings.

But if the consensus is on having a new style, we will of course adapt.


>> +        """Gets a Filenames iterator listing all the filenames of
>> +        messages in the database within the given directory.
>> +
>> +        The returned filenames will be the basename-entries only (not
>> +        complete paths.
>> +        """
>> +        fnames_p = capi.lib.notmuch_directory_get_child_files(self._dir_p)
>> +        return PurePathIter(self, fnames_p)
> Hmm, seeing it's use I think this should probably be an iterator
> returning `pathlib.Path`, only giving a PurePath seems almost mean.  I'd
> make sure they are already usable, instead of still having to get the
> path property to make them useful, so pass in the path of the directory
> to the iterator so it can return absolute paths.

I'm mean :-)

I explained previously my approach, but don't mind giving a "workable"
path. Will do.


>
>> +
>> +    def get_child_directories(self):
> likewise `directories()` probably

Same idea as get_child_files().  I'm curious what other people think of
this as well.


>> +        """Gets a :class:`Filenames` iterator listing all the filenames of
>> +        sub-directories in the database within the given directory
>> +
>> +        The returned filenames will be the basename-entries only (not
>> +        complete paths.
>> +        """
>> +        fnames_p = capi.lib.notmuch_directory_get_child_directories(self._dir_p)
>> +        return PurePathIter(self, fnames_p)
> same as for the files i guess

OK

>> +    @property
>> +    def path(self):
>> +        """Returns the absolute path of this Directory (read-only)"""
> Some nitpicking, I'd document the return type (which should be changed
> to pathlib.Path), also the `read-only` part is already implicit in the
> signature, and it's a property so doesn't "return" things:
>
> """The path of this Directory as a pathlib.Path instance."""
Ok
>> +        return self._path
>> +
>> +    def __repr__(self):
>> +        """Object representation"""
>> +        try:
>> +            self._dir_p
>> +        except errors.ObjectDestroyedError:
>> +            return '<Directory(path={self.path}) (exhausted)>'.format(self=self)
> The `(exhaused)` part seems odd, I'm even surprised you need this except
> clause for this object.

Inspiration came from TagsIter / NotmuchIter - but I didn't try to
reproduced it on this object. I don't know which moves could make you
have dangling references or destroyed objects here.

If you prefer I'll simplify this repr.


> Hope the many comments are acceptable, once again thanks for adding this
> and feel free to disagree with some suggestions :)

Thanks again !

I'll submit a new patch soon.

Regards,

Ludovic
\r

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] python/notmuch2: provide binding for database_get_directory()
  2021-07-25 20:46   ` Ludovic LANGE
@ 2021-07-26  7:55     ` Tomi Ollila
       [not found]     ` <87im0v31z8.fsf@powell.devork.be>
  1 sibling, 0 replies; 5+ messages in thread
From: Tomi Ollila @ 2021-07-26  7:55 UTC (permalink / raw)
  To: Ludovic LANGE, notmuch

On Sun, Jul 25 2021, Ludovic LANGE wrote:

// stuff deleted //

>  Le 25/07/2021 à 22:18, Floris Bruynooghe a écrit :

// stuff deleted ... //

>>
>> Anyway, I'm curious what other people think of this as well.
>
> Copy/paste (!!) of the previous bindings. As I don't know how people are
> using them, I was expecting to make as little changes as necessary, so
> that code change would be kept to a minimal when porting from old
> bindings to new bindings.

If such an implementation is more inconsistent w/ the new bindings, then
-- for sure -- have implementation that differs more from old bindings.
(and from copy-paste point of view, mistakes lurk easily if changes are
applied w/o doing even more careful post-inspection before committing...) 
>
> Waiting for other feedback also.
>
>
>>> +    def get_child_files(self):
>> Style-wise I'd name this `def files(self):`, I think it follows the
>> style of the API more.
>
> Same reasoning as before - not wanting to make too much changes from old
> to new bindings.
>
> But if the consensus is on having a new style, we will of course adapt.

IMO the def files(self): (and def directories(self):) are the way to go !

Tomi

// stuff deleted //

>
> Regards,
>
> Ludovic
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] python/notmuch2: provide binding for database_get_directory()
       [not found]       ` <df94c8da-eba7-f420-8b9a-80b0f6b8a419@lange.nom.fr>
@ 2021-07-31  9:20         ` Floris Bruynooghe
  0 siblings, 0 replies; 5+ messages in thread
From: Floris Bruynooghe @ 2021-07-31  9:20 UTC (permalink / raw)
  To: Ludovic LANGE, notmuch

On Sat 31 Jul 2021 at 00:33 +0200, Ludovic LANGE wrote:
> (Btw, it seems your mail was sent directly to me, not the M/L, was it
> intentional or an accident ? Both are OK for me.)

No, that was accidental.  Apologies!

> While you review the new version of the patch, a few thoughts on your
> answers below:
>
> Le 27/07/2021 à 23:01, Floris Bruynooghe a écrit :
>>>>> +        :raises UpgradeRequiredError: The database must be upgraded
>>>>> +           first.
>>>>> +        """
>>>>> +        if not hasattr(os, 'PathLike') and isinstance(path, pathlib.Path):
>>>>> +            path = bytes(path)
>>>> I think `path = os.fspath(path)` should handle this fine.  But I don't
>>>> think you even need to do that as `os.fsencode()` will just do the right
>>>> thing already.
>>> This is a construct I saw multiple times in the same file (_database.py)
>>> of the notmuch2 python bindings. Not familiar with it, I guessed I
>>> should do the same also.
>>>
>>> I'm ok to remove it, but for consistency a more general approach should
>>> be taken for the rest of the ~6 times it occurs in this file. I'd be
>>> happy to have more insights on why you (as you seem to be the author of
>>> the new bindings, and of these constructs) had to put them here the
>>> first time ?
>> Heh, I guess I just didn't write that good code and read the docs better
>> this time round?  Pathlib was fairly new back then.  I think the other 6
>> times should probably change because it seems like these are the python
>> APIs to do this with and are less error prone.
>
> OK, that's fine - it's just that I don't have much experience with this
> PathLib API.
>
> I'm OK to remove this construct, no issue, will do it. I'll let you do
> the others or I can do it, as you wish.
>
>
>>>>> +class PurePathIter(FilenamesIter):
>>>>> +    """Iterator for pathlib.PurePath objects."""
>>>>> +
>>>>> +    def __next__(self):
>>>>> +        fname = super().__next__()
>>>>> +        return pathlib.PurePath(os.fsdecode(fname))
>>>> I'm surprised you explicitly need a PurePath here?
>>> Not needed. My reasoning was : a Directory is returning real-paths
>>> (directories, files) on the filesystem, but the API is only directed
>>> towards manipulating the notmuch database, not the underlying filesystem
>>> (otherwise we would have to re-run notmuch to synch the differences). So
>>> by returning a PurePath (as opposed to a concrete Path) it could be a
>>> signal to the user not to mess with the undeerlying filesystem.
>> I like this reasoning :)
>>
>> I wonder if we can go even further and not supply the `.files()` and
>> `.directories()` functions at all.  After all there is `Directory.path`
>> and in python you can just do `[p for p in Direcotry.path.iterdir() if
>> p.is_file()]` o get the equivalent.  OTOH supplying these allows you to
>> verify that notmuch sees the same things.
>>
>> So maybe it makes sense to keep this as PurePath and keep .files() and
>> .directories() but explain in their docstring they are more advanced and
>> point users to the list comprehension instead for most uses.
>>
>> Maybe someone who knows the C API better can comment on whether this
>> seems reasonable or whether I have this completely wrong.
>
> I'm a little embarrassed removing those functions (`.files()` and
> `.directories()`):
>
> * They're part of the C API, and in my vision the python API should be a
> (maybe agnostic) wrapper around the C API, not removing functions. I'm
> OK to be a little more "Pythonic", but I'd prefer not to diverge from
> the C roots.

This I'm not too swayed by, if you want a faithful API then you should
use the notmuch._capi module directly but you have to deal with all the
hurdles.  So I think it's reasonable to interpret things in a suitable
way.

> * In this specific use-case, asking the Database its vision of the
> filesystem is what I want - because I'm dealing with mails (that are in
> some folders) indexed by the database. If there is an inconsistency
> (folders not indexed for whatever reason) between the filesystem and
> notmuch's view ; I prefer to have notmuch's view because after I'll
> query notmuch based on those folders - they're better exist in the database.

Great, I missed this.  I checked the implementation in C and indeed it
queries the database and does not do any filesystem operations.  So yes,
it does make sense to include these.

Slightly off-topic: this is why I was reluctant to work on this API
myself, I haven't used it and don't fully understand it's uses.  So I
appreciate this discussion and you taking the time to explain me this
kind of thing.

> * Also, regarding PurePath, you're initial reaction was sound, and I'm
> OK to remove PurePath and have a (valid, functional) Path instead.
> That's what I put in the second version of the patch.


Cheers,
Floris

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-07-31  9:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-25  8:16 [PATCH 1/1] python/notmuch2: provide binding for database_get_directory() Ludovic LANGE
2021-07-25 20:18 ` Floris Bruynooghe
2021-07-25 20:46   ` Ludovic LANGE
2021-07-26  7:55     ` Tomi Ollila
     [not found]     ` <87im0v31z8.fsf@powell.devork.be>
     [not found]       ` <df94c8da-eba7-f420-8b9a-80b0f6b8a419@lange.nom.fr>
2021-07-31  9:20         ` Floris Bruynooghe

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).