unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* question about deletion and counts
@ 2018-11-03 11:39 Jeff Templon
  2018-11-07 22:05 ` Carl Worth
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff Templon @ 2018-11-03 11:39 UTC (permalink / raw)
  To: notmuch


simeto:~> notmuch search --output=files tag:deleted | wc -l
     666
simeto:~> notmuch search --format=text0 --output=files tag:deleted | xargs -0 rm

afterwards, from notmuch new:

No new mail. Removed 577 messages. Detected 89 file renames.

577 + 89 = 666 ... my guess is that there were 577 messages and 89 files
that represented duplicates of messages.  But I didn't rename the files,
I deleted them.  Should I worry?  Why is the message inaccurate?

JT

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: question about deletion and counts
  2018-11-03 11:39 question about deletion and counts Jeff Templon
@ 2018-11-07 22:05 ` Carl Worth
  2018-11-08 10:01   ` Jeff Templon
  0 siblings, 1 reply; 3+ messages in thread
From: Carl Worth @ 2018-11-07 22:05 UTC (permalink / raw)
  To: Jeff Templon, notmuch

[-- Attachment #1: Type: text/plain, Size: 2035 bytes --]

On Sat, Nov 03 2018, Jeff Templon wrote:
> No new mail. Removed 577 messages. Detected 89 file renames.
>
> 577 + 89 = 666 ... my guess is that there were 577 messages and 89 files
> that represented duplicates of messages.

Yes. Among the messages you had tagged as deleted you had 577 unique
message IDs. In addition, you had another 89 files with message IDs that
were duplicates of one of the 577.

> But I didn't rename the files, I deleted them.  Should I worry?

Nope. Nothing to worry about here.

> Why is the message inaccurate?

Because notmuch has a more narrow view of what a "rename" is than you
do.

A file rename is a high-level operation that will be seen by notmuch as
multiple operations seen over the course of a single run of notmuch
new:

  1. A new file is added with a message ID that already exists in the
     database

  2. A file is removed with a message ID for which there are multiple
     files in the database

But notmuch doesn't detect whether both of these operations are seen in
a single pass in order to detect a rename. Instead, what it is doing is
counting every occurence of (2) above as a rename. Here's what the code
looks like (notmuch-new.c:remove_filename):

    status = notmuch_database_remove_message (notmuch, path);
    if (status == NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID) {
	add_files_state->renamed_messages++;
	if (add_files_state->synchronize_flags == true)
	    notmuch_message_maildir_flags_to_tags (message);
	status = NOTMUCH_STATUS_SUCCESS;
    } else if (status == NOTMUCH_STATUS_SUCCESS) {
	add_files_state->removed_messages++;
    }

So, whenever removing a filename, it will either get counted as a rename
(if there is still at least one other filename in the database with the
same message ID), or it will get counted as a removal (if this was the
last filename for message ID).

I suppose you could come up with some other name for what it is
counting, such as "removals of duplicate messages" instead of "rename",
but that's what's happening.

I hope that helps.

-Carl

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: question about deletion and counts
  2018-11-07 22:05 ` Carl Worth
@ 2018-11-08 10:01   ` Jeff Templon
  0 siblings, 0 replies; 3+ messages in thread
From: Jeff Templon @ 2018-11-08 10:01 UTC (permalink / raw)
  To: Carl Worth, notmuch

Hi Carl,

Thanks for your answer.

Carl Worth <cworth@cworth.org> writes:

> A file rename is a high-level operation that will be seen by notmuch as
> multiple operations seen over the course of a single run of notmuch
> new:
>
>   1. A new file is added with a message ID that already exists in the
>      database
>
>   2. A file is removed with a message ID for which there are multiple
>      files in the database
>
> But notmuch doesn't detect whether both of these operations are seen in
> a single pass in order to detect a rename. Instead, what it is doing is
> counting every occurence of (2) above as a rename. Here's what the code
> looks like (notmuch-new.c:remove_filename):
>
>     status = notmuch_database_remove_message (notmuch, path);
>     if (status == NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID) {
> 	add_files_state->renamed_messages++;
> 	if (add_files_state->synchronize_flags == true)
> 	    notmuch_message_maildir_flags_to_tags (message);
> 	status = NOTMUCH_STATUS_SUCCESS;
>     } else if (status == NOTMUCH_STATUS_SUCCESS) {
> 	add_files_state->removed_messages++;
>     }


Perfect explanation, thanks.

> I suppose you could come up with some other name for what it is
> counting, such as "removals of duplicate messages" instead of "rename",
> but that's what's happening.

Yes, that'd be my suggestion :-) It's one of my personal buttons that
sometimes get pushed "name is misleading".  If you seriously consider
it, I'd suggest "file reassignments" instead of "file renames".  A file
rename to me is

       mv jeff.txt carl.txt

the file was named jeff.txt but was renamed to carl.txt.  The case you
describe, a file with a certain name is either assigned to a messageID,
or de-assigned to that messageID - the actual file name is not changed,
as I understand it.

Anyway thanks for the explanation!  Good that I don't need to worry.

BTW I've got integration between org and notmuch up and running now, I'm
really liking this capability.

JT

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-11-08 10:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-03 11:39 question about deletion and counts Jeff Templon
2018-11-07 22:05 ` Carl Worth
2018-11-08 10:01   ` Jeff Templon

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).