unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* [PATCH 2/2] nnmaildir: Use a 'num' file, instead of a directory
@ 2010-05-27 21:09 John Wiegley
  2010-06-24 17:52 ` Ted Zlatanov
  2010-06-26 17:33 ` Paul Jarc
  0 siblings, 2 replies; 4+ messages in thread
From: John Wiegley @ 2010-05-27 21:09 UTC (permalink / raw)
  To: emacs-devel@gnu.org devel

[-- Attachment #1: Type: text/plain, Size: 411 bytes --]

This patch causes nnmaildir to use a "num" file to track the next available article number, rather than creating N empty files.  However, this only works for new groups.  Existing groups, which use the directory, will continue to.  If you want to switch to using file-based numbering, you must convert your Maildir using the attached script (it's also copied to the commit description in the patch).

John


[-- Attachment #2: 0002-nnmaildir-Use-a-num-file-instead-of-a-directory.patch --]
[-- Type: application/octet-stream, Size: 3449 bytes --]

From d80163c6686b7b8a9a24ff94d61baddbcdd0364c Mon Sep 17 00:00:00 2001
From: John Wiegley <johnw@newartisans.com>
Date: Thu, 27 May 2010 15:03:10 -0600
Subject: [PATCH 2/2] nnmaildir: Use a 'num' file, instead of a directory

---
 nnmaildir.el |   40 +++++++++++++++++++++++++++++++++++++---
 1 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/nnmaildir.el b/nnmaildir.el
index fde1fc8..1991321 100644
--- a/nnmaildir.el
+++ b/nnmaildir.el
@@ -236,6 +236,9 @@ by nnmaildir-request-article.")
 (defmacro nnmaildir--marks-dir (dir) `(nnmaildir--subdir ,dir "marks"))
 (defmacro nnmaildir--num-dir   (dir) `(nnmaildir--subdir ,dir "num"))
 
+(defun nnmaildir--num-file  (dir)
+  (expand-file-name "num" dir))
+
 (defmacro nnmaildir--unlink (file-arg)
   `(let ((file ,file-arg))
      (if (file-attributes file) (delete-file file))))
@@ -249,7 +252,7 @@ by nnmaildir-request-article.")
     (mapc 'delete-file (funcall ls dir 'full "\\`[^.]" 'nosort))
     (delete-directory dir)))
 
-(defun nnmaildir--group-maxnum (server group)
+(defun nnmaildir--group-maxnum-from-dir (server group)
   (catch 'return
     (if (zerop (nnmaildir--grp-count group)) (throw 'return 0))
     (let ((dir (nnmaildir--srvgrp-dir (nnmaildir--srv-dir server)
@@ -273,6 +276,21 @@ by nnmaildir-request-article.")
 	(unless (equal ino-opened (nth 10 attr))
 	  (setq number-opened number-linked))))))
 
+(defun nnmaildir--group-maxnum (server group)
+  (if (zerop (nnmaildir--grp-count group))
+      0
+    (let ((file (nnmaildir--num-file
+		 (nnmaildir--nndir
+		  (nnmaildir--srvgrp-dir (nnmaildir--srv-dir server)
+					 (nnmaildir--grp-name group))))))
+      (if (file-directory-p file)
+	  (nnmaildir--group-maxnum-from-dir server group)
+	(if (file-exists-p file)
+	    (with-temp-buffer
+	      (insert-file-contents file)
+	      (read (current-buffer)))
+	  0)))))
+
 ;; Make the given server, if non-nil, be the current server.  Then make the
 ;; given group, if non-nil, be the current group of the current server.  Then
 ;; return the group object for the current group.
@@ -324,8 +342,7 @@ by nnmaildir-request-article.")
 (defun nnmaildir--eexist-p (err)
   (eq (car err) 'file-already-exists))
 
-(defun nnmaildir--new-number (nndir)
-  "Allocate a new article number by atomically creating a file under NNDIR."
+(defun nnmaildir--new-number-from-dir (nndir)
   (let ((numdir (nnmaildir--num-dir nndir))
 	(make-new-file t)
 	(number-open 1)
@@ -366,6 +383,23 @@ by nnmaildir-request-article.")
 		      number-link 0))))
 	   (t (signal (car err) (cdr err)))))))))
 
+(defun nnmaildir--new-number (nndir)
+  "Allocate a new article number by atomically creating a file under NNDIR."
+  (let ((numfile (nnmaildir--num-file nndir))
+	(number 0))
+    (if (file-directory-p numfile)
+	(nnmaildir--new-number-from-dir nndir)
+      (with-current-buffer (find-file-noselect numfile t t)
+	(goto-char (point-min))
+	(unless (eobp)
+	  (setq number (1+ (read (current-buffer))))
+	  (delete-region (point-min) (point-max)))
+	(insert (number-to-string number))
+	(write-region (point-min) (point-max) numfile nil 'no-message)
+	(set-buffer-modified-p nil)
+	(kill-buffer (current-buffer)))
+      number)))
+
 (defun nnmaildir--update-nov (server group article)
   (let ((nnheader-file-coding-system 'binary)
 	(srv-dir (nnmaildir--srv-dir server))
-- 
1.7.1


[-- Attachment #3: Type: text/plain, Size: 1 bytes --]



[-- Attachment #4: gnus-renumber --]
[-- Type: application/octet-stream, Size: 422 bytes --]

#!/bin/sh

dryrun=false
if [ "$1" = "-n" ]; then
    dryrun=true
    shift 1
fi

find $1 -name .nnmaildir -type d | \
while read nndir; do
    if [ -d "$nndir"/num ]; then
	last=$(find "$nndir"/num -type f | sed -e 's/.*\///' | sort -n | tail -1)
	if [ $dryrun = true ]; then
	    echo rm -fr "$nndir"/num
	    echo echo $last '>' "$nndir"/num
	else
	    rm -fr "$nndir"/num
	    echo $last > "$nndir"/num
	fi
    fi
done

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 2/2] nnmaildir: Use a 'num' file, instead of a directory
  2010-05-27 21:09 [PATCH 2/2] nnmaildir: Use a 'num' file, instead of a directory John Wiegley
@ 2010-06-24 17:52 ` Ted Zlatanov
  2010-06-26 17:33 ` Paul Jarc
  1 sibling, 0 replies; 4+ messages in thread
From: Ted Zlatanov @ 2010-06-24 17:52 UTC (permalink / raw)
  To: ding; +Cc: emacs-devel

On Thu, 27 May 2010 15:09:08 -0600 John Wiegley <jwiegley@gmail.com> wrote: 

JW> This patch causes nnmaildir to use a "num" file to track the next
JW> available article number, rather than creating N empty files.
JW> However, this only works for new groups.  Existing groups, which use
JW> the directory, will continue to.  If you want to switch to using
JW> file-based numbering, you must convert your Maildir using the
JW> attached script (it's also copied to the commit description in the
JW> patch).

IIRC (and Paul Jarc may want to comment) something similar was discussed
in http://thread.gmane.org/gmane.emacs.gnus.general/66245/focus=66247

I'm not sure I want to give up concurrent spool access in nnmaildir so I
left things alone back then.  I have recently experienced slow nnmaildir
performance too (see
http://article.gmane.org/gmane.emacs.gnus.general/69723).  Maybe we can
make this a backend parameter, off by default, and ask whenever someone
creates a new nnmaildir backend in Gnus?  I don't think we need to
support the conversion you described, we should only support it for new
nnmaildirs.  What do you think?

Ted




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 2/2] nnmaildir: Use a 'num' file, instead of a directory
  2010-05-27 21:09 [PATCH 2/2] nnmaildir: Use a 'num' file, instead of a directory John Wiegley
  2010-06-24 17:52 ` Ted Zlatanov
@ 2010-06-26 17:33 ` Paul Jarc
  2010-08-13  0:15   ` Ted Zlatanov
  1 sibling, 1 reply; 4+ messages in thread
From: Paul Jarc @ 2010-06-26 17:33 UTC (permalink / raw)
  To: emacs-devel

Ted Zlatanov <tzz@lifelogs.com> wrote:
> On Thu, 27 May 2010 15:09:08 -0600 John Wiegley <jwiegley@gmail.com> wrote: 
>
> JW> This patch causes nnmaildir to use a "num" file to track the next
> JW> available article number, rather than creating N empty files.
> JW> However, this only works for new groups.  Existing groups, which use
> JW> the directory, will continue to.  If you want to switch to using
> JW> file-based numbering, you must convert your Maildir using the
> JW> attached script (it's also copied to the commit description in the
> JW> patch).
>
> IIRC (and Paul Jarc may want to comment) something similar was discussed
> in http://thread.gmane.org/gmane.emacs.gnus.general/66245/focus=66247

Yes, although John hasn't said whether performance is his motivation.

> I have recently experienced slow nnmaildir performance

I suspect the performance problem is with the nov/ directory, rather
than num/.  With the current nov/ structure, it's quick and easy to
map from filenames to article numbers, but I think we usually need the
inverse operation.  As it is, to map from an article number to a
filename, we need to read the contents of all the nov/ files.  That's
done just once and the results are cached, so it's not too horrific,
but we still need to check timestamps to see if the files have
changed, and it takes a lot of memory for large groups.

It's been a while since I looked at it, though--there may be some
operations where we do need to go from the filename to the number.  So
then it might be useful to add hard links so each nov/ file could be
accessed by either its article number or filename.  The filename would
also have to be added to the contents of those files somehow.


paul



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 2/2] nnmaildir: Use a 'num' file, instead of a directory
  2010-06-26 17:33 ` Paul Jarc
@ 2010-08-13  0:15   ` Ted Zlatanov
  0 siblings, 0 replies; 4+ messages in thread
From: Ted Zlatanov @ 2010-08-13  0:15 UTC (permalink / raw)
  To: emacs-devel; +Cc: ding

On Sat, 26 Jun 2010 13:33:42 -0400 prj@po.cwru.edu (Paul Jarc) wrote: 

>> I have recently experienced slow nnmaildir performance

PJ> I suspect the performance problem is with the nov/ directory, rather
PJ> than num/.  With the current nov/ structure, it's quick and easy to
PJ> map from filenames to article numbers, but I think we usually need the
PJ> inverse operation.  As it is, to map from an article number to a
PJ> filename, we need to read the contents of all the nov/ files.  That's
PJ> done just once and the results are cached, so it's not too horrific,
PJ> but we still need to check timestamps to see if the files have
PJ> changed, and it takes a lot of memory for large groups.

PJ> It's been a while since I looked at it, though--there may be some
PJ> operations where we do need to go from the filename to the number.  So
PJ> then it might be useful to add hard links so each nov/ file could be
PJ> accessed by either its article number or filename.  The filename would
PJ> also have to be added to the contents of those files somehow.

Thanks for explaining, Paul.  I wanted to respond to you and John
carefully so it took me a while.  Sorry about that.

I looked at the nnmaildir code.  Keeping in mind the majority of Gnus
users don't need concurrent access to their Maildirs, I have a proposal.

Regarding John's patch, I think it's good to avoid creating many extra
files.  Inodes can be expensive and many filesystems are not good about
indexing many files.  But it should be a user option called
'make-concurrent for instance (on the nnmaildir backend), not a complete
switchover as it is now.  But using a `num' file seems superfluous
since, if we know concurrent access is not an issue, we can keep a
single database.  We also don't have to worry about users going back and
forth between concurrent and non-concurrent access.  If they do, we can
complain loudly and maybe provide a slow bidirectional switchover
function.

Regarding the NOV database in .nnmaildir/nov/MESSAGE-ID, the goal is to
map it to the number N that's currently inside that file.  Links would
also burden the filesystem and are IMO not a good improvement since
scanning the directory repeatedly is expensive.  I think the current
strategy should be kept as is and turned on only if the user asks for
concurrency (as above).  

The non-concurrent alternative should be to keep a single NOV and num
database in memory for the active group and flush it to disk as needed.
The database can be as simple as one line at the beginning for the
version and then just the NOV vectors in order, one per line.  Appending
is trivial (read last line to get max number, append line) and rewriting
the NOV is only necessary when deleting an article.  I think this would
speed up nnmaildir operations significantly.

I'd like to know your opinion since you wrote so much of nnmaildir.el
and have experience supporting it.  I am certain that for the majority
of Gnus users today concurrent access is not an issue based on what I've
heard in the Gnus mailing lists over the last 8 years.  But do you think
the current concurrent system can be improved significantly rather than
doing what I propose?  Should we look at different storage, maybe SQLite
or Berkeley DB or sparse files, for those databases?  Is there anything
in the Emacs core that can help us (thus the CC to emacs-devel) or can
anything be added to Emacs to that end?

Thanks for your help
Ted




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-08-13  0:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-27 21:09 [PATCH 2/2] nnmaildir: Use a 'num' file, instead of a directory John Wiegley
2010-06-24 17:52 ` Ted Zlatanov
2010-06-26 17:33 ` Paul Jarc
2010-08-13  0:15   ` Ted Zlatanov

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).