From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.help Subject: Re: A couple of lisp questions Date: Wed, 12 Nov 2003 18:28:27 GMT Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Message-ID: References: NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1068664745 27899 80.91.224.253 (12 Nov 2003 19:19:05 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 12 Nov 2003 19:19:05 +0000 (UTC) Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Wed Nov 12 20:19:02 2003 Return-path: Original-Received: from monty-python.gnu.org ([199.232.76.173]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1AK0W5-0006SQ-01 for ; Wed, 12 Nov 2003 20:19:02 +0100 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.24) id 1AK1S8-00073U-SE for geh-help-gnu-emacs@m.gmane.org; Wed, 12 Nov 2003 15:19:00 -0500 Original-Path: shelby.stanford.edu!newsfeed.stanford.edu!cyclone.bc.net!snoopy.risq.qc.ca!charlie.risq.qc.ca!53ab2750!not-for-mail Original-Newsgroups: gnu.emacs.help Original-Lines: 80 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3.50 Original-NNTP-Posting-Host: 132.204.24.42 Original-X-Complaints-To: abuse@umontreal.ca Original-X-Trace: charlie.risq.qc.ca 1068661707 132.204.24.42 (Wed, 12 Nov 2003 13:28:27 EST) Original-NNTP-Posting-Date: Wed, 12 Nov 2003 13:28:27 EST Original-Xref: shelby.stanford.edu gnu.emacs.help:118176 Original-To: help-gnu-emacs@gnu.org X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.2 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: main.gmane.org gmane.emacs.help:14117 X-Report-Spam: http://spam.gmane.org/gmane.emacs.help:14117 Stefan> Take a look at how flyspell does it. Or maybe auto-fill. > I will. I think auto-fill cheats though, as its tied directly in to > the command loop. I seem to remember reading that somewhere. Not the command loop, just the self-command-insert command (which is implemented in C). You can hijack the auto-fill-function for your own non-auto-fill use. > usage-hash: "the" --> ("the" . 4) > "and" --> ("and" . 6) Why not just "the" --> 4 "and" --> 6 > Then a suffix hash > suffix-hash: "t" --> (("the" . 4) ("then" . 3) ("talk" . 2) etc) > "th" --> (("the" . 4) etc ) > "the" --> (("the" . 4) etc ) Is `try-completion' too slow (because the usage-hash is too large?) to build the suffixes on the fly ? > In this case the cons cells for each word are shared between the > hashes, so this is not a massive memory waste as the written version > appears. Each word of N letters has: - one string (i.e. N + 16 bytes) - one cons-cell (8 bytes) - one hash-table entry (16 bytes) in usage-hash, plus: - N cons-cells (N*8 bytes) - N hash entries shared with other words (at least 16 btes). For a total of 9*N + 56 bytes per word. Probably not a big deal. > Ideally I would want to build up these word usage statistics as they > are typed, but as you say its hard to do this. I think a flyspell like > approach combined with text properties should work okay. How do you avoid counting the same instance of a word several times? Oh, you mark them with a text-property, I see. More like font-lock than flyspell. > Anyway the idea with the weakness is that I want to garbage collect > the dictionary periodically, throwing away old, or rarely used words. I don't think weakness gives you that. It seems difficult to use weakness here to get even a vague approximation of what you want. You can use a gc-hook to flush stuff every once in a while, but you could just as well use an idle-timer for that. > The serialization would be to enable saving across sessions. Most of > the packages I know that do this depend on their objects having a read > syntax, which doesn't work with hashes. I think the solution here is > to convert the thing into a big alist to save it, and then reconstruct > the hashes on loading. Why not reconstruct the suffix upon loading? This way you have no sharing to worry about and you can just dump the hash via maphash & pp. > Anyway the idea for all of this was to do a nifty version of > abbreviation expansion, something like dabbrev-expand, but instead of > searching local buffers, it would grab word stats as its going, and > use these to offer appropriate suggestions. I was thinking of a user > interface a little bit like the buffer/file switching of ido.el, of > which I have become a committed user. Sounds neat. > the way, building an decent UI around this will probably take 10 times > as much code! And even more time, Stefan