unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
From: Phillip Lord <p.lord@russet.org.uk>
Subject: Re: A couple of lisp questions
Date: 12 Nov 2003 19:00:34 +0000	[thread overview]
Message-ID: <vfad714dkd.fsf@rpc71.cs.man.ac.uk> (raw)
In-Reply-To: jwvu159bh1y.fsf-monnier+gnu.emacs.help@vor.iro.umontreal.ca

>>>>> "Stefan" == Stefan Monnier <monnier@iro.umontreal.ca> writes:

  Stefan> Take a look at how flyspell does it.  Or maybe auto-fill.

  >> I will. I think auto-fill cheats though, as its tied directly in
  >> to the command loop. I seem to remember reading that somewhere.

  Stefan> Not the command loop, just the self-command-insert command
  Stefan> (which is implemented in C). 

Yes, you are right. I was a little confused by

"In general, this is the only way to do that, since the facilities for
customizing `self-insert-command' are limited to special cases
(designed for abbrevs and Auto Fill mode). Do not try substituting
your own definition of `self-insert-command' for the standard one.
The editor command loop handles this function specially."

So auto-fill is tied in slightly indirectly. 


  Stefan> You can hijack the auto-fill-function for your own
  Stefan> non-auto-fill use.

I would not want it to interfere with auto-fill though. I think I have
it working reasonably well know. 

  >> usage-hash: "the" --> ("the" . 4) "and" --> ("and" . 6)

  Stefan> Why not just

  Stefan>    "the" --> 4 "and" --> 6

it makes no difference. The suffix hash must contain cons cells, and I
share them with this. For the usage hash, you are correct, the car of
the cons cell is not used. 

  >> Then a suffix hash

  >> suffix-hash: "t" --> (("the" . 4) ("then" . 3) ("talk" . 2) etc)
  >> "th" --> (("the" . 4) etc ) "the" --> (("the" . 4) etc )

  Stefan> Is `try-completion' too slow (because the usage-hash is too
  Stefan> large?) to build the suffixes on the fly ?

I'm not convinced it does what I want. Perhaps I am wrong. 

When the letter "t" is pressed I get an alist back. The alist is
actually ordered, with the most commonly occurring words first. So I
pick the preferred usage straight of the front. So I have constant
time access to the hash, and constant time access to the list. 

Updating takes a bit longer....



  >> In this case the cons cells for each word are shared between the
  >> hashes, so this is not a massive memory waste as the written
  >> version appears.

  Stefan> Each word of N letters has:
  Stefan> - one string (i.e. N + 16 bytes)
  Stefan> - one cons-cell (8 bytes)
  Stefan> - one hash-table entry (16 bytes)
  Stefan> in usage-hash, plus:
  Stefan> - N cons-cells (N*8 bytes)
  Stefan> - N hash entries shared with other words (at least 16 btes).
  Stefan> For a total of 9*N + 56 bytes per word.  Probably not a big
  Stefan> deal.

Well there are other reasons as well. When I update the cons in the
usage, its automatically "update" in the suffix hash as well. That was
the main reason. 

  >> Ideally I would want to build up these word usage statistics as
  >> they are typed, but as you say its hard to do this. I think a
  >> flyspell like approach combined with text properties should work
  >> okay.

  Stefan> How do you avoid counting the same instance of a word
  Stefan> several times?  Oh, you mark them with a text-property, I
  Stefan> see.  More like font-lock than flyspell.

Just so. 


  >> The serialization would be to enable saving across sessions. Most
  >> of the packages I know that do this depend on their objects
  >> having a read syntax, which doesn't work with hashes. I think the
  >> solution here is to convert the thing into a big alist to save
  >> it, and then reconstruct the hashes on loading.

  Stefan> Why not reconstruct the suffix upon loading?  This way you
  Stefan> have no sharing to worry about and you can just dump the
  Stefan> hash via maphash & pp.

Yes, I think that's going to be my plan. Normally I sort the alist in
the suffix hash after every update, but if I disable this, and then do
them all at once, it should be quicker....

  >> Anyway the idea for all of this was to do a nifty version of
  >> abbreviation expansion, something like dabbrev-expand, but
  >> instead of searching local buffers, it would grab word stats as
  >> its going, and use these to offer appropriate suggestions. I was
  >> thinking of a user interface a little bit like the buffer/file
  >> switching of ido.el, of which I have become a committed user.

  Stefan> Sounds neat.

  >> the way, building an decent UI around this will probably take 10
  >> times as much code!

  Stefan> And even more time,

Just so. 

I've almost got a nasty version (where you build the dictionary
explicitly rather than automatically) working. 

Cheers

Phil

  reply	other threads:[~2003-11-12 19:00 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-11-11 14:00 A couple of lisp questions Phillip Lord
2003-11-12 14:11 ` Stefan Monnier
2003-11-12 16:29   ` Phillip Lord
2003-11-12 18:28     ` Stefan Monnier
2003-11-12 19:00       ` Phillip Lord [this message]
2003-11-13 16:31         ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=vfad714dkd.fsf@rpc71.cs.man.ac.uk \
    --to=p.lord@russet.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).