* Re: correct way to do state machines in elisp?
[not found] ` <m3elcoukc7.fsf@ID-87814.user.dfncis.de>
@ 2002-08-24 19:34 ` Jonathan Walther
0 siblings, 0 replies; 5+ messages in thread
From: Jonathan Walther @ 2002-08-24 19:34 UTC (permalink / raw)
Cc: emacs-devel
[-- Attachment #1: Type: text/plain, Size: 10050 bytes --]
Thank you! The enclosed explanation is very helpful.
Jonathan
On Sat, Aug 24, 2002 at 09:29:12PM +0200, Oliver Scholz wrote:
>It seems that seems that my follow-up did not get through to the
>mailing-list. So I send it to you privately.
>
>Jonathan Walther <krooger@debian.org> writes:
>[...]
>> As I was writing it, I couldn't help thinking, there must be a way to
>> write the same thing in LISP, with 1/3 the lines of code. The largest
>> guzzler of lines of code I think is the state machine that parses every
>> RFC compliant IRC message into a suitable structure.
>>
>> Can anyone give me hints about how the same thing could be done better
>> in elisp? I'm reluctant to use regular expressions for this.
>[...]
>
>I once asked for advice on state machines in comp.lang.lisp and got a
>very helpful answer by Paul Foley. Now I asked for and got his
>permission to forward his mail. The first part explains the basics of
>state machines for a clueless non-programmer without CS-background
>like me. But later on there are some examples in Lisp (CL, actually).
>
>I hope this helps.
>
>To: Oliver Scholz <alkibiades@gmx.de>
>Subject: Re: [newbie] state machines
>Content-Type: text/plain; charset=US-ASCII
>From: Paul Foley <mycroft@actrix.gen.nz>
>Date: 12 Jun 2002 03:23:28 +1200
>
>On Tue, 11 Jun 2002 14:16:23 +0200, Oliver Scholz wrote:
>
>> I found state machines sometimes mentioned in usenet articles or on
>> the web and I understand that there are different kinds (?) of them:
>> finite and infinite, deterministic and non-deterministic. I don't
>> really understand this issue, but I want to learn more about this.
>
>A state machine is a "machine" (usually simulated by a computer
>program) that has "states". It takes some input, figures out what
>state it should go into next, based on the input(s), and goes into
>that state, then repeats.
>
>A deterministic state machine is one in which the "next state"
>function is deterministic -- i.e., for any given input and current
>state, there's only one possible next state.
>
>A non-deterministic state machine is one in which the "next state"
>function is non-deterministic -- i.e., there's more than one choice
>for next state, for at least one combination of input and current
>state.
>
>A finite state machine is one that has a finite number of states.
>[Which is all of them -- there are no infinite state machines]
>
>A very simple example is a traffic light: it has 3 states: green,
>orange, and red, and a single-bit "change colour now" input. When
>it's in the "green" state, the input changes it to the "orange" state;
>when it's in the "orange" state, the input changes it to the "red"
>state, and when it's in the "red" state, the input changes it to the
>"green" state.
>
>A computer-related example with which you're probably familiar is the
>simple regular expression [real regexps are DFSMs (deterministic
>finite state machines); the things called "regexp" in languages like
>Perl are not actually regexps -- they're a more powerful class of
>languages]. Let's say you have the regexp "a(bx|cy)*d": that means it
>matches input consisting of an "a" followed by any number of "bx" or
>"cy" pairs, in any order, followed by a "d". The state machine has 5
>states: let's number them starting from 1 -- it starts off in state 1.
>In state 1, an input of "a" moves it to state 2; any other input is an
>error (i.e., doesn't match the regexp). In state 2, if the input is a
>"b" it moves to state 3, and if it's a "c" is moves to state 4, and if
>it's a "d" it moves to state 5 (anything else is an error). In state
>3, an input of "x" moves it to state 2. In state 4, an input of "y"
>moves it to state 2. If you reach state 5, you're done.
>
>You can draw a picture of it like this:
>
> x
> ,-----------.
> | |
> | b |
> | ,-->[3]--'
> a v / d
> -->[1]--->[2]---->[[5]]
> ^ \
> | `-->[4]--.
> | c |
> | y |
> `-----------'
>
> [n] represents a state, and the arrows represent transitions; the
> letters on the arrows tell you what the input has to be for that
> transition to trigger. [[n]] represents a "final" state (there can
> possibly be more than one, but for a regexp you can always draw it
> with only one final state)
>
>A NDFSM (non-deterministic ...) has some states that look like
>
> x
> ,--->[B]
> /
> [A]
> \ x
> `--->[C]
>
>with the same letter on two or more transitions -- you can't tell
>which one is the 'right one' [if you take the top one, to state B, and
>later hit an "error" condition (where there's no suitable transition
>and you're not in a final state), you have to back up and try the
>lower one, because it might have led you to a final state without
>error]
>
>> My notion is that state machines are useful whenever I have to parse
>> input from a stream. I don't know if this statement is correct. My
>
>They're useful for everything you do on a computer -- a computer is
>really just a finite state machine (with a very large number of
>potential states).
>
>> I want to learn a) the (fundamentals of) the concept, b) when to use
>> them, c) how to construct them in a proper way and d) how to implement
>> them in Lisp.
>
>There are several ways to implement them. For reasonably small
>numbers of states, you can write a TAGBODY with the states as tags,
>using GO to change states. Or you can use a variable for the state
>number and a CASE statement dispatching on that variable for the code,
>or you can write the states and transitions as data, and have a
>function that walks over that data -- that's better for a large number
>of states...
>
>For example, the "a(bx|cy)*d" machine could be written
>
> (defun match-expr-1 (stream &aux input (state 1))
> (loop
> (setq input (read-char stream nil nil))
> (case state
> (1 (ecase input
> (#\a (setq state 2)))
> (2 (ecase input
> (#\b (setq state 3))
> (#\c (setq state 4))
> (#\d (return t)))) ; i.e., state 5 -- done
> (3 (ecase input
> (#\x (setq state 2))))
> (4 (ecase input
> (#\y (setq state 2))))))))
>
>or
>
> (defstruct state
> number
> transitions
> final-p)
>
> (defstruct transition
> input
> next-state)
>
> (defun run-fsm (stream state)
> (loop
> (if (state-final-p state)
> (return t)
> (let ((transition (find (read-char stream nil nil)
> (state-transitions state)
> :key #'transition-input)))
> (if (null transition)
> (return nil)
> (setq state (transition-next-state transition)))))))
>
>
> (defun add-transition (input from-state to-state)
> (push (make-transition :input input :next-state to-state)
> (state-transitions from-state)))
>
> (defun match-expr-2 (stream)
> (let ((state1 (make-state :number 1))
> (state2 (make-state :number 2))
> (state3 (make-state :number 3))
> (state4 (make-state :number 4))
> (state5 (make-state :number 5 :final-p t)))
> (add-transition #\a state1 state2)
> (add-transition #\b state2 state3)
> (add-transition #\c state2 state4)
> (add-transition #\d state2 state5)
> (add-transition #\x state3 state2)
> (add-transition #\y state4 state2)
> (run-fsm stream state1)))
>
>or whatever [of course, there's no need to rebuild all the states
>every time, in the last example]
>
>A similar solution -- and the usual way of implementing it in languages
>like C -- is to use a table. Assume a, b, c, d, x and y are the only
>valid inputs, for simplicity; then a table like
>
> State "a" "b" "c" "d" "x" "y"
> 1 2 NIL NIL NIL NIL NIL
> 2 NIL 3 4 T NIL NIL
> 3 NIL NIL NIL NIL 2 NIL
> 4 NIL NIL NIL NIL NIL 2
>
>describes the machine (a number means go to that state, T and NIL mean
>return T or NIL, respectively), so you could write
>
> (defun read-input (stream)
> (ecase (read-char stream nil nil)
> (#\a 0) (#\b 1) (#\c 2) (#\d 3) (#\x 4) (#\y 5)))
>
> (defun run-fsm2 (stream table &aux (state 0))
> (loop
> (let ((value (aref table state (read-input stream))))
> (if (integerp value)
> (setq state value)
> (return value)))))
>
> (defun match-expr-3 (stream)
> (run-fsm2 #2A(( 1 nil nil nil nil nil)
> (nil 2 3 t nil nil)
> (nil nil nil nil 1 nil)
> (nil nil nil nil nil 1))))
>
>[Note the renumbering of states to count from 0 instead of 1]
>
>Does that help?
>
>--
>Oh dear god. In case you weren't aware, "ad hominem" is not latin for
>"the user of this technique is a fine debater."
> -- Thomas F. Burdick
>(setq reply-to
> (concatenate 'string "Paul Foley " "<mycroft" '(#\@) "actrix.gen.nz>"))
>
>
>
> -- Oliver
>
>--
>Oliver Scholz 7 Fructidor an 210 de la R?volution
>Taunusstr. 25 Libert?, Egalit?, Fraternit?!
>60329 Frankfurt a. M. http://www.jungdemokratenhessen.de
>Tel. (069) 97 40 99 42 http://www.jdjl.org
--
Geek House Productions, Ltd.
Providing Unix & Internet Contracting and Consulting,
QA Testing, Technical Documentation, Systems Design & Implementation,
General Programming, E-commerce, Web & Mail Services since 1998
Phone: 604-435-1205
Email: djw@reactor-core.org
Webpage: http://reactor-core.org
Address: 2459 E 41st Ave, Vancouver, BC V5R2W2
[-- Attachment #2: Type: application/pgp-signature, Size: 307 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread