From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: pjb@informatimago.com (Pascal J. Bourguignon) Newsgroups: gmane.emacs.help Subject: Re: Seeking Understanding about Symbols, Strings, Structure and Storage Date: Fri, 06 Nov 2009 13:03:27 +0100 Organization: Informatimago Message-ID: <87tyx7dfn4.fsf@galatea.local> References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1257511299 10057 80.91.229.12 (6 Nov 2009 12:41:39 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 6 Nov 2009 12:41:39 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Fri Nov 06 13:41:32 2009 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1N6O8B-0006Fo-1t for geh-help-gnu-emacs@m.gmane.org; Fri, 06 Nov 2009 13:41:31 +0100 Original-Received: from localhost ([127.0.0.1]:40474 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1N6O8A-0000k7-5w for geh-help-gnu-emacs@m.gmane.org; Fri, 06 Nov 2009 07:41:30 -0500 Original-Path: news.stanford.edu!usenet.stanford.edu!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail Original-Newsgroups: gnu.emacs.help Original-Lines: 203 Original-X-Trace: individual.net 9FZWWSTqdH7Xy48Yq/v7HQvKoRA/AuWAhaaSuDWSMBAXNx9QCQ Cancel-Lock: sha1:MTVjNThiNjcwMzZlNjUyYWVjNWE1NThlZjZkNjUwZmM0MTYzNzdjOQ== sha1:70ziJPhLx8rGJmDFP8w0puQvnj0= Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwAQMAAABtzGvEAAAABlBMVEUAAAD///+l2Z/dAAAA oElEQVR4nK3OsRHCMAwF0O8YQufUNIQRGIAja9CxSA55AxZgFO4coMgYrEDDQZWPIlNAjwq9 033pbOBPtbXuB6PKNBn5gZkhGa86Z4x2wE67O+06WxGD/HCOGR0deY3f9Ijwwt7rNGNf6Oac l/GuZTF1wFGKiYYHKSFAkjIo1b6sCYS1sVmFhhhahKQssRjRT90ITWUk6vvK3RsPGs+M1RuR mV+hO/VvFAAAAABJRU5ErkJggg== X-Accept-Language: fr, es, en X-Disabled: X-No-Archive: no User-Agent: Gnus/5.1008 (Gnus v5.10.8) Emacs/22.3 (darwin) Original-Xref: news.stanford.edu gnu.emacs.help:174468 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:69542 Archived-At: Nordl=C3=B6w writes: > I am trying really hard to get a unified view about all the atomic > types and containers/structures/links (e)lisp provides me with: First, sorry, in my previous answer, I said that an obarray is a tree, but it's actually a hash-table.=20=20 > What is the best (efficient) way to store symbols (references) > extracted from an obarray? > Should I intern them in a new array or should I store them in a > sequence? You cannot intern a symbol (see the doc of the function intern).=20=20 The emacs lisp reference manual says: Do not try to put symbols in an obarray yourself. This does not work=E2=80=94only intern can enter a symbol in an obarray properly. > Does interning symbols in an obarray require more memory than storing > them in a vector? Why do you care? obarray is not a user level data structure, it's an internal data structure used by intern and find-symbol. You shouldn't use it directly. It is leaking its implementation, because you have to know that to allocate a new, empty obarray, you must use make-vector, but an obarray IS NOT a vector. You cannot store anything directly in a obarray yourself, you must use intern.=20 > When should I intern symbols in an obarray?; My guess: When I need > fast (hashed) lookup of them. Notice that in emacs lisp, symbols can belong only to a single one obarray. The reason is that symbols contain an hidden reference to the following symbol in the obarray bucket, a single reference. Therefore a symbol can be stored only in one obarray, otherwise the buckets could be shared and wreak havoc. > If I store symbols in a list, say '(a b), are these symbols "free", > compared to if I first intern them? What do you mean by "free"? > I have heard the we should always prefer symbols before strings, when > we can. Why?; Don't listen to old optimization advices. The computers have changed a lot over the time. You cannot use the same thumb rules when you have a 3 ton computer with 32,768 words of memory able to do 100,000 additions a second, than when you have a 0.25 gram computer that has 17,179,869,184 bytes of RAM, and that is able to do 3,100,000,000 additions per second. > I guess symbols require less memory because they don't carry string > attributes. Are the any other reasons? A symbol has a name which is a string, therefore, if you keep a naive approach, they need strictly more memory than a string. But when you have a language such as lisp which doesn't actively prevent you to use the functionnal programming style (ie. without mutation), data can be shared amonst several data structures, and therefore it is very difficult to account memory for a single structure, since that structure may be shared, it's memory cost may be shared too. Really, you should not care about these questions while you have gigabytes of free RAM!=20 > If I want to build a tree or even a cyclic graph (I guess this is > possible because there are rings) do we always need to construct these > from cons-cells? Of course not. You can use vectors, structures, (eieio) objects, anything. Unfortunately, emacs lisp has no closure, since it has only dynamic binding, no lexical binding, therefore it is not possible to use lambdas to implement data structure like in scheme or Common Lisp. But you still can use the other kind of data types to build your data abstractions. > If so how do we realize this?: I know that setcar,setcdr,setf can play > a key-role here. What happens if I try to print such an expression > using princ(), prin1()? Why don't you try it yourself? (let ((a (cons 1 2))) (setf (cdr a) a) a) --> #1=3D(1 . #1#) (let ((a (vector 1 2))) (setf (aref a 1) a) a) --> #1=3D[1 #1#] #=3D and ## is a notation used to read and print cyclic references. > For example how should we efficiently store a parse tree of tokens > having properties? (defstruct token children properties) (make-token :children '() :properties '(blue)) --> [cl-struct-token nil (blue)] > My suggestion: As a cons-tree of symbols were the symbols are interned > in an obarray having properties accessed using get() and put(). If you want to use conses to implement the above structure add :type list: (defstruct (token (:type list)) children properties) (make-token :children '() :properties '(blue)) --> (nil (blue)) Notice that in emacs lisp, structures are not a native data type, they're always built upon vectors or cons cells. If you use the eieio package (from http://cedet.sourceforge.net), you will get CLOS like objects, (also built upon vectors, but you shouldn't care). > If I assign the same long symbol=20 Symbols are not short or long. Symbols are. They have: - a name (which is a string), symbol-name - an optional value (anything), symbol-value - an optional function, symbol-function - a property list, symbol-plist - various source files, symbol-file (which is not to say that they use 5 slots, or more or less. As I wrote above, a naive implementation could use a structure to implement symb= ols as: (defstruct symbol name value function plist files %hidden-next-in-bucket ;; ... ) But it is also possible to store things differently. For example, for symbol-file, instead of keeping the list of files where the various aspect of a symbol are defined with each symbol, we could have a list of aspects, and for each aspect, a list of file with the symbols defined there. > to many variables is it assigned by > reference, kind of like pointers in C? Example: > (setq x 'loooooooooooooooooooong=20 > y 'loooooooooooooooooooong) Yes. And it'd make no difference if it was (setq x 'a y 'a). > Is there some web-page out there that highlights the choices the (e) > lisp designers made concerning these things? There are tons of web-pages about lisp. Lisp is the oldest (family of) programming language(s) still in use, beside Fortran, therefore you will be able to find a lot of papers and documentation about a lot of ways to implement lisp and what design choices can be used and what their trade-offs are. If you're asking about a specific version of gnu emacs, then the best is to have a look at the source code of emacs. --=20 __Pascal Bourguignon__