* [corrector(s) needed] doc/lispref/internals.texi tweaks @ 2012-11-12 11:20 Dmitry Antipov 2012-11-13 13:47 ` Eli Zaretskii 0 siblings, 1 reply; 4+ messages in thread From: Dmitry Antipov @ 2012-11-12 11:20 UTC (permalink / raw) To: Emacs development discussions [-- Attachment #1: Type: text/plain, Size: 105 bytes --] There are some bits for doc/lispref/internals.texi, which looks a bit outdated and incomplete... Dmitry [-- Attachment #2: internals.patch --] [-- Type: text/plain, Size: 12095 bytes --] === modified file 'doc/lispref/internals.texi' --- doc/lispref/internals.texi 2012-06-27 05:21:15 +0000 +++ doc/lispref/internals.texi 2012-11-12 11:18:55 +0000 @@ -226,12 +226,11 @@ Beyond the basic vector, a lot of objects like window, buffer, and frame are managed as if they were vectors. The corresponding C data structures include the @code{struct vectorlike_header} field whose -@code{next} field points to the next object in the chain: -@code{header.next.buffer} points to the next buffer (which could be -a killed buffer), and @code{header.next.vector} points to the next -vector in a free list. If a vector is small (smaller than or equal to -@code{VBLOCK_BYTES_MAX} bytes, see @file{alloc.c}), then -@code{header.next.nbytes} contains the vector size in bytes. +@code{size} member contains the subtype enumerated by @code{enum pvec_type} +and an information about how many Lisp_Object fields this structure +contains and what the size of the rest data is. This information is +needed to calculate the memory footprint of an object, and used +by the vector allocation code while iterating over the vector blocks. @cindex garbage collection It is quite common to use some storage for a while, then release it @@ -284,89 +283,35 @@ spontaneously if you use more than @code{gc-cons-threshold} bytes of Lisp data since the previous garbage collection.) -@code{garbage-collect} returns a list containing the following -information: - -@example -@group -((@var{used-conses} . @var{free-conses}) - (@var{used-syms} . @var{free-syms}) -@end group - (@var{used-miscs} . @var{free-miscs}) - @var{used-string-chars} - @var{used-vector-slots} - (@var{used-floats} . @var{free-floats}) - (@var{used-intervals} . @var{free-intervals}) - (@var{used-strings} . @var{free-strings})) -@end example - -Here is an example: - -@example -@group -(garbage-collect) - @result{} ((106886 . 13184) (9769 . 0) - (7731 . 4651) 347543 121628 - (31 . 94) (1273 . 168) - (25474 . 3569)) -@end group -@end example - -Here is a table explaining each element: - -@table @var -@item used-conses -The number of cons cells in use. - -@item free-conses -The number of cons cells for which space has been obtained from the -operating system, but that are not currently being used. - -@item used-syms -The number of symbols in use. - -@item free-syms -The number of symbols for which space has been obtained from the -operating system, but that are not currently being used. - -@item used-miscs -The number of miscellaneous objects in use. These include markers and -overlays, plus certain objects not visible to users. - -@item free-miscs -The number of miscellaneous objects for which space has been obtained -from the operating system, but that are not currently being used. - -@item used-string-chars -The total size of all strings, in characters. - -@item used-vector-slots -The total number of elements of existing vectors. - -@item used-floats -The number of floats in use. - -@item free-floats -The number of floats for which space has been obtained from the -operating system, but that are not currently being used. - -@item used-intervals -The number of intervals in use. Intervals are an internal -data structure used for representing text properties. - -@item free-intervals -The number of intervals for which space has been obtained -from the operating system, but that are not currently being used. - -@item used-strings -The number of strings in use. - -@item free-strings -The number of string headers for which the space was obtained from the -operating system, but which are currently not in use. (A string -object consists of a header and the storage for the string text -itself; the latter is only allocated when the string is created.) -@end table +@code{garbage-collect} returns a list with information on amount of space +in use, where each entry has the form @samp{(name size used free)}. In the +entry, @samp{name} is a symbol describing the kind of objects this entry +represents, @samp{size} is the number of bytes used by each one, @samp{used} +is the number of those objects that were found live in the heap, and +@samp{free} is the number of those objects that are not live but that Emacs +keeps around for future allocations. Here is an example: + +@example +((@var{conses} 16 50589 8907) (@var{symbols} 48 14759 0) + (@var{miscs} 40 37 119) (@var{strings} 32 3610 4481) + (@var{string-bytes} 1 96823) (@var{vectors} 16 7471) + (@var{vector-slots} 8 344767 27849) (@var{floats} 8 76 111) + (@var{intervals} 56 49 18) (@var{buffers} 944 9) + (@var{heap} 1024 14654 2363)) +@end example + +First entry means that the internal size of a cons cell is 16 bytes, there +are 50589 used cons cells and 8907 conses are on the free list. Likewise +for symbols, floats and intervals. Freed buffers aren't collected in the +free list, and the corresponding entry has just two numbers (internal size +of @code{struct buffer} and amount of buffers in @code{all_buffers} list). +Miscellaneous objects at @var{misc} includes markers and overlays plus +certain objects not visible to users. Since string objects consists of +a header and the storage for the string text itself, there are two entries +for them: @var{strings} counts headers and @var{string-bytes} counts +the total number of bytes in the strings. The same applies to the vectors +with @var{vectors} and @var{vector-slots}. Finally, the last member +means that the total heap size is 14654 Kb, and 2363 Kb of them are free. If there was overflow in pure space (@pxref{Pure Storage}), @code{garbage-collect} returns @code{nil}, because a real garbage @@ -639,7 +584,12 @@ the number of Lisp arguments, it must have exactly two C arguments: the first is the number of Lisp arguments, and the second is the address of a block containing their values. These have types -@code{int} and @w{@code{Lisp_Object *}} respectively. +@code{int} and @w{@code{Lisp_Object *}} respectively. Since +@code{Lisp_Object} can hold any Lisp object of any data type, you +can determine the actual data type only at run time; so if you want +a primitive to accept only a certain type of argument, you must check +the type explicitly using a suitable predicate (@pxref{Type Predicates}). +@cindex type checking internals @cindex @code{GCPRO} and @code{UNGCPRO} @cindex protect C variables from garbage collection @@ -820,23 +770,69 @@ @section Object Internals @cindex object internals -@c FIXME Is this still true? Does --with-wide-int affect anything? - GNU Emacs Lisp manipulates many different types of data. The actual -data are stored in a heap and the only access that programs have to it -is through pointers. Each pointer is 32 bits wide on 32-bit machines, -and 64 bits wide on 64-bit machines; three of these bits are used for -the tag that identifies the object's type, and the remainder are used -to address the object. - - Because Lisp objects are represented as tagged pointers, it is always -possible to determine the Lisp data type of any object. The C data type -@code{Lisp_Object} can hold any Lisp object of any data type. Ordinary -variables have type @code{Lisp_Object}, which means they can hold any -type of Lisp value; you can determine the actual data type only at run -time. The same is true for function arguments; if you want a function -to accept only a certain type of argument, you must check the type -explicitly using a suitable predicate (@pxref{Type Predicates}). -@cindex type checking internals + Emacs Lisp provides a rich set of the data types. Some of them, like cons +cells, integers and stirngs, are common to nearly all Lisp dialects. Some +others, like markers and buffers, are quite special and needed to provide +the basic support to write an editor commands in Lisp. To implement such +a variety of object types and provide an efficient way to pass objects between +the subsystems of an interpreter, there is a set of C data structures and +a special type to represent the pointers to all of them, which is known as +tagged pointer. + + In C, the tagged pointer is an object of type @code{Lisp_Object}. Any +initialized variable of such a type always holds the value of one of the +following basic data types: integer, symbol, string, cons cell, float, +vectorlike or miscellaneous object. Each of these data types has the +corresponding tag value. All tags are enumerated by @code{enum Lisp_Type} +and placed into a 3-bits bitfield of the @code{Lisp_Object}. The rest bits of +@code{Lisp_Object} is the value itself. Integer values are immediate, e.g. +directly represented by the rest bits, and all other objects are represented by +the C pointers to a corresponding object allocated from the heap. Width of the +@code{Lisp_Object} is platform- and configuration-dependent: usually it's equal +to the width of an underlying platform pointer (e.g. 32-bit on a 32-bit machine +and 64-bit on a 64-bit one), but also there is a special configuration where +@code{Lisp_Object} is 64-bit but all pointers are 32-bit. The latter trick +was designed to overcome the limited range of values for Lisp integers on +a 32-bit system by using 64-bit @code{long long} type for @code{Lisp_Object}. + + The following C data structures are defined in @file{lisp.h} to represent +the basic data types beyond integers: + +@table @code +@item struct Lisp_Cons +Cons cell, an object used to construct lists. + +@item struct Lisp_String +String, the basic object to represent a sequence of characters. + +@item struct Lisp_Vector +Array, a fixed-size set of Lisp_Objects which may be accessed by an index. + +@item struct Lisp_Symbol +Symbol, the unique-named entity commonly used as an identifier. + +@item struct Lisp_Float +Floating point value. + +@item union Lisp_Misc +Miscellaneous kinds of objects which doesn't fit into any of the above. +@end table + + These types are the first-class citizens of an internal type system. +Since the tag space is limited, all other types are the subtypes of either +@code{Lisp_Vectorlike} or @code{Lisp_Misc}. Vector subtypes are enumerated +by @code{enum pvec_type}, and nearly all complex objects like windows, buffers, +frames, and processes falls into this category. The rest of special types, +including markers and overlays, are enumerated by @code{enum Lisp_Misc_Type} +and forms the set of subtypes of @code{Lisp_Misc}. + + Below there is a description of a few subtypes of @code{Lisp_Vectorlike}. +Buffer object represents the text to display and edit. Window is the part +of display structure which shows the buffer or used as a container to +recursively place other windows on the same frame. (Do not confuse Emacs Lisp +window object with the window as an entity managed by the user interface +system like X; in Emacs terminology, the latter is called frame). Finally, +process object is used to manage the subprocesses. @menu * Buffer Internals:: Components of a buffer structure. @@ -912,12 +908,8 @@ @table @code @item header -A @code{struct vectorlike_header} structure where @code{header.next} -points to the next buffer, in the chain of all buffers (including -killed buffers). This chain is used only for garbage collection, in -order to collect killed buffers properly. Note that vectors, and most -kinds of objects allocated as vectors, are all on one chain, but -buffers are on a separate chain of their own. +A header of type @code{struct vectorlike_header} is common to all +vectorlike objects. @item own_text A @code{struct buffer_text} structure that ordinarily holds the buffer @@ -928,6 +920,11 @@ ordinary buffer, this is the @code{own_text} field above. In an indirect buffer, this is the @code{own_text} field of the base buffer. +@item next +A pointer to the next buffer, in the chain of all buffers, including +killed buffers. This chain is used only for allocation and garbage +collection, in order to collect killed buffers properly. + @item pt @itemx pt_byte The character and byte positions of point in a buffer. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [corrector(s) needed] doc/lispref/internals.texi tweaks 2012-11-12 11:20 [corrector(s) needed] doc/lispref/internals.texi tweaks Dmitry Antipov @ 2012-11-13 13:47 ` Eli Zaretskii 2012-11-14 16:30 ` Dmitry Antipov 0 siblings, 1 reply; 4+ messages in thread From: Eli Zaretskii @ 2012-11-13 13:47 UTC (permalink / raw) To: Dmitry Antipov; +Cc: emacs-devel > Date: Mon, 12 Nov 2012 15:20:47 +0400 > From: Dmitry Antipov <dmantipov@yandex.ru> > > There are some bits for doc/lispref/internals.texi, which looks a bit > outdated and incomplete... Thanks, some comments below. > +@code{size} member contains the subtype enumerated by @code{enum pvec_type} > +and an information about how many Lisp_Object fields this structure ^^^^^^^^^^^ "@code{Lisp_Object}", since that's a C symbol. > +First entry means that the internal size of a cons cell is 16 bytes, there > +are 50589 used cons cells and 8907 conses are on the free list. Likewise I don't understand why you replaced a @table with free text. A table is much easier to read and traverse. I suggest to use a table, just with updated info. > +Miscellaneous objects at @var{misc} includes markers and overlays plus @var is inappropriate here (and in the @example), as "misc" etc. are not formal arguments or references to other symbols; they are literal strings that appear in the output (unlike in the original @example, where they stood for numbers). > Finally, the last member > +means that the total heap size is 14654 Kb, and 2363 Kb of them are free. I think you should mention that this part appears only on some platforms, otherwise some readers might think the manual is in error. > +others, like markers and buffers, are quite special and needed to provide > +the basic support to write an editor commands in Lisp. To implement such ^^ Lose the "an" part here. > +a variety of object types and provide an efficient way to pass objects between > +the subsystems of an interpreter, there is a set of C data structures and > +a special type to represent the pointers to all of them, which is known as > +tagged pointer. Whenever you introduce a new term, it is best to use @dfn, as in "@dfn{tagged pointer}", the first time you use the term. This makes the term stand out. > +and placed into a 3-bits bitfield of the @code{Lisp_Object}. ^^^^^^ "3-bit", without "s". > The rest bits of > +@code{Lisp_Object} is the value itself. ^^^^^^^^^^^^^ "The rest of the bits" > Integer values are immediate, e.g. > +directly represented by the rest bits, and all other objects are represented by I would use "value bits" here, like this: Integer values are immediate, i.e.@: directly represented by those @dfn{value bits} > +@item union Lisp_Misc > +Miscellaneous kinds of objects which doesn't fit into any of the above. ^^^^^^^ "don't" > +frames, and processes falls into this category. The rest of special types, ^^^^^ "fall" > +including markers and overlays, are enumerated by @code{enum Lisp_Misc_Type} > +and forms the set of subtypes of @code{Lisp_Misc}. ^^^^^ "form" > +recursively place other windows on the same frame. (Do not confuse Emacs Lisp > +window object with the window as an entity managed by the user interface > +system like X; in Emacs terminology, the latter is called frame). ^^ This period should be inside the parentheses. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [corrector(s) needed] doc/lispref/internals.texi tweaks 2012-11-13 13:47 ` Eli Zaretskii @ 2012-11-14 16:30 ` Dmitry Antipov 2012-11-14 17:10 ` Eli Zaretskii 0 siblings, 1 reply; 4+ messages in thread From: Dmitry Antipov @ 2012-11-14 16:30 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Emacs development discussions [-- Attachment #1: Type: text/plain, Size: 341 bytes --] On 11/13/2012 05:47 PM, Eli Zaretskii wrote: >> Date: Mon, 12 Nov 2012 15:20:47 +0400 >> From: Dmitry Antipov <dmantipov@yandex.ru> >> >> There are some bits for doc/lispref/internals.texi, which looks a bit >> outdated and incomplete... > > Thanks, some comments below. Great thanks! Do you have the patience for one more round? Dmitry [-- Attachment #2: internals.patch --] [-- Type: text/plain, Size: 17193 bytes --] === modified file 'doc/lispref/internals.texi' --- doc/lispref/internals.texi 2012-06-27 05:21:15 +0000 +++ doc/lispref/internals.texi 2012-11-14 16:20:06 +0000 @@ -226,12 +226,11 @@ Beyond the basic vector, a lot of objects like window, buffer, and frame are managed as if they were vectors. The corresponding C data structures include the @code{struct vectorlike_header} field whose -@code{next} field points to the next object in the chain: -@code{header.next.buffer} points to the next buffer (which could be -a killed buffer), and @code{header.next.vector} points to the next -vector in a free list. If a vector is small (smaller than or equal to -@code{VBLOCK_BYTES_MAX} bytes, see @file{alloc.c}), then -@code{header.next.nbytes} contains the vector size in bytes. +@code{size} member contains the subtype enumerated by @code{enum pvec_type} +and an information about how many @code{Lisp_Object} fields this structure +contains and what the size of the rest data is. This information is +needed to calculate the memory footprint of an object, and used +by the vector allocation code while iterating over the vector blocks. @cindex garbage collection It is quite common to use some storage for a while, then release it @@ -284,88 +283,147 @@ spontaneously if you use more than @code{gc-cons-threshold} bytes of Lisp data since the previous garbage collection.) -@code{garbage-collect} returns a list containing the following -information: +@code{garbage-collect} returns a list with information on amount of space +in use, where each entry has the form @samp{(name size used)} or +@samp{(name size used free)}. In the entry, @samp{name} is a symbol +describing the kind of objects this entry represents, @samp{size} is the +number of bytes used by each one, @samp{used} is the number of those objects +that were found live in the heap, and optional @samp{free} is the number of +those objects that are not live but that Emacs keeps around for future +allocations. So an overall result is: @example -@group -((@var{used-conses} . @var{free-conses}) - (@var{used-syms} . @var{free-syms}) -@end group - (@var{used-miscs} . @var{free-miscs}) - @var{used-string-chars} - @var{used-vector-slots} - (@var{used-floats} . @var{free-floats}) - (@var{used-intervals} . @var{free-intervals}) - (@var{used-strings} . @var{free-strings})) +((@code{conses} @var{cons-size} @var{used-conse} @var{free-conses}) + (@code{symbols} @var{symbol-size} @var{used-symbols} @var{free-symbols}) + (@code{miscs} @var{misc-size} @var{used-miscs} @var{free-miscs}) + (@code{strings} @var{string-size} @var{used-strings} @var{free-strings}) + (@code{string-bytes} @var{byte-size} @var{used-bytes}) + (@code{vectors} @var{vector-size} @var{used-vectors}) + (@code{vector-slots} @var{slot-size} @var{used-slots} @var{free-slots}) + (@code{floats} @var{float-size} @var{used-floats} @var{free-floats}) + (@code{intervals} @var{interval-size} @var{used-intervals} @var{free-intervals}) + (@code{buffers} @var{buffer-size} @var{used-buffers}) + (@code{heap} @var{unit-size} @var{total-size} @var{free-size})) @end example Here is an example: @example -@group (garbage-collect) - @result{} ((106886 . 13184) (9769 . 0) - (7731 . 4651) 347543 121628 - (31 . 94) (1273 . 168) - (25474 . 3569)) -@end group + @result{} ((conses 16 49126 8058) (symbols 48 14607 0) + (miscs 40 34 56) (strings 32 2942 2607) + (string-bytes 1 78607) (vectors 16 7247) + (vector-slots 8 341609 29474) (floats 8 71 102) + (intervals 56 27 26) (buffers 944 8) + (heap 1024 11715 2678)) @end example -Here is a table explaining each element: +Below is a table explaining each element. Note that last @code{heap} entry +is optional and present only if an underlying @code{malloc} implementation +provides @code{mallinfo} function. @table @var +@item cons-size +Internal size of a cons cell, e.g. @code{sizeof (struct Lisp_Cons)}. + @item used-conses The number of cons cells in use. @item free-conses -The number of cons cells for which space has been obtained from the -operating system, but that are not currently being used. - -@item used-syms +The number of cons cells for which space has been obtained from +the operating system, but that are not currently being used. + +@item symbol-size +Internal size of a symbol, e.g. @code{sizeof (struct Lisp_Symbol)}. + +@item used-symbols The number of symbols in use. -@item free-syms -The number of symbols for which space has been obtained from the -operating system, but that are not currently being used. +@item free-symbols +The number of symbols for which space has been obtained from +the operating system, but that are not currently being used. + +@item misc-size +Internal size of a miscellaneous entity, e.g. +@code{sizeof (union Lisp_Misc)}, which is a size of the +largest type enumerated in @code{enum Lisp_Misc_Type}. @item used-miscs -The number of miscellaneous objects in use. These include markers and -overlays, plus certain objects not visible to users. +The number of miscellaneous objects in use. These include markers +and overlays, plus certain objects not visible to users. @item free-miscs The number of miscellaneous objects for which space has been obtained from the operating system, but that are not currently being used. -@item used-string-chars -The total size of all strings, in characters. - -@item used-vector-slots -The total number of elements of existing vectors. +@item string-size +Internal size of a string header, e.g. @code{sizeof (struct Lisp_String)}. + +@item used-strings +The number of string headers in use. + +@item free-strings +The number of string headers for which space has been obtained +from the operating system, but that are not currently being used. + +@item byte-size +This is used for convenience and equals to @code{sizeof (char)}. + +@item used-bytes +The total size of all string data in bytes. + +@item vector-size +Internal size of a vector header, e.g. @code{sizeof (struct Lisp_Vector)}. + +@item used-vectors +The number of vector headers allocated from the vector blocks. + +@item slot-size +Internal size of a vector slot, always equal to @code{sizeof (Lisp_Object)}. + +@item used-slots +The number of slots in all used vectors. + +@item free-slots +The number of free slots in all vector blocks. + +@item float-size +Internal size of a float object, e.g. @code{sizeof (struct Lisp_Float)}. +(Do not confuse it with the native platform @code{float} or @code{double}.) @item used-floats The number of floats in use. @item free-floats -The number of floats for which space has been obtained from the -operating system, but that are not currently being used. +The number of floats for which space has been obtained from +the operating system, but that are not currently being used. + +@item interval-size +Internal size of an interval object, e.g. @code{sizeof (struct interval)}. @item used-intervals -The number of intervals in use. Intervals are an internal -data structure used for representing text properties. +The number of intervals in use. @item free-intervals -The number of intervals for which space has been obtained -from the operating system, but that are not currently being used. - -@item used-strings -The number of strings in use. - -@item free-strings -The number of string headers for which the space was obtained from the -operating system, but which are currently not in use. (A string -object consists of a header and the storage for the string text -itself; the latter is only allocated when the string is created.) +The number of intervals for which space has been obtained from +the operating system, but that are not currently being used. + +@item buffer-size +Internal size of a buffer, e.g. @code{sizeof (struct buffer)}. +(Do not confuse with the value returned by @code{buffer-size} function.) + +@item used-buffers +The number of buffer objects in use. This includes killed buffers +invisible to users, e.g. all buffers in @code{all_buffers} list. + +@item unit-size +The unit of heap space measurement, always equal to 1024 bytes. + +@item total-size +Total heap size, in @var{unit-size} units. + +@item free-size +Heap space which is not currently used, in @var{unit-size} units. @end table If there was overflow in pure space (@pxref{Pure Storage}), @@ -388,23 +446,25 @@ @defopt gc-cons-threshold The value of this variable is the number of bytes of storage that must be allocated for Lisp objects after one garbage collection in order to -trigger another garbage collection. A cons cell counts as eight bytes, -a string as one byte per character plus a few bytes of overhead, and so -on; space allocated to the contents of buffers does not count. Note -that the subsequent garbage collection does not happen immediately when -the threshold is exhausted, but only the next time the Lisp evaluator is -called. - -The initial threshold value is 800,000. If you specify a larger -value, garbage collection will happen less often. This reduces the -amount of time spent garbage collecting, but increases total memory use. -You may want to do this when running a program that creates lots of -Lisp data. - -You can make collections more frequent by specifying a smaller value, -down to 10,000. A value less than 10,000 will remain in effect only -until the subsequent garbage collection, at which time -@code{garbage-collect} will set the threshold back to 10,000. +trigger another garbage collection. You can use the result returned by +@code{garbage-collect} to get an information about size of the particular +object type; space allocated to the contents of buffers does not count. +Note that the subsequent garbage collection does not happen immediately +when the threshold is exhausted, but only the next time the Lisp evaluator +is called. + +The initial threshold value is @code{GC_DEFAULT_THRESHOLD}, defined in +@file{alloc.c}. Since it's defined in @code{word_size} units, the value +is 400,000 for the default 32-bit configuration and 800,000 for the 64-bit +one. If you specify a larger value, garbage collection will happen less +often. This reduces the amount of time spent garbage collecting, but +increases total memory use. You may want to do this when running a program +that creates lots of Lisp data. + +You can make collections more frequent by specifying a smaller value, down +to 1/10th of @code{GC_DEFAULT_THRESHOLD}. A value less than this minimum +will remain in effect only until the subsequent garbage collection, at which +time @code{garbage-collect} will set the threshold back to the minimum. @end defopt @defopt gc-cons-percentage @@ -639,7 +699,12 @@ the number of Lisp arguments, it must have exactly two C arguments: the first is the number of Lisp arguments, and the second is the address of a block containing their values. These have types -@code{int} and @w{@code{Lisp_Object *}} respectively. +@code{int} and @w{@code{Lisp_Object *}} respectively. Since +@code{Lisp_Object} can hold any Lisp object of any data type, you +can determine the actual data type only at run time; so if you want +a primitive to accept only a certain type of argument, you must check +the type explicitly using a suitable predicate (@pxref{Type Predicates}). +@cindex type checking internals @cindex @code{GCPRO} and @code{UNGCPRO} @cindex protect C variables from garbage collection @@ -820,23 +885,70 @@ @section Object Internals @cindex object internals -@c FIXME Is this still true? Does --with-wide-int affect anything? - GNU Emacs Lisp manipulates many different types of data. The actual -data are stored in a heap and the only access that programs have to it -is through pointers. Each pointer is 32 bits wide on 32-bit machines, -and 64 bits wide on 64-bit machines; three of these bits are used for -the tag that identifies the object's type, and the remainder are used -to address the object. - - Because Lisp objects are represented as tagged pointers, it is always -possible to determine the Lisp data type of any object. The C data type -@code{Lisp_Object} can hold any Lisp object of any data type. Ordinary -variables have type @code{Lisp_Object}, which means they can hold any -type of Lisp value; you can determine the actual data type only at run -time. The same is true for function arguments; if you want a function -to accept only a certain type of argument, you must check the type -explicitly using a suitable predicate (@pxref{Type Predicates}). -@cindex type checking internals + Emacs Lisp provides a rich set of the data types. Some of them, like cons +cells, integers and stirngs, are common to nearly all Lisp dialects. Some +others, like markers and buffers, are quite special and needed to provide +the basic support to write editor commands in Lisp. To implement such +a variety of object types and provide an efficient way to pass objects between +the subsystems of an interpreter, there is a set of C data structures and +a special type to represent the pointers to all of them, which is known as +@dfn{tagged pointer}. + + In C, the tagged pointer is an object of type @code{Lisp_Object}. Any +initialized variable of such a type always holds the value of one of the +following basic data types: integer, symbol, string, cons cell, float, +vectorlike or miscellaneous object. Each of these data types has the +corresponding tag value. All tags are enumerated by @code{enum Lisp_Type} +and placed into a 3-bit bitfield of the @code{Lisp_Object}. The rest of the +bits is the value itself. Integer values are immediate, e.g. directly +represented by those @dfn{value bits}, and all other objects are represented +by the C pointers to a corresponding object allocated from the heap. Width +of the @code{Lisp_Object} is platform- and configuration-dependent: usually +it's equal to the width of an underlying platform pointer (e.g. 32-bit on +a 32-bit machine and 64-bit on a 64-bit one), but also there is a special +configuration where @code{Lisp_Object} is 64-bit but all pointers are 32-bit. +The latter trick was designed to overcome the limited range of values for +Lisp integers on a 32-bit system by using 64-bit @code{long long} type for +@code{Lisp_Object}. + + The following C data structures are defined in @file{lisp.h} to represent +the basic data types beyond integers: + +@table @code +@item struct Lisp_Cons +Cons cell, an object used to construct lists. + +@item struct Lisp_String +String, the basic object to represent a sequence of characters. + +@item struct Lisp_Vector +Array, a fixed-size set of Lisp objects which may be accessed by an index. + +@item struct Lisp_Symbol +Symbol, the unique-named entity commonly used as an identifier. + +@item struct Lisp_Float +Floating point value. + +@item union Lisp_Misc +Miscellaneous kinds of objects which don't fit into any of the above. +@end table + + These types are the first-class citizens of an internal type system. +Since the tag space is limited, all other types are the subtypes of either +@code{Lisp_Vectorlike} or @code{Lisp_Misc}. Vector subtypes are enumerated +by @code{enum pvec_type}, and nearly all complex objects like windows, buffers, +frames, and processes fall into this category. The rest of special types, +including markers and overlays, are enumerated by @code{enum Lisp_Misc_Type} +and form the set of subtypes of @code{Lisp_Misc}. + + Below there is a description of a few subtypes of @code{Lisp_Vectorlike}. +Buffer object represents the text to display and edit. Window is the part +of display structure which shows the buffer or used as a container to +recursively place other windows on the same frame. (Do not confuse Emacs Lisp +window object with the window as an entity managed by the user interface +system like X; in Emacs terminology, the latter is called frame.) Finally, +process object is used to manage the subprocesses. @menu * Buffer Internals:: Components of a buffer structure. @@ -912,12 +1024,8 @@ @table @code @item header -A @code{struct vectorlike_header} structure where @code{header.next} -points to the next buffer, in the chain of all buffers (including -killed buffers). This chain is used only for garbage collection, in -order to collect killed buffers properly. Note that vectors, and most -kinds of objects allocated as vectors, are all on one chain, but -buffers are on a separate chain of their own. +A header of type @code{struct vectorlike_header} is common to all +vectorlike objects. @item own_text A @code{struct buffer_text} structure that ordinarily holds the buffer @@ -928,6 +1036,11 @@ ordinary buffer, this is the @code{own_text} field above. In an indirect buffer, this is the @code{own_text} field of the base buffer. +@item next +A pointer to the next buffer, in the chain of all buffers, including +killed buffers. This chain is used only for allocation and garbage +collection, in order to collect killed buffers properly. + @item pt @itemx pt_byte The character and byte positions of point in a buffer. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [corrector(s) needed] doc/lispref/internals.texi tweaks 2012-11-14 16:30 ` Dmitry Antipov @ 2012-11-14 17:10 ` Eli Zaretskii 0 siblings, 0 replies; 4+ messages in thread From: Eli Zaretskii @ 2012-11-14 17:10 UTC (permalink / raw) To: Dmitry Antipov; +Cc: emacs-devel > Date: Wed, 14 Nov 2012 20:30:57 +0400 > From: Dmitry Antipov <dmantipov@yandex.ru> > CC: Emacs development discussions <emacs-devel@gnu.org> > > >> Date: Mon, 12 Nov 2012 15:20:47 +0400 > >> From: Dmitry Antipov <dmantipov@yandex.ru> > >> > >> There are some bits for doc/lispref/internals.texi, which looks a bit > >> outdated and incomplete... > > > > Thanks, some comments below. > > Great thanks! Do you have the patience for one more round? Yep. > +@code{garbage-collect} returns a list with information on amount of space > +in use, where each entry has the form @samp{(name size used)} or > +@samp{(name size used free)}. In the entry, @samp{name} is a symbol Here, "name", "size", "used", etc. stand for something else, so they should be in @var, both inside @samp and in the text that describes them. > +@item cons-size > +Internal size of a cons cell, e.g. @code{sizeof (struct Lisp_Cons)}. If you have a period that doesn't end a sentence and is followed by a space, put a @: between the period and the space, so that the typesetter will know this isn't the end of a sentence, and typesets the space correctly. IOW, "e.g.@:". And I think you mean "i.e.", not "e.g." here (and elsewhere in this table). > +when the threshold is exhausted, but only the next time the Lisp evaluator > +is called. "Lisp interpreter", I think. > +bits is the value itself. Integer values are immediate, e.g. directly ^^^^ "i.e.@:" > +it's equal to the width of an underlying platform pointer (e.g. 32-bit on ^^^^ "i.e.@:" Thanks. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2012-11-14 17:10 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-11-12 11:20 [corrector(s) needed] doc/lispref/internals.texi tweaks Dmitry Antipov 2012-11-13 13:47 ` Eli Zaretskii 2012-11-14 16:30 ` Dmitry Antipov 2012-11-14 17:10 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).