unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* strings referred to by Vsjis_coding_system, Vbig5_coding_system
@ 2009-11-11 21:42 Dan Nicolaescu
  2009-11-16 13:01 ` Kenichi Handa
  0 siblings, 1 reply; 6+ messages in thread
From: Dan Nicolaescu @ 2009-11-11 21:42 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: emacs-devel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 361 bytes --]

Handa-san,

Strings like this:

ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ

are found in the GC memory in emacs, they are referred to by
Vsjis_coding_system, Vbig5_coding_system, and maybe other variables.

Is there any chance that such strings can be put in pure memory?
Not sure where they are created, and what they do...

Thanks

        --dan




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: strings referred to by Vsjis_coding_system, Vbig5_coding_system
  2009-11-11 21:42 strings referred to by Vsjis_coding_system, Vbig5_coding_system Dan Nicolaescu
@ 2009-11-16 13:01 ` Kenichi Handa
  2009-11-16 15:00   ` Dan Nicolaescu
  0 siblings, 1 reply; 6+ messages in thread
From: Kenichi Handa @ 2009-11-16 13:01 UTC (permalink / raw)
  To: Dan Nicolaescu; +Cc: emacs-devel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 1511 bytes --]

In article <200911112142.nABLgZwS024554@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes:

> Handa-san,
> Strings like this:

> ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ

> are found in the GC memory in emacs, they are referred to by
> Vsjis_coding_system, Vbig5_coding_system, and maybe other variables.

> Is there any chance that such strings can be put in pure memory?
> Not sure where they are created, and what they do...

I have no idea.  Are you sure that those variables refer
that string?  According to the following gdb session, at
least Vbig5_coding_system doesn't refer such a string
directly.

(gdb) p Vbig5_coding_system
$1 = 139728474
(gdb) xtype
Lisp_Symbol
(gdb) xsymbol
$2 = (struct Lisp_Symbol *) 0x8541658
"chinese-big5"
(gdb) p *$2
$3 = {
  gcmarkbit = 0, 
  indirect_variable = 0, 
  constant = 0, 
  interned = 2, 
  xname = 137571105, 
  value = 138911026, 
  function = 138911026, 
  plist = 138911002, 
  next = 0x85356b0
}
(gdb) p $3->value
$4 = 138911026
(gdb) xtype
Lisp_Symbol
(gdb) xsymbol
$5 = (struct Lisp_Symbol *) 0x8479d30
"unbound"
(gdb) p $3->function
$6 = 138911026
(gdb) p $3->plist
$7 = 138911002
(gdb) xtype
Lisp_Symbol
(gdb) xsymbol
$8 = (struct Lisp_Symbol *) 0x8479d18
"nil"

By the way, I've just found that "pr" command doesn't work
with the latest code:

(gdb) p Vbig5_coding_system
$1 = 139728474
(gdb) pr
Cannot access memory at address 0x842e030

---
Kenichi Handa
handa@m17n.org




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: strings referred to by Vsjis_coding_system, Vbig5_coding_system
  2009-11-16 13:01 ` Kenichi Handa
@ 2009-11-16 15:00   ` Dan Nicolaescu
  2009-11-17  4:52     ` Kenichi Handa
  0 siblings, 1 reply; 6+ messages in thread
From: Dan Nicolaescu @ 2009-11-16 15:00 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: emacs-devel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 4000 bytes --]

Kenichi Handa <handa@m17n.org> writes:

  > In article <200911112142.nABLgZwS024554@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes:
  > 
  > > Handa-san,
  > > Strings like this:
  > 
  > > ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
  > 
  > > are found in the GC memory in emacs, they are referred to by
  > > Vsjis_coding_system, Vbig5_coding_system, and maybe other variables.
  > 
  > > Is there any chance that such strings can be put in pure memory?
  > > Not sure where they are created, and what they do...
  > 
  > I have no idea.  Are you sure that those variables refer
  > that string?  According to the following gdb session, at
  > least Vbig5_coding_system doesn't refer such a string
  > directly.
  > 

I can't find one at the moment, and unfortunately I did not save a
debugging session.
All the ones that I find at the moment are referred to from Vcoding_system_hash_table.
Can you please figure out this one?

[Note that I have some instrumentation code in alloc.c, so the line
number will not match, but none of the semantics are changed]


Breakpoint 3, mark_object (arg=0x84d8e99) at alloc.c:5538
5538		MARK_STRING (ptr);
Missing separate debuginfos, use: debuginfo-install alsa-lib-1.0.21-3.fc11.i586 atk-1.25.2-2.fc11.i586 bzip2-libs-1.0.5-5.fc11.i586 cairo-1.8.8-1.fc11.i586 dbus-libs-1.2.12-2.fc11.i586 e2fsprogs-libs-1.41.4-12.fc11.i586 expat-2.0.1-6.fc11.1.i586 fontconfig-2.7.1-1.fc11.i586 freetype-2.3.9-5.fc11.i586 giflib-4.1.6-2.fc11.i586 glib2-2.20.5-1.fc11.i586 glibc-2.10.1-5.i686 gpm-libs-1.20.6-3.fc11.i586 gtk-nodoka-engine-0.7.2-5.fc11.i586 gtk2-2.16.6-2.fc11.i586 libICE-1.0.4-7.fc11.i586 libSM-1.1.0-4.fc11.i586 libX11-1.2.2-1.fc11.i586 libXau-1.0.4-5.fc11.i586 libXcomposite-0.4.0-7.fc11.i586 libXcursor-1.1.9-4.fc11.i586 libXdamage-1.1.1-6.fc11.i586 libXext-1.0.99.1-3.fc11.i586 libXfixes-4.0.3-5.fc11.i586 libXft-2.1.13-2.fc11.i586 libXi-1.2.1-1.fc11.i586 libXinerama-1.0.3-4.fc11.i586 libXpm-3.5.7-5.fc11.i586 libXrandr-1.2.99.4-3.fc11.i586 libXrender-0.9.4-5.fc11.i586 libattr-2.4.43-3.fc11.i586 libcap-2.16-4.fc11.1.i586 libcroco-0.6.2-2.fc11.i586 libgsf-1.14.11-3.fc11.i586 libjpeg-6b-45.fc11.i586 libpng-1.2.37-1.fc11.i586 librsvg2-2.26.0-1.fc11.i586 libselinux-2.0.80-1.fc11.i586 libtiff-3.8.2-14.fc11.i586 libxcb-1.2-4.fc11.i586 libxml2-2.7.6-1.fc11.i586 ncurses-libs-5.7-2.20090207.fc11.i586 pango-1.24.5-1.fc11.i586 pixman-0.14.0-2.fc11.i586 zlib-1.2.3-22.fc11.i586
(gdb) pp arg
"ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ\0"
(gdb) up
#1  0x0817c5af in mark_vectorlike (ptr=0x862bf10) at alloc.c:5419
5419	    mark_object (ptr->contents[i]);
(gdb) up
#2  0x0817c974 in mark_object (arg=0x862bf15) at alloc.c:5640
5640		mark_vectorlike (XVECTOR (obj));
(gdb) pp arg
[]
(gdb) up
#3  0x0817c5af in mark_vectorlike (ptr=0x8552298) at alloc.c:5419
5419	    mark_object (ptr->contents[i]);
(gdb) pp arg
#4  0x0817c974 in mark_object (arg=0x855229d) at alloc.c:5640
5640		mark_vectorlike (XVECTOR (obj));
(gdb) pp arg
[]
(gdb) up
#5  0x0817c5af in mark_vectorlike (ptr=0x864ab48) at alloc.c:5419
5419	    mark_object (ptr->contents[i]);
(gdb) pp arg
No symbol "arg" in current context.
(gdb) up
#6  0x0817c974 in mark_object (arg=0x864ab4d) at alloc.c:5640
5640		mark_vectorlike (XVECTOR (obj));
(gdb) pp arg
[]
(gdb) up
#7  0x0817c910 in mark_object (arg=0x8429f2d) at alloc.c:5629
5629			MAYBE_MARK_OBJECT (h->key_and_value);
(gdb) pp arg
#s(hash-table size -2147482553 test eq rehash-size 1.5 rehash-threshold 0.8 data ())
(gdb) up
#8  0x0817be26 in Fgarbage_collect () at alloc.c:5108
5108	    mark_object (*staticvec[i]);
(gdb) p i
$1 = 0x1a4

0x1a4 in staticvec corresponds to Vcoding_system_hash_table.

  > By the way, I've just found that "pr" command doesn't work
  > with the latest code:
  > 
  > (gdb) p Vbig5_coding_system
  > $1 = 139728474
  > (gdb) pr
  > Cannot access memory at address 0x842e030

I've found many instances where "pr" crashes the debugger...




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: strings referred to by Vsjis_coding_system, Vbig5_coding_system
  2009-11-16 15:00   ` Dan Nicolaescu
@ 2009-11-17  4:52     ` Kenichi Handa
  2009-11-17  5:08       ` Dan Nicolaescu
  0 siblings, 1 reply; 6+ messages in thread
From: Kenichi Handa @ 2009-11-17  4:52 UTC (permalink / raw)
  To: Dan Nicolaescu; +Cc: emacs-devel

In article <200911161500.nAGF0KYN015109@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes:

> I can't find one at the moment, and unfortunately I did not save a
> debugging session.
> All the ones that I find at the moment are referred to from Vcoding_system_hash_table.
> Can you please figure out this one?

Ah, then I think that \377 string is to record safely
encodable character sets for each coding system.  If Nth
element is not \377, the charset whose ID is N is encodable
by that coding system.

And, yes, those strings can be in pure memory when created
by temacs.  But, note that one can change a definition of a
coding system at runtime (perhaps rarely happen) and it
makes the string in pure memory accessed from nowhere.

---
Kenichi Handa
handa@m17n.org




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: strings referred to by Vsjis_coding_system, Vbig5_coding_system
  2009-11-17  4:52     ` Kenichi Handa
@ 2009-11-17  5:08       ` Dan Nicolaescu
  2009-11-17  8:10         ` Kenichi Handa
  0 siblings, 1 reply; 6+ messages in thread
From: Dan Nicolaescu @ 2009-11-17  5:08 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: emacs-devel

Kenichi Handa <handa@m17n.org> writes:

  > In article <200911161500.nAGF0KYN015109@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes:
  > 
  > > I can't find one at the moment, and unfortunately I did not save a
  > > debugging session.
  > > All the ones that I find at the moment are referred to from Vcoding_system_hash_table.
  > > Can you please figure out this one?
  > 
  > Ah, then I think that \377 string is to record safely
  > encodable character sets for each coding system.  If Nth
  > element is not \377, the charset whose ID is N is encodable
  > by that coding system.
  > 
  > And, yes, those strings can be in pure memory when created
  > by temacs.  But, note that one can change a definition of a
  > coding system at runtime (perhaps rarely happen) and it
  > makes the string in pure memory accessed from nowhere.

That's fine, instead of everyone having to pay the price of GCing all
those strings, it's not too bad if they are dead in the very rare case
the coding system is changed at runtime.

Are there any other parts of the coding system that are in GC memory and
can be put in pure memory?




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: strings referred to by Vsjis_coding_system, Vbig5_coding_system
  2009-11-17  5:08       ` Dan Nicolaescu
@ 2009-11-17  8:10         ` Kenichi Handa
  0 siblings, 0 replies; 6+ messages in thread
From: Kenichi Handa @ 2009-11-17  8:10 UTC (permalink / raw)
  To: Dan Nicolaescu; +Cc: emacs-devel

In article <200911170508.nAH585Rh021954@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes:

> That's fine, instead of everyone having to pay the price of GCing all
> those strings, it's not too bad if they are dead in the very rare case
> the coding system is changed at runtime.

> Are there any other parts of the coding system that are in GC memory and
> can be put in pure memory?

A value of Vcoding_system_hash_table is a attributes vector
of a coding system.  An element of this vector is indexed by
enum coding_attr_index (in coding.h).  Among those elements,
I think these can be in pure.

coding_attr_docstring (string)
coding_attr_safe_charsets (string)
coding_attr_charset_valids (vector)
coding_attr_iso_initial (vector)
coding_attr_iso_usage (cons)
coding_attr_iso_request (cons)

---
Kenichi Handa
handa@m17n.org




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-11-17  8:10 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-11 21:42 strings referred to by Vsjis_coding_system, Vbig5_coding_system Dan Nicolaescu
2009-11-16 13:01 ` Kenichi Handa
2009-11-16 15:00   ` Dan Nicolaescu
2009-11-17  4:52     ` Kenichi Handa
2009-11-17  5:08       ` Dan Nicolaescu
2009-11-17  8:10         ` Kenichi Handa

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).