From: Stefan Monnier <monnier@iro.umontreal.ca>
To: Eli Zaretskii <eliz@gnu.org>
Cc: schwab@linux-m68k.org, rms@gnu.org, 11519@debbugs.gnu.org,
lekktu@gmail.com
Subject: bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping
Date: Wed, 23 May 2012 16:07:05 -0400 [thread overview]
Message-ID: <jwv396qheah.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <838vgiyh4q.fsf@gnu.org> (Eli Zaretskii's message of "Wed, 23 May 2012 19:52:21 +0300")
>> > Which other places use C pointers to buffer text and call functions
>> > that can allocate memory?
>> IIUC any place that uses STRING_CHAR_AND_LENGTH on buffer text is
>> vulnerable to the problem.
> That's not true. As long as you access buffer text through character
> position, you are safe.
Right, some of those uses might be safe, indeed. Of course it's not
only STRING_CHAR_AND_LENGTH but STRING_CHAR_ADVANCE as well, together
with FETCH_* macros which use those, etc...
Grepping for those macros shows they're used at *many* places, and I'd
be amazed if re_search is the only place where we don't go through the
BYTE_POS_ADDR rigmarole.
Let's see ...hmmm... yup, set-buffer-multibyte is another example,
find_charsets_in_text yet another, and I'm not even trying hard.
Just grep for "STRING_CHAR_" and see for yourself.
>> If you really want to install your workaround on the emacs-24 branch, go
>> for it but let's try to find a real fix for the trunk.
> What kind of real fix are you looking for?
One that lets us write code without having to worry about such
corner cases. E.g. changing STRING_CHAR_ADVANCE so it can't cause
relocation. Not using ralloc.c any more would be another good option.
Otherwise, changing our macros so they do the BYTE_POS_ADDR
internally, discouraging the use of direct pointers into the
buffer's content.
> Why shouldn't it be the fix in this case, and what better fix can we
> invent when we use an essentially externally maintained code (AFAIR,
> regex will at some point be re-sync'ed with gnulib) that cannot be
> expected to change its code radically so as not to access buffer text
> through C pointers?
To me, it's clearly a workaround. It's an OK workaround if/when we use
a "blackbox" (i.e. externally maintained) regexp engine and keep using
ralloc.c. But better would be to eliminate the problem altogether.
>> But on other platforms where we use mmap, we do suffer from this
>> fragmentation, and yet it doesn't seem to be a real source of problem.
> I don't know enough about mmap to answer that. I vaguely recollect
> that mmap avoids such fragmentation as an inherent feature, but I may
> be wrong.
No, fragmentation is a property of the address space, so without
relocation you can't avoid it.
>> I guess my question turns into "why do we use gmalloc.c instead of
>> a malloc library that uses mmap (or some other mechanism that lets it
>> return large free chunks to the OS)"?
> Use of gmalloc is a different issue. We were talking about ralloc.c.
> You could use one, but not the other.
Well, still we use ralloc because we don't use mmap, so the question to
me is: why don't we use mmap (either via a malloc that does, or
directly via USE_MMAP_FOR_BUFFERS) and get rid of ralloc.c?
>> AFAIK, Windows is pretty much the only system where we use gmalloc.c and
>> ralloc.c nowadays.
> My reading of configure is that we use it on more than just Windows
> (and MS-DOS). Basically, any platform that uses gmalloc.c (which is
> the default, turned off only for GNU/Linux and Darwin) also uses
> ralloc.c.
To me "all minus GNU/Linux, Mac OS X, and Cygwin (which apparently uses
gmalloc but not ralloc)" is pretty close to "just Windows" nowadays.
>> Does anyone remember why we don't use the system malloc under
>> Windows (and Cygwin)?
> I find it hard to believe that going through system malloc on
> MS-Windows will let us use buffers as large as 1.5 GB (on a 32-bit
> machine). To achieve this today, we reserve a 2GB contiguous chunk of
> address space at startup, and then commit and uncommit parts of it as
> needed (see w32heap.c). ralloc.c has an important part in this
> arrangement.
You mean that Windows's system malloc library has a memory that's too
fragmented to be able to allocate a single 1.5G chunk? Why?
[ I know next to nothing about the w32 API and plead guilty of
POSIX-only thinking, so please bear with me. ]
Stefan
next prev parent reply other threads:[~2012-05-23 20:07 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-19 16:10 bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping Juanma Barranquero
2012-05-19 16:27 ` Eli Zaretskii
2012-05-19 21:40 ` Juanma Barranquero
2012-05-20 17:27 ` Eli Zaretskii
2012-05-20 19:00 ` Juanma Barranquero
2012-05-21 1:50 ` Stefan Monnier
2012-05-21 2:51 ` Eli Zaretskii
2012-05-21 7:59 ` Andreas Schwab
2012-05-21 17:51 ` Eli Zaretskii
2012-05-21 20:39 ` Stefan Monnier
2012-05-22 19:00 ` Eli Zaretskii
2012-05-22 19:19 ` Stefan Monnier
2012-05-22 19:47 ` Eli Zaretskii
2012-05-23 0:47 ` Stefan Monnier
2012-05-23 2:59 ` Eli Zaretskii
2012-05-23 14:16 ` Stefan Monnier
2012-05-23 15:23 ` Ken Brown
2012-05-23 16:52 ` Eli Zaretskii
2012-05-23 20:07 ` Stefan Monnier [this message]
2012-05-24 16:22 ` Eli Zaretskii
2012-05-28 2:15 ` Stefan Monnier
2012-05-28 16:53 ` Eli Zaretskii
2012-05-28 19:44 ` Stefan Monnier
2012-05-28 20:47 ` Eli Zaretskii
2012-05-29 1:23 ` Stefan Monnier
2012-05-29 16:02 ` Eli Zaretskii
2012-06-02 20:44 ` Juanma Barranquero
2012-06-03 4:18 ` Eli Zaretskii
2013-12-28 8:41 ` Glenn Morris
2013-12-28 9:48 ` Eli Zaretskii
2012-05-23 17:34 ` Eli Zaretskii
2012-05-23 14:10 ` Kenichi Handa
2012-05-23 15:27 ` Stefan Monnier
2012-05-23 17:02 ` Eli Zaretskii
2012-05-22 14:38 ` Kenichi Handa
2012-05-22 19:02 ` Eli Zaretskii
2012-05-21 1:49 ` Stefan Monnier
2012-05-21 2:50 ` Eli Zaretskii
2012-05-21 3:21 ` Stefan Monnier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=jwv396qheah.fsf-monnier+emacs@gnu.org \
--to=monnier@iro.umontreal.ca \
--cc=11519@debbugs.gnu.org \
--cc=eliz@gnu.org \
--cc=lekktu@gmail.com \
--cc=rms@gnu.org \
--cc=schwab@linux-m68k.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.