unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Stefan Monnier <monnier@iro.umontreal.ca>
To: Eli Zaretskii <eliz@gnu.org>
Cc: schwab@linux-m68k.org, rms@gnu.org, 11519@debbugs.gnu.org,
	lekktu@gmail.com
Subject: bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping
Date: Wed, 23 May 2012 16:07:05 -0400	[thread overview]
Message-ID: <jwv396qheah.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <838vgiyh4q.fsf@gnu.org> (Eli Zaretskii's message of "Wed, 23 May 2012 19:52:21 +0300")

>> > Which other places use C pointers to buffer text and call functions
>> > that can allocate memory?
>> IIUC any place that uses STRING_CHAR_AND_LENGTH on buffer text is
>> vulnerable to the problem.
> That's not true.  As long as you access buffer text through character
> position, you are safe.

Right, some of those uses might be safe, indeed.  Of course it's not
only STRING_CHAR_AND_LENGTH but STRING_CHAR_ADVANCE as well, together
with FETCH_* macros which use those, etc...

Grepping for those macros shows they're used at *many* places, and I'd
be amazed if re_search is the only place where we don't go through the
BYTE_POS_ADDR rigmarole.

Let's see ...hmmm... yup, set-buffer-multibyte is another example,
find_charsets_in_text yet another, and I'm not even trying hard.
Just grep for "STRING_CHAR_" and see for yourself.

>> If you really want to install your workaround on the emacs-24 branch, go
>> for it but let's try to find a real fix for the trunk.
> What kind of real fix are you looking for?

One that lets us write code without having to worry about such
corner cases.  E.g. changing STRING_CHAR_ADVANCE so it can't cause
relocation.  Not using ralloc.c any more would be another good option.
Otherwise, changing our macros so they do the BYTE_POS_ADDR
internally, discouraging the use of direct pointers into the
buffer's content.

> Why shouldn't it be the fix in this case, and what better fix can we
> invent when we use an essentially externally maintained code (AFAIR,
> regex will at some point be re-sync'ed with gnulib) that cannot be
> expected to change its code radically so as not to access buffer text
> through C pointers?

To me, it's clearly a workaround.  It's an OK workaround if/when we use
a "blackbox" (i.e. externally maintained) regexp engine and keep using
ralloc.c.  But better would be to eliminate the problem altogether.

>> But on other platforms where we use mmap, we do suffer from this
>> fragmentation, and yet it doesn't seem to be a real source of problem.
> I don't know enough about mmap to answer that.  I vaguely recollect
> that mmap avoids such fragmentation as an inherent feature, but I may
> be wrong.

No, fragmentation is a property of the address space, so without
relocation you can't avoid it.

>> I guess my question turns into "why do we use gmalloc.c instead of
>> a malloc library that uses mmap (or some other mechanism that lets it
>> return large free chunks to the OS)"?
> Use of gmalloc is a different issue.  We were talking about ralloc.c.
> You could use one, but not the other.

Well, still we use ralloc because we don't use mmap, so the question to
me is: why don't we use mmap (either via a malloc that does, or
directly via USE_MMAP_FOR_BUFFERS) and get rid of ralloc.c?

>> AFAIK, Windows is pretty much the only system where we use gmalloc.c and
>> ralloc.c nowadays.
> My reading of configure is that we use it on more than just Windows
> (and MS-DOS).  Basically, any platform that uses gmalloc.c (which is
> the default, turned off only for GNU/Linux and Darwin) also uses
> ralloc.c.

To me "all minus GNU/Linux, Mac OS X, and Cygwin (which apparently uses
gmalloc but not ralloc)" is pretty close to "just Windows" nowadays.

>> Does anyone remember why we don't use the system malloc under
>> Windows (and Cygwin)?
> I find it hard to believe that going through system malloc on
> MS-Windows will let us use buffers as large as 1.5 GB (on a 32-bit
> machine).  To achieve this today, we reserve a 2GB contiguous chunk of
> address space at startup, and then commit and uncommit parts of it as
> needed (see w32heap.c).  ralloc.c has an important part in this
> arrangement.

You mean that Windows's system malloc library has a memory that's too
fragmented to be able to allocate a single 1.5G chunk?  Why?
[ I know next to nothing about the w32 API and plead guilty of
  POSIX-only thinking, so please bear with me.  ]


        Stefan





  reply	other threads:[~2012-05-23 20:07 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-19 16:10 bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping Juanma Barranquero
2012-05-19 16:27 ` Eli Zaretskii
2012-05-19 21:40   ` Juanma Barranquero
2012-05-20 17:27     ` Eli Zaretskii
2012-05-20 19:00       ` Juanma Barranquero
2012-05-21  1:50         ` Stefan Monnier
2012-05-21  2:51           ` Eli Zaretskii
2012-05-21  7:59             ` Andreas Schwab
2012-05-21 17:51               ` Eli Zaretskii
2012-05-21 20:39                 ` Stefan Monnier
2012-05-22 19:00                   ` Eli Zaretskii
2012-05-22 19:19                     ` Stefan Monnier
2012-05-22 19:47                       ` Eli Zaretskii
2012-05-23  0:47                         ` Stefan Monnier
2012-05-23  2:59                           ` Eli Zaretskii
2012-05-23 14:16                             ` Stefan Monnier
2012-05-23 15:23                               ` Ken Brown
2012-05-23 16:52                               ` Eli Zaretskii
2012-05-23 20:07                                 ` Stefan Monnier [this message]
2012-05-24 16:22                                   ` Eli Zaretskii
2012-05-28  2:15                                     ` Stefan Monnier
2012-05-28 16:53                                       ` Eli Zaretskii
2012-05-28 19:44                                         ` Stefan Monnier
2012-05-28 20:47                                           ` Eli Zaretskii
2012-05-29  1:23                                             ` Stefan Monnier
2012-05-29 16:02                                               ` Eli Zaretskii
2012-06-02 20:44                                                 ` Juanma Barranquero
2012-06-03  4:18                                                   ` Eli Zaretskii
2013-12-28  8:41                                                     ` Glenn Morris
2013-12-28  9:48                                                       ` Eli Zaretskii
2012-05-23 17:34                               ` Eli Zaretskii
2012-05-23 14:10                       ` Kenichi Handa
2012-05-23 15:27                         ` Stefan Monnier
2012-05-23 17:02                           ` Eli Zaretskii
2012-05-22 14:38                 ` Kenichi Handa
2012-05-22 19:02                   ` Eli Zaretskii
2012-05-21  1:49       ` Stefan Monnier
2012-05-21  2:50         ` Eli Zaretskii
2012-05-21  3:21           ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jwv396qheah.fsf-monnier+emacs@gnu.org \
    --to=monnier@iro.umontreal.ca \
    --cc=11519@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    --cc=lekktu@gmail.com \
    --cc=rms@gnu.org \
    --cc=schwab@linux-m68k.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).