unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* Mysterious emacs failure
@ 2005-10-17 16:47 Denny Dahl
  2005-10-17 20:12 ` Denny Dahl
  0 siblings, 1 reply; 4+ messages in thread
From: Denny Dahl @ 2005-10-17 16:47 UTC (permalink / raw)



[-- Attachment #1.1: Type: text/plain, Size: 4507 bytes --]

I am data warehouse consultant who has been working at a large insurance company
for 18 months.  On my first week of work, I downloaded, configured and compiled
emacs for several of the AIX boxes here.  I have been using emacs on one particular
server quite productively for the last 18 months without any problems.  Up until
Thursday of last week.

I took off Friday to attend my 30th high school reunion and returned to work this
morning.  The box had been booted during my absence: a pre-meditated re-boot so
that a particular software package (Ab Initio) could be upgraded.  Now, emacs no
longer works.  In fact, it dies a horrible death at start-up like this:
  tlmitnu1:ddahl> emacs
  Segmentation fault(coredump)
  tlmitnu1:ddahl> ls -l core
  -rw-r-----   1 ddahl    abinitio   11360307 Oct 17 12:33 core

If you can provide me any clues or ideas about things to investigate, I would be
very appreciative.  Here is some additional information.

OS particulars:
  tlmitnu1:ddahl> uname -rv
  1 5
  tlmitnu1:ddahl> uname -a
  AIX tlmitnu1 1 5 0029334A4C00
  tlmitnu1:ddahl> cat /etc/motd
  *******************************************************************************
  *                                                                             *
  *                                                                             *
  *  Welcome to AIX Version 5.1!                                                *
  *                                                                             *
  *                                                                             *
  *  Please see the README file in /usr/lpp/bos for information pertinent to    *
  *  this release of the AIX Operating System.                                  *
  *                                                                             *
  *                                                                             *
  *******************************************************************************
  DEOS Infrastructure Version 2002-Q2-aix installed (08/13/2003)
  DEOS v2002-3-aixuvscan (09/23/2002)
  deos_unix_ECCmstagt_prod_s-2003.1 installed on 10/14/2003
  DEOS Infrastructure Version v2004-3-alerts installed  (09/14/2004)

I've run emacs under the debugger and single-stepped my way to the general area
where the signal happens.  In rough outline, there is a variable (__malloc_hook) found
in src/gmalloc.c and this is supposed to contain the address of a function.  But when
the program attempts to execute code at this derefeneced address, it finds unreadable
instructions:

  tlmitnu1:emacs-21.3> dbx src/emacs
  Type 'help' for help.
  reading symbolic information ...
  (dbx) stop in main
  [1] stop in main
  (dbx) run
  [1] stopped in main at line 714 in file "src/emacs.c" ($t1)
    714     int skip_args = 0;
  (dbx) step
  stopped in main at line 737 in file "src/emacs.c" ($t1)
    737     sort_args (argc, argv);
  (dbx) step
  stopped in sort_args at line 1651 in file "src/emacs.c" ($t1)
   1651     char **new = (char **) xmalloc (sizeof (char *) * argc);
  (dbx) step
  stopped in xmalloc at line 519 in file "src/alloc.c" ($t1)
    519     BLOCK_INPUT;
  (dbx) step
  stopped in xmalloc at line 520 in file "src/alloc.c" ($t1)
    520     val = (POINTER_TYPE *) malloc (size);
  (dbx) step
  stopped in gmalloc.malloc at line 891 in file "src/gmalloc.c" ($t1)
    891     if (!__malloc_initialized && !__malloc_initialize ())
  (dbx) step
  stopped in gmalloc.malloc at line 894 in file "src/gmalloc.c" ($t1)
    894     return (__malloc_hook != NULL ? *__malloc_hook : _malloc_internal) (size);
  (dbx) print __malloc_hook
  0x200e0b9c 
  (dbx) print *__malloc_hook
  0x2c030000 
  (dbx) step
  Unreadable instruction at address 0x2c030000
  (dbx) where
  ptrgl.$PTRGL(??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ) at 0x100adf64
  gmalloc.malloc(size = 0), line 894 in "gmalloc.c"
  xmalloc(size = 0), line 520 in "alloc.c"
  sort_args(argc = 42308, argv = 0x000c0002), line 1651 in "emacs.c"
  main(argc = 0, argv = (nil), envp = (nil)), line 737 in "emacs.c"
  (dbx) quit
   
The version of emacs that I've been using is 21-3.  In desperation, I downloaded, configured and
compiled a new emacs.21-3 but this newly compiled version failed in the same way as the original.

Thanks in advance for any help that you might be able to provide!

-Denny Dahl


[-- Attachment #1.2: Type: text/html, Size: 10060 bytes --]

[-- Attachment #2: Type: text/plain, Size: 152 bytes --]

_______________________________________________
Help-gnu-emacs mailing list
Help-gnu-emacs@gnu.org
http://lists.gnu.org/mailman/listinfo/help-gnu-emacs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Mysterious emacs failure
  2005-10-17 16:47 Mysterious emacs failure Denny Dahl
@ 2005-10-17 20:12 ` Denny Dahl
  2005-10-18 18:17   ` Kevin Rodgers
  2005-10-22  6:40   ` Tim X
  0 siblings, 2 replies; 4+ messages in thread
From: Denny Dahl @ 2005-10-17 20:12 UTC (permalink / raw)



[-- Attachment #1.1: Type: text/plain, Size: 5864 bytes --]

Have been hacking at this problem all day and have learned some interesting things
but do not have a solution yet.  I was able to successfully start the "bare impure emacs"
executable named temacs.  This runs successfully for a few minutes, then it goes
nuts attempting (unsuccessfully) to create the directory /.emacs.d over and over
again.

I also built an emacs using "GNU_MALLOC=no ./configure" but this didn't do what
I expected it to do.  I've been looking through the INSTALL file and etc/PROBLEMS
to try to find some alternate ways to build the malloc() code.  I've seen interesting
references to DOUG_LEA_MALLOC, but no indication of how to turn this on at
configure time.

Currently, my best alternative is to run emacs on another machine and just start
making heavy use of ange-ftp and M-x telnet.  I'd rather not, since M-x telnet does
not do filename completion.  And I'd really rather run my emacs on the same machine
where all my files are...  But I've already blown an entire day trying to get a working
build...  I know I'm going to lose patience soon...
  "Denny Dahl" <ddahl@travelers.com> wrote in message news:qtmdnQkmk8s-SM7eRVn-oQ@comcast.com...
  I am data warehouse consultant who has been working at a large insurance company
  for 18 months.  On my first week of work, I downloaded, configured and compiled
  emacs for several of the AIX boxes here.  I have been using emacs on one particular
  server quite productively for the last 18 months without any problems.  Up until
  Thursday of last week.

  I took off Friday to attend my 30th high school reunion and returned to work this
  morning.  The box had been booted during my absence: a pre-meditated re-boot so
  that a particular software package (Ab Initio) could be upgraded.  Now, emacs no
  longer works.  In fact, it dies a horrible death at start-up like this:
    tlmitnu1:ddahl> emacs
    Segmentation fault(coredump)
    tlmitnu1:ddahl> ls -l core
    -rw-r-----   1 ddahl    abinitio   11360307 Oct 17 12:33 core

  If you can provide me any clues or ideas about things to investigate, I would be
  very appreciative.  Here is some additional information.

  OS particulars:
    tlmitnu1:ddahl> uname -rv
    1 5
    tlmitnu1:ddahl> uname -a
    AIX tlmitnu1 1 5 0029334A4C00
    tlmitnu1:ddahl> cat /etc/motd
    *******************************************************************************
    *                                                                             *
    *                                                                             *
    *  Welcome to AIX Version 5.1!                                                *
    *                                                                             *
    *                                                                             *
    *  Please see the README file in /usr/lpp/bos for information pertinent to    *
    *  this release of the AIX Operating System.                                  *
    *                                                                             *
    *                                                                             *
    *******************************************************************************
    DEOS Infrastructure Version 2002-Q2-aix installed (08/13/2003)
    DEOS v2002-3-aixuvscan (09/23/2002)
    deos_unix_ECCmstagt_prod_s-2003.1 installed on 10/14/2003
    DEOS Infrastructure Version v2004-3-alerts installed  (09/14/2004)

  I've run emacs under the debugger and single-stepped my way to the general area
  where the signal happens.  In rough outline, there is a variable (__malloc_hook) found
  in src/gmalloc.c and this is supposed to contain the address of a function.  But when
  the program attempts to execute code at this derefeneced address, it finds unreadable
  instructions:

    tlmitnu1:emacs-21.3> dbx src/emacs
    Type 'help' for help.
    reading symbolic information ...
    (dbx) stop in main
    [1] stop in main
    (dbx) run
    [1] stopped in main at line 714 in file "src/emacs.c" ($t1)
      714     int skip_args = 0;
    (dbx) step
    stopped in main at line 737 in file "src/emacs.c" ($t1)
      737     sort_args (argc, argv);
    (dbx) step
    stopped in sort_args at line 1651 in file "src/emacs.c" ($t1)
     1651     char **new = (char **) xmalloc (sizeof (char *) * argc);
    (dbx) step
    stopped in xmalloc at line 519 in file "src/alloc.c" ($t1)
      519     BLOCK_INPUT;
    (dbx) step
    stopped in xmalloc at line 520 in file "src/alloc.c" ($t1)
      520     val = (POINTER_TYPE *) malloc (size);
    (dbx) step
    stopped in gmalloc.malloc at line 891 in file "src/gmalloc.c" ($t1)
      891     if (!__malloc_initialized && !__malloc_initialize ())
    (dbx) step
    stopped in gmalloc.malloc at line 894 in file "src/gmalloc.c" ($t1)
      894     return (__malloc_hook != NULL ? *__malloc_hook : _malloc_internal) (size);
    (dbx) print __malloc_hook
    0x200e0b9c 
    (dbx) print *__malloc_hook
    0x2c030000 
    (dbx) step
    Unreadable instruction at address 0x2c030000
    (dbx) where
    ptrgl.$PTRGL(??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ??, ) at 0x100adf64
    gmalloc.malloc(size = 0), line 894 in "gmalloc.c"
    xmalloc(size = 0), line 520 in "alloc.c"
    sort_args(argc = 42308, argv = 0x000c0002), line 1651 in "emacs.c"
    main(argc = 0, argv = (nil), envp = (nil)), line 737 in "emacs.c"
    (dbx) quit
     
  The version of emacs that I've been using is 21-3.  In desperation, I downloaded, configured and
  compiled a new emacs.21-3 but this newly compiled version failed in the same way as the original.

  Thanks in advance for any help that you might be able to provide!

  -Denny Dahl


[-- Attachment #1.2: Type: text/html, Size: 12533 bytes --]

[-- Attachment #2: Type: text/plain, Size: 152 bytes --]

_______________________________________________
Help-gnu-emacs mailing list
Help-gnu-emacs@gnu.org
http://lists.gnu.org/mailman/listinfo/help-gnu-emacs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Mysterious emacs failure
  2005-10-17 20:12 ` Denny Dahl
@ 2005-10-18 18:17   ` Kevin Rodgers
  2005-10-22  6:40   ` Tim X
  1 sibling, 0 replies; 4+ messages in thread
From: Kevin Rodgers @ 2005-10-18 18:17 UTC (permalink / raw)


Denny Dahl wrote:
 > Have been hacking at this problem all day and have learned some
 > interesting things but do not have a solution yet.  I was able to
 > successfully start the "bare impure emacs" executable named temacs.
 > This runs successfully for a few minutes, then it goes nuts attempting
 > (unsuccessfully) to create the directory /.emacs.d over and over
 > again.

That suggests that your HOME environment variable is set to "/" or not
set at all.

-- 
Kevin Rodgers

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Mysterious emacs failure
  2005-10-17 20:12 ` Denny Dahl
  2005-10-18 18:17   ` Kevin Rodgers
@ 2005-10-22  6:40   ` Tim X
  1 sibling, 0 replies; 4+ messages in thread
From: Tim X @ 2005-10-22  6:40 UTC (permalink / raw)


"Denny Dahl" <ddahl@travelers.com> writes:

> Have been hacking at this problem all day and have learned some interesting
> things
> 
> but do not have a solution yet.  I was able to successfully start the "bare
> impure emacs"
> 
> executable named temacs.  This runs successfully for a few minutes, then it
> goes
> 
> nuts attempting (unsuccessfully) to create the directory /.emacs.d over and
> over
> 
> again.
> 
>  
> 
> I also built an emacs using "GNU_MALLOC=no ./configure" but this didn't do what
> 
> I expected it to do.  I've been looking through the INSTALL file and
> etc/PROBLEMS

I think you may be approaching this in the wrong way and don't think
you will easily identify the problem by just looking at emacs. Given that

1. Emacs was working fine for sometime
2. Emacs started core dumping after the installation of a new piece of
software and a reboot

If we assume the new software is the only recent change (i.e. no
libraries or other packages have been updated), then its likely
something has changed as a result of the new package that was
installed. 

I would -

1. Verify exactly what actions were taken in the installation of the
   new package. Make sure no libraries or system settings were changed
   for the new package.

2. Use GDB or some other debugger to inspect the core file and find
   out at what point the system crashes.

3. Use ldd or similar utility to list the shared libraries used by
   emacs and the new software. This will verify all shared libraries
   with the correct versions are still available and possibly identify
   points of commonality between the two packages.

4. Use something like strace on emacs to get a more precise idea of
   where the system crashes and what system calls are being processed
   at that time.

The fact you appear to only be getting the problem on the system which
has had the new package installed makes it highly likely it is either
something directly relating to the installation of that new package or
something that was modified in the process by the sys admin who
installed the package. Until this is identified, changing configure
settings, malloc routines or anything else is really just shooting in
the dark - you may get lucky, but the odds are against it.

Tim

-- 
Tim Cross
The e-mail address on this message is FALSE (obviously!). My real e-mail is
to a company in Australia called rapttech and my login is tcross - if you 
really need to send mail, you should be able to work it out!

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2005-10-22  6:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-10-17 16:47 Mysterious emacs failure Denny Dahl
2005-10-17 20:12 ` Denny Dahl
2005-10-18 18:17   ` Kevin Rodgers
2005-10-22  6:40   ` Tim X

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).