gst 2.95g aborts when `make check' is run on i386

Tagged:
Project:GNU Smalltalk
Component:VM
Category:bug
Priority:normal
Assigned:Unassigned
Status:fixed
Description

It works fine on amd64. It seems to work fine with smalltalk@sv.gnu.org/devo--2.2--patch-552

Here's the output on clean, unmodified gst 2.95g, configured with ./configure --with-tcl=/usr/lib/tcl8.4 --with-tk=/usr/lib/tk8.4:

make[2]: Entering directory `/home/tgg/src/smalltalk-2.95g/tests'
echo "PackageLoader fileInPackage: #SUnit. ObjectMemory snapshot: 'gst.im'" | ./
gst --image=../gst.im -
"Global garbage collection... done"
Loading package SUnit
"Global garbage collection... done"
cd . && /home/tgg/src/smalltalk-2.95g/tests/gst -S --image=/home/tgg/src/smalltalk-2.95g/tests/gst.im AnsiLoad.st
gst: Aborted
(ip 4)[] in ProcessorScheduler>>#initialize
(ip 22)Array(SequenceableCollection)>>#do:
(ip 6)[] in ProcessorScheduler>>#initialize
(ip 4)[] in BlockClosure>>#newProcessWith:
(ip 52)[] in Process>>#onBlock:at:suspend:
(ip 10) BlockClosure>>#on:do:
(ip 14)[] in Process>>#onBlock:at:suspend:
(ip 2) BlockClosure>>#ensure:
(ip 10)[] in Process>>#onBlock:at:suspend:
(ip 46)[] in BlockClosure>>#asContext:
(ip 14)BlockContext class>>#fromClosure:parent:
/bin/sh: line 1: 19807 Aborted                 /home/tgg/src/smalltalk-2.95g/tests/gst -S --image=/home/tgg/src/smalltalk-2.95g/tests/gst.im AnsiLoad.st

Updates

#1 submitted by Paolo Bonzini on Thu, 12/13/2007 - 13:34

You mean patch-551 (2.95g) fails and patch-552 passes?

#2 submitted by Thomas Girard on Thu, 12/13/2007 - 20:17

Well, no. patch-551 works, the tarball extracted from 2.95g does not.


Diffing an export of smalltalk@sv.gnu.org/smalltalk--devo--2.2--patch-551 (from savannah) against gst 2.95g tarball, I get some differences. ChangeLog entries, examples, snprintf tests and Unicode methods missing from the tarball. On the other side: match.h, prims.inl, vm.inl are not in the tla repo, so I suppose they get regenerated at compile time. Maybe that's the culprit? Or maybe that's an autotools issue?

#3 submitted by Thomas Girard on Thu, 12/13/2007 - 20:30

Removing *.inl and match.h does not help.

#4 submitted by Paolo Bonzini on Thu, 12/13/2007 - 20:41

I'll look at the diffs, can you please check if there are differences in config.log or similar between patch-551 and 2.95g?

#5 submitted by Thomas Girard on Thu, 12/13/2007 - 20:43

autoreconf'ing does not help either.

#6 submitted by Thomas Girard on Thu, 12/13/2007 - 22:31

The diff is available here: http://thomas.g.girard.free.fr/gst/config.log_tarball_to_patch-551.diff

(Because of the autoreconf, lines numbers are different. I'll do it again tomorrow.)

Full config.log files available from the same directory.

#7 submitted by Thomas Girard on Fri, 12/14/2007 - 08:41

I've updated the diff, there's nothing interesting in it.


Here's the backtrace in gdb I get after a handle SIGSEGV nostop noprint pass no to interfere with libsigsegv:

#0  0xb7d0aea6 in raise () from /lib/libc.so.6
No symbol table info available.
#1  0xb7d0c7b1 in abort () from /lib/libc.so.6
No symbol table info available.
#2  0xb7ee52df in oldspace_sigsegv_handler (fault_address=0x4, serious=0)
    at oop.c:942
        page = (void *) 0x0
        reentering = 0
        reentered = 0
#3  0xb7e2dcf4 in sigsegv_handler (sig=11, more=51) at handler-unix.c:134
        address = (void *) 0x4
#4  <signal handler called>
No symbol table info available.
#5  0xb7f222de in _gst_interpret (processOOP=0x40317760) at vm.def:640
        _receiver = (OOP) 0x40352b58
        _stack0 = (OOP) 0xb7b00748
        _stack1 = (OOP) 0x40352b58
        ip = (ip_type) 0xb7b0074c "3"
        sp = (OOP *) 0x8064118
        arg = 3
        monitored_byte_codes = {0xb7f1bca2 <repeats 256 times>}
        true_byte_codes = {0xb7f1be63 <repeats 42 times>, 0xb7f1d8ee, 
  0xb7f1dc86, 0xb7f1be63, 0xb7f1be63, 0xb7f1be63, 0xb7f1be63, 0xb7f1be63, 
  0xb7f1be63, 0xb7f1be63, 0xb7f1be63, 0xb7f1be2f, 
  0xb7f1be63 <repeats 203 times>}
        false_byte_codes = {0xb7f1be11 <repeats 42 times>, 0xb7f1dc86, 
  0xb7f1d8ee, 0xb7f1be11, 0xb7f1be11, 0xb7f1be11, 0xb7f1be11, 0xb7f1be11, 
  0xb7f1be11, 0xb7f1be11, 0xb7f1be11, 0xb7f1bddd, 
  0xb7f1be11 <repeats 203 times>}
        normal_byte_codes = {0xb7f1bd7b, 0xb7f1e980, 0xb7f1e688, 0xb7f1e62c, 
  0xb7f1e5d0, 0xb7f1ea45, 0xb7f1e766, 0xb7f1e9e9, 0xb7f1e4b3, 0xb7f1e370, 
  0xb7f1e298, 0xb7f1e313, 0xb7f1e207, 0xb7f1e18c, 0xb7f1e12f, 0xb7f1e0d2, 
  0xb7f1e010, 0xb7f1e8a2, 0xb7f1e850, 0xb7f1e7f8, 0xb7f1e7c2, 0xb7f1ed49, 
  0xb7f1e920, 0xb7f1eb16, 0xb7f1e596, 0xb7f1ebd3, 0xb7f1eb7d, 0xb7f1ec25, 
  0xb7f1e6e1, 0xb7f1ecd3, 0xb7f1eaa1, 0xb7f1ec6e, 0xb7f1d712, 0xb7f1dbe5, 
  0xb7f1db20, 0xb7f1dadd, 0xb7f1daaf, 0xb7f1da60, 0xb7f1d972, 0xb7f1d92d, 
  0xb7f1d90d, 0xb7f1d8ee, 0xb7f1d8b0, 0xb7f1d872, 0xb7f1d837, 0xb7f1d7f3, 
  0xb7f1d7b3, 0xb7f1d776, 0xb7f1d752, 0xb7f1ddcd, 0xb7f1dd84, 0xb7f1dd43, 
  0xb7f1dd18, 0xb7f1dca7, 0xb7f1dc86, 0xb7f1dc69, 0xb7f1dc2c, 0xb7f1deec, 
  0xb7f1dea3, 0xb7f1de5a, 0xb7f1de11, 0xb7f1df7e, 0xb7f1df35, 0xb7f1dfc7, 
  0xb7f1d6a9, 0xb7f1d276, 0xb7f1d231, 0xb7f1d1f4, 0xb7f1d1d0, 0xb7f1d394, 
  0xb7f1d2de, 0xb7f1d32b, 0xb7f1cf7b, 0xb7f1cef8, 0xb7f1ce86, 0xb7f1ce3e, 
  0xb7f1cdbd, 0xb7f1cd7e, 0xb7f1cd41, 0xb7f1ccf5, 0xb7f1cca0, 0xb7f1d562, 
  0xb7f1d49e, 0xb7f1d188, 0xb7f1d0f0, 0xb7f1d0b0, 0xb7f1d06c, 0xb7f1d00a, 
  0xb7f1cfa1, 0xb7f1d632, 0xb7f1d5c1, 0xb7f1d456, 0xb7f1d3d3, 0xb7f1cc46, 
  0xb7f1cbce, 0xb7f1cba5, 0xb7f1cb7f, 0xb7f1c897, 0xb7f1c813, 0xb7f1c7aa, 
  0xb7f1c749, 0xb7f1cb3d, 0xb7f1c90c, 0xb7f1cabf, 0xb7f1c68b, 0xb7f1c9b2, 
  0xb7f1c972, 0xb7f1ca47, 0xb7f1c6f2, 0xb7f1c930, 0xb7f1c608, 0xb7f1c5a5, 
  0xb7f1c563, 0xb7f1c515, 0xb7f1c4d3, 0xb7f1c3d3, 0xb7f1c34c, 0xb7f1c450, 
  0xb7f1c30f, 0xb7f1c294, 0xb7f1c25c, 0xb7f1c212, 0xb7f1c194, 0xb7f1c161, 
  0xb7f1c116, 0xb7f1c0ea, 0xb7f1c0bf, 0xb7f1c077, 0xb7f1c02f, 0xb7f203c4, 
  0xb7f20380, 0xb7f202f2, 0xb7f2022d, 0xb7f201d0, 0xb7f201b0, 0xb7f200de, 
  0xb7f200a4, 0xb7f205a7, 0xb7f20567, 0xb7f2061f, 0xb7f20482, 0xb7f20658, 
  0xb7f204fa, 0xb7f1fcad, 0xb7f1fc35, 0xb7f20c44, 0xb7f20c07, 0xb7f20c82, 
  0xb7f207d7, 0xb7f20d70, 0xb7f20b63, 0xb7f20cdd, 0xb7f1ff5b, 0xb7f20a64, 
  0xb7f209e1, 0xb7f20ae3, 0xb7f206b3, 0xb7f2096e, 0xb7f208fa, 0xb7f2099a, 
  0xb7f1f42f, 0xb7f1f9d4, 0xb7f1f98c, 0xb7f1f94a, 0xb7f1f8cf, 0xb7f1faa1, 
  0xb7f1fa75, 0xb7f1f892, 0xb7f1f80b, 0xb7f1f7a2, 0xb7f1f734, 0xb7f1f6bc, 
  0xb7f1f680, 0xb7f1f655, 0xb7f1f5c9, 0xb7f1f593, 0xb7f1f513, 0xb7f1fb70, 
  0xb7f1fae5, 0xb7f1f2c9, 0xb7f1f241, 0xb7f1f208, 0xb7f1f189, 0xb7f1f0ef, 
  0xb7f1f068, 0xb7f1f023, 0xb7f1efe1, 0xb7f1ef56, 0xb7f1eedc, 0xb7f1ee99, 
  0xb7f1ee1f, 0xb7f1edcb, 0xb7f1ed7f, 0xb7f21b5c, 0xb7f21990, 0xb7f21538, 
  0xb7f214b6, 0xb7f2188c, 0xb7f2180e, 0xb7f21af0...}
#6  0xb7f26592 in _gst_nvmsg_send (receiver=0x403121f0, 
    sendSelector=0x40319058, args=0xbfa698c4, sendArgs=1) at interp.c:2211
        processOOP = (OOP) 0x40317760
        currentProcessOOP = (OOP) 0x403542f0
        result = <value optimized out>
        i = <value optimized out>
#7  0xb7f01080 in _gst_va_msg_sendf (resultPtr=0x0, 
    fmt=0xb7f429da "%v %o changed: %S", ap=0xbfa69ad0 "") at callin.c:293
        selector = (OOP) 0x40319058
        args = (OOP *) 0xbfa698c0
        result = <value optimized out>
        i = 1
        numArgs = <value optimized out>
        fp = 0xb7f429eb ""
        s = <value optimized out>
        selectorBuf = "changed:", '\0' <repeats 68 times>, "ѽԷ", '\0' <repeats 12 times>, "ѽԷ", '\0' <repeats 16 times>, "���\000\000\000\000p��ѽԷ \000\000\000\004\000\000\0008�\005\b\000\000\000\000 \000\000\000\004\000\000\000��\005\b\000\000\000\000���\000\000\000\000��\000\000\000\000\000\000\210\000\000\000\021\000\000\000��\000\0000\000\000\000\000\000\000\000p��\000\000\000\0000\000\000\000\000\000\000\000�\217�\000\000\000\000@��x\232��-�Է@��\030\000\000\000\234<�\020�\005\b\030\000\000\000\000\0301@"
#8  0xb7f018e3 in _gst_msg_sendf (resultPtr=0x0, 
    fmt=0xb7f429da "%v %o changed: %S") at callin.c:368
        ap = 0xbfa69ac8 "�!1@�)�"
#9  0xb7ed824b in _gst_invoke_hook (hook=GST_RETURN_FROM_SNAPSHOT)
    at comp.c:539
        save_execution = 0
#10 0xb7ec911a in _gst_initialize (kernel_dir=0x0, 
    image_file=0xbfa6baf2 "/home/tgg/src/smalltalk-2.95g/gst.im", 
    flags=<value optimized out>) at files.c:516
        willRegressTest = 3215375120
        result = <value optimized out>
        currentDirectory = 0x804c008 "/home/tgg/src/smalltalk-2.95g"
        home = <value optimized out>
        str = 0x804e300 "/home/tgg/src/smalltalk-2.95g/gst.im"
        abortOnFailure = true
        rebuild_image_flags = 0
#11 0x08049102 in main (argc=54595653, argv=0x33) at main.c:372
        result = <value optimized out>
        file = <value optimized out>


Please let me know if you need something else.

#8 submitted by Paolo Bonzini on Fri, 12/14/2007 - 09:53

> I've updated the diff, there's nothing interesting in it.

... so you're saying that the build is not deterministic and from the same
source code (except for some missing ChangeLogs :-) ) you got different
results.

Can you build 2.95g and patch-551 in two different directories, and see which
.o/.a/.la differ? This is quite disconcerting, if you weren't the reporter
I would have already closed it as invalid or worksforme...

#9 submitted by Thomas Girard on Fri, 12/14/2007 - 14:43

It seems the gst.im gets corrupted after the first invocation.


How to reproduce it on my i386 box:

$ ./configure --with-tcl=/usr/lib/tcl8.4 --with-tk=/usr/lib/tk8.4 && make
....
$ ls -l gst.im
-rw-r--r-- 1 tgg tgg 1630820 Dec 14 16:06 gst.im
$ cp gst.im gst.im-
$ ./gst -S --image=./gst.im tests/AnsiLoad.st
"Global garbage collection... done"
Loading package SUnit
Loading Ansi.st
Loading AnsiDB.st
Loading AnsiInit.st
"Global garbage collection... done"
$ ls -l gst.im*
-rw-r--r-- 1 tgg tgg 2827244 Dec 14 16:14 gst.im
-rw-r--r-- 1 tgg tgg 1630820 Dec 14 16:06 gst.im-
$ ./gst -S --image=./gst.im tests/AnsiLoad.st
gst: Aborted
(ip 4)[] in ProcessorScheduler>>#initialize
(ip 22)Array(SequenceableCollection)>>#do:
(ip 6)[] in ProcessorScheduler>>#initialize
(ip 4)[] in BlockClosure>>#newProcessWith:
(ip 52)[] in Process>>#onBlock:at:suspend:
(ip 10) BlockClosure>>#on:do:
(ip 14)[] in Process>>#onBlock:at:suspend:
(ip 2) BlockClosure>>#ensure:
(ip 10)[] in Process>>#onBlock:at:suspend:
(ip 46)[] in BlockClosure>>#asContext:
(ip 14)BlockContext class>>#fromClosure:parent:
Aborted

Replacing gst.im with gst.im- does work, but modifies the image again.
Also notice "Global garbage collection... done" is displayed twice. (I'm using embedded libsigsegv0)


Both images are available from http://thomas.g.girard.free.fr/gst/

#10 submitted by Paolo Bonzini on Fri, 12/14/2007 - 17:43

Yes, but where are the two executable different? Can you try compiling them in exactly the same environment, including the directory in which you build?

Also, embedded libsigsegv0 = not the Debian package?

Thanks,

#11 submitted by Thomas Girard on Sat, 12/15/2007 - 13:53

I've computed md5sum for both and diffed the ouput at http://thomas.g.girard.free.fr/gst/tarball_to_patch-551.diff

Yes, I'm not using Debian libsigsegv0.

#12 submitted by Paolo Bonzini on Sat, 12/15/2007 - 14:06

Can you redo the compilation with "-save-temps -O2 -g", and diff the interp.i and interp.s files? Thanks!

#13 submitted by Thomas Girard on Sat, 12/15/2007 - 15:01

There you go:

diff -u ./smalltalk-2.95g--tarball/libgst/interp.i ./smalltalk-2.95g--patch-551/libgst/interp.i
--- ./smalltalk-2.95g--tarball/libgst/interp.i 2007-12-15 16:10:01.000000000 +0000

+++  ./smalltalk-2.95g--patch-551/libgst/interp.i        2007-12-15 16:31:01.000000000 +0000

@@ -23890,7 +23890,7 @@
   return (true);
 }
 
-int _gst_primitives_md5[4] = { 0xf7b28bad, 0xf51eaf5f, 0x2c72fbc5, 0xc7d17f62 };

+ int _gst_primitives_md5[4] = { 0xad8bb2f7, 0x5faf1ef5, 0xc5fb722c, 0x627fd1c7 };

 
 void
 _gst_init_primitives()

and

diff -u ./smalltalk-2.95g--tarball/libgst/interp.s ./smalltalk-2.95g--patch-551/libgst/interp.s
--- ./smalltalk-2.95g--tarball/libgst/interp.s 2007-12-15 16:10:22.000000000 +0000

+++  ./smalltalk-2.95g--patch-551/libgst/interp.s        2007-12-15 16:31:22.000000000 +0000

@@ -67740,10 +67740,10 @@
        .type   _gst_primitives_md5, @object
        .size   _gst_primitives_md5, 16
 _gst_primitives_md5:
-       .long   -139293779
-       .long   -182538401
-       .long   745733061
-       .long   -942571678

+        .long   -1383353609

+       .long   1605312245

+        .long   -973376980

+       .long   1652543943
        .align 4
        .type   free_lifo_context, @object
        .size   free_lifo_context, 4

#14 submitted by Thomas Girard on Sat, 12/15/2007 - 15:04
Attachment:interp.patch (923 bytes)

Diff attached.

#15 submitted by Thomas Girard on Sat, 12/15/2007 - 15:21

I've also copied compilation typescripts in http://thomas.g.girard.free.fr/gst

#16 submitted by Thomas Girard on Sat, 12/15/2007 - 15:33

Hmmm... { 0xf7b28bad, 0xf51eaf5f, 0x2c72fbc5, 0xc7d17f62 } looks a lot like: { 0xad8bb2f7, 0x5faf1ef5, 0xc5fb722c, 0x627fd1c7 } read backwards...

#17 submitted by Paolo Bonzini on Mon, 12/17/2007 - 08:06
Attachment:gst-the-elusive-bug-139.patch (1.35 KB)

Ok, so it's an obvious endianness issue. Less obvious is why it worked for me, and why it worked on x86-64.

Anyway here's the patch to fix this.

#18 submitted by Paolo Bonzini on Mon, 12/17/2007 - 08:27
Status:active» fixed
#19 submitted by Paolo Bonzini on Mon, 12/17/2007 - 12:35

Funny enough, with the patch a `make distcheck' now allows me to reproduce the bug on my machine too.

It is a GC problem caused by weak objects. I'll soon commit the attached patch to fix it.

User login