medley icon indicating copy to clipboard operation
medley copied to clipboard

bad argument to \PUTBASE.UFN out of \MAKEFREEBLOCK while doing RELSTK cleanup

Open nbriggs opened this issue 4 years ago • 7 comments

Did (HARDRESET) in a break window to clean up after generating a bunch of interrupts (middle button scroll without modern scroll functions loaded, and got: (full.sysout of 1-sep-2021 from medley-release)

hemlock% ./run-medley
running: lde -g 1440x900 -sc 1440x900 -m 256  /export/home/briggs/lisp.virtualmem
greet: /export/home/briggs/medley/greetfiles/SIMPLE-INIT

*Error* URAID Called:
Enter the URaid
"Error in uninterruptable system code -- ^N to continue into error handler"

< l
  0 :    0x137e2 : IL:RAID
  1 :    0x137cc : IL:\MP.ERROR
  2 :    0x137b0 : IL:\LISPERROR
  3 :    0x1379a : IL:\ILLEGAL.ARG
  4 :    0x13782 : IL:\PUTBASE.UFN
  5 :    0x1376c : IL:\MAKEFREEBLOCK
  6 :    0x1374e : IL:\DECUSECOUNT
  7 :    0x13734 : IL:RELSTK
  8 :    0x1371e : DEBUGGER:RELEASE-DEBUGGER-WINDOW
  9 :    0x13706 : IL:APPLY
 10 :    0x136e6 : SI:RESETUNWIND
 11 :    0x136d0 : "Clean-up forms"
 12 :    0x136ba : IL:\HARDRESET-CLEANUP-RUN
 13 :    0x1367c : IL:\HARDRESET-CLEANUP1
 14 :    0x1328c : IL:\HARDRESET-CLEANUP
 15 :    0x11e42 : IL:\MAKE.PROCESS0
 16 :    0x11802 : CL:T

< f 7
IVAR -------
  13730 : 0x  6c  0xa9d2  *local* IL:POS  {IL:STACKP}0x6ca9d2

## STACK BF at 0x13732 ##
[cnt=0 ]
ivar : 0x3730
>> Bf's ivar says 0x13730 vs. IVar says 0x137da
Fname is IL:RELSTK
## STACK FX at 0x13734 ##
[cnt = 0 ]
 #alink           0x3728 
 fnhead   0x2faf64 
 nextblock        0x374a 
 pc               0x48 
 nametbl  0x739720 
 #blink           0x374c 
 #clink           0x8b 
  1373e : 0x   e  0x3352  *local* [pvar0]   13138
  13740 : 0x   0  0x   0  IL:\INTERRUPTABLE  CL:NIL
  13742 : 0x  73  0xbe28  
  13744 : 0x  6c  0xa9d2  
  13746 : 0xfffe  0x   0  
  13748 : 0xfffe  0x   2  

< f 6
IVAR -------
  1374a : 0x   e  0x3352  *local* IL:FRAME  13138

## STACK BF at 0x1374c ##
[cnt=0 ]
ivar : 0x374a
>> Bf's ivar says 0x1374a vs. IVar says 0x137da
Fname is IL:\DECUSECOUNT
## STACK FX at 0x1374e ##
[cnt = 0 ]
 #alink           0x373e 
 fnhead   0x2fa70c 
 nextblock        0x3766 
 pc               0xa5 
 nametbl  0x8000374c 
 #blink           0xc000 
 #clink           0x3742 
  13758 : 0x   e  0x12d2  *local* [pvar0]   4818
  1375a : 0x   e  0x   0  *local* [pvar1]   0
  1375c : 0x   f  0xccae  *local* [pvar2]   -13138
  1375e : 0x   f  0xfff6  *local* [pvar3]   -10
  13760 : 0xffff  0xffff  
  13762 : 0x   e  0x3352  
  13764 : 0xfffb  0x   6  

< f 5
IVAR -------
  13766 : 0x   e  0x3352  *local* IL:STK  13138
  13768 : 0x   f  0xccae  *local* IL:SIZE  -13138

## STACK BF at 0x1376a ##
[cnt=0 ]
ivar : 0x3766
>> Bf's ivar says 0x13766 vs. IVar says 0x137da
Fname is IL:\MAKEFREEBLOCK
## STACK FX at 0x1376c ##
[cnt = 0 ]
 #alink           0x3758 
 fnhead   0x2fb254 
 nextblock        0x377a 
 pc               0x36 
 nametbl  0x74f28c 
 #blink           0x61 
 #clink           0xd972 
  13776 : 0x  68  0x27da  
  13778 : 0x   f  0xccae  

< f 4
IVAR -------
  1377a : 0x   1  0x3352  *local* IL:X  {unknown}0x13352
  1377c : 0x   f  0xccae  *local* IL:V  -13138
  1377e : 0x   e  0x   1  *local* IL:D  1

## STACK BF at 0x13780 ##
[cnt=0 ]
ivar : 0x377a
>> Bf's ivar says 0x1377a vs. IVar says 0x137da
Fname is IL:\PUTBASE.UFN
## STACK FX at 0x13782 ##
[cnt = 0 ]
 #alink           0x3776 
 fnhead   0x2e741c 
 nextblock        0x3796 
 pc               0x4d 
 nametbl  0xe0078 
 #blink           0xe 
 #clink           0x190 
  1378c : 0xffff  0xffff  *local* [pvar0]   [variable not bound]
  1378e : 0xffff  0xffff  *local* [pvar1]   [variable not bound]
  13790 : 0x   e  0x  78  
  13792 : 0x   e  0x   1  
  13794 : 0x   1  0x3353  

< f 3
IVAR -------
  13796 : 0x   f  0xccae  *local* IL:X  -13138

## STACK BF at 0x13798 ##
[cnt=0 ]
ivar : 0x3796
>> Bf's ivar says 0x13796 vs. IVar says 0x137da
Fname is IL:\ILLEGAL.ARG
## STACK FX at 0x1379a ##
[cnt = 0 ]
 #alink           0x378c 
 fnhead   0x30cd3c 
 nextblock        0x37a8 
 pc               0x30 
 nametbl  0xe0208 
 #blink           0xe 
 #clink           0x78 
  137a4 : 0x   e  0x  78  
  137a6 : 0x   f  0xccae  

< 

nbriggs avatar Sep 02 '21 17:09 nbriggs

Just to confirm, I was able to generate a similar error by holding down Control-B until it got a stack overflow to RAID and did a 'h' for Hard reset. image

which cleaned up a lot of the break windows but one. I wonder if running with stack check diagnostics would narrow the possible causes -- missing UNINERRUPTABLY? Bad refcount logic? The break window has a pointer to the stack and when you close the window after a resetform it isn't releasing right?

Try it on big-endian / 32 bit? likely not a new problem. image

Doing another h - Hard Reset and it seemed to recover

masinter avatar Sep 04 '21 21:09 masinter

stack.txt

masinter avatar Sep 04 '21 21:09 masinter

I was running on 32-bit little-endian. I've just tried again on a 32-bit big-endian system -- did a dozen ctrl-B, then in one of the windows did a (HARDRESET), and it recovered until the background GC died with:

*Error* creating 0 long FSP
Enter the URaid
CL:NIL

< l
  0 :    0x126d4 : IL:\MAIKO.DORECLAIM
  1 :    0x126c4 : IL:RECLAIM
  2 :    0x126b0 : IL:PERIODICALLYRECLAIM
  3 :    0x12698 : IL:\BACKGROUND.PROCESS
  4 :    0x12678 : IL:\EVALFORM
  5 :    0x1262a : IL:\MAKE.PROCESS0
  6 :    0x11802 : CL:T

nbriggs avatar Sep 04 '21 21:09 nbriggs

Running in lisp.venuesysout, do two ctrl-B then (IL:HARDRESET) in one of the break windows, and got

hemlock% ./run-medley loadups/lisp.venuesysout
running: lde -g 1440x900 -sc 1440x900 -m 256  loadups/lisp.venuesysout
greet: /export/home/briggs/medley/greetfiles/SIMPLE-INIT

*Error* URAID Called:
Enter the URaid
"Error in uninterruptable system code -- ^N to continue into error handler"

< l
  0 :    0x136e0 : IL:RAID
  1 :    0x136ca : IL:\MP.ERROR
  2 :    0x136ae : IL:\LISPERROR
  3 :    0x13698 : IL:\ILLEGAL.ARG
  4 :    0x13680 : IL:\PUTBASE.UFN
  5 :    0x1366a : IL:\MAKEFREEBLOCK
  6 :    0x1364c : IL:\DECUSECOUNT
  7 :    0x13632 : IL:RELSTK
  8 :    0x1361a : IL:APPLY
  9 :    0x135fa : SI:RESETUNWIND
 10 :    0x135e4 : "Clean-up forms"
 11 :    0x135ce : IL:\HARDRESET-CLEANUP-RUN
 12 :    0x13590 : IL:\HARDRESET-CLEANUP1
 13 :    0x1326e : IL:\HARDRESET-CLEANUP
 14 :    0x13222 : IL:\MAKE.PROCESS0
 15 :    0x11802 : CL:T

< 
< f 5
IVAR -------
  13664 : 0x   e  0x   0  *local* IL:STK  0
  13666 : 0x   f  0xfff6  *local* IL:SIZE  -10

## STACK BF at 0x13668 ##
[cnt=0 ]
ivar : 0x3664
>> Bf's ivar says 0x13664 vs. IVar says 0x136d8
Fname is IL:\MAKEFREEBLOCK
## STACK FX at 0x1366a ##
[cnt = 0 ]
 #alink           0x3656 
 fnhead   0x2fd17c 
 nextblock        0x3678 
 pc               0x36 
 nametbl  0x0 
 #blink           0x0 
 #clink           0x0 
  13674 : 0xffff  0xffff  
  13676 : 0x   f  0xfff6  

It's not a good thing to be making a free block of stack at 0 and size -10!

This is reproducible in the lisp.venuesysout as well, in a 32-bit compiled little-endian version on my Mac.

nbriggs avatar Sep 04 '21 22:09 nbriggs

HARDESET throws away the entire stack for all processes. It should be accompanied by clearing (without regard to reference counting) all stack pointers as in CLEARSTK(CLEAR). Not sure why this isn't happening.

masinter avatar Sep 15 '21 16:09 masinter

the other problem noticed is how you get into a situation where HARDRESET is needed, since it's normally a sledge-hammer kind of tool, invoked during stack overflow or other errors in RAID. holding down control-b is one way -- it creates a break inside a break for a screenful of frames. Then you get a break window but there's not enough stack left to do anything. Two things would help -- (a) don't control-b break while starting or running in a break; some kind of time-delay since last break + looking to see if you're at a break prompt and not interrupting (this includes from wheel-scroll) and (b) leave more stack reserve for debugging stack overflow

masinter avatar Sep 15 '21 17:09 masinter

HARDRESET isn't easy when running online. I think something akin to UNINTERRUPTABLY but "softer" -- ignore control-B between when you type one and it pops up a break window -- would patch the immediate problem. The HARDRESET code requires reactivating some memory cells.

masinter avatar Aug 05 '22 00:08 masinter