can floating point stack overflow during call/ret?

Discussion:

(too old to reply)

Justin L. Kennedy

2004-11-19 22:59:11 UTC

I am preparing to write a simple JIT compiler in ANSI C89 (compiled by
GCC) for simple floating point arithmetic expressions. I want to add
support so that users can ask for operations that aren't done directly in
assembly such as C's arcsin and pow functions. This means that the
function the JIT writes will be inserting calls to functions in libc which
I have no control over.

Since the x86 floating point stack is only 8 operands high, how do I make
sure that the operands I put on the stack along with the operands these
other functions may put on the stack will not be more than 8? Does the
processor spill those registers onto the call stack automatically when I
use the CALL instruction, or is it controlled by call conventions specific
to processors? I haven't seen this issue addressed in the tutorials I
have looked at. All the examples just add onto the stack when their
functions enter and do nothing when they call other functions, which
leads me to believe that either the processor handles this automatically
or they are just really bad examples.

--
Justin L. Kennedy
Georgia Institute of Technology, Atlanta Georgia, 30332
Email: ***@prism.gatech.edu

David Lindauer

2004-11-19 23:59:02 UTC

Permalink

the floating point processor can make exceptions if you want it to... and I
believe one of them is an exception when the stack overflows. But you
possibly have to handle this yourself, I really don't know...

David

Post by Justin L. Kennedy
I am preparing to write a simple JIT compiler in ANSI C89 (compiled by
GCC) for simple floating point arithmetic expressions. I want to add
support so that users can ask for operations that aren't done directly in
assembly such as C's arcsin and pow functions. This means that the
function the JIT writes will be inserting calls to functions in libc which
I have no control over.
Since the x86 floating point stack is only 8 operands high, how do I make
sure that the operands I put on the stack along with the operands these
other functions may put on the stack will not be more than 8? Does the
processor spill those registers onto the call stack automatically when I
use the CALL instruction, or is it controlled by call conventions specific
to processors? I haven't seen this issue addressed in the tutorials I
have looked at. All the examples just add onto the stack when their
functions enter and do nothing when they call other functions, which
leads me to believe that either the processor handles this automatically
or they are just really bad examples.
--
Justin L. Kennedy
Georgia Institute of Technology, Atlanta Georgia, 30332

Matt

2004-11-20 02:27:40 UTC

Permalink

[...]

You should look at the document describing the C++ ABI on Unix:
http://www.caldera.com/developers/devspecs/abi386-4.pdf

Post by Justin L. Kennedy
Since the x86 floating point stack is only 8 operands high, how do I make
sure that the operands I put on the stack along with the operands these
other functions may put on the stack will not be more than 8? Does the
processor spill those registers onto the call stack automatically when I
use the CALL instruction, or is it controlled by call conventions specific
to processors? I haven't seen this issue addressed in the tutorials I
have looked at. All the examples just add onto the stack when their
functions enter and do nothing when they call other functions, which
leads me to believe that either the processor handles this automatically
or they are just really bad examples.

As memory serves, you are supposed to call these functions with FP arguments
on the program stack, and the FP stack is supposed to be empty. This allows
each function to assume that the FP stack is completely empty when it begins
executing, and the compiler then just has to be careful not to push more
than 8 items on at a time.

Don't take my word for it, of course; read the spec.

-Matt

Ed Beroset

2004-11-20 16:56:50 UTC

Permalink

The obvious answer is "keep track of the count," but of course there's
more to it than that because your code isn't the only thing putting
things on the FPU stack.

Post by Justin L. Kennedy
Does the
processor spill those registers onto the call stack automatically when I
use the CALL instruction, or is it controlled by call conventions specific
to processors?

The processor does not spill those registers, but I suppose a compiler
might. The only way to get things onto the FPU stack is from memory
(e.g. FILD) or by explicit instruction (e.g. FLDPI), so there is no
processor supported way to load directly from CPU registers to FPU stack
space (really just a set of registers).

Post by Justin L. Kennedy
I haven't seen this issue addressed in the tutorials I
have looked at. All the examples just add onto the stack when their
functions enter and do nothing when they call other functions, which
leads me to believe that either the processor handles this automatically
or they are just really bad examples.

I don't know what exactly you're looking at, but the examples may be
valid even if they don't pay any attention to the FPU stack. Looking
into the Linux source code, FPU stack overflow or underflow just
generates a SIGFPE which the user is responsible for handling, but if
you look into it (I'm looking at arch/i386/kernel/traps.c) you'll see
that the math_error() function includes a warning that multiple
exceptions are not handled reliably.

Generically, the way it could be handled is to unmask the stack
exception and write an exception handler. Since the exception is
triggered *before* any change is actually made to the FPU stack, one way
to deal with it would be to create a memory pool which would act as an
extension to the FPU stack (in fact Intel suggests this mechanism). On
a overflow condition (trying to write to a non-empty FPU register) you
could write the value to memory instead, and on underflow (trying to
read from an empty FPU register) you could read from memory.

This should all be possible from within a user process under Linux
(assuming that's what you're using since you refer to libc and gcc) if
you write a signal handler for SIGFPE. As a starter:

#include <stdio.h>
#include <math.h>
#include <signal.h>
#include <fenv.h>

void sig_pfe();

int main()
{
int i;

signal(SIGFPE,sig_pfe);
feenableexcept(FE_INVALID);

printf("This program attempts to overload the x86 FPU stack\n");
printf("by repeatedly putting the constant pi on that stack.\n");
for (i=1; i < 30; i++)
{
printf("pushing pi #%d\n", i);
asm volatile ("fldpi");
}
return 0;
}

void sig_pfe(void)
{
printf("FPU exception detected - ending program\n");
exit(1);
}

Obviously, this code is x86 only and intended for Linux (and tested on
such a platform), but it may be applicable elsewhere, e.g. BSD.

Ed

Wendy E. McCaughrin

2004-11-20 23:52:33 UTC

Permalink

Ed Beroset <***@crayne.org> wrote:
: Justin L. Kennedy wrote:

: > Since the x86 floating point stack is only 8 operands high, how do I make
: > sure that the operands I put on the stack along with the operands these
: > other functions may put on the stack will not be more than 8?

: The obvious answer is "keep track of the count," but of course there's
: more to it than that because your code isn't the only thing putting
: things on the FPU stack.

: > Does the
: > processor spill those registers onto the call stack automatically when I
: > use the CALL instruction, or is it controlled by call conventions specific
: > to processors?

: The processor does not spill those registers, but I suppose a compiler
: might. The only way to get things onto the FPU stack is from memory
: (e.g. FILD) or by explicit instruction (e.g. FLDPI), so there is no
: processor supported way to load directly from CPU registers to FPU stack
: space (really just a set of registers).

: > I haven't seen this issue addressed in the tutorials I
: > have looked at. All the examples just add onto the stack when their
: > functions enter and do nothing when they call other functions, which
: > leads me to believe that either the processor handles this automatically
: > or they are just really bad examples.

: I don't know what exactly you're looking at, but the examples may be
: valid even if they don't pay any attention to the FPU stack. Looking
: into the Linux source code, FPU stack overflow or underflow just
: generates a SIGFPE which the user is responsible for handling, but if
: you look into it (I'm looking at arch/i386/kernel/traps.c) you'll see
: that the math_error() function includes a warning that multiple
: exceptions are not handled reliably.

: Generically, the way it could be handled is to unmask the stack
: exception and write an exception handler. Since the exception is
: triggered *before* any change is actually made to the FPU stack, one way
: to deal with it would be to create a memory pool which would act as an
: extension to the FPU stack (in fact Intel suggests this mechanism). On
: a overflow condition (trying to write to a non-empty FPU register) you
: could write the value to memory instead, and on underflow (trying to
: read from an empty FPU register) you could read from memory.

: This should all be possible from within a user process under Linux
: (assuming that's what you're using since you refer to libc and gcc) if
: you write a signal handler for SIGFPE. As a starter:

: #include <stdio.h>
: #include <math.h>
: #include <signal.h>
: #include <fenv.h>

: void sig_pfe();

: int main()
: {
: int i;

: signal(SIGFPE,sig_pfe);
: feenableexcept(FE_INVALID);

It should be noted that of the 6 possible exceptions the x87 can incur,
"invalid operation" includes a number of other possibilities besides
stack overflow -- such as: operation on a signalling NaN, and indeterminate
form (e.g., 0/0 with FDIV). So you would need to distinguish the case for
stack overflow, which is done by examining the SF (bit 6) of the x87's
status word. That is, get a copy of the status word in your handler and
isolate this Stack Flag (say, AND'ing with 0x0040): if it is set, the
invalid operation is a stack overflow.
This is from my Intel x87 documentation, Order #290376-001 (though you
will most likely want a more up-to-date version).

-- Scott