X86 Assembly/Floating Point

The ALU is only capable of dealing with integer values. While integers are sufficient for some applications, it is often necessary to use decimals. A highly specialized coprocessor, all part of the FPU – the floating-point unit –, will allow you to manipulate numbers with fractional parts.

x87 Coprocessor
The original x86 family members had a separate math coprocessor that handled floating point arithmetic. The original coprocessor was the 8087, and all FPUs since have been dubbed “x87” chips. Later variants integrated the FPU into the microprocessor itself. Having the capability to manage floating point numbers means a few things:
 * 1) The microprocessor must have space to store floating point numbers.
 * 2) The microprocessor must have instructions to manipulate floating point numbers.

The FPU, even when it is integrated into an x86 chip, is still called the “x87” section. For instance, literature on the subject will frequently call the FPU Register Stack the “x87 Stack”, and the FPU operations will frequently be called the “x87 instruction set”.

The presence of an integrated x87 FPU can be checked using the  instruction.

FPU Register Stack
The FPU has an array of eight registers that can be accessed as a stack. There is one top index indicating the current top of the stack. Pushing or popping items to or from the stack will only change the top index and store or wipe data respectively.

or simply  refers to the register that is currently at the top of the stack. If eight values were stored on the stack,  refers to last element on the stack (i.&#8239;e. the bottom).

Numbers are pushed onto the stack from memory, and are popped off the stack back to memory. There is no instruction allowing to transfer values directly to or from ALU registers. The x87 stack can only be accessed by FPU instructions ‒ you cannot write  ‒ it is necessary to store values to memory if you want to print them, for example.

FPU instructions generally will pop the first two items off the stack, act on them, and push the answer back on to the top of the stack.

Floating point numbers may generally be either 32 bits long, the  data type in the programming language C, or 64 bits long,   in C. However, in order to reduce round-off errors, the FPU stack registers are all 80 bits wide.

Most calling conventions return floating point values in the  register.

Examples
The following program (using NASM syntax) calculates the square root of 123.45.

Essentially, programs that use the FPU load values onto the stack with  and its variants, perform operations on these values, then store them into memory with one of the forms of , most commonly   when you are done with x87, to clean up the x87 stack as required by most calling conventions.

Here is a more complex example that evaluates the Law of Cosines:

Floating-Point Instruction Set
You may notice that some of the instructions below differ from another in name by just one letter: a P appended to the end. This suffix signifies that in addition to performing the normal operation, they also Pop the x87 stack after execution is complete.

Original 8087 instructions
FDISI, FENI, FLDENVW, FLDPI, FNCLEX, FNDISI, FNENI, FNINIT, FNSAVEW, FNSTENVW, FRSTORW, FSAVEW, FSTENVW

Data Transfer Instructions

 * : load floating-point value
 * : load integer
 * load a constant on top of the stack
 * : $$+1$$
 * : $$\log_2 e$$
 * : $$\log_2 10$$
 * : $$\log_{10} 2$$
 * : $$\ln 2$$
 * : “positive” $$0$$
 * : $$\ln 2$$
 * : “positive” $$0$$


 * , : store integer
 * : exchange
 * : store a truncated integer
 * : store a truncated integer

Arithmetic Instructions

 * : absolute value
 * : change sign
 * : split exponent and significant


 * ,,  : addition
 * ,,  : subtraction
 * ,,  : reverse subtraction


 * : square root
 * ,,  : division (see also   bug on Wikipedia)
 * : partial remainder
 * : round to integer
 * : multiply/divide by integral powers of 2
 * : $$2^x - 1$$
 * : $$y \log_2 x$$
 * : $$y \log_2\left(x + 1\right)$$
 * : multiply/divide by integral powers of 2
 * : $$2^x - 1$$
 * : $$y \log_2 x$$
 * : $$y \log_2\left(x + 1\right)$$

FPU Internal and Other Instructions

 * : initialize FPU


 * and : increment or decrement top
 * : tag a register as free


 * : test
 * ,,  : compare floating-point values
 * , : compare with an integer
 * : examine a register


 * : clear exceptions
 * does the same as.
 * does the same as.

Added with 80287
FSETPM

Added with 80387
FCOS, FLDENVD, FNSAVED, FNSTENVD, FPREM1, FRSTORD, FSAVED, FSIN, FSINCOS, FSTENVD, FUCOM, FUCOMP, FUCOMPP

Added with Pentium Pro
FCMOVB, FCMOVBE, FCMOVE, FCMOVNB, FCMOVNBE, FCMOVNE, FCMOVNU, FCMOVU, FCOMI, FCOMIP, FUCOMI, FUCOMIP, FXRSTOR, FXSAVE

Added with SSE
FXRSTOR, FXSAVE

These are also supported on later Pentium IIs which do not contain SSE support

Added with SSE3
FISTTP (x87 to integer conversion with truncation regardless of status word)

Undocumented instructions

 * : performs  and pops the stack