X86 Disassembly/Calling Convention Examples

Microsoft C Compiler
Here is a simple function in C:

Using cl.exe, we are going to generate 3 separate listings for MyFunction, one with CDECL, one with FASTCALL, and one with STDCALL calling conventions. On the commandline, there are several switches that you can use to force the compiler to change the default:


 * : The default calling convention is CDECL
 * : The default calling convention is FASTCALL
 * : The default calling convention is STDCALL

Using these commandline options, here are the listings:

CDECL
becomes: On entry of a function, ESP points to the return address pushed on the stack by the call instruction (that is, previous contents of EIP). Any argument in stack of higher address than entry ESP is pushed by caller before the call is made; in this example, the first argument is at offset +4 from ESP (EIP is 4 bytes wide), plus 4 more bytes once the EBP is pushed on the stack. Thus, at line 5, ESP points to the saved frame pointer EBP, and arguments are located at addresses ESP+8 (x) and ESP+12 (y).

For CDECL, caller pushes arguments into stack in a right to left order. Because ret 0 is used, it must be the caller who cleans up the stack.

As a point of interest, notice how lea is used in this function to simultaneously perform the multiplication (ecx * 2), and the addition of that quantity to eax. Unintuitive instructions like this will be explored further in the chapter on unintuitive instructions.

FASTCALL
becomes:

This function was compiled with optimizations turned off. Here we see arguments are first saved in stack then fetched from stack, rather than be used directly. This is because the compiler wants a consistent way to use all arguments via stack access, not only one compiler does like that.

There is no argument is accessed with positive offset to entry SP, it seems caller doesn’t pushed in them, thus it can use ret 0. Let’s do further investigation:

and the corresponding listing:

Now we have 6 arguments, four are pushed in by caller from right to left, and last two are passed again in cx/dx, and processed the same way as previous example. Stack cleanup is done by ret 16, which corresponding to 4 arguments pushed before call executed.

For FASTCALL, compiler will try to pass arguments in registers, if not enough caller will pushed them into stack still in an order from right to left. Stack cleanup is done by callee. It is called FASTCALL because if arguments can be passed in registers (for 64bit CPU the maximum number is 6), no stack push/clean up is needed.

The name-decoration scheme of the function: @MyFunction@n, here n is stack size needed for all arguments.

STDCALL
becomes:

The STDCALL listing has only one difference than the CDECL listing that it uses "ret 8" for self clean up of stack. Lets do an example with more parameters:

Let's take a look at how this function gets translated into assembly by cl.exe:

Yes the only difference between STDCALL and CDECL is that the former does stack clean up in callee, the later in caller. This saves a little bit in X86 due to its "ret n".

GNU C Compiler
We will be using 2 example C functions to demonstrate how GCC implements calling conventions:

and

GCC does not have commandline arguments to force the default calling convention to change from CDECL (for C), so they will be manually defined in the text with the directives: __cdecl, __fastcall, and __stdcall.

CDECL
The first function (MyFunction1) provides the following assembly listing:

First of all, we can see the name-decoration is the same as in cl.exe. We can also see that the ret instruction doesn't have an argument, so the calling function is cleaning the stack. However, since GCC doesn't provide us with the variable names in the listing, we have to deduce which parameters are which. After the stack frame is set up, the first instruction of the function is "movl 8(%ebp), %eax". One we remember (or learn for the first time) that GAS instructions have the general form:

instruction src, dest

We realize that the value at offset +8 from ebp (the last parameter pushed on the stack) is moved into eax. The leal instruction is a little more difficult to decipher, especially if we don't have any experience with GAS instructions. The form "leal(reg1,reg2), dest" adds the values in the parenthesis together, and stores the value in dest. Translated into Intel syntax, we get the instruction:

Which is clearly the same as a multiplication by 2. The first value accessed must then have been the last value passed, which would seem to indicate that values are passed right-to-left here. To prove this, we will look at the next section of the listing:

the value at offset +12 from ebp is moved into edx. edx is then moved into eax. eax is then added to itselt (eax * 2), and then is added back to edx (edx + eax). remember though that eax = 2 * edx, so the result is edx * 3. This then is clearly the y parameter, which is furthest on the stack, and was therefore the first pushed. CDECL then on GCC is implemented by passing arguments on the stack in right-to-left order, same as cl.exe.

FASTCALL
Notice first that the same name decoration is used as in cl.exe. The astute observer will already have realized that GCC uses the same trick as cl.exe, of moving the fastcall arguments from their registers (ecx and edx again) onto a negative offset on the stack. Again, optimizations are turned off. ecx is moved into the first position (-4) and edx is moved into the second position (-8). Like the CDECL example above, the value at -4 is doubled, and the value at -8 is tripled. Therefore, -4 (ecx) is x, and -8 (edx) is y. It would seem from this listing then that values are passed left-to-right, although we will need to take a look at the larger, MyFunction2 example:

By following the fact that in MyFunction2, successive parameters are added to increasing constants, we can deduce the positions of each parameter. -4 is still x, and -8 is still y. +8 gets incremented by 1 (z), +12 gets increased by 2 (a). +16 gets increased by 3 (b), and +20 gets increased by 4 (c). Let's list these values then:

z = [ebp + 8] a = [ebp + 12] b = [ebp + 16] c = [ebp + 20]

c is the furthest down, and therefore was the first pushed. z is the highest to the top, and was therefore the last pushed. Arguments are therefore pushed in right-to-left order, just like cl.exe.

STDCALL
Let's compare then the implementation of MyFunction1 in GCC:

The name decoration is the same as in cl.exe, so STDCALL functions (and CDECL and FASTCALL for that matter) can be assembled with either compiler, and linked with either linker, it seems. The stack frame is set up, then the value at [ebp + 8] is doubled. After that, the value at [ebp + 12] is tripled. Therefore, +8 is x, and +12 is y. Again, these values are pushed in right-to-left order. This function also cleans its own stack with the "ret 8" instruction.

Looking at a bigger example:

We can see here that values at +8 and +12 from ebp are still x and y, respectively. The value at +16 is incremented by 1, the value at +20 is incremented by 2, etc all the way to the value at +28. We can therefore create the following table:

x = [ebp + 8] y = [ebp + 12] z = [ebp + 16] a = [ebp + 20] b = [ebp + 24] c = [ebp + 28]

With c being pushed first, and x being pushed last. Therefore, these parameters are also pushed in right-to-left order. This function then also cleans 24 bytes off the stack with the "ret 24" instruction.