In this lab, you will learn a little more about the x86 instruction set by writing and examining programs written in x86 assembly. In this lab, we will use a few tools that may be somewhat unfamiliar; these include the GNU debugger, GNU assembler, and makefiles.
The third reference is a complete list of all x86 instructions, which you probably will not need. If you are looking for an instruction to perform a particular operation, you can search on the page for matches in the description column. There may be many instructions with the same description, but these will differ in their operand types. Pay close attention to operands; x86 has quite a few funny quirks, and these show up mainly in operands.
There are many tricky details in the x86 calling convention used on Linux, but for this lab you just need to know three things: which registers parameters go into, where return values go, and how to call functions like
printf that take a variable number of parameters.
Parameters to functions are placed in registers in this order:
Return values go into the
When you call a function like
printf, which takes a variable number of parameters, you use the same calling convention with one added detail.
You have to set the
%al register to zero, which indicates that you are not passing any parameters in vector registers (a special kind of register for performing operations in parallel). You can see an example of this in the second hint for part C.
Many parts of x86 assembly language are similar to MIPS, but there are some differences that will show up quickly. Here are few important ones:
add %rax, %rbxinstruction adds the values in
%rbxand stores the result in
pop. The instruction
push %rcxmoves the stack down and copies the value in
%rcxto the bottom of the stack. The instruction
pop %rcxpulls the value off the bottom of the stack and stores it into
%rcx, then moves the stack up.
push %rbp). Next, you take the current stack pointer and make this the frame pointer (
mov %rsp, %rbp). At the end of the function, you need to undo this process. Luckily, x86 has a handy
leaveinstruction that undoes this, then you execute a
retinstruction to return.
We’ll start out by constructing a simple “Hello World” program with x86 assembly to make sure you have all the tools up and running.
First, create a directory for your lab work today.
Next, we’ll write our first program’s source code in that directory, in a file named
# Begin the data section, where we put constants. This contains our string message. .data msg: .asciz "Hello World!\n" # Begin the text section, which is where we put instructions .text .global main # The main function should be visible outside this file main: push %rbp # Save the frame pointer mov %rsp, %rbp # Use the current stack pointer as the new frame pointer leaq msg(%rip), %rdi # Compute the location of our printf message and put it in the first parameter register. # The (%rip) part tells the assembler to use PC-relative addressing. call puts # Call the puts function to display the message mov $0, %rax # Set main to return zero leave # Restore the frame pointer ret # Return
You can use your favorite text editor to create this file, as long as MPLAB is not your favorite text editor. Eclipse is probably not a great choice either.
Now that we have our source code, we can compile it into an executable with the following command:
$ gcc -o hello hello.s -lc
This tells the GNU compiler collection to translate
hello.s from x86 assembly to an executable file named
hello. This file uses the C standard library (for the function
puts) so we pass
-lc to link this library in. The
gcc command will invoke the GNU Assembler (
gas) and the GNU Linker (
ld) to produce this file.
While this works fine, we can use a Makefile to quickly rebuild the program when we edit its source code. Create a new file called
Makefile and add this to the file using your favorite text editor:
all: hello clean: rm -f hello hello: hello.s gcc -o hello hello.s -lc
This file declares three targets:
clean targets are called “phony” targets because these are just shorthand for other targets. After the name of a target, we write a colon and then the list of files or targets it depends on. By writing
all: hello we are telling
make that in order to build
all, it must build
make tool will build the first target by default, so we usually set it up to build everything in the first target. You can build a specific target by typing
make clean or
make hello. Following our target name and its dependencies, we write the shell commands that
make must execute to transform the dependencies into the target. These lines must be indented with tabs, not spaces.
If you run
make in your shell, you’ll probably receive the message
make: Nothing to be done for 'all'. That’s because we already built
hello by hand. If you edit your
hello.s file and then run
make, it should re-execute the steps to transform
hello because the dependencies have changed. If you want to force
make to rebuild
hello you can type
make clean then
Once you’ve built your
hello program using
make, run it! Type the command
./hello into your shell and hit enter.
The last step in this process is to use
gdb, the GNU Debugger, to step through our program’s execution. Run the following command to start our program with
gdb. You’ll get a fairly long message, then a prompt from inside of gdb.
$ gdb ./hello GNU gdb (Debian 7.7.1+dfsg-5) 7.7.1 Copyright (C) 2014 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./hello...(no debugging symbols found)...done. (gdb)
Now that we’re inside of
gdb, we can send commands to
gdb to control how the program executes.
start command as below:
(gdb) start Temporary breakpoint 1 at 0x40050a Starting program: /home/curtsinger/x86/hello Temporary breakpoint 1, 0x000000000040050a in main () (gdb)
The start command bring the program to the start of our
main function. It stops at that point by placing a breakpoint at the beginning of
Now that we’re at the beginning of the main function, we can tell
gdb to display the current instruction with this command:
(gdb) display/i $pc 1: x/i $pc => 0x40050a <main+4>: lea 0x2003c7(%rip),%rdi # 0x6008d8
gdb to display the contents of memory at the program counter as an instruction.
You can see that
gdb stopped our program 4 bytes past the beginning of
main, which skips over the first two instructions of the function, which are often called the “prologue.”
Now we can walk through one instruction at a time with the
(gdb) stepi 0x0000000000400511 in main () 1: x/i $pc => 0x400511 <main+11>: callq 0x4003e0 <puts@plt>
At any point while the program is stopped, you can run
info registers to see the value in each of the processor’s registers.
Keep going until you end up inside the
puts function. You can finish running
puts until you get back to
main using the
Okay, good work! If anything did not run as expected, now is the time to get some help. If things looked okay, continue on to the next step.
Using our message program as a starting point, you will write a program that uses a loop to print a message ten times.
Write your new program in the file
You may not simply copy-paste the code ten times!
You will need to add a target for
hellohello, and add
hellohello to the dependencies for the
all target and the files removed by the
Hint: The x86 architecture performs conditional jumps in two stages, sort of like the
slt instruction in MIPS.
You can use the
cmp instruction to compare to registers, which sets the result of that comparison in a special set of “flag registers.”
Conditional jumps run after the comparison use these flags to decide whether to take a jump or not.
Once you have a working program, run it with
gdb and stop after your loop bounds check.
Print the register values with
info registers at this point. Which flags are consistently set or un-set by the comparison when the loop is finished? What about when it isn’t finished?
Call over a mentor emonstrate your program and report your answer to the flags question.
In this part of the lab, you will implement a recursive factorial function, and a
main function to test your factorial implementation.
main function should count from zero up to ten, call
fact on each of these values, and then print out a message showing the result.
Here is how your completed
fact program should work:
$ ./fact fact(0) is 1 fact(1) is 1 fact(2) is 2 fact(3) is 6 fact(4) is 24 fact(5) is 120 fact(6) is 720 fact(7) is 5040 fact(8) is 40320 fact(9) is 362880 fact(10) is 3628800
Write your code in a file
fact.s, and update your
Makefile to build a program called
fact from your code.
Once you have a working implementation, show it to me or one of the mentors.
Hint: Recursive functions are quite a bit simpler in x86 than on MIPS.
Dealing with the stack is pretty simple because we have
You’ll still have to save at least one value on the stack or you will lose it after the recursive call.
Registers used to pass parameters are not saved across procedure calls!
Hint: You will need to call
printf to show your message. Here’s a sample program that calls
# Begin the data section, where we put constants. This contains our string message. .data msg: .asciz "fact(%u) is %u\n" # Begin the text section, which is where we put instructions .text .global main # The main function should be visible outside this file main: push %rbp # Save the frame pointer mov %rsp, %rbp # Use the current stack pointer as the new frame pointer leaq msg(%rip), %rdi # Get the location of our printf message and put it in the first parameter register mov $5, %rsi # Put the constant 5 into the second parameter register mov $120, %rdx # Put the constant 120 into the third parameter register movb $0, %al # Put the constant 0 into the %al register, as required when calling functions that take variable arguments call printf # Call the printf function to display the message mov $0, %rax # Set main to return zero leave # Restore the frame pointer ret # Return