Assembly 101

Assembly can be defined as the text representation of machine code, where machine code is the byte-level programs that processors execute. Assembly is a layer on top of machine code, so is very low level and gives a lot of control to the programmers who write it.

Very few people regularly write assembly, however a lot of the concepts introduced when learning assembly are very useful for developing a deep understanding of computer systems.

Some of the benefits of understanding assembly are that you will be aware of how data is represented in memory, and how instructions given to the processor use data. This post is specifically about x86-64 assembly.


When writing assembly, some of the program state is visible to the programmer. This makes it possible for programmers to manipulate and access data and do conditional branching, for example. The state that is exposed is:

  • The program counter - this is the address of the next instruction. This is just the point the processor is in its program execution.
  • Register file - this is a collection of processor registers which will be where data is moved when it needs to be manipulated.
  • Condition codes - these codes store any status information about the most recent arithmetic or logical operation that was executed. These can be used for conditional branching by programmers.
  • Memory - the memory is exposed to programmers as a byte-addressable array. This means programmers can access very specific parts of memory and manipulate it as desired.

 Data Types

Unlike in higher-level programming languages such as Ruby or JavaScript, assembly has a lot of very specific rules and information about data types, and programmers need to be aware of the actual size of the memory representation of different data types when programming in Assembly.

  • There are no aggregate types in Assembly, such as arrays or structs, there is only contiguously allocated bytes in memory.
  • "Integer" data can be 1, 2, 4 or 8 bytes.
  • Code is stored as a series of bytes encoding sequences of instructions.
  • Floats can be 4, 8 or 10 bytes.

There are also different suffixes for operations in assembly if you are referring to different sized types (I will explain this later!).


As said before, a register is where you can store information used in a program. It's essentially like being able to create temporary variables to use in computation in assembly. You could store this data in memory and access it every time you need it, but this is very expensive in terms of time and will slow down the processor, and therefore your program!

This is why the processor has some storage locations called registers of finite size, usually 64 bits nowadays, which can store this data without having to access the memory. There are only a limited number of these registers- all of the available registers for machines with 64-bit architecture (i.e the memory address size is 64 bits) are shown on the diagram below!:

Diagram of registers

You are able to access a particular register from assembly usin tghe 'name' shown on the diagram, e.g. %rax, but if you only need to access the lower 32 bits of the %rax register, you can use the name %eax for example. These registers are actually backwards compatible with older 32-bit architecture, as the 32-bit and smaller sections of the registers have kept the same names.

A few of the registers we have access to when writing assembly actually have specific usages.

  • The %rbp (formerly %ebp) register is used for the base pointer, which means it keeps track of the bottom of the stack at that point.
  • %rsp is used for the stack pointer, which means it holds a pointer to the top of the stack.
  • By convention, %rax is used to store a return value for a function.
  • %rdx,%rsi, %rcx, %rdi, %r8, and %r9 are used for the first 6 parameters for a function call. (If you need more parameters than 6, the rest are passed on the stack).


Assembly is written using 'operations'. These are just instructions that the programmer can write, but may actually encompass multiple atomic actions by the processor. For example, operations include moving data (from memory to register, or vice versa, you cannot move from memory to memory in a single instruction), performing arithmetic functions on data, or transferring control either via branches or conditions.

Moving Data

The syntax for moving data is:

movq source, dest

(warning, if you are reading the Intel documentation, they will actually write this as movq dest, source)

When you write this operation, source and dest in this example are placeholders for operands. Operands in the movq operation can come in 3 forms:

  • Immediate - Constant integer data, like referring to a hardcoded value, for example movq $0x402, $-535.
  • Register - You can pass one of the integer registers, for example movq %rax, %r13.
  • Memory - This refers to the x (where x is given by the suffix on mov, in this case q meaning 8) consecutive bytes of data in a register, representing a memory address, e.g. movq (%rax), %r12.
Source Destination Assembly Operation Example of same thing in C
Immedaite Register movq $0x8, %r12 temp = 0x8;
Immediate Memory movq $-156,(%r12) *p = -156;
Register Register movq %rax,%rdx temp2 = temp1;
Register Memory movq %rax,(%rdx) *p = temp;
Memory Register movq (%rax),%rdx temp = *p;

(Credit to these slides for this table)

The suffixes you can use with operations like mov are shown below:

Type Specifier Bytes addressed Suffix
Byte 1 b
Word 2 w
Double Word (Long Word) 4 l
Quad Word 8 q

Usually, the size of the item stored at a given memory address can be inferred from the assembly instruction where it is referenced, e.g. movq would imply the data being moves is 8 bytes long.

Memory Addressing

Memory addressing in assembly is slightly more complicated than in more abstracted programming languages. For example, if you were to write myarray[5] in a language like Python or Ruby, you are technically handling memory addressing, however you don't see any of the intricacies.

In Assembly, there are 2 main types of memory addressing, normal addressing and displacement addressing.

Normal Memory Addressing

This is the 'simple' case for memory addressing in assembly. We have actually encountered this already. When you use an operand like (%rax) you are actually doing memory addressing. For example, writing movq (%rax), %rcx will take a memory address stored in %rax and move the item stored there into %rcx.

This means that the memory address being accessed is as follows:


where R is the register you have the memory address stored in.

Displacement Memory Addressing

Displacement offset is how you can handle things similar to arrays in Assembly. Lets say you know that in register %rdx you know you have 2 32-bit values stored next to each other in that memory. To access this, you can consider the register %rdx to be the start of the memory you want to access, and then you can skip ahead 4 bytes to access the second value you have in there.

In this case, the memory address being accessed is:

Memory[Registers[R] + D]

where R is the register you are accessing, and D is a constant displacement from the location given by R.

Writing movq 8(%rdx), %rax is an example of displacement addressing, where you will access the memory at the address stored in %rdx and then go 8 bytes forward and move the item stored there to %rax.

There is also 'scaled indexing' which allows you to access memory in more complex ways. This is when you use an instruction like (%rdx, %rcx, 4) which will get the address stored in %rdx, and add 4 times the number stored in %rcx. Using this method of memory addressing, you can index into contiguous memory, which is essentially the same as indexing into an array or similar. In this case, the pattern is:

Imm(Rb, Ri, S)

where Imm is a constant displacement (either 1, 2 or 4 bytes), S is the scale (can be either 1, 2, 4 or 8, representing the 'size' of the data in your arra), Rb is the 'base register' (the place to start), Ri is the 'index register' (the number of elements to skip forward).

You are also able to omit some of these elements in the pattern if you don't need them, as shown in the examples below.

Memory[Register[Rb] + S * Register[Ri] + D]

A few examples are run through below to help clarify this. In these examples, the registers used contain the following values:

Register Contents
%rdx 0xf000
%rcx 0x100


Operation Addressing Result
(%rdx, %rcx, 4) 0xf000 + 4 * 0x100 0xf400
(%rdx, %rcx) 0xf000 + 0x100 0xf100

Arithmetic Operations

There are many built-in arithmetic operations in Assembly. They vary in complexity and number of operands. Examples are shown below, with credit to this cheatsheet for the tables:

Unary Operations

These operations only take a single argument. They will load the value, complete the operation, and then put the value back whever it was stored:

Instruction Description
inc D Increment by 1
dec D Decrement by 1
neg D Arithmetic negation
not D Bitwise complement

Binary Operations

These operations take two arguments:

Instruction Description
leaq S, D Load effective address of source into destination
add S, D Add source to destination
sub S, D Subtract source from destination
imul S, D Multiply destination by source
xor S, D Bitwise XOR destination by source
or S, D Bitwise OR destination by source
and S, D Bitwise AND destination by source

The lea instruction referenced above is called load effective address. This operation is particularly useful, as it will put the address specified by the first operand, into the register specified in the second operand, but without loading the contents of the first location. This makes a reasonably complex action very cheap for the processor.

Shift Operations

These operations complete shift operations on values by a certain number of bits given by the first argument:

Instruction Description
sal / shl k, D Left shift destination by k bits
sar k, D Arithmetic right shift destination by k bits
shr k, D Logical right shift destination by k bits


Lets take the following C code:

long arith(long x, long y, long z) {
  long t1 = x+y;
  long t2 = z+t1;
  long t3 = x+4;
  long t4 = y * 48;
  long t5 = t3 + t4;
  long rval = t2 * t5;
  return rval;

In x86-64 Assembly, this will translate to:

  leaq (%rdi,%rsi), %rax
  addq %rdx, %rax
  leaq (%rsi,%rsi,2), %rdx
  salq $4, %rdx
  leaq 4(%rdi,%rdx), %rcx
  imulq %rcx, %rax

Below is a walkthrough of how this code was translated, and explaining how the steps in Assembly produce the same outcome!:

Walkthrough of translation from C to Assembly