How do C++ pointers work on a machine level?
Posted by in Computer Science Programming onI’m excited to answer this question because, as it so happens, I just looked into this a few days ago when someone was kind enough to correct me. I’ll walk you through my process so you can reproduce it yourself. But first, let’s establish some baseline expectations for the kids following along at home.
Let’s lead with an analogy:
Imagine a deck of playing cards. You search for the King of Hearts and find it’s the 32nd card from the top. You write down the number “32” on a sheet of paper. The literal value of the pointer would be “32” because that’s where the King of Hearts is located — but the actual value you care about it “King of Hearts”.
Whenever you need to find the King of Hearts, you look at your paper, see the number 32, and grab the 32nd card in the deck.
Remember the metaphor. I’ll be using it a lot in this article. So buckle in, because we’re going deep for this article.
Defining our Terms
We need to define our terms here because “pointer” can mean a lot of different things to a lot of different people. To a great degree, this seems to depend on which version of C++ you work in.
-
Smart Pointer: A neat memory-management data-structure that wraps a C-style pointer to make it easier to avoid memory leaks in large programs. If you have the option, you should be using these.
-
C-Style Pointer: A high-level abstraction that technically references a memory address, but practically lets us indirectly reference values that would otherwise be out of our scope.
-
C++ Reference: A C++ exclusive high-level abstraction that behaves kind of like a C-Style Pointer, but also quite a bit like a run-of-the-mill value.
For our purposes, I’m not going to talk about Smart Pointers. They’re neat, but in terms of “how high is high-level?” these things are in space, so they’re quite a bit out of the “machine code” level of discussion.
We’re also going to need to know a couple of other terms, here:
-
The Stack: For our purposes, we can envision the stack as “local memory”. Technically speaking, this is space in memory that your program reserves ahead of time because it’s already clear that you’re going to need this space. It’s fast to read from, fast to write from, and pretty much cleans up after itself.
For more info, see Call stack - Wikipedia -
The Heap: For our purposes, we can envision the heap as “remote memory”. Technically speaking, this is space in memory that your program doesn’t reserve ahead of time. Any time you use malloc or the new keyword, you are requesting space from the Operating System. I’ll talk more about this later.
For more info, see Memory management - Wikipedia -
Virtual Memory: As a quirk of modern computing, the Heap and the Stack don’t need to be in contiguous memory. As far as your application knows, it is in contiguous memory, because that makes it easier to work with. However, behind the scenes (and obscured even from your machine-level code), the memory can be scattered all around your RAM.
For more info, see Virtual memory - Wikipedia
Great. So now that we’ve defined all of our important terms, let’s get to the meat of this.
What operations do we care about?
I’m going to bound the discussion a bit by defining what it is we actually care to explore. Pointers and references, yes, but there are a lot of things we can do with those. We need to scope the work to make the discussion useful.
- Pointers
- What happens when I create a pointer?
- What happens when I assign a value to a pointer?
- What happens when I dereference a pointer?
- What happens when I destroy a pointer?
- References
- What happens when I create a reference?
- What happens when I assign a value to a reference?
We don’t need to worry about dereferencing references or destroying references. C++ doesn’t allow us to do either of those. I’m also not going to look at any of the special pointers, like nullptr. It’s interesting, but not an “operation”; it’s just a special value.
Define our Environment
The next step here is to define our environment. Why? Because “machine code” is largely dependent on the CPU that your machine code is supposed to run on. Chances are, you’re talking about an x86 CPU or x86_64 CPU. Just as important as the target architecture, however, is the compiler that is converting our high-level code into low-level machine code.
I am writing this from Debian Linux, so you may need to adapt these instructions to your machine. I’m going to be using the g++ compiler, but if you’re using Windows you could probably use MinGW to get similar results. Also, I’m not going to be using any optimization flags, because those can affect the resulting machine code in ways that make it harder to understand what I’m trying to tell the computer to do.
One major thing to consider is that C++ does a lot to obfuscate your code during compilation. It adds a lot of extra instructions that are useful for C++ but not useful to us. To get around this, we’re going to tell the compiler that it needs to treat the output as if it needed to interface with C programs, using extern "C"
.
Finally, we aren’t going to be trying to directly read the machine code. We’re going to use assembly because it’s easier to read. That’s what it’s for, after all.
Understanding Assembly
There are a few things we’re going to need to understand to really “get” what’s going on here. Assembly is a kind of “make your eyes roll back in your head if you aren’t intimately aware of what’s going on at the machine level” kind of language, and it’s got more acronyms and abbreviations than the US Military. Here’s the primer you need to understand what’s going on:
-
Your CPU generally takes commands in the form of “here’s value 1, here’s value 2, now do
with them.” -
The slots where we stick the commands and values are called registers. Your computer probably has between 8 and 32 of them, but you may have more or less. The actual number doesn’t matter for this answer.
- General-purpose registers can be used to hold values.
- Special-purpose registers are usually used to hold runtime information.
-
In my disassembler, registers are referred to by a percent sign and a trigraph.
%rbp
is the special-purpose register that says “this is the base of the stack.”%rsp
is the special-purpose register that says “this is the top of the stack.”%rax
is the special-purpose register where the results of calculations are deposited. We call this “the Accumulator.”
-
In Assembly, the stack grows in the opposite direction it intuitively makes sense to grow. That’ll become relevant later.
- Referencing a spot in memory can be done by using the offset from a stack pointer. So
-12(%rsp)
means “12 bytes from the top of the stack”. Yes, bytes. Yikes.
- Referencing a spot in memory can be done by using the offset from a stack pointer. So
-
Registers can have different sizes. The accumulator
%rax
is 64 bits (8 bytes),%eax
is 32 bits (4 bytes),%ax
is 16 bits (2 bytes), and%ah
&%al
are 8 bits (1 byte) each.
That about covers it. Now let’s get to it!
POINTER: Create and Assign a C-Style Pointer!
So now we’re finally at the point where we are writing code. The first step is to define our program. We want something simple that’s easy to quickly understand and generates really short output. We also want to pick a value that the computer is unlikely to care about, but is obvious to us. I vote for the number 123
:
extern "C" {
int main() {
int value = 123;
int* pointer = &value;
return 0;
}
}
Impressive, I know. I named this file “pointer.cpp” and ran this at the command line:
$ g++ pointer.cpp -S -o ./pointer.s
If you’re not familiar, g++ is the compiler, pointer.cpp is the file, -o specifies where I want the output to go, and -S says “build this in assembly.” So let’s take a look at the contents of pointer.s. The important part is here:
movl $123, -12(%rbp)
leaq -12(%rbp), %rax
movq %rax, -8(%rbp)
Okay, so what’s going on here?
- In the first instruction,
movl $123, -12(%rbp)
, we are placing the integer value123
into memory12
bytes from the bottom of the stack. This corresponds to our C++ variable named int value. - In the second instruction,
leaq -12(%rbp), %rax
, we are placing the numerical value for the memory address that int value represents into the general-purpose register%rax
.
Remember the “the King of Hearts is the 32nd card” analogy? We just found the card, but we haven’t written the “32” down yet. - In the third instruction,
movq %rax, -8(%rbp)
, we are recording the value held in register%rax
into memory8
bytes from the bottom of the stack.
In our King of Hearts analogy, we just wrote down the value “32”.
So now we know how a pointer is created and referenced.
POINTER: Dereference and destroy a pointer!
We need to write a slightly different program here to see what’s going on when we dereference or destroy a pointer. To do that, we’re going to write the same program but we’re going to swap the roles of int value and int* pointer.
extern "C" {
int main() {
int* pointer = new int(123);
int value = *pointer;
delete pointer;
return 0;
}
}
Look at that beautiful pointer. Simply marvelous. Run it through the compiler again. This is the important bit:
subq $16, %rsp
movl $4, %edi
call _Znwm@PLT
movl $123, (%rax)
movq %rax, -8(%rbp)
movq -8(%rbp), %rax
movl (%rax), %eax
movl %eax, -12(%rbp)
movq -8(%rbp), %rax
movl $4, %esi
movq %rax, %rdi
call _ZdlPvm@PLT
Yikes. That’s quite a bit more code. Fear not, it’s not as crazy as it looks.
Earlier, I mentioned that the CPU basically works by slapping values onto registers and then telling it to run a command. That’s about to become relevant.
- The first instruction,
subq $16, %rsp
, tells the CPU to subtract 16 from the top of the call stack and deposit the result into the accumulator%rax
. That will put us outside of our application's allocated memory. You can tell because the highest request we have here is-12(%rbp)
— that address holds an integer, which is 4 bytes wide. 12 + 4 = 16. - The second instruction,
movl $4, %edi
, places the number4
into the register%edi
. This is preparing to tell the operating system that we need to request 4 bytes of memory. - The third instruction,
call _Znwm@PLT
, is executing a function with the tag_Znwm@PLT
. Some sleuthing around will reveal that this corresponds to the “new” keyword. The result is placed on — you guessed it — the accumulator%rax
.
Steps 1–3, as a unit, say “I want to request enough memory from the heap to store one integer value.” What’s that look like in C++?
int* pointer = new int
That’s just us requesting the memory. Now it’s time to use it.
- The fourth instruction,
movl $123, (%rax)
, probably looks pretty familiar. We are putting the integer value123
onto the memory address referenced by the accumulator. That’s the value in the heap — the King of Hearts. - The fifth instruction,
movq %rax, -8(%rbp)
, places the value from the accumulator onto the eighth byte from the base of the stack. That’s the value on the stack — our “King of Hearts is at 32”.
int* pointer = new int(123);
Great! Now let’s dereference that pointer.
- The sixth instruction,
movq -8(%rbp), %rax
, places the numerical address held within our pointer value (8th byte past the base pointer) onto the accumulator. - The seventh instruction,
movl (%rax), %eax
, takes the memory address in the accumulator (“32nd card”) and retrieves the value (“King of Hearts”), and deposits the result into the last 32 bits of the accumulator. - The eighth instruction,
movl %eax, -12(%rbp)
, places the value that is currently held in the 32-bit accumulator (“King of Hearts”) and sticks it into the memory address that is 12 bytes from the bottom of the stack.
int value = *pointer;
Whew! Okay. Now we’re ready to delete the value our pointer refers to.
- The ninth instruction,
movq -8(%rbp), %rax
, is placing the numerical representing the memory address (“32nd card”) into the accumulator. - The tenth instruction,
movl $4, %esi
, is queuing up the integer4
— this corresponds to the width of the memory we are getting ready to delete. - The eleventh instruction,
movq %rax, %rdi
, is moving the value from the accumulator (“32nd card”) into another register to correspond to the numerical address we want to delete. - The twelfth instruction,
call _ZdlPvm@PLT
, is us telling the operating system to deallocate memory based on our previous specifications.
As a whole, these four steps say “I want to free 4 bytes of memory from the heap, corresponding to the address in pointer.”
So now we understand how pointers are created, assigned, dereferenced, and deleted.
REFERENCES: It’s Complicated
References are weird because there isn’t a formal definition for how they’re supposed to work. That’s up to your compiler to figure out. G++ interprets references as pointers with special compile-time rules. Let’s take a look at the create and assign steps. Our source code:
extern "C" {
int main() {
int value = 123;
int& reference = value;
return 0;
}
}
I never cease to amaze. So we run it through the compiler again and…
movl $123, -12(%rbp)
leaq -12(%rbp), %rax
movq %rax, -8(%rbp)
Now go take a look at the first pointer code we wrote. It’s exactly the same. That’s because there’s no difference between pointers and references at run-time according to the g++ compiler. It’s just a specialized pointer with additional compile-time checking.
Summary
So in closing, the machine-code level is complicated, but mostly because we have to do everything 1 step at a time. As far as the computer is concerned, pointers and references are just regular integer values. What makes them special is that the value they hold is a reference to a position in memory (“32nd card”), and that position in memory holds the actual value you care about (“King of Hearts”).