Stream-of-consciousness note on disassembly

Compilers have options that allow you to retain and inspect a disassembly of the generated machine code.

On macOS, passing the gcc compiler the options -fverbose-asm -save-temps will tell the compiler to write out a file with the assembly code interspersed with the original source code.

As an example, the file

int add(int a, int b)
{
  return a + b;
}

compiled with c++ -c -fverbose-asm -save-temps add.cpp -oadd results in a file add.s which has in its contents

__Z3addii:
LFB0:
	pushq	%rbp	#
LCFI0:
	movq	%rsp, %rbp	#,
LCFI1:
	movl	%edi, -4(%rbp)	# a, a
	movl	%esi, -8(%rbp)	# b, b
# add.cpp:5:   return a + b;
	movl	-4(%rbp), %edx	# a, tmp84
	movl	-8(%rbp), %eax	# b, tmp85
	addl	%edx, %eax	# tmp84, _3
# add.cpp:6: }
	popq	%rbp	#
LCFI2:
	ret

This is the generated assembly code intermixed with the original source as comments.

We notice some fun things, like the fact that the return result is stored in the eax register, which is a 32 bit register. This is an interesting thing to keep in mind when you run into the maddening fact that a C++ function which is supposed to return a value, will compile fine with no return statement and, in some cases, will work fine too.

Take a look at this code, for example:

#include <iostream>

int add(int a, int b)
{
  int y = a + b;
}

int main(int argc, char *argv[])
{
  std::cout << add(2, 3);
}

This code will compile file and, most likely, work fine. At most, a good compiler will give you a warning, like:

main.cpp: In function 'int add(int, int)':
main.cpp:8:1: warning: no return statement in function returning non-void [-Wreturn-type]
    8 | }
      | ^

What’s up with that?? Well, from our earlier observation about the eax register you can guess what’s happening. A glance at the disassembly confirms this:

__Z3addii:
LFB1567:
	pushq	%rbp	#
LCFI0:
	movq	%rsp, %rbp	#,
LCFI1:
	movl	%edi, -20(%rbp)	# a, a
	movl	%esi, -24(%rbp)	# b, b
# main.cpp:7:   int y = a + b;
	movl	-20(%rbp), %edx	# a, tmp87
	movl	-24(%rbp), %eax	# b, tmp88
	addl	%edx, %eax	# tmp87, tmp86
	movl	%eax, -4(%rbp)	# tmp86, y
# main.cpp:8: }
	nop	
	popq	%rbp	#
LCFI2:
	ret	

The result of the add is stored in eax and remains there when we return (the movl function copies the value and does not reset eax). When we pop back to main, we are expecting to find the result in eax, and indeed it is there, so things work out quite accidentally.

Why doesn’t a C++ compiler like gcc flag this as an error? Apparently it’s quite a hard problem to figure out how things are being returned from a function, especially since one can embed assembly in the function itself. For example, I can do

#include <iostream>

int add(int a, int b)
{
  __asm__(
      "movl	%0, %%edx;"
      "movl	%1, %%eax;"
      "addl	%%edx, %%eax"
      :
      : "g"(a), "g"(b));
}

int main(int argc, char *argv[])
{
  std::cout << add(2, 3);
}

Which is, understandably, hard to analyze for return values.

As a side note, what happens if we change the types of the data involved?

long long int add(long long int a, long long int b)
{
  return a + b;
}
__Z3addxx:
LFB0:
	pushq	%rbp	#
LCFI0:
	movq	%rsp, %rbp	#,
LCFI1:
	movq	%rdi, -8(%rbp)	# a, a
	movq	%rsi, -16(%rbp)	# b, b
# add.cpp:5:   return a + b;
	movq	-8(%rbp), %rdx	# a, tmp84
	movq	-16(%rbp), %rax	# b, tmp85
	addq	%rdx, %rax	# tmp84, _3
# add.cpp:6: }
	popq	%rbp	#
LCFI2:
	ret	

How cool is that! Look, now that we really need 64 bits, the result is being returned in rax, which is a 64 bit register! Sorry, I really should stop, this stuff is addictive.

Good tooling would allow you to view the source code and disassembly side-by-side and allow you to easily navigate from the source to the disassembly. Sadly, VS Code, my favorite editor/IDE by far, steadfastly refuses to add a disassembly view to its C++ extension.

There is a separate extension that theoretically supplies this tooling, but it is very buggy. When I went to use it, it would cache the disassembly file and the only way I could get it to reload the disassembly was to reload the entire editor. That’s no good.

This may say something about the economics of open source: in the current age people who both write C++ and who also need to see the disassembly are usually writing this for work and have access to many powerful proprietary tools for performance analysis (which is usually the main reason people need to peek at the disassembly). Therefore there is no strong market for an open source IDE that allows you to see disassembly.

I am always surprised that Apple gives away the powerful Instruments tool suite away for free. I understand that Eclipse has a nice disassembly view during debugging.

As a complete non sequitor, doing searches for anything related to “disassembly” or “assembly” gets you very different hits from what you expect. Try doing a search for “emacs disassembly” for example.

If you just want to do small experiments the online godbolt compiler is awesome: you can see the source code and disassembly side-by-side, though this is not a tool to use for a large project sitting on your hard disk.

I had a lot of fun during my undergraduate days as an electrical engineering student. I loved pretty much everything there, but my favorites were probably engineering drawing and microprocessor programming. I still have memories of writing an assembly program to do long division. It certainly helped that our 8085 kits looked like the DSKY.

Yes, folks. When I say stream of consciousness, I mean, stream of consciousness, and no, I did not write this blog post in a dream. Which is the dreamer, and which is the dream?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.