## Thursday, November 10, 2016

### "missing" optimizations — constant address comparison

Sometimes the compiler intentionally fails to optimize certain constructs for arguably good reasons. For example, compiling the following with GCC
extern int a;
extern int b;

int foo(void)
{
return &a != &b;
}
generates code doing a comparison
foo:
movl    $a, %eax cmpq$b, %rax
setne   %al
movzbl  %al, %eax
ret
even though the C standard ensures the addresses of a and b are different.

It seems to be a bit unclear why GCC keeps this comparison, but the discussion in the bug 78035 mentions the C defect report DR #078, and the expressiveness of the ELF format. DR #078 notes that
unsigned int h(void)
{
return memcpy != memmove;
}
may return 0, which happens on implementations where the C standard library uses the same code for memcpy and memmove (the C language cannot do that, but the standard library does not need to be written in C). This does not mean that the compiler must be able to handle different symbols mapping to the same address — it only says that C programs must not assume too much about the standard library. But ELF supports exporting multiple symbols for the same address, and GCC tries to honor ELF possibilities (such as the symbol interposition that is limiting optimizations for shared libraries).

I'm not convinced it makes sense for GCC to keep these comparisons in the generated code — other optimizations, such as alias analysis, treats global symbols as having different addresses, so it is likely that other optimizations will make the code fail if it has two symbols with the same address. For example,
extern int a;
extern int b;

int foo(void)
{
a = 1;
b = 5;
a++;
return &a != &b;
}
optimizes the accesses to a and b as if they have different addresses, even though the comparison is emitted:
foo:
movl    $a, %eax movl$5, b(%rip)
movl    $2, a(%rip) cmpq$b, %rax
setne   %al
movzbl  %al, %eax
ret
This missing optimization does probably not make any difference in reality (although I could imagine some macro or template that relies on this being optimized), but this inconsistency in what is optimized annoys me...

1. Do distinct functions have distinct addresses is an interesting case because MSVC does aggressively fold functions together in some cases while gcc and clang as far as I know do not, see this Stackoveflow question: http://stackoverflow.com/q/26533740/1708801

I have to say DR #078 is fascinating, I had not seen that one before.

2. GCC sometimes use the same code for identical functions, but it ensures they have different addresses. E.g. if $$\verb!foo!$$ and if $$\verb!bar!$$ are identical, then $$\verb!foo!$$ is compiled to
$$\verb!_Z3foov:! \\ \verb! jmp _Z3barv! \\$$

3. What difficulty would there be with saying that if a compiler is invoked with a flag authorizing a compiler to make assumptions about the addresses of all extern variables, only those not declared volatile, or no extern variables, or if it is invoked with an option that says "I'm aware of all optimization defaults for compiler versions up to [the version which added the flag]" but no flag that explicitly disables it?

Such an approach would allow optimizing assumptions when they are correct and useful, without breaking code in cases where such assumptions would be false.

1. It is easy to add such flags. But all flags come with a maintenance cost, and very few people actually use such options, so no sane compiler developer will agree to add that functionality... 😀