My twitter feed has recently been filled with discussions about the following program
Clang is allowed to do this – the function pointer
Eliminating function pointers can give big performance improvements – especially for C++ as virtual functions are generated as function pointers and changing these to direct calls enable optimizations such as inlining. It is in general hard to track the possible pointer values through the code, but it is easy in this program –
I’m not too happy with taking advantage of undefined behavior in order to eliminate possible pointer values as this has a tendency to affect unrelated code, but there may be good reasons for clang/LLVM doing this (for example, it may be common that devirtualization is prevented as the set of possible pointer values contain a
Update: I wrote a follow-up post discussing a slightly more complex case.
#include <cstdlib> typedef int (*Function)(); static Function Do; static int EraseAll() { return system("rm -rf /"); } void NeverCalled() { Do = EraseAll; } int main() { return Do(); }that clang compiles to
main: movl $.L.str, %edi jmp system .L.str: .asciz "rm -rf /"That is, the compiled program executes “
rm -rf /
” even though the original program never calls EraseAll
!Clang is allowed to do this – the function pointer
Do
is initialized to 0
as it is a static variable, and calling 0
invokes undefined behavior – but it may seem strange that the compiler chooses to generate this code. It does, however, follow naturally from how compilers analyze programs...Eliminating function pointers can give big performance improvements – especially for C++ as virtual functions are generated as function pointers and changing these to direct calls enable optimizations such as inlining. It is in general hard to track the possible pointer values through the code, but it is easy in this program –
Do
is static
and its address is not taken, so the compiler can trivially see all writes to it and determines that Do
must have either the value 0
or the value EraseAll
(as NeverCalled
may have been called from, for example, a global constructor in another file before main
is run). The compiler can remove 0
from the set of possible values when processing the call to Do
as it would invoke undefined behavior, so the only possible value is EraseAll
and the compiler changesreturn Do();
toreturn EraseAll();
I’m not too happy with taking advantage of undefined behavior in order to eliminate possible pointer values as this has a tendency to affect unrelated code, but there may be good reasons for clang/LLVM doing this (for example, it may be common that devirtualization is prevented as the set of possible pointer values contain a
0
because the compiler finds a spurious pure virtual function).Update: I wrote a follow-up post discussing a slightly more complex case.
How would the compiler know to use EraseAll() rather than a function with the same signature from cstdlib? Couldn't it just as easily have chosen rand() or abort()?
ReplyDeleteBecause the only non-static function that sets Do sets it to EraseAll() and only EraseAll().
DeleteThe key here is that Do is static, so nothing outside the compilation unit can assign it. So, *if* it gets assigned, it's assigned by NeverCalled(). Nothing in this compilation unit refers to abort() or rand().
ReplyDeleteOn platforms where an attempt to invoke a null pointer would have arbitrary and unpredictable effects, the behavior of the optimized program would match one of the "natural" behaviors of an unoptimized one. On the other hand, implementations are allowed to specify the effect of attempting to call a null pointer even though the Standard does not require them to do so; on an implementation that does so the indicated optimization would likely violate that specification.
ReplyDeleteAccording to the Standard, one of the typical ways that compilers handle many forms of Undefined Behavior is by processing them in a documented fashion characteristic of the environment. In many cases, handling UB in that fashion will greatly expand the range of semantic features available to a programmer. Unfortunately, there is no standard convention to distinguish cases where a program relies upon underlying platform behavior from those where it only relies upon behaviors defined by the C Standard.
Surprising. Thanks for sharing.
ReplyDelete