My previous blog post had a minimal description of
GCC tracks what the pointers may point to using the general ideas from the paper “Efficient Field-sensitive pointer analysis for C”. I will not describe the details – the first few pages of the paper do it better than I can do here – but the principle is that each pointer is represented by a set of locations it may point to, the compiler is generating set constraints representing each statement in the program, and then solving those constraints to get the actual set of locations the pointer may point to.
But this process is expensive, so GCC is normally doing this one function at a time and assumes that called functions may access any memory visible to them.
A more relevant example is the “slow” code from the “Integer division is slow” blog post
-fipa-pta
and I have received several questions about what it actually do. This blog post will try to give some more details...Points-to analysis
Many optimizations need to know if two operations may access the same memory address. For example, the if-statement ini = 5; *p = -1; if (i < 0) do_something();can be optimized away if
*p
cannot modify i
.GCC tracks what the pointers may point to using the general ideas from the paper “Efficient Field-sensitive pointer analysis for C”. I will not describe the details – the first few pages of the paper do it better than I can do here – but the principle is that each pointer is represented by a set of locations it may point to, the compiler is generating set constraints representing each statement in the program, and then solving those constraints to get the actual set of locations the pointer may point to.
But this process is expensive, so GCC is normally doing this one function at a time and assumes that called functions may access any memory visible to them.
-fipa-pta
The-fipa-pta
optimization takes the bodies of the called functions into account when doing the analysis, so compilingvoid __attribute__((noinline)) bar(int *x, int *y) { *x = *y; } int foo(void) { int a, b = 5; bar(&a, &b); return b + 10; }with
-fipa-pta
makes the compiler see that bar
does not modify b
, and the compiler optimizes foo
by changing b+10
to 15
int foo(void) { int a, b = 5; bar(&a, &b); return 15; }
A more relevant example is the “slow” code from the “Integer division is slow” blog post
std::random_device entropySource; std::mt19937 randGenerator(entropySource()); std::uniform_int_distribution<int> theIntDist(0, 99); for (int i = 0; i < 1000000000; i++) { volatile auto r = theIntDist(randGenerator); }Compiling this with
-fipa-pta
makes the compiler see that theIntDist
is not modified within the loop, and the inlined code can thus be constant-folded in the same way as the “fast” version – with the result that it runs four times faster.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.