Monday, October 17, 2016

Inlining — main is special

I wrote in my previous post that I assumed Jason compiled using the -O2 optimization level for the example in his CppCon 2016 talk, as it worked for me when I used -O3. But I was wrong — Jason used -O3, but his example was slightly different compared to mine. I used the function
int bar()
{
    return std::string("a").size() + std::string("b").size();
}
that optimizes to just returning a constant, while Jason used
int main()
{
    return std::string("a").size() + std::string("b").size();
}
which does not optimize. The only difference is the name of the function...

The reason for the difference is that GCC knows that main is special — it has the property that it is called only once. That is, it is a cold function, which makes the inlining less aggressive.

GCC is propagating the "called only once" information where possible, so functions for which the compiler can see all callers (such as static functions, functions in an anonymous namespace, or when compiling using -fwhole-program) are also marked as "called only once" if they are called exactly once from such a function. For example, the compiler can see that bar is called only once in
static int bar()
{
   return std::string("a").size() + std::string("b").size();
}

int main()
{
    return bar();
}
and the string code is not inlined. Removing the static prevents GCC from marking bar as called only once (as it could be called from outside the translation unit), and the string code will now be inlined.

On the other hand, if the "called only once" function contains loops, then GCC can infer that it makes sense to optimize the content of the loop as it is executed multiple times. So for example
int main()
{
    int k = 0;
    for (int i = 0; i < 10000; i++)
        k += std::string("a").size() + std::string("b").size();
    return k;
}
gets the strings inlined, and it optimizes to
main:
    mov     eax, 20000
    ret

3 comments:

  1. Interesting! Does that mean that manually marking functions as hot/cold (as, I suppose, the compiler doesn't do a great job at detecting such functions, as it usually can't see all callers) may significantly improve the compiler's inlining strategy?

    ReplyDelete
    Replies
    1. I wouldn't be too surprised if, in general, the compiler has a better idea than the human about what is the hot path. Knowing which functions are hot indeed does influence how well the optimiser can do its job. That's what's behind profile-guided optimisation. https://en.wikipedia.org/wiki/Profile-guided_optimization
      Instead of the compiler or human guessing, the code is profiled to know exactly what's hot.

      Delete
    2. Yes, you should probably use profile-driven optimization if your application is such that it is helped much by hot/cold decisions.

      My experience is that manually marking functions at hot/cold is not worth the effort...

      Delete