Sunday, November 6, 2016

Inlining — shared libraries are special

The way shared libraries work affect how the code can be optimized, so GCC must be more conservative with inlining when building shared libraries (i.e. when compiling with -fpic or -fPIC).

Consider the functions
int foo(void)
{
    return 23;
}

int bar(void)
{
    return 19 + foo();
}
Compiling this with "gcc -O3" inlines foo into bar
foo:
    movl    $23, %eax
    ret
bar:
    movl    $42, %eax
    ret
but that is not the case when compiling using "gcc -O3 -fPIC"
foo:
    movl    $23, %eax
    ret
bar:
    subq    $8, %rsp
    call    foo@PLT
    addq    $8, %rsp
    addl    $19, %eax
    ret
The reason is that ELF permits symbols in shared libraries to be overridden by the dynamic linker — a typical use case is to use LD_PRELOAD to load a debug library that contains logging versions of some functions. This has the effect that GCC cannot know that it is the foo above that really is called by bar, and thus cannot inline it. It is only exported symbols that can be overridden, so anonymous namespaces and static functions are optimized as usual, as are functions defined as "extern inline" (the compiler is told to inline, so it may assume the function will not be overridden).

The missed optimizations from this are especially noticeable when doing link-time optimization — the benefit of LTO is that the compiler can see the whole library and inline between files, but this is not possible if those functions may be replaced. This problem makes all interprocedural optimizations (such as devirtualization) ineffective, not only inlining.

There are two ways to get GCC to optimize shared libraries in the same way as normal code
  • Use the command line option -fno-semantic-interposition
  • Avoid exporting unnecessary symbols by passing -fvisibility=hidden to the compiler, and manually export the needed symbols using a linker export map or by decorating the functions with
    __attribute__((__visibility__("default")))
    

1 comment:

  1. There is this other interposition-type issue I ran into (in both gcc and clang) that you might find interesting: http://www.playingwithpointers.com/ipo-and-derefinement.html

    ReplyDelete