Monday, June 25, 2018

Useful GCC address sanitizer checks not enabled by default

Some useful address sanitizer checks are disabled by default because they are relatively expensive (or, as for the std::vector checking, need to be enabled for all translation units).

Use after return

The address sanitizer warns when a variable is used after it has gone out of scope in a function, but it does not warn when the variable is used after the function return. That can, however, be enabled by adding detect_stack_use_after_return=1 to the ASAN_OPTIONS environment variable.

Example

int *ptr;

__attribute__((noinline))
void foo(void)
{
  int a;
  ptr = &a;
}

int main(void)
{
  foo();
  return *ptr;  // Error
}
Compile as
gcc -O -fsanitize=address file.c
and add detect_stack_use_after_return=1 to the ASAN_OPTIONS environment variable before running the program
env ASAN_OPTIONS="detect_stack_use_after_return=1" ./a.out

Pointer comparison

It is not valid to compare two pointers from different objects using the relational operators <, <=, >, and >=. This can be detected by compiling with -fsanitize=address,pointer-compare and adding detect_invalid_pointer_pairs=1 to the ASAN_OPTIONS environment variable.

Note: -fsanitize=pointer-compare was added in GCC 8.

Example

#include <stdlib.h>

int main(void)
{
  char *p = malloc(42);
  char *q = malloc(42);

  int tmp = p < q;  // Error

  free(p);
  free(q);

  return tmp;
}
Compile as
gcc -fsanitize=address,pointer-compare file.c
and add detect_invalid_pointer_pairs=1 to the ASAN_OPTIONS environment variable before running the program
env ASAN_OPTIONS="detect_invalid_pointer_pairs=1" ./a.out

Pointer subtraction

It is not valid to subtract pointers that point into different objects. This can be detected by compiling with -fsanitize=address,pointer-subtract and adding detect_invalid_pointer_pairs=1 to the ASAN_OPTIONS environment variable.

Note: -fsanitize=pointer-subtract was added in GCC 8.

Example

#include <stdlib.h>

int main(void)
{
  char *p = malloc(42);
  char *q = malloc(42);

  int tmp = p - q;  // Error

  free(p);
  free(q);

  return tmp;
}
Compile as
gcc -O -fsanitize=address,pointer-subtract file.c
and add detect_invalid_pointer_pairs=1 to the ASAN_OPTIONS environment variable before running the program
env ASAN_OPTIONS="detect_invalid_pointer_pairs=1" ./a.out

std::vector checking

The address sanitizer does not detect out-of-bounds accesses to the unused capacity of a vector, such as
std::vector<int> v(2);
int* p = v.data();
v.pop_back();
return p[1];  // Error
because the memory is valid, even though it is an error to use it. It is possible to make the address sanitizer warn for this by compiling with -D_GLIBCXX_SANITIZE_VECTOR which makes libstdc++ annotate the memory so that the validity can be tracked. The annotations must be present on all vector operations or none, so this macro must be defined to the same value for all translation units that create, destroy or modify vectors.

Note: _GLIBCXX_SANITIZE_VECTOR was added in the GCC 8 libstdc++.

Example

#include <vector>

int main()
{
  std::vector<int> v(2);
  int* p = v.data();
  v.pop_back();
  return p[1];  // Error
}
Compile as
g++ -O -fsanitize=address -D_GLIBCXX_SANITIZE_VECTOR file.cpp

Sunday, June 10, 2018

On an example from “What Else Has My Compiler Done For Me Lately?”

One of the examples in Matt Godbolt’s C++Now 2018 talk “What Else Has My Compiler Done For Me Lately?” is the function
void maxArray(double * __restrict x, double * __restrict y)
{
  for (int i = 0; i < 65536; i++) {
    if (y[i] > x[i])
      x[i] = y[i];
  }
}
The compiler generates vectorized code that processes four elements at a time – it reads the elements from x and y, compares the elements, and uses the result of the comparison as a mask in a masked move to write the elements from y that are larger than the corresponding element from x
vmovupd ymm0, ymmword ptr [rsi + rax]
vmovupd ymm4, ymmword ptr [rdi + rax]
vcmpltpd ymm4, ymm4, ymm0
vmaskmovpd ymmword ptr [rdi + rax], ymm4, ymm0
Modifying maxArray to use a more max-like construct (or std::max) as in
void maxArray2(double * __restrict x, double * __restrict y)
{
  for (int i = 0; i < 65536; i++) {
    x[i] = (y[i] > x[i]) ? y[i] : x[i];
  }
}
makes the compiler generate this using a “max” instruction instead of the compare and masked move
vmovupd ymm0, ymmword ptr [rsi + rax]
vmaxpd ymm0, ymm0, ymmword ptr [rdi + rax]
vmovupd ymmword ptr [rdi + rax], ymm0

Matt says he is a bit surprised that the compiler cannot see that the first version too can be generated in this way, but the compiler is doing the right thing – it is not allowed to change maxArray in this way! The reason is that maxArray only writes to x when the value changes while maxArray2 always writes to x, and the compiler would introduce problems if the generated code contain stores that are not in the original source code. Consider for example the program
const double a1[65536] = {0.0};
double a2[65536] = {0.0};

int main(void)
{
  maxArray((double*)a1, a2);
  return 0;
}
that is passing a constant array to maxArray. It is valid to cast away const as long as the object is not written to through the pointer, so this program is correct – y[i] is never bigger than x[i] for any i, so maxArray will never write to (the mask in the vectorized code is never set, so the vmaskmovpd instruction is essentially a nop). The code from maxArray2 does, however, always write to x so it would crash on this input as the compiler places a1 in read-only memory.