Signed integers are not allowed to overflow in C and C++, and this helps compilers generate better code. I was interested in how GCC is taking advantage of this, and here are my findings.1
Signed integer expression simplification
The nice property of overflow being undefined is that signed integer operations works as in normal mathematics — you can cancel out values so that
(x*10)/5
simplifies to x*2
, or (x+1)<(y+3)
simplifies to x<(y+2)
. Increasing a value always makes it larger, so x<(x+1)
is always true.
GCC iterates over the IR (the compiler's Internal Representation of the program), and does the following transformations (
x
, and y
are signed integers, c
, c1
, and c2
are positive constants, and cmp
is a comparison operator. I have only listed the transformations for positive constants, but GCC handles negative constants too in the obvious way)- Eliminate multiplication in comparison with 0
(x * c) cmp 0 -> x cmp 0
- Eliminate division after multiplication
(x * c1) / c2 -> x * (c1 / c2) if c1 is divisible by c2
- Eliminate negation
(-x) / (-y) -> x / y
- Simplify comparisons that are always true or false
x + c < x -> false x + c <= x -> false x + c > x -> true x + c >= x -> true
- Eliminate negation in comparisons
(-x) cmp (-y) -> y cmp x
- Reduce magnitude of constants
x + c > y -> x + (c - 1) >= y x + c <= y -> x + (c - 1) < y
- Eliminate constants in comparisons
(x + c1) cmp c2 -> x cmp (c2 - c1) (x + c1) cmp (y + c2) -> x cmp (y + (c2 - c1)) if c1 <= c2
The second transformation is only valid ifc1 <= c2
, as it would otherwise introduce an overflow wheny
has the valueINT_MIN
.
Pointer arithmetic and type promotion
If an operation does not overflow, then we will get the same result if we do the operation in a wider type. This is often useful when doing things like array indexing on 64-bit architectures — the index calculations are typically done using 32-bit
One other aspect of this is that undefined overflow ensures that
int
, but the pointers are 64-bit, and the compiler may generate more efficient code when signed overflow is undefined by promoting the 32-bit integers to 64-bit operations instead of generating type extensions. One other aspect of this is that undefined overflow ensures that
a[i]
and a[i+1]
are adjacent. This improves analysis of memory accesses for vectorization etc.Value range calculations
The compiler keeps track of the variables' range of possible values at each point in the program, i.e. for code such as
int x = foo(); if (x > 0) { int y = x + 5; int z = y / 4;
it determines that
x
has the range [1, INT_MAX]
after the if-statement, and can thus determine that y
has the range [6, INT_MAX]
as overflow is not allowed. And the next line can be optimized to
int z = y >> 2;as the compiler knows that
y
is non-negative.The undefined overflow helps optimizations that need to compare two values (as the wrapping case would give possible values of the form
[INT_MIN, (INT_MIN+4)]
or [6, INT_MAX]
that prevents all useful comparisons with <
or >
), such as- Changing comparisons
x<y
to true or false if the ranges forx
andy
does not overlap - Changing
min(x,y)
ormax(x,y)
tox
ory
if the ranges do not overlap - Changing
abs(x)
tox
or-x
if the range does not cross 0 - Changing
x/c
tox>>log2(c)
ifx>0
and the constantc
is a power of 2 - Changing
x%c
tox&(c-1)
ifx>0
and the constantc
is a power of 2
Loop analysis and optimization
The canonical example of why undefined signed overflow helps loop optimizations is that loops like
But undefined signed overflow helps many more loop optimizations. All analysis such as determining number of iteration, transforming induction variables, and keeping track of memory accesses are using everything in the previous sections in order to do its work. In particular, the set of loops that can be vectorized are severely reduced when signed overflow is allowed.
for (int i = 0; i <= m; i++)are guaranteed to terminate for undefined overflow. This helps architectures that have specific loop instructions, as they do in general not handle infinite loops.
But undefined signed overflow helps many more loop optimizations. All analysis such as determining number of iteration, transforming induction variables, and keeping track of memory accesses are using everything in the previous sections in order to do its work. In particular, the set of loops that can be vectorized are severely reduced when signed overflow is allowed.
1. Using GCC trunk r233186, "gcc version 6.0.0 20160205 (experimental)"
This blog post was updated 2016-02-23:
This blog post was updated 2016-02-23:
- Corrected first sentence to say "overflow"instead of "wrap".
- Removed one incorrect transformation from "Eliminate Negation" where I had misread the GCC source code.
- Corrected the ranges in the "Value range calculations" example.