ESSL permits optimizations that may change the value of floating point expressions (
lowp
and mediump
precision change, reassociation of addition/multiplication, etc.), which means that
identical expressions may give different results in different shaders. This may cause problems with e.g. alignment of geometry in multi-pass algorithms, so output variables may be decorated with the invariant
qualifier to force the
compiler to be consistent in how it generates code for them. The compiler is still allowed to do value-changing
optimizations for invariant
expressions, but it need to do it in the same way for all shaders.
This may give us interesting problems if optimizations and code generation are done without knowledge of each other...Example 1
As an example of the
problems we may get with
The application has a shader
invariant
, consider an application that is generating optimized SPIR-V using an offline ESSL compiler, and uses the IR with a Vulkan driver having a simple backend. The backend works on one basic block at a time, and is generating FMA (Fused Multiply-Add) instructions when multiplication is followed by addition. This is fine for invariant
, even though FMA changes the precision, as the backend is consistent and always generates FMA when possible (i.e. identical expressions in different shaders will generate identical instructions).The application has a shader
#version 310 es in float a, b, c; out invariant float result; void main() { float tmp = a * b; if (c < 0.0) { result = tmp - 1.0; } else { result = tmp + 1.0; } }
This is generated exactly as written if no optimization is done; first a multiplication, followed by a compare and branch, and
we have two basic blocks doing one addition each. But the offline compiler optimizes this with if-conversion, so it generates SPIR-V as
if
main
was written as
void main() { float tmp = a * b; result = (c < 0.0) ? (tmp - 1.0) : (tmp + 1.0); }
The optimization has eliminated the branches, and the backend will now
see that it can use FMA instructions as everything is in the same basic block.
But the application has one additional shader where
But the application has one additional shader where
main
looks like
void main() { float tmp = a * b; if (c < 0.0) { foo(); result = tmp - 1.0; } else { result = tmp + 1.0; } }
The optimization cannot transform the if-statement
here, as the basic blocks are too complex. So this will not use FMA, and
will therefore break the invariance guarantee.
Example 2
It is not onlyinvariant
expressions that are problematic — you may get surprising results from normal code too when
optimizations done offline and in the backend interacts in interesting
ways. For example, you can get different precision in different threads from "redundant computation elimination" optimizations. This happens for cases such as
mediump float tmp = a + b; if (x == 0) { /* Code not using tmp */ ... } else if (x == 1) { /* Code using tmp */ ... } else { /* Code using tmp */ ... }where
tmp
is calculated, but not used, for the case "x == 0
". The optimization moves the tmp
calculation into the two basic blocks where it is used
if (x == 0) { /* Code not using tmp */ ... } else if (x == 1) { mediump float tmp = a + b; /* Code using tmp */ ... } else { mediump float tmp = a + b; /* Code using tmp */ ... }and the backend may now chose to use different precisions for the two
mediump
tmp
calculations. Offline optimization with SPIR-V
The examples above are of course silly — higher level optimizations should not be allowed to change control flow for invariant statements, and the "redundant computation elimination" does not make sense for warp-based architectures. But the first optimization would have been fine if used with a better backend that could combine instructions from different basic blocks. And not all GPUs are warp-based. That is, it is reasonable to do this kind of optimizations, but they need to be done in the driver where you have full knowledge about the backend and architecture.My impression is that many developers believe that SPIR-V and Vulkan implies that the driver will just do simple code generation, and that all optimizations are done offline. But that will prevent some optimizations. It may work for a game engine generating IR for a known GPU, but I'm not sure that the GPU vendors will provide enough information on their architectures/backends that this will be viable either.
So my guess is that the drivers will continue to do all the current optimizations on SPIR-V too, and that offline optimizations will not matter...
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.