Sunday, October 4, 2015

spirv-tools — status, problems, and plans

I have done some work on spirv-tools since the posts about the human friendly SPIR-V representation and Python API, so it makes sense to do a status update now.

Most of the development has been to actually make the code work. And I think it does now. For example, one of my tests is to disassemble all the shaders from the Mesa shader-db (containing shaders from some old games), assemble the result, and verifying that the binaries are identical.

API

IR improvements

The major user-visible change in the API is that the IR has been modified so that
• An ID is represented by an ID object. The ir.Id class contains a reference to the instruction defining the ID, so there is no need to use module.id_to_inst[id] each time you want to get the instruction (which you usually want each time you have an ID). The instruction is now accessed as id.inst.
• A literal number is represented as an integer.
• A literal string is represented as a string.
• An enumerated value is represented as a string of the enumeration name. I choose this representation in order to make it easy to see what the code does. For example
if inst.operands[0] == 'FPFastMathMode':

checks if a decoration is specifying the floating-point fast math flags.
• A mask is represented as a list of strings of the enumeration names, and the empty list is used when no value is set in the mask. Checking if a bit is set is done as
if 'NotNaN' in inst.operands[1]:


Optimizations

I have also added optimization passes corresponding to the LLVM instcombine, constprop, die (Dead Instruction Elimination), and simplifycfg passes. And a mem2reg pass will be available soon.

I'm mostly working on the optimizations just to verify that the API makes sense, and some of the passes (constprop and instcombine) are essentially placeholders right now, but I will finish up the code when the final SPIR-V specification is released.

Plans

My short-term plan for the API:
1. Make this a real Python package, installable with pip. And make it work for Python 3.x too.
2. Add a mem2reg pass (including infrastructure for simple data flow analysis).
3. Implement a better API for handling constants, types, and global instructions.
4. Clean up the API. There are some things that need to be renamed and tweaked a little bit (for example some functions having "uses" in their name treat decorations as a usage, and some does not).
5. Document the API. Add examples/tutorials.

Assembler / disassembler

The biggest user-visible change in the assembler/disassembler is that global symbols now use normal ID tokens (such as %main) instead of prefixing the name with @ (such as @main). The original implementation used @ in order to simplify parsing of a more convenient syntax for declaring global variables
@gl_FragColor = Output <4 x f32> BuiltIn(FragColor)

but decorations are appended to normal instructions, so this is not much more convenient than using an OpVariable instruction
%gl_FragColor = OpVariable %44 BuiltIn(FragColor) Output

The only real difference is that the type must be specified as a pointer type for OpVariable, so it is not pretty-printed (The reason is that the pointer type contains a storage class, and I have not found a good way to include it in a pretty-printed type. The storage class is an operand to OpVariable too, so this could be written as
%gl_FragColor = OpVariable *<4 x f32> BuiltIn(FragColor) Output

if the assembler is updated to infer the storage class from the instruction. But I'm not sure if that is a good idea or not...).

The assembler/disassembler are mostly done, but two things needs to be implemented:
1. Floating-point literals
2. Correct name handling
And there are lots of minor things that could be improved...

Floating-point literals

The assembler is supposed to allow
%52 = OpFSub <4 x f32> %49, (0.5, 0.5, 0.5, 0.5)

%50 = OpConstant f32 0x3f000000
%51 = OpConstantComposite <4 x f32> %50, %50, %50, %50
%52 = OpFSub <4 x f32> %49, %51

but the current implementation does only handle integer/Boolean literals.

Name handling

The assembler accepts named IDs (such as %main) and the disassembler is using the names from OpName decorations to create named IDs. But there are some problems with the implementation:
• Name mangling need to be implemented in order to handle polymorphism (the disassembler currently use the numerical value for IDs if it finds several identical names in the binary, and the assembler returns errors for re-declared names).
• ID names declared within a function should be local to the function.
• How should the tools handle multiple names for the same ID? For example, if the name in OpEntryPoint differs from the name in an OpName debug instruction for the function. Or if one instruction is decorated with multiple OpName names.

Minor tweaks

SPIR-V spreads out some information over the file (e.g. decorations are placed in the beginning of the file, far from the instruction they refer to), and the goal of the textual representation is to collect it in a way that makes it is easy to see all relevant details affecting each instruction, as well as supressing irrelevant details. But it is a bit unclear how to make this as useful as possible...

Some examples of things to consider:
• Is it better to be consistent or compact? Decorations are written after the instruction name, and a FPFastMathMode decoration is currently written as
%52 = OpFSub <4 x f32> FPFastMathMode(NotNaN | NotInf) %49, %50

The values are unique, so it could be written more compact, without the FPFastMathMode keyword
%52 = OpFSub <4 x f32> NotNaN | NotInf %49, %50

But there may exist better ways of improving this...
• Pointers are heavily used in Kernels, so it makes sense to pretty-print them. But how should the storage class be handled?
• Should structures be pretty printed?
• Are OpLoopMerge and OpSelectionMerge instructions necessary, or should the assembler insert them automatically when needed?
I need to look at lots of real world shaders in order to get an idea of what makes sense, but that need to wait for the SPIR-V specification to be released and shader compilers becoming available. And I need to find relevant modern shaders to look at...

Plans

My short-term plan for the assembler/disassembler:
1. Implement floating-point literals
2. Document the assembler syntax