ISASPEC - XML Based ISA Specification ===================================== isaspec provides a mechanism to describe an instruction set in XML, and generate a disassembler and assembler. The intention is to describe the instruction set more formally than hand-coded assembler and disassembler, and better decouple the shader compiler from the underlying instruction encoding to simplify dealing with instruction encoding differences between generations of GPU. Benefits of a formal ISA description, compared to hand-coded assemblers and disassemblers, include easier detection of new bit combinations that were not seen before in previous generations due to more rigorous description of bits that are expect to be '0' or '1' or 'x' (dontcare) and verification that different encodings don't have conflicting bits (i.e. that the specification cannot result in more than one valid interpretation of any bit pattern). The isaspec tool and XML schema are intended to be generic (not specific to ir3), although there are currently a couple limitations due to short- cuts taken to get things up and running (which are mostly not inherent to the XML schema, and should not be too difficult to remove from the py and decode/disasm utility): * Maximum "field" size is 64b * Fixed instruction size Often times, especially when new functionality is added in later gens while retaining (or at least mostly retaining) backwards compatibility with encodings used in earlier generations, the actual encoding can be rather messy to describe. To support this, isaspec provides many flexible mechanism, such as conditional overrides and derived fields. This not only allows for describing an irregular instruction encoding, but also allows matching an existing disasm syntax (which might not have been design around the idea of disassembly based on a formal ISA description). Bitsets ------- The fundamental concept of matching a bit-pattern to an instruction decoding/encoding is the concept of a hierarchical tree of bitsets. This is intended to match how the HW decodes instructions, where certain bits describe the instruction (and sub-encoding, and so on), and other bits describe various operands to the instruction. Bitsets can also be used recursively as the type of a field described in another bitset. The leaves of the tree of instruction bitsets represent every possible instruction. Deciding which instruction a bitpattern is amounts to: .. code-block:: c m = (val & bitsets[n]->mask) & ~bitsets[n]->dontcare; if (m == bitsets[n]->match) { /* we've found the instruction description */ } For example, the starting point to decode an ir3 instruction is a 64b bitset: .. code-block:: xml Encoding of an ir3 instruction. All instructions are 64b. In the first level of instruction encoding hierarchy, the high three bits group things into instruction "categories": .. code-block:: xml Destination register is opposite precision as source, i.e. if {FULL} is true then destination is half precision, and visa versa. Full precision source registers 010 The ```` elements are the part(s) that determine which leaf-node bitset matches against a given bit pattern. The leaf node's match/mask/ dontcare bitmasks are a combination of those defined at the leaf node and recursively each parent bitclass. For example, cat2 instructions (ALU instructions with up to two src registers) can have either one or two source registers: .. code-block:: xml {SY}{SS}{JP}{SAT}(nop{NOP}) {UL}{NAME} {EI}{DST_HALF}{DST}, {SRC1} {SY}{SS}{JP}{SAT}{REPEAT}{UL}{NAME} {EI}{DST_HALF}{DST}, {SRC1} xxxxxxxxxxxxxxxx xxx 000110 In this example, ``absneg.f`` is a concrete cat2 instruction (leaf node of the bitset inheritance tree) which has a single src register. At the ``#instruction-cat2-1src`` level, bits that are used for the 2nd src arg and condition code (for cat2 instructions which use a condition code) are defined as 'x' (dontcare), which matches our understanding of the hardware (but also lets the disassembler flag cases where '1' bits show up in places we don't expect, which may signal a new instruction (sub)encoding). You'll notice that ``SRC1`` refers back to a different bitset hierarchy that describes various different src register encoding (used for cat2 and cat4 instructions), i.e. GPR vs CONST vs relative GPR/CONST. For fields which have bitset types, parameters can be "passed" in via ```` elements, which can be referred to by the display template string, and/or expressions. For example, this helps to deal with cases where other fields outside of that bitset control the encoding/decoding, such as in the ``#multisrc`` example: .. code-block:: xml Encoding for instruction source which can be GPR/CONST/IMMED or relative GPR/CONST. ... {ABSNEG}{SRC_R}{HALF}{SRC} 000000 At some level in the bitset inheritance hierarchy, there is expected to be a ```` element specifying a template string used during bitset decoding. The display template consists of references to fields (which may be derived fields) specified as ``{FIELDNAME}`` and other characters which are just echoed through to the resulting decoded bitset. The special field reference ``{NAME}`` prints the name of the bitset. This is often useful when the ```` element is at a higher level than the leaves of the hierarchy, for example a whole class of similar instructions that only differ in opcode. Sometimes there may be multiple variants of an instruction that must be different bitsets, for example because they are so different that they must derive from different bitsets, but they have the same name. Because bitset names must be unique in the encoder, this can be a problem, but this can worked around with the ``displayname`` attribute on the ``bitset`` which changes how ``{NAME}`` is displayed but not the name used in the encoder. ``displayname`` is only useful for leaf bitsets. It is possible to define a line column alignment value per field to influence the visual output. It needs to be specified as ``{FIELDNAME:align=xx}``. The ```` element will be described in the next section, but it provides for both different decoded instruction syntax/mnemonics (when simply providing a different display template string) as well as instruction encoding where different ranges of bits have a different meaning based on some other bitfield (or combination of bitfields). In this example it is used to cover the cases where ``SRCn_R`` has a different meaning and a different disassembly syntax depending on whether ``REPEAT`` equals zero. The ``