Age | Commit message (Collapse) | Author | Files | Lines |
|
This isn't as complete as I would like (can't merge interpolation because
of the implicit r5 dependency, doesn't work with control flow), but this
was cheap and easy.
Improves 3DMMES Taiji performance by 1.15353% +/- 0.299896% (n=29, 16)
total instructions in shared programs: 99810 -> 99059 (-0.75%)
instructions in affected programs: 10705 -> 9954 (-7.02%)
|
|
Every caller was dereffing the qinst, and this will let us make the number
of sources vary depending on the destination of the qinst so that we can
have general ALU ops that store to tex_[strb] and get an implicit uniform.
|
|
If a block might be entered from multiple locations, then the uniform
stream will (probably) be at different points, and we need to make sure
that it's pointing where we expect it to be. The kernel also enforces
that any block reading a uniform resets uniforms, to prevent reading
outside of the uniform stream by using looping.
|