Age | Commit message (Collapse) | Author | Files | Lines |
|
A few thoughts:
- Some of that LegalizeSSA logic should really live much earlier and be
subject to the likes of DCE and other useful passes
- Some of the "lowering" done in from_tgsi should be done later so that
proper optimization might be done.
However this all works and the above can be improved upon later.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
|
|
PFETCH, actually ISBERD on GM107+ ISA only accepts a GPR for src0.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
If we have a loop, instructions before the tex might be added as tex
uses, and those may in fact dominate all other uses of the tex results.
This however doesn't mean that we don't need a texbar after the tex.
Only check if uses dominate each other they are dominated by the tex.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96565
Fixes: 7752bbc44 (gk104/ir: simplify and fool-proof texbar algorithm)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
|
|
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
The SAMPLEMASK semantic should only return the bits set covered by the
current invocation. However we were always retrieving the covmask, which
returns the covered samples of the whole pixel.
When not doing per-sample invocation, this is precisely what we want.
However when doing per-sample invocation, we have to select the
sampleid'th bit and only return that. Furthermore, this means that we
have to have a 1:1 correlation for invocations and samples.
This fixes most
dEQP-GLES31.functional.shaders.sample_variables.sample_mask_in.*
tests. A few failures remain due to disagreements about nr_samples==1
logic as well as what happens with MSAA x2 RTs when the shading fraction
is 0.5.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
This will allow to convert surface formats without adding an extra
call to our lib.
[hakzsam: make use of this for GK104]
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
This implements RESQ for surfaces which comes from imageSize() GLSL
bultin. As the dimensions are sticked into the driver constant buffer,
this only has to be lowered with loads.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v2)
|
|
TGSI RESQ allows both images and buffers but we have to make a
distinction between these two type of resources in our lowering pass.
Introducing OP_BUFQ which is a fake operand will allow to implement
OP_SUQ for surfaces.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
Make sure to avoid out of bounds access in presence of indirect
array indexing by loading the size from the driver constant buffer.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
Having all this code in a big switch is not really a good pratice.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
To not overwrite buffers and surfaces information, we need to use
a different offset in the driver constant buffer. Currently, OP_SUQ
is only supported for buffers but this will be slightly updated for
images support.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
Changes from v3:
- move the previous OP_SELP change to the previous commit
Changes from v2:
- make sure the op is OP_SELP when emitting the predicate and add one
assert
- use bld.getSSA() for mkOp2()
- add cross edge between tryLockAndSetBB and joinBB
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
This largely leaves the existing image logic alone. When image support
is added this will have to be harmonized somehow.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
With the current algorithm, we only look at tex uses. However there's a
write-after-write hazard where we might decide to, on some path, not use
a texture's output at all, but instead to write a different value to
that register. However without the barrier, the texture might complete
later and overwrite that value.
This fixes Unreal Elemental demo on GK110/GK208, flightgear on GK10x,
and likely other random-looking failures.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1" <mesa-stable@lists.freedesktop.org>
|
|
If build with C++11 standard, use std::unordered_set.
Otherwise if build on old Android version with stlport,
use std::tr1::unordered_set with a wrapper class.
Otherwise use std::tr1::unordered_set.
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
There's a lot of functionality duplicated in the gm107 lowering pass
from the nvc0 pass. As that one gets updated, the gm107 one falls
behind. Avoid this by sharing the code.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
This will set the FTZ flag (flush denorms to zero) on all opcodes that
can take it.
This resolves issues in Unigine Heaven 4.0 where there were solid-filled
boxes popping up.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89455
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
In certain circumstances, findFirstUses could end up doubling back on
instructions it had already processed, resulting in an infinite
recursion. Avoid this by keeping track of already-visited instructions.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83079
Tested-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
|
|
The big missing part here is proper sched data calculations, but
hopefully the chosen placeholder will be sufficient for now.
Passes piglit as well as GK107 does.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|