Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
|
|
Fixes: 7a30517cce8f ("broadcom/vc5: Start adding support for rendering to Z32F_S8X24_UINT.")
|
|
I had a bit of it for V3D 3.x, but didn't update it for 4.x.
|
|
|
|
For shader image load/store, we want most of this logic to be shared.
|
|
Having "v3dx_pack() {" under each #if branch would confuse emacs's
indenter.
|
|
I think this bug predated adding v3d_layer_offset(). Noticed during an
unrelated refactor.
|
|
It's supposed to be the dispatched sample mask for this pixel, not the GL
state's sample mask.
|
|
If someone did TF into a UBO, we might have left the TF job un-flushed at
the point of reading.
|
|
This simplifies a bunch of our texture handling, while introducing the
slots necessary for adding new shader stages.
|
|
The default attributes are long-lived (the state struct is cached), and
only 256 bytes each.
|
|
Shaders are usually quite short, and are private to the context. We can
save memory and reduce the work the kernel needs to do at exec time by
packing them together in a stream uploader for long-lived state.
|
|
We were missing the invalidate between bin and render (possibly relevant
for SSBOs), and still trying to flush the nonexistent L2C on 3.3+.
|
|
This is a separate, dedicated hardware unit for texture layout conversions
and mipmap generation.
|
|
The TFU lets us format raster and SAND images into formats that can be
read by the texture engine, and do mipmap generation.
The UAPI comes from drm-next e69aa5f9b97f ("Merge tag
'drm-misc-next-2018-12-06' of git://anongit.freedesktop.org/drm/drm-misc
into drm-next")
|
|
The HW apparently has some issues (or at least a much more complicated VCM
calculation) with non-combined segments, and the closed source driver also
uses combined I/O. Until I get the last CTS failure resolved (which does
look plausibly like some VPM stomping), let's use combined I/O too.
|
|
We were exposing ARB_texture_float, but apparently not the OES subset
flag. Fixes regression from GLES3 support to GLES2.
Fixes: fcf9fcee3c8a ("mesa/main: do not require float-texture filtering
for es3")
|
|
This is the actual native format for the hardware, without swizzling.
Noticed while debugging why GLES3 disappeared.
|
|
I've been using this with the kmsro series to test v3d on VKMS without my
old KMS hack in the v3d kernel driver. KMSRO still needs some cleanup,
but v3d RO support seems reasonable.
|
|
Fixes: 4018eb04e8a5 ("v3d: Use the TLB R/B swapping instead of recompiles when available.")
|
|
Now that it doesn't need to find the struct v3d_bos, it can just take the
normal v3d_ioctl() path.
|
|
This way we don't need to reach back into the gallium driver code to get
the mapping.
|
|
The recompile reduction is nice, but this also makes it so that a straight
texture copy could get optimized some day to not unpack/repack the f16
values.
|
|
|
|
If the caller has passed in a stride for (linear) BO import, we should use
that stride when rendering to the BO (or, if we some day support texturing
from linear-imported BOs, when doing the linear-to-UIF shadow copy). This
lets us remove the extra stride-changing relayout in the simulator.
|
|
This came from vc4, where we had a file format for GPU hangs. I don't
have one of those for V3D, and I probably won't ever have the simulator
side produce dumps even if I do.
|
|
|
|
|
|
this is needed by u_debug
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
We were doing this late after nir_lower_io, but we can just reuse the core
code. By doing it at this stage, we won't even set up the VS attributes
as inputs, reducing our VPM size.
|
|
This lets us trim unused trailing components in the vertex attributes,
reducing the size of our VPM allocations.
|
|
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
|
|
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
|
|
Cuts the formerly 7-minute simulation time of fs-packHalf2x16.shader_test
in half.
|
|
This means that TTN shaders more closely resemble GTN shaders: they have
inputs and outputs as variable derefs, with the variables having their
.driver_location already set up for you.
This will be useful for v3d to do input variable DCE in NIR, which we
can't do when the TTN shaders never have a pre-nir_lower_io stage.
Acked-by: Rob Clark <robdclark@gmail.com>
|
|
u_transfer_helper already had code to handle treating packed Z32_S8
as separate Z32_FLOAT and S8_UINT resources, since some drivers can't
handle that interleaved format natively.
Other hardware needs depth and stencil as separate resources for all
formats. For example, V3D3 needs this for 24-bit depth as well.
This patch adds a new flag to lower all depth/stencils formats, and
implements support for Z24_UNORM_S8_UINT. (S8_UINT_Z24_UNORM is left
as an exercise to the reader, preferably someone who has access to a
machine that uses that format.)
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
The HW for FLUSH_ALL_STATE isn't validated, since the closed driver only
uses FLUSH. Now that we don't have any new state at the end of our bin
CLs, follow their lead.
|
|
Ever since we added OQ support, we've been clearing OQ state at the start
of the job anyway. We're intentionally breaking old-and-new-driver-mix
systems, because we need to stop using the unvalidated FLUSH_ALL_STATE.
|
|
The HW's FLUSH_ALL_STATE is not validated, so we probably shouldn't use
it, meaning that we need to reset state at the start. By doing this, we
also make ourselves more resilient to another client leaving the TF state
enabled at the end of their batch (as we now do, ourselves).
However, we still need to emit a single TF disable at the end of the
frame, for SWVC5-718.
|
|
There were two bugs working together to make things mostly work: I wasn't
dividing the VPM output size available by the size of a batch (vertex),
but I also had the size of the VPM reduced by a factor of 8.
Fixes dEQP-GLES3.functional.vertex_array_objects.all_attributes and it
seems also my intermittent varying failures.
Fixes: 1561e4984eb0 ("v3d: Emit the VCM_CACHE_SIZE packet.")
|
|
Fixes
dEQP-GLES3.functional.fragment_ops.blend.default_framebuffer.rgb_func_alpha_func.dst.src_alpha_saturate_src_alpha_saturate
and friends with --deqp-egl-config-name=rgb565d0s0
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
|
|
Now that we have the util function for the default values, we can get rid
of the boilerplate.
v2: Rebase on new gallium caps
|
|
One of the pains of implementing a gallium driver is filling in a million
pipe caps you don't know about yet when you're just starting out. One of
the pains of working on gallium is copy-and-pasting your new PIPE_CAP into
each driver. We can fix both of these by having each driver call into the
default helper from their default case, so that both sides can ignore each
other until they need to.
v2: fix i915g build, revert swr change to avoid breaking scons build
(https://travis-ci.org/anholt/mesa/jobs/419739857)
v3: Rebase on 3 new gallium caps.
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Cc: Bruce Cherniak <bruce.cherniak@intel.com>
Cc: George Kyriazis <george.kyriazis@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
|
|
Some hardware can do PIPE_TEX_WRAP_MIRROR_REPEAT but not
PIPE_TEX_WRAP_MIRROR_CLAMP and PIPE_TEX_WRAP_MIRROR_CLAMP_TO_BORDER.
Drivers for such hardware would like to advertise support for
ARB_texture_mirror_clamp_to_edge but not EXT_texture_mirror_clamp.
This commit adds a new PIPE_CAP_TEXTURE_MIRROR_CLAMP_TO_EDGE bit,
changes the extension enable to be based on that, and enables it
in all upstream drivers which supported PIPE_CAP_TEXTURE_MIRROR_CLAMP
(so they continue supporting this mode).
|
|
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
|
|
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
|
|
This is needed to ensure that we don't get blocked waiting for VPM space
with bin/render overlapping.
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
|
|
VC5 isn't a useful name any more, just stick to v3d.
|
|
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
|