summaryrefslogtreecommitdiff
path: root/src/gallium/drivers/v3d/v3d_program.c
AgeCommit message (Collapse)AuthorFilesLines
2020-11-03broadcom/compiler: remove v3d_fs_key depth_enabled field.Alejandro Piñeiro1-2/+0
It is not used right now, so keeping it adds some noise/confusion. So far configuring Z test are done through the CFG_BITS. See v3dX(emit_state) at v3dx_emit.c for v3d, and pack_cfg_bits at v3dv_pipeline.c for v3dv. There flags like z_updates_enable and others are filled up. That key field seems like a leftover coming from using vc4 as reference, as that driver defines and uses a field with name name. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7421>
2020-10-13v3d/compiler: num_tex_used on v3d_keyAlejandro Piñeiro1-0/+2
We would need on OpenGL to update values for all the textures used. On OpenGL that value can be always took from the context or the nir shader, but there are cases on Vulkan that it is not the case, or would force up to recompute it. Acked-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-07-29nir: Add nir_foreach_shader_in/out_variable helpersJason Ekstrand1-3/+3
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5966>
2020-07-06v3d: Add a lowering pass for line smoothingNeil Roberts1-0/+2
When line smoothing is enabled, the driver now increases the width of the line so that it can add some semi-transparent pixels to either side of the line. A lowering pass is added which modifies the alpha component of every write to fragment output 0 so that if the fragment is outside the width of the line then the alpha is reduced. It additionally discards fragments that are completely invisible. It might seem bad to use discard on a tiled renderer but the assumption is that any bad effects from using discard will also happen anyway because of enabling alpha blending. v2: Disable the line smoothing pass entirely when the framebuffer contains an integer colour output or one with no alpha channel. Calculate the coverage once upfront and store in a global variable instead of calculating each time an output write is modified. Also do the conditional discard once upfront. v3: Don’t check whether the output buffer has an alpha channel. Only look at output 0. Use aa_line_width intrinsic instead of calculating the real line width in the shader. Clamp the coverage as part of the global variable, not per output write. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5624>
2020-07-06v3d: Only call nir_lower_io on shader_in/outJason Ekstrand1-8/+6
Gallium drivers should never see nir_var_uniform because gallium lowers regular uniforms to a UBO. No GL driver should ever see either nir_var_mem_shared because that's lowered in GLSL IR. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5418>
2020-06-03nir: add callback to nir_remove_dead_variables()Timothy Arceri1-1/+1
This allows us to do API specific checks before removing variable without filling nir_remove_dead_variables() with API specific code. In the following patches we will use this to support the removal of dead uniforms in GLSL. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4797>
2020-05-13ttn: Add new allow_disk_cache parameterAxel Davy1-1/+1
For now this parameter doesn't do anything. It means the implementation is allowed to use a cache on disk. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4993>
2019-12-16v3d: support precompiling geometry shadersIago Toral Quiroga1-16/+48
At present, this is only relevant for shader-db. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16v3d: add initial compiler plumbing for geometry shadersIago Toral Quiroga1-19/+150
Most of the relevant work happens in the v3d_nir_lower_io. Since geometry shaders can write any number of output vertices, this pass injects a few variables into the shader code to keep track of things like the number of vertices emitted or the offsets into the VPM of the current vertex output, etc. This is also where we handle EmitVertex() and EmitPrimitive() intrinsics. The geometry shader VPM output layout has a specific structure with a 32-bit general header, then another 32-bit header slot for each output vertex, and finally the actual vertex data. When vertex shaders are paired with geometry shaders we also need to consider the following: - Only geometry shaders emit fixed function outputs. - The coordinate shader used for the vertex stage during binning must not drop varyings other than those used by transform feedback, since these may be read by the binning GS. v2: - Use MAX3 instead of a chain of MAX2 (Alejandro). - Make all loop variables unsigned in ntq_setup_gs_inputs (Alejandro) - Update comment in IO owering so it includes the GS stage (Alejandro) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-11-14util: Move gallium's PIPE_FORMAT utils to /util/format/Eric Anholt1-1/+1
To make PIPE_FORMATs usable from non-gallium parts of Mesa, I want to move their helpers out of gallium. Since u_format used util_copy_rect(), I moved that in there, too. I've put it in a separate directory in util/ because it's a big chunk of related code, and it's not clear to me whether we might want it as a separate library from libmesa_util at some point. Closes: #1905 Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-31v3d: rename vertex shader key (num)_fs_inputs fieldsIago Toral Quiroga1-13/+13
Until now this made sense because we always paired vertex shaders with fragment shaders, but as soon as we implement geometry and tessellation shaders that will no longer be the case, so rename this to (num_)used_outputs. v2: Use 'used_outputs' instead of ns_outputs, which is more explicit (Eric). Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-12v3d: flag dirty state when binding compute statesJose Maria Casanova Crespo1-3/+3
As introduced in "v3d: flag dirty state when binding new sampler states" we need to add support for compute states. New flag VC5_DIRTY_COMPTEX and VC5_DIRTY_UNCOMPILED_CS are introduced. Reaching 33 flags at the dirty field forces us to change the type to uint_64. Flags are reordered and empty continuous bits are available for future pipeline stages. v2: Update flag conditions to compile cs shader. (Eric Antholt) Now dirty flags use uint_64t and flags are reordered. Added VC5_DIRTY_UNCOMPILED_CS flag. Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-22v3d: fill logicop_func in the fragment shader key when precompiling shadersIago Toral Quiroga1-0/+2
Since logicop_func 0 is PIPE_LOGIOP_CLEAR, we were trigger lowerinng of logic ops on precompiled shaders, which we don't want to do. Also, this had the side effect of making shader-db crash, as during this lowering we would try to read the color format swizzle information from the fragment shader key that we don't populate in precompiled shaders because right now we only need it when logic operations are enabled. Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-12v3d: add color formats and swizzles to the fragment shader keyIago Toral Quiroga1-0/+11
We are going to need these very soon to emit correct reads from the tlb to implement logic operations. Reviewed-by: Eric Anholt <eric@anholt.net>
2019-04-12v3d: Detect the correct number of QPUs and use it to fix the spill size.Eric Anholt1-4/+6
We were missing a * 4 even if the particular hardware matched our assumption.
2019-04-12v3d: Add Compute Shader compilation support.Eric Anholt1-70/+133
While waiting for the CSD UABI to get reviewed, I keep having to rebase the CS patch. Just land the compiler side for now to keep it from diverging. For now this covers just GLES 3.1 compute shaders, not CL kernels.
2019-04-12nir/i965/freedreno/vc4: add a bindless bool to type size functionsTimothy Arceri1-1/+1
This required to calculate sizes correctly when we have bindless samplers/images. Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-10st: Lower uniforms in st in the !PIPE_CAP_PACKED_UNIFORMS case as well.Eric Anholt1-11/+0
PIPE_CAP_PACKED_UNIFORMS conflates several things: Lowering uniforms i/o at the st level instead of the backend, packing uniforms with no padding at all, and lowering to UBOs. Requiring backends to lower uniforms i/o for !PIPE_CAP_PACKED_UNIFORMS leads to the driver needing to either link against the type size function in mesa/st, or duplicating it in the backend. Given that all backends want this lower-io as far as I can tell, just move it to mesa/st to resolve the link issue and avoid the driver author needing to understand st's uniforms layout. Incidentally, fixes uniform layout failures in nouveau in: dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_fragment dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_vertex dEQP-GLES2.functional.shaders.struct.uniform.sampler_array_fragment dEQP-GLES2.functional.shaders.struct.uniform.sampler_array_vertex and I think in Lima as well. v2: fix indents Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-04-09nir: Get rid of global registersJason Ekstrand1-1/+0
We have a pass to lower global registers to locals and many drivers dutifully call it. However, no one ever creates a global register ever so it's all dead code. It's time we bury it. Acked-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-05tgsi_to_nir: Produce optimized NIR for a given pipe_screen.Timur Kristóf1-1/+1
With this patch, tgsi_to_nir will output NIR that is tailored to the given pipe, by reading its capabilities and adjusting the NIR code to those capabilities similarly to how glsl_to_nir works. It also adds an optimization loop that brings the output NIR in line with what glsl_to_nir outputs. This is necessary for the same reason why glsl_to_nir has its own optimization loop: currently not every driver does these optimizations yet. For uses which cannot pass a pipe_screen we also keep a variant called tgsi_to_nir_noscreen which keeps the old behavior. Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Andre Heider <a.heider@gmail.com> Tested-by: Rob Clark <robdclark@gmail.com> Acked-By: Eric Anholt <eric@anholt.net>
2019-02-18v3d: Stop tracking num_inputs for VPM loads.Eric Anholt1-1/+1
It's unused in the VS (since we need vattr_sizes[] anyway), so move it to FS prog data.
2019-02-05v3d: Store the actual mask of color buffers present in the key.Eric Anholt1-9/+10
If you only bound rt 1+, we'd still emit a write to the rt0 that isn't present (noticed while debugging an ext_framebuffer_multisample-alpha-to-coverage-no-draw-buffer-zero regression in another change).
2019-02-05v3d: Fix precompile of FRAG_RESULT_DATA1 and higher outputs.Eric Anholt1-1/+1
I was just leaving the other MRT targets than DATA0 out, by accident.
2019-02-05nir: Move V3D's "the shader was TGSI, ignore FS output types" flag to NIR.Eric Anholt1-3/+2
Ken's rework of mesa/st builtins to NIR means that we'll have more NIR shaders with color output types that are mismatched with the render target types. Since this is behavior that GLSL doesn't require, add it as a shader_info option so the driver can know that it needs to ignore the FS output's base type in favor of the actual render target's. This prevents needing additional variants in several mesa/st paths (clear, pbo upload, pbo download), given that the driver already has to handle the variants for any TGSI being passed to it (from u_blitter, for example). Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-27v3d: Rename gallium-local limits defines from VC5 to V3D.Eric Anholt1-1/+1
The compiler has its limits under V3D_* (like most V3D stuff), so sync up with that.
2019-01-19nir: rename nir_var_function to nir_var_function_tempKarol Herbst1-1/+1
Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-08nir: rename global/local to private/function memoryKarol Herbst1-1/+1
the naming is a bit confusing no matter how you look at it. Within SPIR-V "global" memory is memory accessible from all threads. glsl "global" memory normally refers to shader thread private memory declared at global scope. As we already use "shared" for memory shared across all thrads of a work group the solution where everybody could be happy with is to rename "global" to "private" and use "global" later for memory usually stored within system accessible memory (be it VRAM or system RAM if keeping SVM in mind). glsl "local" memory is memory only accessible within a function, while SPIR-V "local" memory is memory accessible within the same workgroup. v2: rename local to function as well v3: rename vtn_variable_mode_local as well Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-04v3d: Fix up VS output setup during precompiles.Eric Anholt1-6/+10
I noticed that a VS I was debugging was missing all of its output stores -- outputs_written was for POS, VAR0, VAR3, while the shader's variables were POS, VAR9, and VAR12. I'm not sure what outputs_written is supposed to be doing here, but we can just walk the declared variables and avoid both this bug and the emission of extra stvpms for less-than-vec4 varyings.
2019-01-02v3d: Refactor compiler entrypoints.Eric Anholt1-26/+6
Before, I had per-stage entryoints with some helpers shared between them. As I extended for compute shaders and shader-db, it turned out that the other common code in the middle wanted to be shared too.
2019-01-02v3d: Don't forget to include RT writes in precompiles.Eric Anholt1-0/+10
Looking at some assembly dumps for an optimization, we were clearly missing important parts of the shader!
2019-01-02v3d: Fix segfault when failing to compile a program.Eric Anholt1-2/+4
We'll still fail at draw time, but this avoids a regression in shader-db execution once I enable TLB writes in precompiles. Fixes: b38e4d313fc2 ("v3d: Create a state uploader for packing our shaders together.")
2018-12-30v3d: Hook up some shader-db output to GL_ARB_debug_output.Eric Anholt1-0/+12
This allows the original shader-db project's run.c runner to parse things easily, and is probably a good thing to have for GL_ARB_debug_output in general. I formatted it more like Intel's so I can mostly reuse their report script.
2018-12-29v3d: Add a "precompile" debug flag for shader-db.Eric Anholt1-0/+76
I've been using my apitrace-based shader-db so far, but it's slow (apitrace decompression), intrusive (apitrace windows spamming the screen), and doesn't have much coverage. The original shader-db provides a lot more coverage and compiles faster, at the expense of not having the actual runtime variant key. As v3d has a lot less runtime variation than vc4 did, this tradeoff makes more sense.
2018-12-20v3d: Drop shadow comparison state from shader variant key.Eric Anholt1-2/+0
The shadow state is now in the sampler.
2018-12-07v3d: Make an array for frag/vert texture state in the context.Eric Anholt1-2/+2
This simplifies a bunch of our texture handling, while introducing the slots necessary for adding new shader stages.
2018-12-07v3d: Create a state uploader for packing our shaders together.Eric Anholt1-9/+13
Shaders are usually quite short, and are private to the context. We can save memory and reduce the work the kernel needs to do at exec time by packing them together in a stream uploader for long-lived state.
2018-10-30v3d: Use nir_remove_unused_io_vars to handle binner shader output DCEEric Anholt1-1/+1
We were doing this late after nir_lower_io, but we can just reuse the core code. By doing it at this stage, we won't even set up the VS attributes as inputs, reducing our VPM size.
2018-10-30v3d: Use nir_lower_io_to_scalar_early to DCE unused VS input components.Eric Anholt1-1/+4
This lets us trim unused trailing components in the vertex attributes, reducing the size of our VPM allocations.
2018-10-25util: use C99 declaration in the for-loop hash_table_foreach() macroEric Engestrom1-2/+0
Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-15gallium/ttn: Convert inputs and outputs to derefs of variables.Eric Anholt1-3/+4
This means that TTN shaders more closely resemble GTN shaders: they have inputs and outputs as variable derefs, with the variables having their .driver_location already set up for you. This will be useful for v3d to do input variable DCE in NIR, which we can't do when the TTN shaders never have a pre-nir_lower_io stage. Acked-by: Rob Clark <robdclark@gmail.com>
2018-07-09v3d: Implement noperspective varyings on V3D 4.x.Eric Anholt1-0/+5
Fixes a bunch of piglit interpolation tests, and reduces my concern about some MSAA blit shaders with noperspective varyings.
2018-07-05v3d: Fix leak of the spill BO on context destruction.Eric Anholt1-0/+2
2018-07-05v3d: Add proper support for GL_EXT_draw_buffers2's blending enables.Eric Anholt1-4/+4
I had flagged it as enabled on V3D 4.x, but not actually implemented the per-RT enables. Fixes piglit fbo_drawbuffers2-blend.
2018-06-27v3d: Convert a bunch of our "minus one" fields over to the new XML attr.Eric Anholt1-1/+1
This fixes up their formatting for CLIF files and makes the code more legible.
2018-06-22st,ir3,radeonsi: push lower_deref_instrs back into driverRob Clark1-1/+0
vc4+vc5 is not really effected by the deref chain to deref instr conversion, so it no longer needs this pass. For others, now that all the passes mesa/st uses are using deref instructions, push the lowering to deref chains back into driver. Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-22anv,i965,radv,st,ir3: Call nir_lower_deref_instrsJason Ekstrand1-0/+1
This inserts a call to nir_lower_deref_instrs at every call site of glsl_to_nir, spirv_to_nir, and prog_to_nir. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-18v3d: Set the SO offsets correctly if we have to re-emit.Eric Anholt1-0/+2
This should fix TF across a glFlush() or TF pause/restart. Fixes dEQP-GLES3.functional.transform_feedback.array.interleaved.lines.highp_float and many, many others.
2018-05-17v3d: Add support for glSampleMask / glSampleCoverage.Eric Anholt1-1/+1
2018-05-16v3d: Rename driver functions from vc5 to v3d.Eric Anholt1-132/+132
This is the final step of the driver rename.
2018-05-16v3d: Rename the driver files from "vc5" to "v3d".Eric Anholt1-0/+682