path: root/src/gallium/drivers/vc4
AgeCommit message (Collapse)AuthorFilesLines
2018-07-27v3d: Pass the whole clif_dump structure to v3d_print_group().Eric Anholt1-1/+6
To generate CLIF files that the v3dv3 simulator can parse, we're going to need to decode addresses, and for that we'll need the vaddr lookup function from the clif structure from within v3d_decoder.
2018-07-13vc4: Tell NIR to lower fdiv instructionsJason Ekstrand1-0/+1
This should allow us to use them in nir_lower_tex Reviewed-by: Eric Anholt <>
2018-07-13vc4: Switch to using u_transfer_helper for MSAA maps.Eric Anholt2-100/+16
No requirement, just reduces code duplication.
2018-07-12vc4: Don't automatically reallocate a PERSISTENT-mapped buffer.Eric Anholt1-1/+1
I had mistakenly used the COHERENT flag, which can only be set when PERSISTENT is mapped, but isn't always. Fixes: a2014c2eb9e0 ("vc4: Simplify the DISCARD_RANGE handling")
2018-06-29gallium/util: remove dummy function util_format_is_supportedMarek Olšák1-2/+1
Reviewed-by: Eric Engestrom <>
2018-06-22broadcom/vc4: Remove deref chain support from nir_lower_txf_ms.Eric Anholt1-1/+0
Acked-by: Rob Clark <> Acked-by: Bas Nieuwenhuizen <> Acked-by: Dave Airlie <> Reviewed-by: Kenneth Graunke <>
2018-06-22st,ir3,radeonsi: push lower_deref_instrs back into driverRob Clark1-1/+0
vc4+vc5 is not really effected by the deref chain to deref instr conversion, so it no longer needs this pass. For others, now that all the passes mesa/st uses are using deref instructions, push the lowering to deref chains back into driver. Signed-off-by: Rob Clark <> Acked-by: Rob Clark <> Acked-by: Bas Nieuwenhuizen <> Acked-by: Dave Airlie <> Reviewed-by: Kenneth Graunke <>
2018-06-22anv,i965,radv,st,ir3: Call nir_lower_deref_instrsJason Ekstrand1-0/+1
This inserts a call to nir_lower_deref_instrs at every call site of glsl_to_nir, spirv_to_nir, and prog_to_nir. Reviewed-by: Caio Marcelo de Oliveira Filho <> Acked-by: Rob Clark <> Acked-by: Bas Nieuwenhuizen <> Acked-by: Dave Airlie <> Reviewed-by: Kenneth Graunke <>
2018-06-20gallium: add scalar isa shader capChristian Gmeiner1-0/+2
v1 -> v2: - nv30 is _NOT_ scalar as suggested by Ilia Mirkin. - Change from a screen cap to a shader cap as suggested by Eric Anholt. - radeonsi is scalar as suggested by Marek Olšák. - Change missing ones to be scalar. v2 -> v3: - r600 prefers vec4 as suggested by Marek Olšák. Signed-off-by: Christian Gmeiner <> Reviewed-by: Eric Anholt <> Reviewed-by: Marek Olšák <>
2018-06-14gallium: add support for programmable sample locationsRhys Perry1-0/+1
Signed-off-by: Rhys Perry <> Reviewed-by: Brian Paul <> (v2) Reviewed-by: Marek Olšák <> (v2)
2018-06-05v3d: Be more explicit about include directory from our generated code.Eric Anholt1-1/+2
You'd need src/broadcom/cle/ in the -I previously, for srcdir != builddir. nir was fine at that, but automake didn't have it. Bugzilla:
2018-05-29gallium: add PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITYMarek Olšák1-0/+1
Reviewed-by: Nicolai Hähnle <> Reviewed-by: Timothy Arceri <>
2018-05-30gallium/winsys: rename DRM_API_HANDLE_* to WINSYS_HANDLE_*Dave Airlie1-5/+5
This just renames this as we want to add an shm handle which isn't really drm related. Originally by: Marc-André Lureau <> (airlied: I used this sed script instead) This was generated with: git grep -l 'DRM_API_' | xargs sed -i 's/DRM_API_/WINSYS_/g' Reviewed-by: Marek Olšák <>
2018-05-17broadcom/vc4: Native fence fd supportStefan Schake6-11/+107
With the syncobj support in place, lets use it to implement the EGL_ANDROID_native_fence_sync extension. This mostly follows previous implementations in freedreno and etnaviv. v2: Drop the flags (Eric) Handle in_fence_fd already in job_submit (Eric) Drop extra vc4_fence_context_init (Eric) Dup fds with CLOEXEC (Eric) Mention exact extension name (Eric) Signed-off-by: Stefan Schake <> Reviewed-by: Eric Anholt <>
2018-05-17broadcom/vc4: Store job fence in syncobjStefan Schake3-4/+35
This gives us access to the fence created for the render job. v2: Drop flag (Eric) Signed-off-by: Stefan Schake <> Reviewed-by: Eric Anholt <>
2018-05-17broadcom/vc4: Detect syncobj supportStefan Schake2-0/+7
We need to know if the kernel supports syncobj submission since otherwise all the DRM syncobj calls fail. v2: Use drmGetCap to detect syncobj support (Eric) Signed-off-by: Stefan Schake <> Reviewed-by: Eric Anholt <>
2018-05-15vc4: use util_copy_framebuffer_stateRob Clark1-12/+2
Signed-off-by: Rob Clark <> Reviewed-by: Eric Anholt <>
2018-04-30gallium: add initial support for conservative rasterizationRhys Perry1-1/+12
Signed-off-by: Rhys Perry <> Reviewed-by: Brian Paul <> Reviewed-by: Marek Olšák <>
2018-03-29util: Move util_is_power_of_two to bitscan.h and rename to ↵Ian Romanick1-2/+2
util_is_power_of_two_or_zero The new name make the zero-input behavior more obvious. The next patch adds a new function with different zero-input behavior. Signed-off-by: Ian Romanick <> Suggested-by: Matt Turner <> Reviewed-by: Alejandro Piñeiro <>
2018-03-22broadcom/vc4: add path to nir_builder.hJuan A. Suarez Romero1-1/+1
As the other VC4 files do. Otherwise, it won't find nir_builder.h v2: add path in source code rather changing autotools (Emil) Reviewed-by: Emil Velikov <>
2018-03-20gallium: add packed uniform CAPTimothy Arceri1-0/+1
Reviewed-by: Marek Olšák <>
2018-03-13brodacom/vc4: Fix simulator since the perfmon change.Eric Anholt1-0/+1
It would be nice to support perfmon with simulator, and might be a useful tool for regression testing performance (since the simulator would be deterministic).
2018-03-09broadcom/vc4: Add an accelerated path to turn raster R8/RG88 into tiled.Eric Anholt3-0/+211
Drawing a 1080p YV12 video stream generated by MMAL goes from 10.5 FPS to 36.
2018-03-09broadcom/vc4: Allow binding non-zero constant buffers.Eric Anholt5-5/+53
We're going to use UBO loads for implementing YUV linear-to-T-format blits.
2018-03-09broadcom: Remove our defines of DRM_FORMAT_MOD_INVALID.Eric Anholt1-4/+0
The imported drm_fourcc.h handles it now.
2018-03-09broadcom: Suppress compiler warnings about enum pipe_tex_filter.Eric Anholt1-0/+1
2018-03-05broadcom/vc4: Add support for HW perfmonBoris Brezillon5-12/+249
The V3D engine provides several perf counters. Implement ->get_driver_query_[group_]info() so that these counters are exposed through the GL_AMD_performance_monitor extension. Signed-off-by: Boris Brezillon <> Signed-off-by: Eric Anholt <>
2018-02-28nir: add lower_ldexp to nir compiler optionsTimothy Arceri1-0/+1
Reviewed-by: Marek Olšák <>
2018-02-23broadcom/vc4: Remove the retval==usage check in is_format_supported().Eric Anholt1-26/+13
This got us into trouble recently, so just remove it entirely.
2018-02-23broadcom/vc4: Add support for YUV textures using unaccelerated blits.Eric Anholt3-3/+35
Previously we would assertion fail about having no hardware format. This is enough to get kmscube -M nv12-2img working.
2018-02-23broadcom/vc4: Fix double-unrefcounting of prsc->next with shadows.Eric Anholt1-6/+11
When we set up the shadow resource we were copying the original resource as the template, including its prsc->next field. When we shadowed the first YUV plane's resource for linear-to-tiled conversion, we would end up unbalancing the refcount on the shadow resource's destruction.
2018-02-23broadcom/vc4: Add pipe_reference debugging for vc4_bos.Eric Anholt2-5/+24
Trying to track down the YUV EGLImage use-after-free, it helps to see what the mystery objects are that are being refcounted.
2018-02-23broadcom/vc4: Remove dead vc4_bo_set_reference().Eric Anholt1-8/+0
It would be broken if NULL was passed to it anyway, since it wouldn't participate in screen->bo_handles management.
2018-02-23broadcom/vc4: Use pipe_resource_reference in sampler views.Eric Anholt1-2/+2
Improves u_debug_refcount output.
2018-02-23broadcom/vc4: Allow importing linear BOs with arbitrary offset/stride.Eric Anholt1-8/+25
This is part of supporting YUV textures -- MMAL will be handing us a single GEM BO with the planes at offsets within it, and MMAL-decided stride.
2018-02-23broadcom/vc4: Ignore PIPE_BIND_DISPLAY_TARGET in is_format_supported().Eric Anholt1-0/+2
We were failing the retval == usage check at the end. Fixes: f7604d8af521 ("st/dri: only expose config formats that are display targets")
2018-02-17gallium: allow drivers to impose BO flags restrictions on constant buffer 0Marek Olšák1-0/+1
Required by radeonsi for optimal behavior.
2018-02-14gallium: drop all the guard band float caps.Dave Airlie1-5/+0
Nobody queries these and nobody sets them to anything useful, the docs say TODO. Drop them until a use appears. Reviewed-by: Roland Scheidegger <> Signed-off-by: Dave Airlie <>
2018-01-31nir: add lower_all_io_to_temps flagTimothy Arceri1-0/+1
This will be used for freedreno and vc4 which require all inputs and outputs to be copied to temps. Reviewed-by: Marek Olšák <>
2018-01-30gallium: introduce PIPE_CAP_FENCE_SIGNAL v2Andres Rodriguez1-0/+1
Protects semaphore signaling functionality required by GL_EXT_semaphore. v2: s/semaphore/fence Signed-off-by: Andres Rodriguez <> Reviewed-by: Marek Olšák <>
2018-01-19autotools: include meson build files in tarballDylan Baker1-1/+1
This adds the, meson_options.txt, and a few scripts that are used exclusively by the meson build. v2: - Remove accidentally included changes needed to test make dist with LLVM > 3.9 Signed-off-by: Dylan Baker <> Acked-by: Eric Engestrom <> Reviewed-by: Emil Velikov <>
2018-01-17gallium: remove PIPE_CAP_USER_CONSTANT_BUFFERSMarek Olšák1-1/+0
Reviewed-by: Roland Scheidegger <> Tested-by: Dieter Nützel <>
2018-01-17gallium: remove PIPE_CAP_TEXTURE_SHADOW_MAPMarek Olšák1-1/+0
Reviewed-by: Roland Scheidegger <> Tested-by: Dieter Nützel <>
2018-01-17gallium: remove PIPE_CAP_TWO_SIDED_STENCILMarek Olšák1-1/+0
Reviewed-by: Roland Scheidegger <> Tested-by: Dieter Nützel <>
2018-01-11meson: Use dependencies for nirDylan Baker1-3/+4
This creates two new internal dependencies, idep_nir_headers and idep_nir. The former encapsulates the generation of nir_opcodes.h and nir_builder_opcodes.h and adding src/compiler/nir as an include path. This ensures that any target that needs nir headers will have the includes and that the generated headers will be generated before the target is build. The second, idep_nir, includes the first and additionally links to libnir. This is intended to make it easier to avoid race conditions in the build when using nir, since the number of consumers for libnir and it's headers are quite high. Acked-by: Eric Engestrom <> Signed-off-by: Dylan Baker <>
2017-12-19gallium: plumb context priority through to driverRob Clark1-0/+1
Signed-off-by: Rob Clark <> Reviewed-by: Roland Scheidegger <> Reviewed-by: Marek Olšák <> Reviewed-by: Andres Rodriguez <> Reviewed-by: Wladimir J. van der Laan <>
2017-12-04meson: define driver dependenciesDylan Baker1-0/+5
This allow us to encapsulate the compiler and linkage requirements of each driver in a reusable way. The result will be that each target that needs a specific driver can simply add `driver_<name>` to its dependencies line and the necessary libraries and compiler args will be added. This will allow for a lot of code de-duplication between gallium targets. Signed-off-by: Dylan Baker <> Reviewed-by: Eric Engestrom <>
2017-12-01broadcom/vc4: Use a single-entry cached last_hindex value.Eric Anholt2-2/+20
Since almost all BOs will be in one CL at a time, this cache will almost always hit except for the first usage of the BO in each CL. This didn't show up as statistically significant on the minetest trace (n=340), but if I lop off the throttled lobe of the bimodal distribution, it very clearly does (0.74731% +/- 0.162093%, n=269).
2017-12-01broadcom/vc4: Decompose single QUADs to a TRIANGLE_FAN.Eric Anholt1-5/+14
No significant difference in the minetest replay, but it should reduce overhead by not requiring that we write quad indices to index buffers that we repeatedly re-upload (and making the draw packet smaller, as well). Over the course of the series the actual game seems to be up by 1-2 fps.
2017-12-01broadcom/vc4: Skip emitting redundant VC4_PACKET_GEM_HANDLES.Eric Anholt3-3/+12
Now that there's only one user of it, it's pretty obvious how to avoid emitting redundant ones. This should save a bunch of kernel validation overhead. No statistically sigificant difference on the minetest trace I was looking at (n=169), but the maximum FPS is up by .3%