summaryrefslogtreecommitdiff
path: root/src/gallium/drivers/vc4
AgeCommit message (Collapse)AuthorFilesLines
2017-06-20vc4: Clean up release build warnings using MAYBE_UNUSED.Eric Anholt2-6/+5
These variables are all used in an assert(), so release builds see no usages.
2017-06-20vc4: Allow VBOs to be mapped during execution.Eric Anholt1-1/+1
There's no reason we can't -- the mappings we expose are basically equivalent to persistent/coherent, already. Improves mesa-demos drawoverhead (no state change) performance by 5.21362% +/- 1.25078% (n=11).
2017-06-15gallium: Add renderonly-based support for pl111+vc4.Eric Anholt4-2/+54
This follows the model of imx (display) and etnaviv (render): pl111 is a display-only device, so when asked to do GL for it, we see if we have a vc4 renderer, make the vc4 screen, and have vc4 call back to pl111 to do scanout allocations. The difference from etnaviv is that we share the same BO between vc4 and pl111, rather than having a vc4 bo and a pl11 bo and copies between the two. The only mismatch between their requirements is that vc4 requires 4-pixel (at 32bpp) stride alignment, while pl111 requires that stride match width. The kernel will reject any modesets to an incorrect stride, so the 3D driver doesn't need to worry about that. v2: Rebase on Android rework, drop unused include. v3: Fix another Android bug, from Rob Herring's build-testing. Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-06-14gallium: add PIPE_CAP_BINDLESS_TEXTURESamuel Pitoiset1-0/+1
Whether bindless texture operations are supported by the underlying driver. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-02gallium: Add a cap to check if the driver supports ARB_post_depth_coverageLyude1-0/+1
Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-05-22vc4: Remove dead code in vc4_dump_surface_msaa()Rhys Kidd1-6/+0
Coverity caught the use of dead code copy-paste for found_colors[] and num_found_colors. CID: 1341850 Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-17vc4: Don't allocate new BOs to avoid synchronization when they're shared.Eric Anholt1-1/+2
If X11 did a software fallback to the entire screen, we would throw out the BO the screen is scanning out from and allocate a new one. Cc: mesa-stable@lists.freedesktop.org
2017-05-17vc4: Drop pointless indirections around BO import/export.Eric Anholt3-69/+49
I've since found them to be more confusing by adding indirections than clarifying by screening off resources from the handle/fd import/export process.
2017-05-17vc4: Drop the u_resource_vtbl no-op layer.Eric Anholt4-33/+27
We only ever attached one vtbl, so it was a waste of space and indirections.
2017-05-17gallium: add PIPE_CAP_ALLOW_MAPPED_BUFFERS_DURING_EXECUTIONMarek Olšák1-0/+1
for skipping mapped-buffer checking in every GL draw call Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-11Android: push driver build details to driver makefilesRob Herring1-0/+4
src/gallium/targets/dri/Android.mk contains lots of conditional for individual drivers. Let's move these details into the individual driver makefiles. In the process, align the make driver conditionals with automake (i.e. HAVE_GALLIUM_*). Signed-off-by: Rob Herring <robh@kernel.org> [Emil Velikov: add the radeon winsys for radeonsi] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-10gallium: add PIPE_CAP_CAN_BIND_CONST_BUFFER_AS_VERTEXMarek Olšák1-0/+1
The next patch will use it. This is really for svga and GL2-level drivers. Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>
2017-05-10gallium: remove pipe_index_buffer and set_index_bufferMarek Olšák5-38/+20
pipe_draw_info::indexed is replaced with index_size. index_size == 0 means non-indexed. Instead of pipe_index_buffer::offset, pipe_draw_info::start is used. For indexed indirect draws, pipe_draw_info::start is added to the indirect start. This is the only case when "start" affects indirect draws. pipe_draw_info::index is a union. Use either index::resource or index::user depending on the value of pipe_draw_info::has_user_indices. v2: fixes for nine, svga
2017-05-10gallium: decrease the size of pipe_vertex_buffer - 24 -> 16 bytesMarek Olšák1-1/+1
2017-05-09nir: Embed the shader_info in the nir_shader againJason Ekstrand2-4/+4
Commit e1af20f18a86f52a9640faf2d4ff8a71b0a4fa9b changed the shader_info from being embedded into being just a pointer. The idea was that sharing the shader_info between NIR and GLSL would be easier if it were a pointer pointing to the same shader_info struct. This, however, has caused a few problems: 1) There are many things which generate NIR without GLSL. This means we have to support both NIR shaders which come from GLSL and ones that don't and need to have an info elsewhere. 2) The solution to (1) raises all sorts of ownership issues which have to be resolved with ralloc_parent checks. 3) Ever since 00620782c92100d77c660f9783504c6d80fa1d58, we've been using nir_gather_info to fill out the final shader_info. Thanks to cloning and the above ownership issues, the nir_shader::info may not point back to the gl_shader anymore and so we have to do a copy of the shader_info from NIR back to GLSL anyway. All of these issues go away if we just embed the shader_info in the nir_shader. There's a little downside of having to copy it back after calling nir_gather_info but, as explained above, we have to do that anyway. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-02vc4: Use runtime CPU detection for whether NEON is available.Eric Anholt2-14/+16
This will allow Raspbian's ARMv6 builds to take advantage of the new NEON code, and could prevent problems if vc4 ends up getting used on a v7 CPU without NEON. v2: Drop dead NEON_SUFFIX (noted by Erik Faye-Lund)
2017-05-02vc4: Use a wrapper file to set VC4_BUILD_NEON instead of CFLAGS.Eric Anholt4-8/+31
Android.mk was setting the flag across the entire driver, so we didn't have non-NEON versions getting built. This was going to be a problem with the next commit, when I start auto-detecting NEON support and use the non-NEON version when appropriate. Reviewed-by: Rob Herring <robh@kernel.org>
2017-05-01vc4: Only build the NEON code on arm32.Eric Anholt1-2/+2
NEON is sufficiently different on arm64 that we can't just reuse this code. Disable it on arm64 for now. v2: Use PIPE_ARCH_ARM instead, as __ARM_ARCH may be 8 for a 32-bit build for a v8 CPU. Signed-off-by: Eric Anholt <eric@anholt.net> Cc: <mesa-stable@lists.freedesktop.org>
2017-04-26gallium: add PIPE_SHADER_CAP_TGSI_SKIP_MERGE_REGISTERSSamuel Pitoiset1-0/+1
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-20gallium: fold u_trim_pipe_prim call from st/mesa to driversMarek Olšák1-0/+5
Most drivers don't need it and shouldn't need it because it can't be used in some cases (indirect draws, primitive restart, count from streamout). Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-18vc4: Enable V3D 2.6.Eric Anholt1-1/+1
This version of the chip is present on the Cygnus-based 911360 enterprise phone platform. It appears to be completely backwards compatible.
2017-04-14gallium: add PIPE_CAP_TGSI_TES_LAYER_VIEWPORTNicolai Hähnle1-0/+1
Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-04-05gallium: add PIPE_CAP_TGSI_BALLOTNicolai Hähnle1-0/+1
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05gallium: add sparse buffer interface and capabilityNicolai Hähnle1-0/+1
v2: - explain the resource_commit interface in more detail Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-31gallium: Add a cap to check if the driver supports fill_rectangleLyude1-0/+1
Changes since v1: - Add pipe caps for etnaviv, freedreno, swr and virgl Signed-off-by: Lyude <lyude@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-01gallium: remove support for predicates from TGSI (v2)Marek Olšák1-2/+0
Neved used. v2: gallivm: rename "pred" -> "exec_mask" etnaviv: remove the cap gallium: fix tgsi_instruction::Padding Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-03-31gallium: add PIPE_CAP_TGSI CLOCKNicolai Hähnle1-0/+1
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-30vc4: Fix indenting in vc4_screen_get_param()Lyude1-3/+3
Signed-off-by: Lyude <lyude@redhat.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-15gallium: add PIPE_CAP_TGSI_TEX_TXF_LZMarek Olšák1-0/+1
2017-03-14nir: Rework conversion opcodesJason Ekstrand3-13/+13
The NIR story on conversion opcodes is a mess. We've had way too many of them, naming is inconsistent, and which ones have explicit sizes was sort-of random. This commit re-organizes things and makes them all consistent: - All non-bool conversion opcodes now have the explicit size in the destination and are named <src_type>2<dst_type><size>. - Integer <-> integer conversion opcodes now only come in i2i and u2u forms (i2u and u2i have been removed) since the only difference between the different integer conversions is whether or not they sign-extend when up-converting. - Boolean conversion opcodes all have the explicit size on the bool and are named <src_type>2<dst_type>. Making things consistent also allows nir_type_conversion_op to be moved to nir_opcodes.c and auto-generated using mako. This will make adding int8, int16, and float16 versions much easier when the time comes. Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-08vc4: Fix math with a condition flag set.Eric Anholt2-3/+18
Math results land in r4, regardless of the condition. To implement them, we just need to ensure that the results are moved out of r4 (as often happens anyway, the values is live across another math instruction), so that we can attach the condition to the MOV. Fixes dEQP-GLES2.functional.shaders.random.all_features.fragment.93 and a couple others, that were assertion failing that their conditions hadn't been handled during the QIR->QPU stage.
2017-03-08vc4: Fix register pressure cost estimates when a src appears twice.Eric Anholt1-3/+13
This ended up confusing the scheduler for things like fabs (implemented as fmaxabs x, x) or squaring a number, and it would try to avoid scheduling them because it appeared more expensive than other instructions. Fixes failure to register allocate in dEQP-GLES2.functional.uniform_api.random.3 with almost no shader-db effects (+.35% max temps)
2017-03-08vc4: Report to shader-db how many threads a fragment shader has.Eric Anholt1-0/+7
Doing instruction count analysis when we emit the thread switches that will save us from tons of stalls is kind of missing the point.
2017-03-08Revert "vc4: Lazily emit our FS/VS input loads."Eric Anholt4-93/+75
This reverts commit 292c24ddac5acc35676424f05291c101fcd47b3e. It broke a lot of GLES2 deqp, and I see at least one problem that will require some serious rework to fix.
2017-03-08gallium: s/uint/enum pipe_shader_type/ for set_constant_buffer()Brian Paul1-1/+2
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08gallium: s/unsigned/enum pipe_shader_type/ for get_compiler_options()Brian Paul2-2/+4
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08gallium: s/unsigned/enum pipe_shader_type/ for pipe_screen::get_shader_param()Brian Paul1-2/+3
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-07gallium/util: replace pipe_mutex_unlock() with mtx_unlock()Timothy Arceri2-7/+7
pipe_mutex_unlock() was made unnecessary with fd33a6bcd7f12. Replaced using: find ./src -type f -exec sed -i -- \ 's:pipe_mutex_unlock(\([^)]*\)):mtx_unlock(\&\1):g' {} \; Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07gallium/util: replace pipe_mutex_lock() with mtx_lock()Timothy Arceri2-6/+6
replace pipe_mutex_lock() was made unnecessary with fd33a6bcd7f12. Replaced using: find ./src -type f -exec sed -i -- \ 's:pipe_mutex_lock(\([^)]*\)):mtx_lock(\&\1):g' {} \; Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07gallium/util: replace pipe_mutex_init() with mtx_init()Timothy Arceri1-1/+1
pipe_mutex_init() was made unnecessary with fd33a6bcd7f12. Replace was done using: find ./src -type f -exec sed -i -- \ 's:pipe_mutex_init(\([^)]*\)):(void) mtx_init(\&\1, mtx_plain):g' {} \; Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07gallium/util: replace pipe_mutex with mtx_tTimothy Arceri1-2/+2
pipe_mutex was made unnecessary with fd33a6bcd7f12. Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-24vc4: Lazily emit our FS/VS input loads.Eric Anholt4-75/+93
This reduces register pressure in both types of shaders, by reordering the input loads from the var->data.driver_location order to whatever order they appear first in the NIR shader. These instructions aren't reorderable at our QIR scheduling level because the FS takes two in lockstep to do an interpolation, and the VS takes multiple read instructions in a row to get a whole vec4-level attribute read. shader-db impact: total instructions in shared programs: 76666 -> 76590 (-0.10%) instructions in affected programs: 42945 -> 42869 (-0.18%) total max temps in shared programs: 9395 -> 9208 (-1.99%) max temps in affected programs: 2951 -> 2764 (-6.34%) Some programs get their max temps hurt, depending on the order that the load_input intrinsics appear, because we end up being unable to copy propagate an older VPM read into its only use.
2017-02-24vc4: Refactor the load_input code out of the intrinsic code.Eric Anholt1-25/+42
It's going gain most of ntq_setup_inputs(), so simplify it first.
2017-02-24vc4: Track the last block we emitted at the top level.Eric Anholt3-5/+10
This will be used for delaying our VPM reads (which must be unconditional) until just before they're used.
2017-02-24vc4: Emit max number of temps in the shader-db output.Eric Anholt1-0/+23
We need to be paying attention to optimization's impact on this -- even if we reduce instruction count, increasing max temps in general is likely to cause us to fail to register allocate on some shaders, which means that those won't run at all.
2017-02-25gallium: remove PIPE_CAP_USER_INDEX_BUFFERSMarek Olšák1-1/+0
all drivers support it Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com> (VMware driver only)
2017-02-24vc4: automake: add the kernel/README to the tarballEmil Velikov1-0/+2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2017-02-14gallium: set pipe_context uploaders in drivers (v3)Marek Olšák1-5/+6
Notes: - make sure the default size is large enough to handle all state trackers - pipe wrappers don't receive transfer calls from stream_uploader, because pipe_context::stream_uploader points directly to the underlying driver's stream_uploader (to keep it simple for now) v2: add error handling to nv50, nvc0, noop v3: set const_uploader Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> (v1) Tested-by: Charmaine Lee <charmainel@vmware.com>
2017-02-10vc4: Enable glSampleMask() even when !rasterizer->multisample.Eric Anholt1-2/+1
gallium's blitter expects that it can set the sample mask even when the rasterizer doesn't have the flag on. Between this and the previous test, 10 new ext_framebuffer_multisample tests start passing.
2017-02-10vc4: Respect glSampleMask() even when we're not writing color.Eric Anholt1-3/+13
gallium's quad-based blitter for copying MSAA depth textures expects to be able to do 4 passes updating a sample at a time using glSampleMask, and there's no color buffer bound when it's doing that.