summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-07-25build: Remove unused AX_CHECK_COMPILE_FLAG macroAndreas Boll1-72/+0
Unused since 1a6ae840413d7fb6d2e83f6a83081d5246c7ac9e Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-07-25main: memcpy larger chunks in _mesa_propagate_uniforms_to_driver_storageNils Wallménius1-6/+23
When possible, do the memcpy on larger blocks. This reduces cycles spent in _mesa_propagate_uniforms_to_driver_storage from 1.51 % to 0.62% according to perf during the Unigine Heaven benchmark. It did not affect the framerate of the benchmark. The system used for testing was an i5 6600K with a Radeon R9 380. Piglit hangs randomly on this system both with and without the patch so i could not make a comparison. v2: fixed whitespace Signed-off-by: Nils Wallménius <nils.wallmenius@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-25st/va: enable h264 VAAPI encodeBoyuan Zhang1-5/+1
Enable H.264 VAAPI encoding through config. Currently only H.264 baseline is supported. Encode entrypoint is not accepted by driver. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
2016-07-25st/va: add function to handle misc param type frame rateBoyuan Zhang1-5/+19
Frame rate can be passed to driver either through VAEncSequenceParameterBufferType or VAEncMiscParameterTypeFrameRate. Previous code only implement the former one, which is used by Gstreamer-Vaapi. Now adding implementation for VAEncMiscParameterTypeFrameRate. Also adding default frame rate as 30 just in case application never provides frame rate information to driver. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
2016-07-25st/va: add enviromental variable to disable interlaceBoyuan Zhang1-0/+4
Add environmental variable to disable interlace mode. At VAAPI decoding stage, driver can not distinguish b/w pure decoding case and transcoding case. And since interlace encoding is not supported, we have to disable interlace for transcoding case. The temporary solution is to use enviromental variable to disable interlace mode. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
2016-07-25st/va: add preset values for VAAPI encodeBoyuan Zhang1-0/+27
Add some hardcoded values hardware needs mainly for rate control purpose. With previously hardcoded values for OMX, the rate control result is not correct. This change fixed the rate control result by setting correct values for Vaapi. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
2016-07-25st/va: add functions for VAAPI encodeBoyuan Zhang3-2/+178
Add necessary functions/changes for VAAPI encoding to buffer and picture. These changes will allow driver to handle all Vaapi encode related operations. This patch doesn't change the Vaapi decode behaviour. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
2016-07-25st/va: get rate control method from configattrib v2Boyuan Zhang3-0/+15
Rate control method is passed from app to driver through config attrib list. That is why we need to store this rate control method to config. And later on, we will pass this value to context->desc.h264enc.rate_ctrl.rate_ctrl_method. v2 (chk): fix broken build and commit message Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
2016-07-25st/va: add conversion for yv12 to nv12in putimage v2Boyuan Zhang1-7/+27
For putimage call, if image format is yv12 (or IYUV with U V field swap) and surface format is nv12, then we need to convert yv12 to nv12 and then copy the converted data from image to surface. We can't use the existing logic where surface is destroyed and re-created with yv12 format. v2 (chk): fix some compiler warnings and commit message Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
2016-07-25vl/util: add copy func for yv12image to nv12surface v2Boyuan Zhang1-0/+37
Add function to copy from yv12 image to nv12 surface for VAAPI putimage call. We need this function in VaPutImage call where copying from yv12 image to nv12 surface for encoding. Existing function can't be used because it only work for copying from yv12 surface to nv12 image in Vaapi. v2: cleanup variable types and commit message Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
2016-07-25st/va: add encode entrypoint v2Boyuan Zhang4-39/+150
VAAPI passes PIPE_VIDEO_ENTRYPOINT_ENCODE as entry point for encoding case. We will save this encode entry point in config. config_id was used as profile previously. Now, config has both profile and entrypoint field, and config_id is used to get the config object. Later on, we pass this entrypoint to context->templat.entrypoint instead of always hardcoded to PIPE_VIDEO_ENTRYPOINT_BITSTREAM for decoding case previously. Encode entrypoint is not accepted by driver until we enable Vaapi encode in later patch. v2 (chk): fix commit message to match 80 chars, use switch instead of ifs, fix memory leaks in the error path, implement vlVaQueryConfigEntrypoints as well, drop VAEntrypointEncPicture (only used for JPEG). Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
2016-07-24nvc0: upload sample locations on GM20xSamuel Pitoiset3-5/+31
This fixes a bunch of multisample piglit tests on GM206, like bin/arb_texture_multisample-texelfetch 2 -auto -fbo Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-24freedreno/a4xx: time-elapsed query should be active for clearsRob Clark1-1/+1
Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-24nvc0/ir: fix up an assertion in emitUADD()Samuel Pitoiset1-4/+3
It's illegal to have neg modifiers on both sources for OP_ADD, and it's illegal to have OP_SUB with just src0 neg. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-23nvc0: fix wrong indentation in nvc0_validate_fb()Samuel Pitoiset1-141/+141
Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-07-23glsl: reuse main extension table to appropriately restrict extensionsIlia Mirkin14-354/+268
Previously we were only restricting based on ES/non-ES-ness and whether the overall enable bit had been flipped on. However we have been adding more fine-grained restrictions, such as based on compat profiles, as well as specific ES versions. Most of the time this doesn't matter, but it can create awkward situations and duplication of logic. Here we separate the main extension table into a separate object file, linked to the glsl compiler, which makes use of it with a custom function which takes the ES-ness of the shader into account (thus allowing desktop shaders to properly use ES extensions that would otherwise have been disallowed.) We can also now use this logic to generate #define's for all supported extensions automatically, removing the duplicate (and often inaccurate) list in glcpp. The effect of this change should be nil in most cases. However in some situations, extensions like GL_ARB_gpu_shader5 which were formerly available in compat contexts on the GLSL side of things will now become inaccessible. This regresses two ES CTS tests: ES3-CTS.shaders.shader_integer_mix.define ES31-CTS.shader_integer_mix.define however that is due to them using #version 100 instead of 300 es. As the extension is only defined for ES3, I believe this is the correct behavior. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v2) v2 -> v3: integrate glcpp defines into the same mechanism
2016-07-23freedreno/a4xx: timestamp queriesRob Clark3-1/+34
Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-23freedreno: hw timestamp supportRob Clark3-3/+16
If the kernel supports it, use hw counter for timestamps. Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-23freedreno: prep work for timestamp queriesRob Clark3-6/+10
We need "NULL" state to be a valid bit in the bitmask, because timestamp queries are not restricted to draw/etc stages (ie. the only commands to submit may just be to read the timestamp). And just because there are no draws, isn't a reason to skip the flush and return zero. Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-23radeonsi: ensure sample locations are set for line and polygon smoothingNicolai Hähnle1-2/+1
Since commit d938b8c, the sample locations are no longer set unconditionally, so we need to set the atom to dirty on all chips, not just Polaris. Cc: 12.0 <mesa-stable@lists.freedesktop.org>
2016-07-23radeonsi: fix Polaris MSAA regressionNicolai Hähnle2-15/+20
The regression was introduced by commit d938b8c. The problem here is that in order to use the small primitive filter, we need to explicitly set the sample locations to 0. But the DB doesn't properly process the change of sample locations without a flush, and so we can end up with incorrect Z values. Instead of doing a flush, just disable the small primitive filter when MSAA is force-disabled. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96908 Cc: 12.0 <mesa-stable@lists.freedesktop.org>
2016-07-23freedreno/ir3: Add missing braces in initializerfrancians@gmail.com1-1/+1
Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-23freedreno/a2xx: silence missing case 'SHADER_COMPUTE' warning (v2)francians@gmail.com1-0/+2
v2: no need for break after an unreachable (Matt Turner) Signed-off-by: Francesco Ansanelli <francians@gmail.com> Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-23radeonsi: implement buffer_subdata without indirect callsMarek Olšák5-5/+41
There is less noise in CPU profile data now. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-23gallium/util: don't modify usage in pipe_buffer_writeMarek Olšák2-9/+7
All drivers were already doing it except virgl. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-23gallium: split transfer_inline_write into buffer and texture callbacksMarek Olšák57-389/+383
to reduce the call indirections with u_resource_vtbl. The worst call tree you could get was: - u_transfer_inline_write_vtbl - u_default_transfer_inline_write - u_transfer_map_vtbl - driver_transfer_map - u_transfer_unmap_vtbl - driver_transfer_unmap That's 6 indirect calls. Some drivers only had 5. The goal is to have 1 indirect call for drivers that care. The resource type can be determined statically at most call sites. The new interface is: pipe_context::buffer_subdata(ctx, resource, usage, offset, size, data) pipe_context::texture_subdata(ctx, resource, level, usage, box, data, stride, layer_stride) v2: fix whitespace, correct ilo's behavior Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Roland Scheidegger <sroland@vmware.com>
2016-07-22nir: Lower interp_var_at_* like a normal load_var for flat inputs.Kenneth Graunke1-0/+4
"flat centroid" and "flat sample" both just mean "flat", so we should ignore interpolateAtCentroid/Sample and just return the flat value. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97032 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-07-22mesa: Don't call GenerateMipmap if Width or Height == 0.Kenneth Graunke1-0/+5
One of the WebGL 2.0 conformance tests is trying to call glGenerateMipmaps with a width and height of 0. With the meta implementation, this generates a "framebuffer attachment incomplete" status, and falls back to the CPU path, calling MapTextureImage. Except that there's no actual texture to map, and we assert fail. There's no work to do in this case. The test expects it to succeed, so just return early with no error and avoid hassling the driver. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96911 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-07-22anv/pipeline: Set up point coord enablesJason Ekstrand1-0/+5
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-07-22spirv/nir: Add support for ImageQuerySamplesJason Ekstrand1-0/+3
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22spirv/nir: Handle texture projectorsJason Ekstrand1-0/+15
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22nir/spirv: Refactor coordinate handling in handle_textureJason Ekstrand1-29/+28
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22spirv/nir: Refactor type handling in handle_textureJason Ekstrand1-5/+8
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22spirv/nir: Move opcode selection higher up in handle_textureJason Ekstrand1-48/+48
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22anv/image: Assert that the image format is actually supportedJason Ekstrand1-2/+5
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22spirv/nir: Don't increment coord_components for array lod queriesJason Ekstrand1-1/+1
For lod query instructions, we really don't care whether or not the sampler is an array type because that doesn't factor into the LOD. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22i965: Get rid of the do_lower_unnormalized_offsets passJason Ekstrand4-109/+0
We can do this in NIR now. No need to keep a GLSL pass lying around for it. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22i965/nir: Enable NIR lowering of txf and rect offsetsJason Ekstrand1-0/+2
This fixes the following piglit tests on gen6+: tex-miplevel-selection textureProjGradOffset 2DRect tex-miplevel-selection textureGradOffset 2DRect tex-miplevel-selection textureGradOffset 2DRectShadow tex-miplevel-selection textureProjGradOffset 2DRect_ProjVec4 tex-miplevel-selection textureProjGradOffset 2DRectShadow Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22nir/lower_tex: Add support for lowering coordinate offsetsJason Ekstrand2-0/+64
On i965, we can't support coordinate offsets for texelFetch or rectangle textures. Previously, we were doing this with a GLSL pass but we need to do it in NIR if we want those workarounds for SPIR-V. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22nir/lower_tex: Add some helpers for working with tex sourcesJason Ekstrand1-16/+30
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22nir: Add a helper for determining the type of a texture sourceJason Ekstrand1-0/+44
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22anv/pipeline: Set binding_table.gather_texture_startJason Ekstrand1-0/+1
This should get texture gather working on gen8+ and mostly working on gen7. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22spirv/nir: Properly handle gather componentsJason Ekstrand1-1/+11
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22spirv/nir: Add support for shadow samplers that return vec4Jason Ekstrand1-1/+2
While SPIR-V technically doesn't support "old style" shadow, the shadow-compare gather instruction does return a vec4 so we need to be able to set the old_style_shadow bit in NIR. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22spirv/nir: Fix some texture opcode assertsJason Ekstrand1-2/+2
We can't get an lod with txf_ms and SPIR-V considers textureGrad to be an explicit-LOD texturing instruction. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22nv50/ir: allow to swap sources for OP_SUBSamuel Pitoiset1-1/+6
This allows the load-propagation pass to swap the sources in presence of immediate values. Maxwell (GM107): total instructions in shared programs :1928187 -> 1927634 (-0.03%) total gprs used in shared programs :330741 -> 330154 (-0.18%) total local used in shared programs :28032 -> 28032 (0.00%) local gpr inst bytes helped 0 271 425 425 hurt 0 0 194 194 Fermi (GF114): total instructions in shared programs :2334474 -> 2333829 (-0.03%) total gprs used in shared programs :380934 -> 380215 (-0.19%) total local used in shared programs :33304 -> 33264 (-0.12%) local gpr inst bytes helped 5 314 521 521 hurt 0 4 195 195 No regressions on GM107 and GF114 with full piglit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-22gallium/radeon: make deferred flushes asynchronousMarek Olšák1-0/+2
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-22gallium: add PIPE_FLUSH_DEFERREDMarek Olšák3-2/+13
There are 2 uses: - Asynchronous flushing for multithreaded drivers. - Return a fence without flushing (mid-command-buffer fence). The driver can defer flushing until fence_finish is called. This is required to make Bioshock Infinite faster, which creates 1000 fences (flushes) per frame. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-07-22gallium/os: use CLOCK_MONOTONIC for sleeps (v2)Marek Olšák2-6/+14
v2: handle EINTR, remove backslashes Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2016-07-22mapi: fix typo in macro nameEric Engestrom3-3/+3
Fixes: 5ec140c17b54c2592009 ("mapi: Massage code to allow clang to compile.") Reported-by: Alexandre Demers <alexandre.f.demers@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>