Age | Commit message (Collapse) | Author | Files | Lines |
|
Unused since 1a6ae840413d7fb6d2e83f6a83081d5246c7ac9e
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
|
|
When possible, do the memcpy on larger blocks. This reduces cycles
spent in _mesa_propagate_uniforms_to_driver_storage from
1.51 % to 0.62% according to perf during the Unigine Heaven benchmark.
It did not affect the framerate of the benchmark. The system used for
testing was an i5 6600K with a Radeon R9 380.
Piglit hangs randomly on this system both with and without the patch
so i could not make a comparison.
v2: fixed whitespace
Signed-off-by: Nils Wallménius <nils.wallmenius@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
|
|
Enable H.264 VAAPI encoding through config. Currently only H.264 baseline is supported. Encode entrypoint is not accepted by driver.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
|
|
Frame rate can be passed to driver either through VAEncSequenceParameterBufferType or VAEncMiscParameterTypeFrameRate. Previous code only implement the former one, which is used by Gstreamer-Vaapi. Now adding implementation for VAEncMiscParameterTypeFrameRate. Also adding default frame rate as 30 just in case application never provides frame rate information to driver.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
|
|
Add environmental variable to disable interlace mode. At VAAPI decoding stage, driver can not distinguish b/w pure decoding case and transcoding case. And since interlace encoding is not supported, we have to disable interlace for transcoding case. The temporary solution is to use enviromental variable to disable interlace mode.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
|
|
Add some hardcoded values hardware needs mainly for rate control purpose. With previously hardcoded values for OMX, the rate control result is not correct. This change fixed the rate control result by setting correct values for Vaapi.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
|
|
Add necessary functions/changes for VAAPI encoding to buffer and picture. These changes will allow driver to handle all Vaapi encode related operations. This patch doesn't change the Vaapi decode behaviour.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
|
|
Rate control method is passed from app to driver through config attrib list.
That is why we need to store this rate control method to config. And later
on, we will pass this value to context->desc.h264enc.rate_ctrl.rate_ctrl_method.
v2 (chk): fix broken build and commit message
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
For putimage call, if image format is yv12 (or IYUV with U V field swap) and
surface format is nv12, then we need to convert yv12 to nv12 and then copy
the converted data from image to surface. We can't use the existing logic
where surface is destroyed and re-created with yv12 format.
v2 (chk): fix some compiler warnings and commit message
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
Add function to copy from yv12 image to nv12 surface for VAAPI putimage call.
We need this function in VaPutImage call where copying from yv12 image to nv12
surface for encoding. Existing function can't be used because it only work for
copying from yv12 surface to nv12 image in Vaapi.
v2: cleanup variable types and commit message
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
VAAPI passes PIPE_VIDEO_ENTRYPOINT_ENCODE as entry point for encoding case. We
will save this encode entry point in config. config_id was used as profile
previously. Now, config has both profile and entrypoint field, and config_id is
used to get the config object. Later on, we pass this entrypoint to
context->templat.entrypoint instead of always hardcoded to
PIPE_VIDEO_ENTRYPOINT_BITSTREAM for decoding case previously. Encode entrypoint
is not accepted by driver until we enable Vaapi encode in later patch.
v2 (chk): fix commit message to match 80 chars, use switch instead of ifs,
fix memory leaks in the error path, implement vlVaQueryConfigEntrypoints
as well, drop VAEntrypointEncPicture (only used for JPEG).
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
This fixes a bunch of multisample piglit tests on GM206, like
bin/arb_texture_multisample-texelfetch 2 -auto -fbo
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
It's illegal to have neg modifiers on both sources for OP_ADD,
and it's illegal to have OP_SUB with just src0 neg.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
Trivial.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
|
|
Previously we were only restricting based on ES/non-ES-ness and whether
the overall enable bit had been flipped on. However we have been adding
more fine-grained restrictions, such as based on compat profiles, as
well as specific ES versions. Most of the time this doesn't matter, but
it can create awkward situations and duplication of logic.
Here we separate the main extension table into a separate object file,
linked to the glsl compiler, which makes use of it with a custom
function which takes the ES-ness of the shader into account (thus
allowing desktop shaders to properly use ES extensions that would
otherwise have been disallowed.) We can also now use this logic to
generate #define's for all supported extensions automatically, removing
the duplicate (and often inaccurate) list in glcpp.
The effect of this change should be nil in most cases. However in some
situations, extensions like GL_ARB_gpu_shader5 which were formerly
available in compat contexts on the GLSL side of things will now become
inaccessible.
This regresses two ES CTS tests:
ES3-CTS.shaders.shader_integer_mix.define
ES31-CTS.shader_integer_mix.define
however that is due to them using #version 100 instead of 300 es. As the
extension is only defined for ES3, I believe this is the correct
behavior.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v2)
v2 -> v3: integrate glcpp defines into the same mechanism
|
|
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
If the kernel supports it, use hw counter for timestamps.
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
We need "NULL" state to be a valid bit in the bitmask, because timestamp
queries are not restricted to draw/etc stages (ie. the only commands to
submit may just be to read the timestamp). And just because there are
no draws, isn't a reason to skip the flush and return zero.
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
Since commit d938b8c, the sample locations are no longer set unconditionally,
so we need to set the atom to dirty on all chips, not just Polaris.
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
|
|
The regression was introduced by commit d938b8c. The problem here is that in
order to use the small primitive filter, we need to explicitly set the sample
locations to 0. But the DB doesn't properly process the change of sample
locations without a flush, and so we can end up with incorrect Z values.
Instead of doing a flush, just disable the small primitive filter when MSAA
is force-disabled.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96908
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
|
|
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
v2: no need for break after an unreachable (Matt Turner)
Signed-off-by: Francesco Ansanelli <francians@gmail.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
There is less noise in CPU profile data now.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
|
|
All drivers were already doing it except virgl.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
|
|
to reduce the call indirections with u_resource_vtbl.
The worst call tree you could get was:
- u_transfer_inline_write_vtbl
- u_default_transfer_inline_write
- u_transfer_map_vtbl
- driver_transfer_map
- u_transfer_unmap_vtbl
- driver_transfer_unmap
That's 6 indirect calls. Some drivers only had 5. The goal is to have
1 indirect call for drivers that care. The resource type can be determined
statically at most call sites.
The new interface is:
pipe_context::buffer_subdata(ctx, resource, usage, offset, size, data)
pipe_context::texture_subdata(ctx, resource, level, usage, box, data,
stride, layer_stride)
v2: fix whitespace, correct ilo's behavior
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
|
|
"flat centroid" and "flat sample" both just mean "flat", so we should
ignore interpolateAtCentroid/Sample and just return the flat value.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97032
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
|
|
One of the WebGL 2.0 conformance tests is trying to call
glGenerateMipmaps with a width and height of 0. With the meta
implementation, this generates a "framebuffer attachment incomplete"
status, and falls back to the CPU path, calling MapTextureImage.
Except that there's no actual texture to map, and we assert fail.
There's no work to do in this case. The test expects it to succeed,
so just return early with no error and avoid hassling the driver.
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96911
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
|
|
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
|
|
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
|
|
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
|
|
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
|
|
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
|
|
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
|
|
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
|
|
For lod query instructions, we really don't care whether or not the sampler
is an array type because that doesn't factor into the LOD.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
|
|
We can do this in NIR now. No need to keep a GLSL pass lying around for
it.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
|
|
This fixes the following piglit tests on gen6+:
tex-miplevel-selection textureProjGradOffset 2DRect
tex-miplevel-selection textureGradOffset 2DRect
tex-miplevel-selection textureGradOffset 2DRectShadow
tex-miplevel-selection textureProjGradOffset 2DRect_ProjVec4
tex-miplevel-selection textureProjGradOffset 2DRectShadow
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
|
|
On i965, we can't support coordinate offsets for texelFetch or rectangle
textures. Previously, we were doing this with a GLSL pass but we need to
do it in NIR if we want those workarounds for SPIR-V.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
|
|
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
|
|
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
|
|
This should get texture gather working on gen8+ and mostly working on gen7.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
|
|
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
|
|
While SPIR-V technically doesn't support "old style" shadow, the
shadow-compare gather instruction does return a vec4 so we need to be able
to set the old_style_shadow bit in NIR.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
|
|
We can't get an lod with txf_ms and SPIR-V considers textureGrad to be an
explicit-LOD texturing instruction.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
|
|
This allows the load-propagation pass to swap the sources in presence
of immediate values.
Maxwell (GM107):
total instructions in shared programs :1928187 -> 1927634 (-0.03%)
total gprs used in shared programs :330741 -> 330154 (-0.18%)
total local used in shared programs :28032 -> 28032 (0.00%)
local gpr inst bytes
helped 0 271 425 425
hurt 0 0 194 194
Fermi (GF114):
total instructions in shared programs :2334474 -> 2333829 (-0.03%)
total gprs used in shared programs :380934 -> 380215 (-0.19%)
total local used in shared programs :33304 -> 33264 (-0.12%)
local gpr inst bytes
helped 5 314 521 521
hurt 0 4 195 195
No regressions on GM107 and GF114 with full piglit.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
|
|
There are 2 uses:
- Asynchronous flushing for multithreaded drivers.
- Return a fence without flushing (mid-command-buffer fence). The driver
can defer flushing until fence_finish is called.
This is required to make Bioshock Infinite faster, which creates
1000 fences (flushes) per frame.
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
|
|
v2: handle EINTR, remove backslashes
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
|
|
Fixes: 5ec140c17b54c2592009 ("mapi: Massage code to allow clang to compile.")
Reported-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
|