summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2015-01-10get the dispatch modes rightilk-fast-clearKenneth Graunke1-21/+34
(should get cleaned up) appears to be working now (haven't piglited but various apps work). appears to actually be using the repclear shader. but...glxgears is like 9% slower...aquarium seems a bit slower if anything...clearspd looks basically the same...huh.
2015-01-10i965: Enable replicated color clears on Ironlake.Kenneth Graunke1-1/+1
Now that this is done via Meta and not BLORP, we can trivially enable it on Ironlake as well. We could probably enable it on G45 as well, but we currently don't even try to compile SIMD16 programs on G45 since there's only one kernel start pointer, and you have to use jumps. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-10i965: Make emit_repclear_shader use message headers on Gen4-5.Kenneth Graunke1-1/+1
Ironlake and earlier hardware don't support headerless FB write messages. Conveniently, we already have code to support using a message header since it's required for MRT on Gen6+; just use that path. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-10i965: Respect the viewport transformation enable flag on Gen4-5.Kenneth Graunke1-1/+2
In commit ff7a2fc322a0ae0a36a976444b7506e9313ac630, Kristian made viewport transformation optional on Gen6+, so he could disable it for RECTLIST primitives. Respecting that flag will allow us to use RECTLISTs on Gen4-5 as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-10i965: Disable clipping for RECTLIST primitives on Gen4-5.Kenneth Graunke2-2/+16
We've never used rectangle primitives on Gen4-5 up until now, and attempting to emit one will trigger unreachable assertions when switching on the GL primitive type. On Gen6+, rectangle primitives aren't clipped properly, and are only used for driver-internal blit-like functionality, where we don't need clipping anyway. So, when using RECTLISTs, simply disable clipping by setting it to "accept all" mode. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-10i965: Initialize grf_used in fs_visitor::emit_repclear_shader().Kenneth Graunke1-0/+4
fs_visitor::grf_used is used to calculate prog_data->reg_blocks_16, which was getting populated with an uninitialized value. Normally, grf_used is set by the register allocator, but emit_repclear_shader bypasses that; it must set grf_used directly. This bug was well hidden: reg_blocks and reg_blocks_16 are unused on Gen6+. However, when uploading programs, we look for an existing entry in the cache. If we find a potential hit, we memcmp the two prog_data structures, including the uninitialized reg_blocks_16 field. Of course, this only happens if we upload the repclear shader twice - which only happens if the precompile guesses the program key incorrectly. On Gen6+, precompile guesses the program key for the repclear shader correctly, so this bug ought to remain untriggered. However, on Gen5, we make more mistakes when guessing the key, which is how I spotted the bug. It is reproducible on Haswell by making precompile choose daft values for fields. v2: Program a correct value. I originally programmed 0 based on a misreading of the trivial register allocator. Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-11vc4: Avoid the save/restore of r3 for raddr conflicts, just use ra31.Eric Anholt2-38/+11
Turns out this was harmful in code quality: total instructions in shared programs: 39487 -> 38845 (-1.63%) instructions in affected programs: 22522 -> 21880 (-2.85%) This costs us yet another register, which is painful since it means more programs might fail to compile). However, the alternative was causing us trouble where we'd save/restore r3 while it contained a MIN-ed direct texture offset, causing the kernel to fail to validate our shaders (such as in GLB2.7).
2015-01-10vc4: Allow dead code elimination of VPM reads.Eric Anholt2-1/+44
This gets a bunch of dead reads out of the CSes, which don't read most attributes generally. total instructions in shared programs: 39753 -> 39487 (-0.67%) instructions in affected programs: 4721 -> 4455 (-5.63%)
2015-01-10vc4: Cook up the draw-time VPM setup info during shader compile.Eric Anholt4-11/+28
This will give the compiler the chance to dead-code eliminate unused VPM reads. This is particularly a big deal in the CS where a bunch of vattrs are just not going to be used.
2015-01-10vc4: Split two notions of instructions having side effects.Eric Anholt5-4/+15
Some ops can't be DCEd, while some of the ops that are just important due to the args they have can be.
2015-01-10vc4: Redo VPM reads as a read file.Eric Anholt5-16/+16
This will let us do copy propagation of the VPM reads.
2015-01-10vc4: Fix miscalculation of the VPM space.Eric Anholt1-1/+1
We pass in a byte offset, not dword. I'm rather scared that this actually managed to pass piglit, but it does fix gears.
2015-01-10vc4: Pack VPM attr contents according to just the size of the attribute.Eric Anholt3-11/+9
total instructions in shared programs: 40960 -> 39753 (-2.95%) instructions in affected programs: 20871 -> 19664 (-5.78%)
2015-01-10vc4: Restructure color packing as a series of channel replacements.Eric Anholt4-49/+60
I'm using this in some WIP commits for doing blending in 8888 instead of vec4. But it also gives us these results immediately, thanks to allowing more uniforms/immediates in the arguments: total instructions in shared programs: 41027 -> 40960 (-0.16%) instructions in affected programs: 4381 -> 4314 (-1.53%)
2015-01-10vc4: Fix the no-copy-propagating-from-TLB_COLOR_READ check.Eric Anholt1-1/+1
Our MOV's dst obviously won't be the TLB_COLOR_READ's def, because we're ssa.
2015-01-10vc4: Move global seqno short-circuiting to vc4_wait_seqno().Eric Anholt2-6/+3
Any other caller would want it, too.
2015-01-10state_tracker: Fix assertion failures in conditional block movs.Eric Anholt1-31/+26
If you had a conditional assignment of an array or struct (say, from the if-lowering pass), we'd try doing swizzle_for_size() on the aggregate type, and it would assertion fail due to vector_elements==0. Instead, extend emit_block_mov() to handle emitting the conditional operations, which also means we'll have appropriate writemasks/swizzles on the CMPs within a struct containing various-sized members. Fixes 20 testcases in es3conform on vc4. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-01-08i965: Consider SEL.{GE,L} to be commutative operations.Matt Turner2-10/+27
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-08i965/cfg: Fix end_ip of last basic block.Matt Turner1-1/+1
start_ip and end_ip are inclusive. Increases instruction counts in 64 shaders in shader-db, likely indicative of them previously being misoptimized.
2015-01-08mesa: compute row stride outside of loop and fix MSVC compilation errorBrian Paul1-2/+4
Can't do void pointer arithmetic with MSVC. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-01-08mesa: fix MSVC compilation errorsBrian Paul1-5/+5
Move assertions after declarations and don't use void pointer arithmetic. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-01-08main: Checking for cube completeness in TextureSubImage.Laura Ekstrand1-13/+35
This is part of a potential solution to a spec bug. Cube completeness is a concept from glGenerateMipmap, but it seems reasonable to check for it in TextureSubImage when target=GL_TEXTURE_CUBE_MAP. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Checking for cube completeness in GetTextureImage.Laura Ekstrand1-12/+35
This is part of a potential solution to a spec bug. Cube completeness is a concept from glGenerateMipmap, but it seems reasonable to check for it in GetTextureImage when the target is GL_TEXTURE_CUBE_MAP. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Added _mesa_cube_level_complete to check for the completeness of an ↵Laura Ekstrand2-9/+18
arbitrary cube map level. Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-01-08main: glDeleteTextures now throws GL_INVALID_VALUE if n is negative.Laura Ekstrand1-0/+5
This is in conformance with the OpenGL spec. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Refactor in teximage.c to handle NULL from _mesa_get_current_tex_object.Laura Ekstrand1-0/+22
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Added entry point for glTextureBuffer.Laura Ekstrand4-16/+92
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Fix texObj->Immutable flag update in _mesa_texture_image_multisample.Laura Ekstrand1-1/+1
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Added entry points for glTextureStorage[23]DMultisample.Laura Ekstrand4-28/+137
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Added entry point for glGenerateTextureMipmap.Laura Ekstrand4-20/+62
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Added entry points for glCompressedTextureSubImage*D.Laura Ekstrand4-52/+256
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Added entry point for glGetCompressedTextureImage.Laura Ekstrand4-45/+141
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Added entry point for glGetTextureImage.Laura Ekstrand4-66/+249
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Nameless texture creation and deletion. Does not affect normal ↵Laura Ekstrand2-0/+69
creation and deletion paths. In implementing ARB_DIRECT_STATE_ACCESS functions, it is often necessary to abstract the functionality of a traditional GL API function into a backend that both the traditional and dsa API functions can share. For instance, glTexParameteri and glTextureParameteri both call _mesa_texture_parameteri, which takes a context object and a texture object as arguments. The existance of such backend functions provides the opportunity for driver internals (such as meta) to pass around the actual texture object rather than its ID or target, saving on texture object storage and look-up overhead. This patch provides nameless texture creation and deletion for meta. This will be used in an upcoming refactor of meta. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Added entry points for CopyTextureSubImage*D.Laura Ekstrand4-48/+183
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Fixed some comments in texparam.cLaura Ekstrand1-2/+2
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Added entry points for glGetTextureParameteriv, Iiv, and Iuiv.Laura Ekstrand4-34/+141
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Added entry point for glGetTextureParameterfv.Laura Ekstrand4-12/+52
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Added entry points for glGetTextureLevelParameteriv, fv.Laura Ekstrand4-32/+131
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: legal_get_tex_level_parameter_target now handles GL_TEXTURE_CUBE_MAP.Laura Ekstrand1-2/+13
ARB_DIRECT_STATE_ACCESS functions allow an effective target of GL_TEXTURE_CUBE_MAP. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Added entry points for glTextureParameteriv, Iiv, Iuiv.Laura Ekstrand4-34/+156
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Added entry point for glTextureParameteri.Laura Ekstrand2-11/+52
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Added entry point for glTextureParameterfv.Laura Ekstrand4-13/+59
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Added entry point for glTextureParameterf.Laura Ekstrand4-10/+66
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Added get_texobj_by_name in texparam.c.Laura Ekstrand1-13/+51
This is a convenience function for *Texture*Parameter functions. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: set_tex_parameterf now handles errors according to the OpenGL 4.5 ↵Laura Ekstrand1-17/+20
Specification. Beginning in the OpenGL 4.3 core specification, certain error handling has changed. One example shown here is that INVALID_ENUM is thrown instead of INVALID_OPERATION when a user attempts to set sampler parameters for a multisample target. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: set_tex_parameteri now handles errors according to the OpenGL 4.5 ↵Laura Ekstrand1-28/+42
Specification. Beginning in the OpenGL 4.3 core specification, some error handling has changed (see OpenGL 4.5 core spec, 30.10.2014, Section 8.10 Texture Parameters, pages 228-29). As an example, changing sampler states with a multisample target throws INVALID_ENUM rather than INVALID_OPERATION. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Added entry point for BindTextureUnit.Laura Ekstrand6-5/+155
The following preparations were made in texstate.c and texstate.h to better facilitate the BindTextureUnit function: Dylan Noblesmith: mesa: add _mesa_get_tex_unit() mesa: factor out _mesa_max_tex_unit() This is about to appear in a lot more places, so reduce boilerplate copy paste. add _mesa_get_tex_unit_err() checking getter function Reduce boilerplate across files. Laura Ekstrand: Made note of why BindTextureUnit should throw GL_INVALID_OPERATION if the unit is out of range. Added assert(unit > 0) to _mesa_get_tex_unit. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Corrected comment on _mesa_is_zero_size_texture.Laura Ekstrand1-1/+1
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08main: Added entry points for glTextureSubImage*D.Laura Ekstrand4-81/+320
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>