AgeCommit message (Collapse)AuthorFilesLines
2017-01-23radeonsi: preload PS inputs only if KILL is usedMarek Olšák1-2/+6
so that most shaders can get lower VGPR usage thanks to lazy input loading. I think this is a more accurate constraint that prevents the black transitions in Witcher 2. Affected shaders (7758): Max Waves: 57437 -> 58231 (1.38 %) Reviewed-by: Nicolai Hähnle <>
2017-01-23gallium/radeon: adjust the rule for using the LINEAR_ALIGNED layoutMarek Olšák1-1/+3
Reviewed-by: Nicolai Hähnle <>
2017-01-23winsys/amdgpu: drop all IBs if at least one was rejected within the contextMarek Olšák1-1/+7
The corruption is inevitable and hangs are possible too. Reviewed-by: Nicolai Hähnle <>
2017-01-23winsys/amdgpu: report a rejected IB as a lost contextMarek Olšák3-0/+14
Reviewed-by: Nicolai Hähnle <>
2017-01-24vulkan: import latest registry for 1.0.39 extensions.Dave Airlie1-42/+408
Acked-by: Jason Ekstrand <> Signed-off-by: Dave Airlie <>
2017-01-24vulkan: bump vulkan.h to 1.0.39 versionDave Airlie1-2/+365
This introduces a bunch of new extension defines. Acked-by: Jason Ekstrand <> Signed-off-by: Dave Airlie <>
2017-01-23radv: don't resubmit the same cs over and over while tracingGrazvydas Ignotas1-2/+1
Fixes: 97dfff54 ("radv: Dump command buffer on hang.") Signed-off-by: Grazvydas Ignotas <> Reviewed-by: Bas Nieuwenhuizen <> CC: <>
2017-01-23gallium/radeon: add HUD queries for monitoring some hw blocksSamuel Pitoiset4-1/+110
It's also possible to monitor them via performance counters but the hardware can only use two counters simultaneously. It seems easier to re-use the existing code which reads from MMIO instead of writing a multi-pass approach. v2: - add new lines after ':' Signed-off-by: Samuel Pitoiset <> Reviewed-by: Marek Olšák <>
2017-01-23gallium/radeon: refactor the GRBM counters pathSamuel Pitoiset3-43/+47
This will allow to expose more queries in order to know which blocks are busy/idle. v2: - add new lines after ':' Signed-off-by: Samuel Pitoiset <> Reviewed-by: Marek Olšák <>
2017-01-23swr: Align query results allocationGeorge Kyriazis2-4/+5
Some query results struct contents are declared as cache line aligned. Use aligned malloc, and align the whole struct, to be safe. Fixes crash when compiling with clang. CC: <> Reviewed-by: Bruce Cherniak <>
2017-01-23swr: Prune empty nodes in CalculateProcessorTopology.Bruce Cherniak1-0/+9
CalculateProcessorTopology tries to figure out system topology by parsing /proc/cpuinfo to determine the number of threads, cores, and NUMA nodes. There are some architectures where the "physical id" begins with 1 rather than 0, which was creating and empty "0" node and causing a crash in CreateThreadPool. Bugzilla: Reviewed-By: George Kyriazis <> CC: <>
2017-01-23i965: Use UNUSED to silence unused variable (used in assert).Matt Turner1-1/+1
2017-01-23dri: allow 16bit R/GR images to be exported via drm buffersRainer Hochecker4-0/+24
This allows eglCreateImageKHR to access P010 surfaces created by vaapi Signed-off-by: Rainer Hochecker <> Acked-by: Ben Widawky <>
2017-01-23st/va: make sure that we call begin_frame() only once v2Christian König2-3/+9
This fixes "st/va: delay calling begin_frame until we have all parameters". v2: call begin frame after decoder (re)creation as well. Signed-off-by: Christian König <> Reviewed-by: Nayan Deshmukh <> Tested-by: Andy Furniss <>
2017-01-23drirc: remove spurious tabsEric Engestrom1-8/+8
Signed-off-by: Eric Engestrom <> Reviewed-by: Edward O'Callaghan <> Reviewed-by: Nicolai Hähnle <>
2017-01-23st/glsl_to_tgsi: use DDIV instead of DRCP + DMULNicolai Hähnle1-6/+3
Fixes GL45-CTS.gpu_shader_fp64.built_in_functions. v2: use DDIV unconditionally (Roland) Reviewed-by: Roland Scheidegger <> (v1) Reviewed-by: Marek Olšák <> (v1) Tested-by: Glenn Kennard <> Tested-by: James Harvey <> Cc: 17.0 <>
2017-01-23glsl: split DIV_TO_MUL_RCP into single- and double-precision flagsNicolai Hähnle2-9/+14
Reviewed-by: Marek Olšák <> Reviewed-by: Iago Toral Quiroga <> Tested-by: Glenn Kennard <> Tested-by: James Harvey <> Cc: 17.0 <>
2017-01-23r600: implement DDIVNicolai Hähnle1-0/+59
Tested-by: Glenn Kennard <> Tested-by: James Harvey <> Cc: 17.0 <>
2017-01-23r600: factor out cayman_emit_unary_double_rawNicolai Hähnle1-20/+42
We will use it for DDIV. Tested-by: Glenn Kennard <> Tested-by: James Harvey <> Cc: 17.0 <>
2017-01-23r600: double multiply can handle only one multiply at a timeNicolai Hähnle1-17/+19
It seems clear that trying to multiply two pairs of doubles would result in the temporary register getting overwritten by the second pair. So make the code more explicit. Tested-by: Glenn Kennard <> Tested-by: James Harvey <> Cc: 17.0 <>
2017-01-23glsl: fix tes linking regressionTimothy Arceri1-2/+2
Fixes regression caused by cbeba6bd48da2c. I accidentally pushed the wrong version of the patch.
2017-01-23mesa: remove unused gl_shader_info field from gl_linked_shaderTimothy Arceri1-2/+0
Reviewed-by: Nicolai Hähnle <>
2017-01-23mesa/glsl: set and get cs layouts to and from shader_infoTimothy Arceri4-36/+17
Reviewed-by: Nicolai Hähnle <>
2017-01-23mesa/glsl: set and get gs layouts directly to and from shader_infoTimothy Arceri2-41/+41
Reviewed-by: Nicolai Hähnle <>
2017-01-23mesa/glsl/i965: set and get tes layouts directly to and from shader_infoTimothy Arceri3-45/+40
Reviewed-by: Nicolai Hähnle <>
2017-01-23glsl: use last_vert_prog to get last {clip,cull}_distance_array_sizeTimothy Arceri3-23/+4
Reviewed-by: Nicolai Hähnle <>
2017-01-23mesa/glsl: set {clip,cull}_distance_array_size directly in gl_programTimothy Arceri6-73/+25
There are some line wrapping violations here but those lines will get deleted in the following patch. Reviewed-by: Nicolai Hähnle <>
2017-01-23st/mesa/glsl: change xfb_program field to last_vert_progTimothy Arceri7-32/+44
Now that the i965 backend doesn't depend on this field we can make it more generic and short circuit a bunch of code paths. The new field will be used in a following patch for another clean-up. Reviewed-by: Nicolai Hähnle <>
2017-01-23mesa: use gl_program for CurrentProgram rather than gl_shader_programTimothy Arceri27-396/+248
This makes much more sense and should be more performant in some critical paths such as SSO validation which is called at draw time. Previously the CurrentProgram array could have contained multiple pointers to the same struct which was confusing and we would often need to fish out the information we were really after from the gl_program anyway. Also it was error prone to depend on the _LinkedShader array for programs in current use because a failed linking attempt will lose the infomation about the current program in use which is still valid. V2: fix validate_io() to compare linked_stages rather than the consumer and producer to decide if we are looking at inward facing shader interfaces which don't need validation. Acked-by: Edward O'Callaghan <> To avoid build regressions the following 2 patches were squashed in to this commit: mesa/meta: rewrite _mesa_shader_program_use() and _mesa_program_use() These are rewritten to do what the function name suggests, that is _mesa_shader_program_use() sets the use of all stage and _mesa_program_use() sets the use of a single stage. Reviewed-by: Lionel Landwerlin <> Acked-by: Edward O'Callaghan <> mesa: update active relinked program This likely fixes a subroutine bug were _mesa_shader_program_init_subroutine_defaults() would never have been called for the relinked program as we previously just set _NEW_PROGRAM as dirty and never called the _mesa_use* functions when linking. Acked-by: Edward O'Callaghan <>
2017-01-22freedreno/a5xx: set frag shader threadsizeRob Clark1-2/+7
Signed-off-by: Rob Clark <> Cc: "17.0" <>
2017-01-22freedreno/a5xx: set fragcoordxy properlyRob Clark1-1/+1
What a3xx docs call IJPERSPCENTERREGID.. the xy coord passed into bary.f. We were incorrectly setting both this and gl_FragCoord.xy to the same register resulting in all sorts of hilarity. Fixes stk, vdrift, 0ad, probably a bunch others. Signed-off-by: Rob Clark <> Cc: "17.0" <>
2017-01-22freedreno/ir3: setup var locations in standalone compilerRob Clark1-1/+69
Signed-off-by: Rob Clark <>
2017-01-22freedreno/a5xx: fix psizeRob Clark2-8/+5
Note spritelist (POINTLIST_PSIZE) seems not to be a thing anymore on a5xx. Signed-off-by: Rob Clark <> Cc: "17.0" <>
2017-01-22freedreno/a5xx: srgb fixRob Clark1-1/+2
Signed-off-by: Rob Clark <> Cc: "17.0" <>
2017-01-22freedreno/a5xx: fix int vbosRob Clark1-1/+3
Signed-off-by: Rob Clark <> Cc: "17.0" <>
2017-01-22freedreno/a5xx: fix clear for uint/sint formatsRob Clark1-19/+28
Signed-off-by: Rob Clark <> Cc: "17.0" <>
2017-01-22freedreno/a5xx: fix cull stateRob Clark1-5/+5
Signed-off-by: Rob Clark <> Cc: "17.0" <>
2017-01-22freedreno: update generated headersRob Clark6-13/+36
Signed-off-by: Rob Clark <> Cc: "17.0" <>
2017-01-21anv: descriptors: don't update immutables samplers with anything but their ↵Lionel Landwerlin1-12/+19
immutable value Signed-off-by: Lionel Landwerlin <> Reviewed-by: Jason Ekstrand <>
2017-01-21nir/search: Use the correct bit size for integer comparisonsJason Ekstrand1-32/+16
The previous code always compared integers as 64-bit. Due to variations in sign-extension in the code generated by, this meant that nir_search doesn't always do what you want. Instead, 32-bit values should be matched as 32-bit and 64-bit values should be matched as 64-bit. While we're here we unify the unsigned and signed paths. Now that we're using the right bit size, they should be the same since the only difference we had before was sign extension. This gets the UE4 bitfield_extract optimization working again. It had stopped working due to the constant 0xff00ff00 getting sign-extended when it shouldn't have. Reviewed-by: Iago Toral Quiroga <> Reviewed-by: Eric Anholt <> Cc: "17.0 13.0" <>
2017-01-21intel/blorp/copy: Properly handle clear colors for CCS_E imagesJason Ekstrand1-0/+82
In order to handle CCS_E, we stomp the image format to a UINT format and then do some bitcasting logic in the shader. This works fine since SKL render compression only considers the channel layout of the format and not the format itself. In order for this to work on images that have been fast-cleared, we need to also convert the clear color so that, when interpreted as UINT, it provides the same bit value as it would have in the original format. This fixes a bunch of OpenGL ES CTS tests for copy_image when we start using CCS more aggressively. Reviewed-by: Topi Pohjolainen <> Cc: "17.0" <>
2017-01-20glsl: Rename [u]int64_t tokens.Kenneth Graunke2-5/+5
basetsd.h on Windows defines INT64 and UINT64 typedefs which conflict with these. Append "_TOK" to avoid conflicts. Should fix the Windows build. Signed-off-by: Kenneth Graunke <> Reviewed-by: Matt Turner <>
2017-01-20Revert "i965: Really don't emit Q or UQ moves on Gen < 8"Matt Turner1-8/+0
This reverts commit c95380c4044237d73fb537511667c3c8f658fcee. Acked-by: Kenneth Graunke <>
2017-01-20i965: Select DF type for 64-bit integers on Gen < 8.Matt Turner4-10/+12
Gen8 adds Q/UQ types. We attempted to change the types back to DF in the generator (commit c95380c40), but an assertion added in the FP64 series (commit e481dcc3) triggers before that code has a chance to execute. In fact, using Q/UQ in the IR and then changing to DF in the generator would not work in the presence of source modifiers, etc. Fixes: d6fcede6 ("i965: Return Q and UQ types for int64 and uint64") Reviewed-by: Kenneth Graunke <>
2017-01-20i965: Enable ARB_gpu_shader_int64 on Gen8+Ian Romanick2-0/+6
Signed-off-by: Ian Romanick <> Reviewed-by: Matt Turner <>
2017-01-20i965: Split SIMD16 CMP of Q and UQ instructionsIan Romanick1-14/+29
This is basically the same as happens for doubles. Signed-off-by: Ian Romanick <> Reviewed-by: Matt Turner <>
2017-01-20i965: Enable 64-bit integer support for almost all unary and binary operationsIan Romanick1-10/+0
Integer comparison functions (e.g., nir_op_ilt) are handled in the next commit. Signed-off-by: Ian Romanick <> Reviewed-by: Matt Turner <>
2017-01-20i965: Enable uploading 64-bit integer uniformsIan Romanick1-1/+3
Signed-off-by: Ian Romanick <> Reviewed-by: Matt Turner <>
2017-01-20i965: Add 64-bit integer support for conversions and bitcastsIan Romanick2-5/+35
v2 (idr): Make the "from" type in a cast unsized. This reduces the number of required cast operations at the expensive slightly more complex code. However, this will be a dramatic improvement when other sized integer types are added. Suggested by Connor. Signed-off-by: Ian Romanick <> Reviewed-by: Matt Turner <>
2017-01-20i965: Enable emitting Q and UQ instructions in the fs backendIan Romanick2-1/+12
v2: Fixup assertion in brw_reg_type_to_hw_type to allow BRW_REGISTER_TYPE_{UQ,Q} on Gen8+. Signed-off-by: Ian Romanick <> Reviewed-by: Matt Turner <>