path: root/src
AgeCommit message (Collapse)AuthorFilesLines
2014-02-03build: move ARCH_LIBS definition outside of ASM definitionPaul Seidler1-6/+6
_mesa_streaming_load_memcpy is also needed even if assembling is disabled Cc: "10.0" <> Reviewed-by: Matt Turner <> (cherry picked from commit 1cdeeef6c400979a0497afde52bf351a623a934f)
2014-02-03mesa: Fix build to properly check for supported compiler flagsLauri Kasanen1-1/+5
Bugzilla: Reviewed-by: Matt Turner <> Signed-off-by: Lauri Kasanen <> (cherry picked from commit fcefdc9a595c52ade2be15e0f3a2f301fee3599c)
2014-01-31i965: Ignore 'centroid' interpolation qualifier in case of persample shadingAnuj Phogat2-2/+3
This patch handles the use of 'centroid' qualifier with 'in' variables in a fragment shader when persample shading is enabled. Per sample shading for the whole fragment shader can be enabled by: glEnable(GL_SAMPLE_SHADING) or using {gl_SamplePosition, gl_SampleID} builtin variables in fragment shader. Explaining it below in more detail. /* Enable sample shading using OpenGL API */ glEnable(GL_SAMPLE_SHADING); glMinSampleShading(1.0); Example fragment shader: in vec4 a; centroid in vec4 b; main() { ... } Variable 'a' will be interpolated at sample location. But, what interpolation should we use for variable 'b' ? ARB_sample_shading recommends interpolation at sample position for all the variables. GLSL 400 (and earlier) spec says that: "When an interpolation qualifier is used, it overrides settings established through the OpenGL API." But, this text got deleted in later versions of GLSL. NVIDIA's and AMD's proprietary linux drivers (at OpenGL 4.3) interpolates at sample position. This convinces me to use the similar approach on intel hardware. Signed-off-by: Anuj Phogat <> Reviewed-by: Chris Forbes <> (cherry picked from commit f5cfb4ae21df8eebfc6b86c0ce858b1c0a9160dd) and i965: Ignore 'centroid' interpolation qualifier in case of persample shading I missed this change in commit f5cfb4a. It fixes the incorrect rendering caused in Dolphin Emulator. Bugzilla: Signed-off-by: Anuj Phogat <> Tested-by: Markus Wick <> Reviewed-by: Matt Turner <> (cherry picked from commit dc2f94bc786768329973403248820a2e5249f102)
2014-01-31i965: Use sample barycentric coordinates with per sample shadingAnuj Phogat4-6/+29
Current implementation of arb_sample_shading doesn't set 'Barycentric Interpolation Mode' correctly. We use pixel barycentric coordinates for per sample shading. Instead we should select perspective sample or non-perspective sample barycentric coordinates. It also enables using sample barycentric coordinates in case of a fragment shader variable declared with 'sample' qualifier. e.g. sample in vec4 pos; A piglit test to verify the implementation has been posted on piglit mailing list for review. V2: Do not interpolate all the 'in' variables at sample position if fragment shader uses 'sample' qualifier with one of them. For example we have a fragment shader: #version 330 #extension ARB_gpu_shader5: require sample in vec4 a; in vec4 b; main() { ... } Only 'a' should be sampled at sample location, not 'b'. Signed-off-by: Anuj Phogat <> Reviewed-by: Chris Forbes <> (cherry picked from commit a92e5f7cf63d496ad7830b5cea4bbab287c25b8e)
2014-01-31mesa: Use IROUND instead of roundf.José Fonseca1-1/+1
roundf is not available on MSVC. (cherry picked from commit bba8f10598866776ae198b363b3752c2e3bbb126)
2014-01-31i965/gen6/blorp: Emit more flushes to workaround hangsChad Versace2-15/+5
This is a squash of three related cherry-picks from master. [PATCH 1/3] i965/gen6/blorp: Set need_workaround_flush immediately after primitive This patch makes the workaround code in gen6 blorp follow the pattern established in the regular draw path. It shouldn't result in any behavioral change. On gen6, there are two places where we emit 3D_CMD_PRIM: brw_emit_prim() and gen6_blorp_emit_primitive(). brw_emit_prim() sets need_workaround_flush immediately after emitting the primitive, but blorp does not. Blorp sets need_workaround_flush at the bottom of brw_blorp_exec(). This patch moves the need_workaround_flush from brw_blorp_exec() to gen6_blorp_emit_primitive(). There is no need to set need_workaround_flush in gen7_blorp_emit_primitive() because the workaround applies only to gen6. Reviewed-by: Paul Berry <> Signed-off-by: Chad Versace <> (cherry picked from commit 5e0cd58de4261e9dca7a15037192e7e9426a0207) [PATCH 2/3] i965/gen6/blorp: Set need_workaround_flush at top of blorp Unconditionally set brw->need_workaround_flush at the top of gen6 blorp state emission. The art of emitting workaround flushes on Sandybridge is mysterious and not fully understood. Ken and I believe that intel_emit_post_sync_nonzero_flush() may be required when switching from regular drawing to blorp. This is an extra safety measure to prevent undiscovered difficult-to-diagnose gpu hangs. I verified that on ChromeOS, pre-patch, need_workaround_flush was not set at the top of blorp, as Paul expected. To verify, I inserted the following debug code at the top of gen6_blorp_exec(), restarted the ui, and inspected the logs in /var/log/ui. The abort gets triggered so early that the browser never appears on the display. static void gen6_blorp_exec(...) { if (!brw->need_workaround_flush) { fprintf(stderr, "chadv: %s:%d\n", __FILE__, __LINE__); abort(); } ... } CC: Kenneth Graunke <> CC: Stéphane Marchesin <> Reviewed-by: Paul Berry <> Signed-off-by: Chad Versace <> (cherry picked from commit 6a5c86f48675d2ca0975d69e0899e72afaab29e5) [PATCH 3/3] i965/gen6/blorp: Remove redundant HiZ workaround Commit 1a92881 added extra flushes to fix a HiZ hang in WebGL Google Maps. With the extra flushes emitted by the previous two patches, the flushes added by 1a92881 are redundant. Tested with the same criteria as in 1a92881: by zooming in and out continuously for 2 hours on Sandybridge Chrome OS (codename Stumpy) without a hang. CC: Kenneth Graunke <> CC: Stéphane Marchesin <> Reviewed-by: Paul Berry <> Signed-off-by: Chad Versace <> (cherry picked from commit 90368875e733171350c64c8dda52f81bd0705dd0) Conflicts: src/mesa/drivers/dri/i965/gen6_blorp.cpp
2014-01-28radeon / r200: Pass the API into _mesa_initialize_contextIan Romanick4-3/+5
Otherwise an application that requested an OpenGL ES 1.x context would actually get a desktop OpenGL context. Signed-off-by: Ian Romanick <> Cc: "9.1 9.2 10.0" <> Reviewed-by: Alex Deucher <> Reviewed-by: Kenneth Graunke <> (cherry picked from commit 33214679bb632a80d4339ffa0f28f7620d510658)
2014-01-28r600g/compute: Emit DEALLOC_STATE on cayman after dispatching a compute shader.Tom Stellard4-14/+10
This is necessary to prevent the next SURFACE_SYNC packet from hanging the GPU. Reviewed-by: Marek Olšák <> Reviewed-by: Alex Deucher <> CC: "9.2" "10.0" <> (cherry picked from commit d51dbe048afd2131eb3675e9cd868ce73325a61d)
2014-01-28gallium/rtasm: handle mmap failures appropriatelyEmil Velikov1-3/+7
For a variety of reasons mmap (selinux and pax to name a few) and can fail and with current code. This will result in a crash in the driver, if not worse. This has been the case since the inception of the gallium copy of rtasm. Cc: 9.1 9.2 10.0 <> Bugzilla: Signed-off-by: Emil Velikov <> Reviewed-by: Jakob Bornecrantz <> (cherry picked from commit 4dd445f1cf80292f10eda53665cefc2a674d838d)
2014-01-28glcpp: Define GL_EXT_shader_integer_mix in both GL and ES.Matt Turner1-3/+5
Cc: Reviewed-by: Ian Romanick <> (cherry picked from commit 66ef8feb4df2780e06c92c43b6523623aaa1b2eb) Conflicts: src/glsl/glcpp/glcpp-parse.y
2014-01-27draw: fix incorrect vertex size computation in LLVM drawing codeBrian Paul2-11/+30
We were calling draw_total_vs_outputs() too early. The call to draw_pt_emit_prepare() could result in the vertex size changing. So call draw_total_vs_outputs() after draw_pt_emit_prepare(). This fix would seem to be needed for the non-LLVM code as well, but it's not obvious. Instead, I added an assertion there to try to catch this problem if it were to occur there. Bugzilla: Cc: 10.0 <> Reviewed-by: José Fonseca <> (cherry picked from commit ad814d04ca5d579538885a595331b5b27caefd2a) Conflicts: src/gallium/auxiliary/draw/draw_pt_fetch_shade_pipeline.c
2014-01-25glsl: Fix chained assignments of vector channels.Kenneth Graunke1-1/+19
Simple shaders such as: void splat(vec2 v, float f) { v[0] = v[1] = f; } failed to compile with the following error: error: value of type vec2 cannot be assigned to variable of type float First, we would process v[1] = f, and transform: LHS: (expression float vector_extract (var_ref v) (constant int (1))) RHS: (var_ref f) into: LHS: (var_ref v) RHS: (expression vec2 vector_insert (var_ref v) (constant int (1)) (var_ref f)) Note that the LHS type is now vec2, not a float. This is surprising, but not the real problem. After emitting assignments, this ultimately becomes: (declare (temporary) vec2 assignment_tmp) (assign (xy) (var_ref assignment_tmp) (expression vec2 vector_insert (var_ref v) (constant int (1)) (var_ref f))) (assign (xy) (var_ref v) (var_ref assignment_tmp)) We would then return (var_ref assignment_tmp) as the rvalue, which has the wrong type---it should be float, but is instead a vec2. To fix this, we simply return (vector_extract (var_ref assignment_temp) <the appropriate channel>) to pull out the desired float value. Fixes Piglit's chained-assignment-with-vector-constant-index.vert and chained-assignment-with-vector-dynamic-index.vert tests. Cc: Bugzilla: Reported-by: Dan Ginsburg <> Reviewed-by: Ian Romanick <> Reviewed-by: Matt Turner <> Signed-off-by: Kenneth Graunke <> (cherry picked from commit 44a86e2b4fca7c7cab243dfa62dc17f4379fc8e3)
2014-01-25glsl: Rename "expr" to "lhs_expr" in vector_extract munging code.Kenneth Graunke1-6/+6
When processing assignments, we have both an LHS and RHS. At a glance, "lhs_expr" clearly refers to the LHS, while a generic name like "expr" is ambiguous. Cc: Reviewed-by: Ian Romanick <> Reviewed-by: Matt Turner <> Signed-off-by: Kenneth Graunke <> (cherry picked from commit 6c158e110c0aec5371bea6fc1c14f28b045797b0)
2014-01-25glsl: Disable ARB_texture_rectangle in shader version 100.Anuj Phogat1-0/+4
OpenGL with ARB_ES2_compatibility allows shaders that specify #version 100. This fixes the Khronos OpenGL test(Texture_Rectangle_Samplers_frag.test) failure. Cc: Reviewed-by: Matt Turner <> Reviewed-by: Ian Romanick <> Signed-off-by: Anuj Phogat <> (cherry picked from commit c907595ba77a0c74b18b6908f71fafc3c08e2886)
2014-01-25st/mesa: fix glReadBuffer(GL_NONE) segfaultBrian Paul1-1/+2
Bugzilla: Cc: 10.0 <> Tested-by: Ahmed Allam <> Reviewed-by: Marek Olšák <> (cherry picked from commit f7c118ffbfdafaccd4ec05d4a040d07e120c5090)
2014-01-25gallium/util: util_format_srgb should not return FORMAT_NONE for sRGB formatsMarek Olšák1-0/+3
This fixes a serious regression introduced in 4e549ddb500cf677b6fa16d9ebdfa67cc23da097. Cc: 9.2 10.0 <> Reviewed-by: Brian Paul <> (cherry picked from commit d40532f260c15d56e5fa836147e02c031a999682)
2014-01-25st/vdpau: don't return a device if the screen doesn't support NPOTIlia Mirkin1-0/+5
NV3x cards don't support NPOT textures. Technically this restriction could be worked around, but since it also doesn't expose any video decoding hw, just turn it off entirely. Signed-off-by: Ilia Mirkin <> Cc: 10.0 <> Reviewed-by: Christian König <> (cherry picked from commit 00e4314f6d605e467b9a386cacab7eec48b9e429)
2014-01-25nv50: access only the available amount of constbufEmil Velikov1-1/+1
The textures array is defined as a number of NV50_MAX_PIPE_CONSTBUFS per shader stage. Currently the nv50 driver handles only 3 shader stages, thus we wreck chaos when accessing array-out-of-bounds. Cc: 9.1 9.2 10.0 <> Signed-off-by: Emil Velikov <> Reviewed-by: Ilia Mirkin <> (cherry picked from commit 12e744abbb9fd8cb07a12954aaa7127521d5af0a)
2014-01-25nv50: access only the available amount of texturesEmil Velikov1-1/+1
The textures array is defined as a number of PIPE_MAX_SAMPLERS per shader stage. Currently nv50 driver handles only 3 shader stages, thus we wreck chaos when accessing array-out-of-bounds. Fixes a segfault in piglit/bin/arb_texture_buffer_object-data-sync -fbo -auto Cc: 9.1 9.2 10.0 <> Signed-off-by: Emil Velikov <> Reviewed-by: Ilia Mirkin <> (cherry picked from commit d606ca37eb20f18d8ac4727c68831fcecb2f7de4)
2014-01-25mesa: fix GL_COLOR_SUM enum for drivers without ARB_vertex_programIlia Mirkin2-3/+1
Commit c13970808 (mesa: GL_EXT_secondary_color is not optional) changed CHECK_EXTENSION2(EXT_secondary_color, ARB_vetex_program, cap) to CHECK_EXTENSION(ARB_vertex_program, cap) However CHECK_EXTENSION2 checks that either extension is available, not both. Remove the extension check entirely since the intent was for it to always be enabled. v2: Fix glGet*(GL_COLOR_SUM) too. Suggested by Ian. Signed-off-by: Ilia Mirkin <> Reviewed-by: Ian Romanick <> Cc: 9.2 10.0 <> (cherry picked from commit 739dc95e676b31349525b7daf99453b987748248)
2014-01-25st/dri: prevent leak of dri option default valuesAaron Watry1-0/+6
v2: Change comment style CC: "10.0" <> Reviewed-by: Marek Olšák <> (cherry picked from commit ce3528896b37c7d8ef051780e29ea9588fada9da)
2014-01-25radeon: Move gfx/dma cs cleanup to r600_common_context_cleanupAaron Watry2-7/+7
The radeonsi code was not cleaning up either of these items leading to leaked memory. v2: Move cleanup to r600_common_context_cleanup instead of duplicating the logic for SI CC: "10.0" <> Reviewed-by: Marek Olšák <> (cherry picked from commit 5ac3229f76f02453ae7e971d515b01fb56ad3fa5) Conflicts: src/gallium/drivers/radeon/r600_pipe_common.c
The ES and desktop GL specs diverge here. Yay! In desktop OpenGL, the driver can perform online compression of uncompressed texture data. GL_NUM_COMPRESSED_TEXTURE_FORMATS and GL_COMPRESSED_TEXTURE_FORMATS give the application a list of formats that it could ask the driver to compress with some expectation of quality. The GL_ARB_texture_compression spec calls this "suitable for general-purpose usage." As noted above, this means GL_COMPRESSED_RGBA_S3TC_DXT1_EXT is not included in the list. In OpenGL ES, the driver never performs compression. GL_NUM_COMPRESSED_TEXTURE_FORMATS and GL_COMPRESSED_TEXTURE_FORMATS give the application a list of formats that the driver can receive from the application. It is the *complete* list of formats. The GL_EXT_texture_compression_s3tc spec says: "New State for OpenGL ES 2.0.25 and 3.0.2 Specifications The queries for NUM_COMPRESSED_TEXTURE_FORMATS and COMPRESSED_TEXTURE_FORMATS include COMPRESSED_RGB_S3TC_DXT1_EXT, COMPRESSED_RGBA_S3TC_DXT1_EXT, COMPRESSED_RGBA_S3TC_DXT3_EXT, and COMPRESSED_RGBA_S3TC_DXT5_EXT." Note that the addition is only to the OpenGL ES specification! Signed-off-by: Ian Romanick <> See-also: Reviewed-by: Marek Olšák <> Reviewed-by: Brian Paul <> Cc: "10.0" <> (cherry picked from commit 0a75909b3f554b20c9672fc72efbc4f6ec3ce4ea)
2014-01-25st/mesa: use signed temporary variable to store _ColorDrawBufferIndexesEmil Velikov1-1/+1
The temporary variable used to store _ColorDrawBufferIndexes must be signed (GLint), otherwise the following conditional will be incorrectly evaluated. Leading to crashes in the driver/mesa or accessing/writing to arbitrary memory location. The bug dates back to 2009. Cc: 10.0 9.2 9.1 <> Reviewed-by: Marek Olšák <> Signed-off-by: Emil Velikov <> (cherry picked from commit bfcf78c1101a1cbcdd9a479722203047c8d6c26a)
2014-01-25mesa: use signed temporary variable to store _ColorDrawBufferIndexesEmil Velikov1-1/+1
_ColorDrawBufferIndexes is defined as GLint* and using a GLuint* will result in the first part of the conditional to be evaluated to true always. Unintentionally introduced by the following commit, this will result in a driver segfault if one is using an old version of the piglit test bin/clearbuffer-mixed-format -auto -fbo commit 03d848ea1003abefd8fe51a5b4a780527cd852af Author: Marek Olšák <> Date: Wed Dec 4 00:27:20 2013 +0100 mesa: fix interpretation of glClearBuffer(drawbuffer) This corresponding piglit tests supported this incorrect behavior instead of pointing at it. Cc: Marek Olšák <> Cc: 10.0 9.2 9.1 <> Reviewed-by: Marek Olšák <> Signed-off-by: Emil Velikov <> (cherry picked from commit 10368e1446e3b537c1dc4cb536994a4d01cfd2f0)
2014-01-25i965: Ensure that all necessary state is re-emitted if we run out of aperture.Paul Berry3-0/+21
Prior to this patch, if we ran out of aperture space during brw_try_draw_prims(), we would rewind the batch buffer pointer (potentially throwing some state that may have been emitted by brw_upload_state()), flush the batch, and then try again. However, we wouldn't reset the dirty bits to the state they had before the call to brw_upload_state(). As a result, when we tried again, there was a danger that we wouldn't re-emit all the necessary state. (Note: prior to the introduction of hardware contexts, this wasn't a problem because flushing the batch forced all state to be re-emitted). This patch fixes the problem by leaving the dirty bits set at the end of brw_upload_state(); we only clear them after we have determined that we don't need to rewind the batch buffer. Cc: 10.0 9.2 <> Reviewed-by: Eric Anholt <> Reviewed-by: Kenneth Graunke <> (cherry picked from commit fb6d9798a0c6eefd512f5b0f19eed34af8f4f257)
2014-01-25st/mesa: use sRGB formats for MSAA resolving if destination is sRGBMarek Olšák1-0/+32
Copied from the i965 driver, including the big comment. Cc: 9.2 10.0 <> (cherry picked from commit 4e549ddb500cf677b6fa16d9ebdfa67cc23da097)
2014-01-09i965: Don't do the temporary-and-blit-copy for INVALIDATE_RANGE maps.Eric Anholt1-1/+2
We definitely want to fall through to the unsynchronized map case, instead of wasting bandwidth on a copy. Prevents a -43.2407% +/- 1.06113% (n=49) performance regression on aa10perf when teaching glamor to provide the GL_INVALIDATE_RANGE_BIT information. This is a performance fix, which I usually wouldn't cherry-pick to stable. But this was really was just a bug in the code, its presence would discourage developers from giving us the best information they can, and I think we've got fairly high confidence in the unsynchronized map path already. Cc: 10.0 9.2 <> Reviewed-by: Kenneth Graunke <> (cherry picked from commit f46563fe1c8a5560e4de0adf03e3d8770b7fc734)
2014-01-09i965: Fix handling of MESA_pack_invert in blit (PBO) readpixels.Eric Anholt1-1/+3
Fixes piglit GL_MESA_pack_invert/readpixels and GPU hangs with glamor and cairo-gl. Cc: 10.0 9.2 <> Reviewed-by: Kenneth Graunke <> Reviewed-by: Ian Romanick <> Reviewed-by: Anuj Phogat <> (cherry picked from commit e186b927b8254ce62e0d47db90d16cd4253b3db5)
2014-01-09mesa: Namespace qualify fma to override ambiguity with fma from math.hThomas Sondergaard1-1/+1
MSVC 2013 version of math.h includes an fma() function. Cc: "10.0" <> Reviewed-by: Brian Paul <> (cherry picked from commit e8ff08edd823ddf6b0e07ef84d2ba8afc3abbc34)
2014-01-09mesa: Work around internal compiler errorThomas Sondergaard1-2/+2
This small rearrangement avoids MSVC 2013 ICE. Also, this should be a better memory access order. Cc: "10.0" <> Reviewed-by: Brian Paul <> Reviewed-by: Ian Romanick <> (cherry picked from commit 8fcddd325ce3dc5dfdafc95767542590ae860c45)
2014-01-09mesa: Fix compile error with MSVC 2013Thomas Sondergaard1-1/+1
This fixes the following compile error: src\glsl\ir_constant_expression.cpp(1405) : error C2666: 'copysign' : 3 overloads have similar conversions Cc: "10.0" <> Reviewed-by: Brian Paul <> (cherry picked from commit 067ad6e53ec2545970b7698d06d2a537da194678)
2014-01-09i965: fold offset into coord for textureOffset(gsampler2DRect)Chris Forbes1-1/+1
The hardware is broken with nonzero texel offsets and unnormalized coordinates; instead of doing correct offsetting, we get garbage. This just extends the existing workaround for ir_txf and ir_tg4+gsampler2DRect to also consider ir_tex+gsampler2DRect. Fixes broken rendering in 'tesseract' when 'mesa_texrectoffset_bug' is not enabled; also fixes the new piglit test 'tests/spec/glsl-1.30/execution/fs-textureOffset-Rect'. Has been broken ~forever; suggesting including this in only 10.0 because the lowering pass doesn't exist in 9.2 or earlier so would require quite a different patch. Signed-off-by: Chris Forbes <> Reviewed-by: Kenneth Graunke <> Cc: Lee Salzman <> Cc: "10.0" <> (cherry picked from commit 9e99735f301ebf85f8d0bfdce2bad441a5aac7f8)
2014-01-09swrast: fix delayed texel buffer allocation regression for OpenMPAndreas Fänger1-0/+12
Commit 9119269ca14ed42b51c7d8e2e662500311b29fa3 moved the texel buffer allocation to _swrast_texture_span(), however, when compiled with OpenMP support this code already runs multi-threaded so a critical section is required to prevent multiple allocations and rendering errors. Cc: "10.0" <> Reviewed-by: Brian Paul <> (cherry picked from commit 2a0fb946e147f5482c93702fbf46ffdf5208f57c)
2014-01-09mesa: implement missing glGet(GL_RGBA_SIGNED_COMPONENTS_EXT) queryBrian Paul4-0/+71
This is part of the GL_EXT_packed_float extension. Reviewed-by: Kenneth Graunke <> Reviewed-by: Chris Forbes <> (cherry picked from commit 3486f6f31b8cdb01e480cfbd8814c1e4222d26b0 Also squashed in a subsequent bug fix: mesa: check for MESA_FORMAT_RGB9_E5_FLOAT in _mesa_is_format_signed() This packed floating point format only stores positive values. Reviewed-by: Marek Olšák <> Reviewed-by: Kenneth Graunke <> Reviewed-by: Roland Scheidegger <> (cherry picked from commit 0fc8d7c66e08c295b701586afdc1f6d86eb8a514) Also squashed in a second, subsequent bug fix: mesa: check bits per channel for GL_RGBA_SIGNED_COMPONENTS_EXT query If a channel has zero bits it's not signed. v2: also check for luminance and intensity format bits. Bruce Merry's proposed piglit test hits the luminance case. Reviewed-by: Matt Turner <> (cherry picked from commit d046fd731ab192dceee0916323dd718b78df5976) Bugzilla: Cc: 10.0 <> Conflicts: src/mesa/main/get.c
2014-01-02nv50: fix a small leak on context destroyIlia Mirkin1-0/+2
Signed-off-by: Ilia Mirkin <> (cherry picked from commit f50a45452a4fd4f7cece8fe37c394edac0808136)
2014-01-02glsl: Fix inconsistent assumptions about ir_loop::counter.Paul Berry3-2/+9
The compiler back-ends (i965's fs_visitor and brw_visitor, ir_to_mesa_visitor, and glsl_to_tgsi_visitor) assume that when ir_loop::counter is non-null, it points to a fresh ir_variable that should be used as the loop counter (as opposed to an ir_variable that exists elsewhere in the instruction stream). However, previous to this patch: (1) loop_control_visitor did not create a new variable for ir_loop::counter; instead it re-used the existing ir_variable. This caused the loop counter to be double-incremented (once explicitly by the body of the loop, and once implicitly by ir_loop::increment). (2) ir_clone did not clone ir_loop::counter properly, resulting in the cloned ir_loop pointing to the source ir_loop's counter. (3) ir_hierarchical_visitor did not visit ir_loop::counter, resulting in the ir_variable being missed by reparenting. Additionally, most optimization passes (e.g. loop unrolling) assume that the variable mentioned by ir_loop::counter is not accessed in the body of the loop (an assumption which (1) violates). The combination of these factors caused a perfect storm in which the code worked properly nearly all of the time: for loops that got unrolled, (1) would introduce a double-increment, but loop unrolling would fail to notice it (since it assumes that ir_loop::counter is not accessed in the body of the loop), so it would unroll the loop the correct number of times. For loops that didn't get unrolled, (1) would introduce a double-increment, but then later when the IR was cloned for linking, (2) would prevent the loop counter from being cloned properly, so it would look to further analysis stages like an independent variable (and hence the double-increment would stop occurring). At the end of linking, (3) would prevent the loop counter from being reparented, so it would still belong to the shader object rather than the linked program object. Provided that the client program didn't delete the shader object, the memory would never get reclaimed, and so the shader would function properly. However, for loops that didn't get unrolled, if the client program did delete the shader object, and the memory belonging to the loop counter got re-used, this could cause a use-after-free bug, leading to a crash. This patch fixes loop_control_visitor, ir_clone, and ir_hierarchical_visitor to treat ir_loop::counter the same way the back-ends treat it: as a freshly allocated ir_variable that needs to be visited and cloned independently of other ir_variables. Bugzilla: Reviewed-by: Eric Anholt <> Reviewed-by: Kenneth Graunke <> (cherry picked from commit d6eb4321d0e62b6b391ad88ce390bd6e23d79747)
2014-01-02glsl: Teach ir_variable_refcount about ir_loop::counter variables.Paul Berry2-0/+22
If an ir_loop has a non-null "counter" field, the variable referred to by this field is implicitly read and written by the loop. We need to account for this in ir_variable_refcount, otherwise there is a danger we will try to dead-code-eliminate the loop counter variable. Note: at the moment the dead code elimination bug doesn't occur due to a bug in ir_hierarchical_visitor: it doesn't visit the "counter" field, so dead code elimination doesn't treat it as a candidate for elimination. But the patch to follow will fix that bug, so we need to fix ir_variable_refcount first in order to avoid breaking dead code elimination. Reviewed-by: Eric Anholt <> Reviewed-by: Kenneth Graunke <> (cherry picked from commit 9d2951ea0acdcd219ad28831ac9e7112737d9ca3)
2014-01-02i965/gen6: Fix HiZ hang in WebGL Google MapsChad Versace1-0/+15
Emitting flushes before depth and hiz resolves at the top of blorp's state emission fixes the hang. Marchesin and I found the fix experimentally, as opposed to adhering to a documented hardware workaround. A more minimal fix likely exists, but this gets the job done. Fixes HiZ hangs in the new WebGL Google maps on Sandybridge Chrome OS. Tested by zooming in and out continuously for 2 hours. This patch is based on CC: Bugzilla: Signed-off-by: Stéphane Marchesin <> Signed-off-by: Chad Versace <> Reviewed-by: Paul Berry <> Reviewed-by: Kenneth Graunke <> (cherry picked from commit 1a928816a1b717201f3b3cc998a42731b280e6ba)
2014-01-02st/mesa: fix glClear with multiple colorbuffers and different formatsMarek Olšák1-24/+10
Cc: 10.0 9.2 9.1 <> (cherry picked from commit 0612005aa66f211753f44bb4ffdfdcc9316281ac)
2014-01-02glcpp: error on multiple #else/#elif directivesErik Faye-Lund6-1/+51
The preprocessor currently accepts multiple else/elif-groups per if-section. The GLSL-preprocessor is defined by the C++ specification, which defines the following parse-rule: if-section: if-group elif-groups(opt) else-group(opt) endif-line This clearly only allows a single else-group, that has to come after any elif-groups. So let's modify the code to follow the specification. Add test to prevent regressions. Reviewed-by: Ian Romanick <> Reviewed-by: Kenneth Graunke <> Reviewed-by: Carl Worth <> Cc: 10.0 <> (cherry picked from commit eb212c5a302f0122a13b36dfdf07e91f951ae2e7)
2014-01-02r600/pipe: Stop leaking context->start_compute_cs_cmd.buf on EG/CMAaron Watry1-0/+2
Found while tracking down memory leaks in VDPAU playback Reviewed-by: Tom Stellard <> CC: "10.0" <> (cherry picked from commit 3ddabe0d523416693f28e293d8d3d918bdb612ca)
2014-01-02st/vdpau: Destroy context when initialization failsAaron Watry1-0/+1
Prevents a potential memory leak found when tracking down something else. Reviewed-by: Christian König <> Reviewed-by: Tom Stellard <> CC: "10.0" <> (cherry picked from commit 20446d0e535c0735489c8944e8d767e0fc74fc6e)
2014-01-02radeon/llvm: Free target data at end of optimizationAaron Watry1-0/+1
Reviewed-by: Tom Stellard <> CC: "10.0" <> (cherry picked from commit 767b0f82c37f0370c05335120e50f0a534549109)
2014-01-02r600/compute: Use the correct FREE macro when deleting compute stateAaron Watry1-1/+1
Reviewed-by: Tom Stellard <> CC: "10.0" <> (cherry picked from commit 0bd858d7ff4a16228164e3157aca846edeb6c228)
2014-01-02r600/compute: Free compiled kernels when deleting compute stateAaron Watry1-0/+2
v2: Remove unnecessary null pointer check CC: "10.0" <> (cherry picked from commit e19717d075bd26c16e12564ed578ff519a5ce57a)
2014-01-02radeon/compute: Stop leaking LLVMContexts in radeon_llvm_parse_bitcodeAaron Watry5-18/+41
Previously we were creating a new LLVMContext every time that we called radeon_llvm_parse_bitcode, which caused us to leak the context every time that we compiled a CL program. Sadly, we can't dispose of the LLVMContext at the point that it was being created because evergreen_launch_grid (and possibly the SI equivalent) was assuming that the context used to compile the kernels was still available. Now, we'll create a new LLVMContext when creating EG/SI compute state, store it there, and pass it to all of the places that need it. The LLVM Context gets destroyed when we delete the EG/SI compute state. Reviewed-by: Tom Stellard <> CC: "10.0" <> (cherry picked from commit 8c9a9205d96b5ac0718218bfa952a5b4b6ad939c)
2014-01-02pipe_loader/sw: close dev->lib when initialization failsAaron Watry1-1/+4
Prevents a memory leak. Reviewed-by: Tom Stellard <> CC: "10.0" <> (cherry picked from commit a7653c19a3b1adae162864587a7ab1c17ab256e6)
2014-01-02clover: Remove unused variableAaron Watry1-1/+0
Reviewed-by: Tom Stellard <> CC: "10.0" <> (cherry picked from commit 862f55c29c50798942e58ea75c5294921c0489f8)
2014-01-02llvmpipe: use pipe_sampler_view_release() to avoid segfaultJonathan Liu1-0/+6
This fixes another case of faulting when freeing a pipe_sampler_view that belongs to a previously destroyed context. Cc: "10.0" <> Signed-off-by: Jonathan Liu <> Reviewed-by: Brian Paul <> (cherry picked from commit 7990ab58fa01cbebcefd63dd25af5fd6fdddf019)