path: root/src/gallium/auxiliary
AgeCommit message (Collapse)AuthorFilesLines
2014-01-28gallium/rtasm: handle mmap failures appropriatelyEmil Velikov1-3/+7
For a variety of reasons mmap (selinux and pax to name a few) and can fail and with current code. This will result in a crash in the driver, if not worse. This has been the case since the inception of the gallium copy of rtasm. Cc: 9.1 9.2 10.0 <> Bugzilla: Signed-off-by: Emil Velikov <> Reviewed-by: Jakob Bornecrantz <> (cherry picked from commit 4dd445f1cf80292f10eda53665cefc2a674d838d)
2014-01-27draw: fix incorrect vertex size computation in LLVM drawing codeBrian Paul2-11/+30
We were calling draw_total_vs_outputs() too early. The call to draw_pt_emit_prepare() could result in the vertex size changing. So call draw_total_vs_outputs() after draw_pt_emit_prepare(). This fix would seem to be needed for the non-LLVM code as well, but it's not obvious. Instead, I added an assertion there to try to catch this problem if it were to occur there. Bugzilla: Cc: 10.0 <> Reviewed-by: José Fonseca <> (cherry picked from commit ad814d04ca5d579538885a595331b5b27caefd2a) Conflicts: src/gallium/auxiliary/draw/draw_pt_fetch_shade_pipeline.c
2014-01-25gallium/util: util_format_srgb should not return FORMAT_NONE for sRGB formatsMarek Olšák1-0/+3
This fixes a serious regression introduced in 4e549ddb500cf677b6fa16d9ebdfa67cc23da097. Cc: 9.2 10.0 <> Reviewed-by: Brian Paul <> (cherry picked from commit d40532f260c15d56e5fa836147e02c031a999682)
2014-01-02pipe_loader/sw: close dev->lib when initialization failsAaron Watry1-1/+4
Prevents a memory leak. Reviewed-by: Tom Stellard <> CC: "10.0" <> (cherry picked from commit a7653c19a3b1adae162864587a7ab1c17ab256e6)
2013-11-15gallium/pipe_loader: un-reference udev resources when we're done with them.Aaron Watry1-0/+3
Reviewed-by: Tom Stellard <> CC: "10.0" <> (cherry picked from commit 598f61ba28bcfd220104e18e89973768babeaac3)
2013-11-15gallium: fix build on GNU/Hurd due to missing PIPE_OS_HURD detectionCyril Brulebois1-6/+6
Thanks to Pino Toscano. Patch from Debian package. Cc: "10.0" <> Reviewed-by: Brian Paul <> (cherry picked from commit 2d77e4f922a8c34541d8b187e171738006bd6f4d)
2013-11-08gallivm: deduplicate some indirect register address codeRoland Scheidegger1-157/+96
There's only one minor functional change, for immediates the pixel offsets are no longer added since the values are all the same for all elements in any case (it might be better if those weren't stored as soa vectors in the first place maybe). Reviewed-by: Zack Rusin <>
2013-11-07draw,llvmpipe,util: add depth bias calculation for arb_depth_buffer_floatMatthew McClure7-19/+80
With this patch, the llvmpipe and draw modules will calculate the depth bias according to floating point depth buffer semantics described in the arb_depth_buffer_float specification, when the driver has a z buffer bound with a format type of UTIL_FORMAT_TYPE_FLOAT. By default, the driver will use the existing UNORM calculation for depth bias. A new function, draw_set_zs_format, was added to calculate the Minimum Resolvable Depth value and floating point depth sense for the draw module. Reviewed-by: Jose Fonseca <> Reviewed-by: Roland Scheidegger <>
2013-11-06gallium: fix build on GNU/kFreeBSDFabio Pedretti1-1/+1
Patch from Debian package Reviewed-by: Brian Paul <> Reviewed-by: Andreas Boll <>
2013-11-06gallivm: fix indirect addressing of inputsRoland Scheidegger1-17/+28
We weren't adding the soa offsets when constructing the indices for the gather functions. That meant that we were always returning the data in the first element. (Copied straight from the same fix for temps.) While here fix up a couple of broken comments in the fetch functions, plus don't name a straight float type float4 which is just confusing. Reviewed-by: Jose Fonseca <> Reviewed-by: Zack Rusin <>
2013-11-05gallivm: optimize lp_build_minify for sseRoland Scheidegger3-13/+54
SSE can't handle true vector shifts (with variable shift count), so llvm is turning them into a mess of extracts, scalar shifts and inserts. It is however possible to emulate them in lp_build_minify with float muls, which should be way faster (saves over 20 instructions per 8-wide lp_build_minify). This wouldn't work for "generic" 32bit shifts though since we've got only 24bits of mantissa (actually for left shifts it would work by using sse41 int mul instead of float mul but not for right shifts). Note that this has very limited scope for now, since this is only used with per-pixel lod (otherwise we're avoiding the non-constant shift count by doing per-quad shifts manually), and only 1d textures even then (though the latter should change). Reviewed-by: Brian Paul <> Reviewed-by: Jose Fonseca <>
2013-11-05util/u_format: take normalized flag in consideration in ↵José Fonseca1-0/+3
util_format_is_rgba8_variant Just happened to notice it was missing while looking at it.
2013-11-04gallivm: Remove llvm::DisablePrettyStackTrace for LLVM >= 3.4.Vinson Lee1-0/+2
LLVM 3.4 r193971 removed llvm::DisablePrettyStackTrace and made the pretty stack trace opt-in rather than opt-out. The default value of DisablePrettyStackTrace has changed to true in LLVM 3.4 and newer. Signed-off-by: Vinson Lee <> Bugzilla: Reviewed-by: Tom Stellard <> Reviewed-by: Brian Paul <>
2013-11-04tgsi/scan: set maximum index for each constant bufferMarek Olšák2-1/+13
2013-11-04draw: move type construction out of loopBrian Paul1-1/+3
We can create clip_ptr_type once instead of n times inside the loop. Reviewed-by: Roland Scheidegger <>
2013-10-29gallium/auxiliary/indices: add u_primconvertRob Clark3-0/+227
A convenient front end to indices generate/translate code, for emulating primitives which are not supported natively by the driver. This handles saving/restoring index buffer state, etc. Signed-off-by: Rob Clark <> Reviewed-by: Brian Paul <>
2013-10-29gallium/auxiliary/indices: add start paramRob Clark5-26/+54
Add 'start' parameter to generator/translator. Signed-off-by: Rob Clark <> Reviewed-by: Brian Paul <>
2013-10-29util,llvmpipe: correctly set the minimum representable depth valueMatthew McClure2-0/+52
Reviewed-by: Roland Scheidegger <> Reviewed-by: Jose Fonseca <>
2013-10-25gallivm: implement fully accurate corner filtering for seamless cube mapsRoland Scheidegger1-13/+151
d3d10 requires that cube corners are filtered with accurate weights (that is, the weight of the non-existing corner texel should be evenly distributed to the other 3 texels). OpenGL does not require this (but recommends it). This requires us to use different filtering code, since we need per-texel weights which our 2d lerp doesn't (and can't) do. And of course the (now per element) weights need to be adjusted too for it to work. Invoke the new filtering code whenever there's an edge to keep things simpler, as it will work for edges too not just corners but of course it's only needed with corners. More ugly code for not much gain but at least a hacked up cubemap demo shows very nice corners now... Not sure yet if and how this should be configurable... v2: incorporate feedback from Jose, only use special corner filtering code when there's a corner not when there's only an edge (as corner filtering code is slower, though a perf difference was only measureable when always forcing edge code). Plus some minor style fixes. Reviewed-by: Jose Fonseca <>
2013-10-23gallium: new, unified pipe_context::set_sampler_views() functionBrian Paul11-76/+67
The new function replaces four old functions: set_fragment/vertex/ geometry/compute_sampler_views(). Note: at this time, it's expected that the 'start' parameter will always be zero. Reviewed-by: Roland Scheidegger <> Reviewed-by: Marek Olšák <> Tested-by: Emil Velikov <>
2013-10-21gallivm: implement seamless cube filteringRoland Scheidegger3-40/+368
For seamless cube filtering it is necessary to determine new faces and new coords per sample. The logic for this is _seriously_ complex (what needs to happen is very "asymmetric" wrt face, x/y under/overflow), further complicated by the fact that if the 4 samples are in a corner (meaning we only have actually 3 samples, and all 3 are on different faces) then falling off the edge is happening _both_ on x and y axis simultaneously. There was a noticeable performance hit in mesa's cubemap demo when seamless filtering was forced on (just below 10 percent or so in a debug build, when disabling all filtering hacks, otherwise it would probably be a bit more) and when always doing the logic, hence use a branch which it only does it if any of the pixels in a quad (or in two quads) actually hit this. With that there was no measurable performance hit in the cubemap demo (neither in a debug nor release buidl), but this will vary (cubemap demo very rarely hits edges). Might also be different on other cpus, as this forces SoA sampling path which potentially can be quite a bit slower. Note that as for corners, this code gets all the 3 samples which actually exist right, and the 4th texel will simply be the same as one of the others, meaning that filter weights will be a bit wrong. This however should be enough for full OpenGL (but not d3d10) compliance. Reviewed-by: Jose Fonseca <> Reviewed-by: Brian Paul <>
2013-10-18translate_sse: Fix generated code argument handling for msabi on x86_64Jon TURNEY1-3/+11
translate_sse.c contains code for msabi on x86_64, but it appears to be untested. Currently arguments 1 and 2 passed to the generated code are moved as 32-bit quantities into the registers used by sysvabi, irrespective of the architecture. Since these may be pointers, they must be moved as 64-bit quantities to avoid truncation. Commit f4dd0991719ef3e2606920c5100b372181c60899 disabled tranlate_sse.c on MinGW x86_64, I don't know if was due to this issue, or a different one... Signed-off-by: Jon TURNEY <> Reviewed-by: Brian Paul <>
2013-10-18rtasm: Cygwin uses the msabi calling convention on x86_64Jon TURNEY1-1/+1
Cygwin also uses the msabi calling convention on x86_64, not the sysvabi calling convention Signed-off-by: Jon TURNEY <> Reviewed-by: Brian Paul <> ignored, and an empty message aborts the commit.
2013-10-18rtasm: The heap is NX on 64-bit Cygwin, so use the rtasm_exec_malloc() ↵Jon TURNEY1-1/+1
implementation which uses mmap() The heap is NX on 64-bit Cygwin, so use the rtasm_exec_malloc() implementation which uses mmap() to allocate an anonymous page with execute permission, rather than the one which just uses malloc(). Signed-off-by: Jon TURNEY <> Reviewed-by: Brian Paul <>
2013-10-16Revert "scons: Fix build when rtti is disabled"José Fonseca1-2/+0
This reverts commit 94d05bf87a21bd364e84f699a0064e5fba58a6f9 as it has a few problems: - it breaks windows builds becuase env[LLVM_CXXFLAGS] is never set there - it is merging not only rtti, but the whole cxxflags (defines etc) which has proven to be a source of troubles (breaks debugging etc.)
2013-10-16cso: fix incorrect sampler view count in cso_restore_sampler_views()Brian Paul1-3/+6
During the recent bind_sampler_states() interface change in gallium we changed the CSO single_sampler_done() function so that if we were decreasing the number of sampler states bound in the driver, we'd null-out the "extra/old" sampler states to unbind them. See commit 1e2fbf265. However, we didn't make the corresponding fix for sampler views. This caused an assertion to fail in the svga driver which checked that the number of sampler views matched the number of sampler states. This patch fixes cso_restore_sampler_views() so that it nulls-out the extra/old sampler views if the number of new views is less than the number of current/old views. Reviewed-by: Jose Fonseca <>
2013-10-15scons: Fix build when rtti is disabledAlexander von Gluck IV1-0/+2
* The rtti fix actually dug up a bug in the scons build scripts. * Autotools took the LLVM cpp and cxx flags, while scons only took the cpp flags. * This grabs the cxx flags and applies them where needed. We may want to make the same change for the llvm cpp flags in scons. * The only linux platform I can find with LLVM no-rtti is Ubuntu. * Fixes bug #70471 Tested-by: Vinson Lee <>
2013-10-15draw: make vs_slot signed.José Fonseca1-2/+4
Otherwise (vs_slot < 0) will never be true. Trivial.
2013-10-14build: remove forced -fno-rttiAlexander von Gluck IV1-6/+0
* As discussed on the mailing list, forced no-rtti breaks C++ public API's such as the Haiku C++ * -fno-rtti *can* be still set however instead of blindly forcing -fno-rtti, we can rely on the llvm-config --cppflags output. If the system llvm is built without rtti (default), the no-rtti flag will be present in llvm-config --cppflags (which we pick up on) If llvm is built with rtti (REQUIRES_RTTI=1), then -fno-rtti is removed from llvm-config --cppflags. * We could selectively add / remove rtti from various components, however mixing rtti and non-rtti code is tricky and could introduce missing symbols. * This needs impact tested. Reviewed-by: Francisco Jerez <>
2013-10-09util: Fix MinGW build.José Fonseca1-1/+1
_GNU_SOURCE appears to not be used reliably. Use _MSC_VER instead so that MSVC alone is affected.
2013-10-10gallivm: kill old per-quad face selection codeRoland Scheidegger1-475/+286
Not used since ages, and it wouldn't work at all with explicit derivatives now (not that it did before as it ignored them but now the code would just use the derivs pre-projected which would be quite random numbers). v2: also get rid of 3 helper functions no longer used. Reviewed-by: Jose Fonseca <>
2013-10-10gallivm: handle explicit derivatives for cubemapsRoland Scheidegger3-56/+235
They need some special handling. Quite complicated. Additionally, use the same code for implicit derivatives too if no_rho_approx and no_quad_lod is set, because it seems while generally it should be ok to use per quad lod for implicit derivatives there's at least some test which insists that in case of cubemaps the shared lod value MUST come from a pixel inside the primitive (due to the derivatives becoming different if a different larger major axis is chosen). v2: based on Brian's feedback, clean up code a bit. And use sign bit of major axis instead of pre-select s/t/r sign for coord mirroring (which should be the same in the end, saves 2 ands). Also fix two bugs with select/mirror of derivatives, the minor axes need to use major axis sign as well (instead of major derivative axis sign), and don't mistakenly use absolute values of major derivative and inverse major values. Reviewed-by: Jose Fonseca <>
2013-10-10gallivm: ignore rho approximation for cube mapsRoland Scheidegger1-30/+20
There's two reasons for this: 1) even when ignoring rho approximation for cube maps, the result is still not correct, but it's better as the max error at edges is now sqrt(2) instead of 2 (which was a full mip level), same as it is for ordinary 2d maps when doing rho approximations (so the error actually goes from factor 2 at edges and sqrt(2) completely inside a face to sqrt(2) at edges and 0 inside a face). 2) I want to repurpose rho_no_approx for cubemaps for fully correct cubemap derivatives (so don't need yet another debug var). Reviewed-by: Jose Fonseca <> Reviewed-by: Brian Paul <>
2013-10-10util/u_math: Fix C++ include of u_math.h on MSVC.José Fonseca1-1/+1
GNU C++ compiler declares the C99 lrint, etc. when _GNU_SOURCE is defined, but MSVC does not. Trivial.
2013-10-09llvmpipe: implement 64 bit mul opcodes in llvmpipeZack Rusin1-0/+60
Both the imul_hi and umul_hi are working with this patch. Signed-off-by: Zack Rusin <> Reviewed-by: José Fonseca <> Reviewed-by: Roland Scheidegger <> Reviewed-by: Brian Paul <>
2013-10-09gallium: Add support for 32x32 muls with 64 bit resultsZack Rusin4-0/+45
The code introduces two new 32bit integer multiplication opcodes which can be used to produce correct 64 bit results. GLSL, OpenCL and D3D10+ require them. We use two seperate opcodes, because they match the behavior of GLSL and OpenCL, are a lot easier to add than a single opcode with multiple destinations and because there's not much (any) difference wrt code-generation. Signed-off-by: Zack Rusin <> Reviewed-by: José Fonseca <> Reviewed-by: Roland Scheidegger <> Reviewed-by: Brian Paul <>
2013-10-09gallivm: support printing of 64 bit integersZack Rusin1-1/+6
only 8 and 32 bit integers were supported before. Signed-off-by: Zack Rusin <> Reviewed-by: José Fonseca <>
2013-10-04util: when packing depth values, round to nearest.Matthew McClure2-4/+56
This patch adds the lrint, lrintf, llrint, and llrintf rounding utility functions. When packing unorm depth values, we will round to nearest. Reviewed-by: Roland Scheidegger <>
2013-10-03cso: make sure all sampler states are set/clearedBrian Paul1-2/+9
2013-10-03vl: remove old bind_fragment_sampler_states() callsBrian Paul7-47/+17
2013-10-03util: remove old bind_fragment_sampler_states() calls from blitter codeBrian Paul1-22/+9
2013-10-03draw: remove use of old bind_fragment_sampler_states()Brian Paul2-82/+13
2013-10-03cso: remove use of old bind_*_sampler_states() functionsBrian Paul1-31/+3
2013-10-03vl: use pipe_context::bind_sampler_states() if non-nullBrian Paul7-8/+49
2013-10-03util: use pipe_context::bind_sampler_states() if non-nullBrian Paul1-6/+22
2013-10-03draw: use pipe_context::bind_sampler_states() if non-nullBrian Paul2-7/+97
2013-10-03cso: use pipe_context::bind_sampler_states() if non-nullBrian Paul1-21/+44
2013-10-03draw: rename bind_sampler_states variablesBrian Paul2-19/+19
Put 'fragment' in the names. In preparation for upcoming function renaming.
2013-09-30util/u_format: Assert that format block size is at least 1 byte.Vinson Lee1-1/+6
The block size for all formats is currently at least 1 byte. Add an assertion for this. This should silence several Coverity "Division or modulo by zero" defects. Signed-off-by: Vinson Lee <> Reviewed-by: Brian Paul <>
2013-09-30draw: Add a null check for draw.Vinson Lee1-1/+1
There is an earlier null check for draw so draw could be null here as well. Fixes "Dereference after null check" defect reported by Coverity. Signed-off-by: Vinson Lee <> Reviewed-by: Brian Paul <>