path: root/src/gallium
AgeCommit message (Collapse)AuthorFilesLines
2013-08-19nv50: allow non-nv12 buffers to be created, just pass them through to vlIlia Mirkin1-5/+1
Since we expose non-NV12 formats as supported when there is no decoer profile selected, make sure that those formats are actually allowed to be allocated. Signed-off-by: Ilia Mirkin <> Tested-by: Emil Velikov <> Cc: "9.2" <> (cherry picked from commit a8346a2f52d08233d376db3aa8205d0b2cc74318)
2013-08-16nv30: remove no-longer-used formats from tableIlia Mirkin1-3/+0
Commit 14ee790df77 removed the formats from the vtxfmt_table but forgot to also update the info_table. Signed-off-by: Ilia Mirkin <> Cc: "9.2 and 9.1" <> (cherry picked from commit c1a6f59b20dab380b77ad1375062f9987cad9183)
2013-08-15radeonsi: Don't leave gaps between position exports from vertex shaderMichel Dänzer3-59/+83
If the vertex shader exports clip distances but not point size, use position exports 1/2 instead of 2/3 for the clip distances. Fixes geometry corruption in that case. Bugzilla: Cc: Reviewed-by: Tom Stellard <> (cherry picked from commit b00269aa5887b88d2e037d6bfa374779902f8743)
2013-08-15llvmpipe: fix stencil bug if we have both stencil and depth testsRoland Scheidegger1-14/+13
This is a very well hidden bug found by accident (only the fixed glean tstencil2 test so far seems to hit it). We must use new mask with combined s_pass values and orig_mask values for zpass/zfail stencil ops, otherwise both the sfail op and one of zpass/zfail op are applied (probably not hit in most tests because some of the ops tend to be KEEP usually). Note: this is a candidate for the 9.2 branch. Reviewed-by: Zack Rusin <> (cherry picked from commit abdd32dcd5569c7caa393acd21753e03de24047f)
2013-08-15nv30: U8_USCALED only works for size 4Ilia Mirkin1-3/+0
See for a sample program. Changing it to use a vec4 makes it work. Remove the unsupported formats. Signed-off-by: Ilia Mirkin <> Cc: "9.2 and 9.1" <> (cherry picked from commit 14ee790df77c810f187860a8d51096173ff39fcf)
2013-08-15draw: always call util_cpu_detect() in draw context creation.Roland Scheidegger1-1/+4
Since disabling denorms in draw_vbo() we require the util_cpu_caps to be initialized there. Hence add another util_cpu_detect() call in draw_create_context() which should ensure this. (There is another call in draw_get_option_use_llvm() which only gets called with x86 (not x86_64) but calling it always there wouldn't help since it most likely wouldn't get called when compiling without llvm, so leave it alone there.) This fixes (Because util_cpu_caps wasn't initialized when first calling util_fpstate_get() hence it returning zero, but it would later get initialized by rtasm translate code hence when draw call returned it unmasked all exceptions by calling util_fpstate_set(). This was happening only with DRAW_USE_LLVM=0 or not compiling with llvm, otherwise the llvm init code was calling it on time too.) Reviewed-by: Jose Fonseca <> Reviewed-by: Zack Rusin <> Tested-by: Vinson Lee <>
2013-08-14radeon/llvm: Add missing "%s" format string to fprintf.Jon Severinsson1-1/+1
This fixes a compilation warning with -Wformat-security. CC: "9.2" <> Reviewed-by: Tom Stellard <> (cherry picked from commit 9298f537a72dc2323898e91c40894f55e3c4754a)
2013-08-13r600g/sb: use MULADD workaround on R7xx for MULADD_IEEEVadim Girlin1-1/+2
Looks like the same issue that was seen with MULADD in trans slot on R7xx also affects MULADD_IEEE (maybe all OP3 instructions and MULADD is just a most frequently used?). So the workaround is to not allow affected instructions to be placed into the trans slot. Fixes Signed-off-by: Vadim Girlin <> Cc: "9.2" <> (cherry picked from commit 17bb96b03d340c0aee8e1a332fdcd695e9179486)
2013-08-09r600g: disable GPUVM by defaultAlex Deucher1-1/+1
Cayman and trinity systems still seem to suffer from stability problems with GPUVM. This also fixes compute on these asics. It can still be enabled for testing by setting env var RADEON_VA=true. Fixes: Signed-off-by: Alex Deucher <> CC: "9.2" <> CC: "9.1" <> Reviewed-by: Christian König <> (cherry picked from commit c88783047e2a0faa39d6f3ac6fbd3f26a480d5d3)
2013-08-07r300g/compiler/tests: Pass the required LDFLAGS when building the test programTom Stellard1-1/+2
CC: "9.2 <>" (cherry picked from commit d0c13fba172c56c3670bc8bebf453d98e455d482)
2013-08-07r300g/compiler/tests: Fix segfaultTom Stellard3-4/+4
CC: "9.2" <> (cherry picked from commit d691ba4d9412b68dd56a300549bafc733e1bb7ee)
2013-08-06nv50: handle pure integer vertex attributesEmil Velikov2-2/+14
And as a side effect fix a crash in the following piglit test: general/attribs GL3 Signed-off-by: Emil Velikov <> Cc: "9.2 and 9.1" (cherry picked from commit 07c8f7a6f8dfe724c1ae92ec45dd04532b6fd453)
2013-08-06st/dri: add a new driconf option disable_shader_bit_encoding for UnigineMarek Olšák4-1/+6
Now Unigine Heaven 3.0 finally works with r600g. Reviewed-by: Kenneth Graunke <> Reviewed-by: Brian Paul <> (cherry picked from commit 7568a89500c35f14cbd397f87c77acc915afc672)
2013-08-06mesa,glsl,st/dri: add a new driconf option force_glsl_version for UnigineMarek Olšák4-7/+12
See documentation in mtypes.h. Reviewed-by: Kenneth Graunke <> Reviewed-by: Brian Paul <> Reviewed-by: Ian Romanick <> (cherry picked from commit 0f6a7cb00c86fbdb415b01450bb1ece8cfe1e31d)
2013-08-06st/dri: remove more unused driconf optionsMarek Olšák1-6/+1
vblank_mode is read by dri_util.c and falls under the "dri2" driver name, which is not connected to the actual Mesa/Gallium driver in any way. Reviewed-by: Brian Paul <> (cherry picked from commit 772070527f6a6db72505575d6571470280a131ab)
2013-08-06st/dri: implement the driconf option force_s3tc_enable properlyMarek Olšák5-12/+23
Reviewed-by: Brian Paul <> (cherry picked from commit 83dbe61ea4308638f1c041d2f550f0f719e36967)
2013-08-06driconf: remove the unused option allow_large_texturesMarek Olšák1-2/+1
Reviewed-by: Kenneth Graunke <> Reviewed-by: Brian Paul <> Reviewed-by: Ian Romanick <> (cherry picked from commit f27f3a4b15449e9ba3c0ee4e01b9db753e48e55f)
2013-08-06st/dri: support the driconf option disable_blend_func_extendedMarek Olšák4-3/+8
This is needed for Unigine. Reviewed-by: Kenneth Graunke <> Reviewed-by: Brian Paul <> (cherry picked from commit 2acc27cc6de5cae395d19017daf86ddd8de704cf)
2013-08-06st/osmesa: initialize disable_glsl_line_continuationsMarek Olšák1-0/+1
Reviewed-by: Kenneth Graunke <> Reviewed-by: Brian Paul <> (cherry picked from commit 71e0b5d688e8442c4c19d905db84caad94314d5e)
2013-08-06radeonsi: Number of SGPRs retrieved from LLVM already includes VCCMichel Dänzer1-8/+8
Fixes spurious 'Assertion `num_sgprs <= 104' failed.' with shaders using all 104 SGPRs. Cc: Reviewed-by: Christian König <> (cherry picked from commit 46b6f79fea1042604ebb61a8214188fb807316ff)
2013-08-05nvc0: properly align NVE4_COMPUTE_MP_TEMP_SIZESamuel Pitoiset2-2/+3
MP_TEMP_SIZE must be aligned to 0x8000, while TEMP_SIZE on NVE4_3D must be aligned to 0x20000, so perform both alignments to be sure we allocate enough space (actually the bo will most likely use 128 KiB pages and not aligning to that would be a waste anyway). Cc: "9.2" (cherry picked from commit ef6d5ee9f34bce89c8bb8ff001be4c70a2a5421d)
2013-08-05gallium/postprocessing: convert blits to pipe->blitMarek Olšák5-25/+54
PP saves current states to cso_context and then util_blit_pixels does the same. cso_context doesn't like that and the original state is not correctly restored. Cc: Reviewed-by: Brian Paul <> (cherry picked from commit 4c89ec1f69c0cba995cb4aa939469ead82c6a8ec)
2013-08-05gallium/postprocessing: fix shader parsingMarek Olšák1-2/+2
tokens was converted to a pointer, which made the Elements macro return 1. Broken by e87fc11cac696881469a57955af2ac7b4929a2c7. Cc: Reviewed-by: Kenneth Graunke <> Reviewed-by: Brian Paul <> (cherry picked from commit c84e8d039ec9c7532b25757012aa3828f4f8a70d)
2013-08-05Revert "r300g: Give CLIP_DISABLE another try"Marek Olšák2-3/+2
This reverts commit e866bd1adea2c3b4971ad68e69c644752f2ab7b6. Cc: (cherry picked from commit 4dfe1a0df56d084b6a29fe423afe0535abec29e9)
2013-08-05gallium/vl: add prime supportDave Airlie1-1/+19
This fixes the dri2 opening to check if DRI_PRIME is set, and picks the correct drm device path to open, this along with a change to libvdpau allows vdpauinfo to work at least, Martin Peres tested with nouveau, and there seems to be a further issue with final displaying, it only works sometimes, but this patch is at least necessary to help debug further. Signed-off-by: Dave Airlie <> Cc: Reviewed-by: Christian König <> Bugzilla: Tested-by: Armin K. <> (cherry picked from commit 19338157c97becac1e61cc51dc0904ddfab8e9da)
2013-08-05clover: Respect kernel argument alignment restrictions.Francisco Jerez2-2/+19
Cc: Reviewed-by: Tom Stellard <> (cherry picked from commit df530829f757a8968389427eb26f45a0d46623fa)
2013-08-05clover: Extend kernel arguments for differing host and device data types.Francisco Jerez2-4/+56
Loosely based on a similar patch by Tom Stellard. Cc: Reviewed-by: Tom Stellard <> (cherry picked from commit f64c0ca692d3e8c78dd9ae1f015f58f1dfc1c760)
2013-08-05clover: Byte-swap kernel arguments when host and device endianness differ.Francisco Jerez1-37/+65
Cc: Reviewed-by: Tom Stellard <> (cherry picked from commit 829caf410e2c2c6f79902199da5a7900abc16129)
2013-08-05clover: Add kernel argument fields to allow differing host/target data types.Francisco Jerez1-2/+23
Loosely based on a similar patch by Tom Stellard. Cc: Reviewed-by: Tom Stellard <> (cherry picked from commit 2265b40e377cc2c9d1091498df2aede5df2ff684)
2013-08-05clover: Pass corresponding module::argument to kernel::argument::bind().Francisco Jerez2-84/+61
And remove size information from most kernel::argument derived classes, it's no longer going to be necessary. Cc: Reviewed-by: Tom Stellard <> (cherry picked from commit a3dcab43c6b6fed2f35aa0e802be6398985f100c)
2013-08-05clover: Return correct value for CL_DEVICE_ENDIAN_LITTLETom Stellard3-1/+8
Query the driver using PIPE_CAP_ENDIANNESS rather than always returning true. Cc: Reviewed-by: Francisco Jerez <> (cherry picked from commit 8c9d3c62f60a2819948bdfb005600cdc10aa2547)
2013-08-05gallium: Add PIPE_CAP_ENDIANNESSTom Stellard14-1/+38
Cc: [ Francisco Jerez: Fix "PIPE_ENDIAN_SMALL" in the documentation, define PIPE_ENDIAN_NATIVE. ] (cherry picked from commit 4e90bc9a12bea93c6b5522abe8151a8cfe1d6d1d)
2013-08-03nvc0: force use of correct firmware fileMaarten Lankhorst1-1/+1
Signed-off-by: Maarten Lankhorst <> (cherry picked from commit e847b5ae066bf9a209dad482fcc664f944983633)
2013-08-03nv50: fix some h264 interlaced decoding on vp2Ilia Mirkin2-7/+8
Some videos specify mb_adaptive_frame_field_flag instead of field_pic_flag. This implies that the pic height needs to be halved, and this field needs to be passed to the VP engine. Cc: "9.2" Signed-off-by: Ilia Mirkin <> (cherry picked from commit 8edb79f1ef95581c20ed0c3dc49aabe99d7f072a)
2013-07-25nv50,nvc0: s/uint16/uint32 for constant buffer offsetChristoph Bumiller2-2/+2
Looks like a thinko, "Hey, constant buffers can be at most 64 KiB in size, offset can't be larger." But it can, of course. I think piglit lacks a test for UBO and BindBufferRange that tests if it actually works.
2013-07-18llvmpipe: clamp inputs for srgb render buffersRoland Scheidegger1-0/+35
Usually with fixed point renderbuffers clamping is done as part of conversion. However, since we blend in float format, we essentially skip all conversion steps pre-blend but since this is still a fixed point renderbuffer we must still clamp the inputs in this case. Makes no difference for piglit though. Obviously we could skip this if fragment color clamping is enabled, but a) this is deprecated in OpenGL (d3d never had it) and b) we don't support it natively so it gets baked into the shader. Also add some comment about logic ops being broken for srgb, luckily no test tries to do that as there's no easy fix... Reviewed-by: Jose Fonseca <> Reviewed-by: Zack Rusin <>
2013-07-18llvmpipe: fix blending with SRC_ALPHA_SATURATE with some formats without alphaRoland Scheidegger2-8/+26
We were fixing up the blend factor to ZERO, however this only works correctly with fixed point render buffers where the input values are clamped to 0/1 (because src_alpha_saturate is min(As, 1-Ad) so can be negative with unclamped inputs). Haven't seen any failure anywhere due to that with fixed point SNORM buffers (which clamp inputs to -1/1) but it should apply there as well (snorm blending is rare, even opengl 4.3 doesn't require snorm rendertargets at all, d3d10 requires them but they are not blendable). Doesn't look like piglit hits this though (some internal testing hits the float case at least). (With legacy OpenGL we could theoretically still use the fixup to zero if the fragment color clamp is enabled, but we can't detect that easily since we don't support native clamping hence it gets baked into the shader.) Reviewed-by: Jose Fonseca <> Reviewed-by: Zack Rusin <>
2013-07-18r600g: use WAIT_3D_IDLE before using CP DMAMarek Olšák2-0/+2
I broke this with 7948ed1250cae78ae1b22dbce4ab23aceacc6159 for r700 at least.
2013-07-18r300g: make use of gallium's os_get_process_name()Jonathan Gray1-1/+6
Lets the code compile on non Linux systems. Signed-off-by: Jonathan Gray <> Signed-off-by: Marek Olšák <>
2013-07-18nv50: H.264/MPEG2 decoding support via VP2, available on NV84-NV96, NVA0Ilia Mirkin11-3/+1815
Adds H.264 and MPEG2 codec support via VP2, using firmware from the blob. Acceleration is supported at the bitstream level for H.264 and IDCT level for MPEG2. Known issues: - H.264 interlaced doesn't render properly - H.264 shows very occasional artifacts on a small fraction of videos - MPEG2 + VDPAU shows frequent but small artifacts, which aren't there when using XvMC on the same videos Signed-off-by: Ilia Mirkin <>
2013-07-17gallivm: (trivial) simplify lp_build_cos/lp_build_sin a tiny bitRoland Scheidegger1-7/+6
Use "or" instead of "add" (this is a classic select sequence, which at least newer llvm versions can actually recognize (3.2+?), and the "add" might prevent that - and we really don't want an add instead of an or with avx if it isn't recognized (even without avx logic ops might be cheaper)). Reviewed-by: Jose Fonseca <>
2013-07-17util/u_format_s3tc: handle srgb formats correctly.Roland Scheidegger2-185/+254
Instead of just ignoring the srgb/linear conversions, simply call the corresponding conversion functions, for all of pack/unpack/fetch, both for float and unorm8 versions (though some don't make a whole lot of sense, i.e. unorm8/unorm8 srgb/linear combinations). Refactored some functions a bit so don't have to duplicate all the code (there's a slight change for packing dxt1_rgb, as there will now be always 4 components initialized and sent to the external compression function so the same code can be used for all, the quite horrid and ad-hoc interface (by now) should always have worked with that). Fixes llvmpipe/softpipe piglit texwrap GL_EXT_texture_sRGB-s3tc. Reviewed-by: Jose Fonseca <>
2013-07-17r600g/sb: improve alu packing on caymanVadim Girlin2-15/+89
Scheduler/register allocator in r600-sb was developed and optimized on evergreen (VLIW-5) hardware, so currently it's not optimal for VLIW-4 chips. This patch should improve performance on cayman gpus due to better alu packing, but also it tends to increase register usage, so overall positive effect on performance has to be proven by real benchmarks yet. Some results with bfgminer kernel on cayman: source bytecode: 60 gprs, 3905 alu groups, sbcl before the patch: 45 gprs, 4088 alu groups, sbcl with this patch: 55 gprs, 3474 alu groups. Signed-off-by: Vadim Girlin <>
2013-07-17r600g/sb: fix handling of new multislot instructions on caymanVadim Girlin3-5/+6
Ex-scalar instructions that became multislot on cayman do replicate result to all channels - handle them similar to DOT4. Signed-off-by: Vadim Girlin <>
2013-07-17r600g/sb: fix debug dump code in schedulerVadim Girlin1-4/+5
Update the stale debug code for other changes related to debug output. Signed-off-by: Vadim Girlin <>
2013-07-17r600g/sb: fix initial register allocationVadim Girlin1-0/+1
Mark values that are members of the 'same register' constraint as preallocated in ra_init pass, this will prevent incorrect reallocation in scheduler in some cases. Should fix Signed-off-by: Vadim Girlin <>
2013-07-17r600g/sb: move chip & class name functions to sb_contextVadim Girlin4-53/+55
Signed-off-by: Vadim Girlin <>
2013-07-17r600g/sb: fix handling of PS in source bytecode on caymanVadim Girlin1-0/+5
Actually PS doesn't make sense for cayman and isn't even mentioned in cayman docs, but llvm backend currently uses it in bytecode and, assuming that hw seems to be mostly ok with it, this will allow sb to parse such source bytecode correctly. Signed-off-by: Vadim Girlin <>
2013-07-17r600g/sb: Initialize ra_checker member variables.Vinson Lee1-1/+1
Fixes "Uninitialized scalar field" defect reported by Coverity. Signed-off-by: Vinson Lee <>
2013-07-17gallium/util: use explicily sized types for {un, }pack_rgba_{s, u}intEmil Velikov2-8/+8
Every function but the above four uses explicitly sized types for their src and dst arguments. Even fetch_rgba_{s,u}int follows the convention. Signed-off-by: Emil Velikov <> Signed-off-by: Marek Olšák <>