summaryrefslogtreecommitdiff
path: root/src/gallium/auxiliary
AgeCommit message (Collapse)AuthorFilesLines
2013-09-30gallium: include u_surface.h instead of u_rect.hBrian Paul4-9/+3
u_rect.h was including u_surface.h just to avoid touching a bunch of other source files after some functions were moved from u_rect.h to u_surface.h. This patch cleans up that hack. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-09-25draw/clip: don't emit so many empty trianglesZack Rusin1-0/+39
Compress empty triangles (don't emit more than one in a row) and never emit empty triangles if we already generated a triangle covering a non-null area. We can't skip all null-triangles because c_primitives expects ones that were generated from vertices exactly at the clipping-plane, to be emitted. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-09-25vl/mpeg12: use new vlc function to search for start codesChristian König1-1/+1
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-09-25vl/vlc: add fast forward search for byte valueChristian König1-10/+74
Commonly used to find start codes and has far less overhead to searching manually. Signed-off-by: Christian König <christian.koenig@amd.com>
2013-09-20draw: Ensure draw_pt_middle_end::bind_parameters is never NULL.José Fonseca2-0/+15
Prevents calling NULL pointer with softpipe in certain cases. Trivial.
2013-09-19gallivm: adjust wrap mode to CLAMP_TO_EDGE always for cube maps.Roland Scheidegger1-3/+7
Technically without seamless filtering enabled GL allows any wrap mode, which made sense when supporting true borders (can get seamless effect with border and CLAMP_TO_BORDER), but gallium doesn't support borders and d3d9 requires wrap modes to be ignored and it's a pain to fix up the sampler state (as it makes it texture dependent). It is difficult to imagine a situation where an app really wants another behavior so just cheat here. (It looks like some graphics hw (intel) actually requires this too hence it should be safe.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-09-18util/u_blit: Implement util_blit_pixels via pipe_context::blit.José Fonseca1-410/+37
This removes a lot of code, but not everything, as util_blit_pixels_tex is still useful when one needs to override pipe_sampler_view::swizzle_?. Reviewed-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-09-18util/u_blit: Support blits from cubemaps.José Fonseca2-3/+32
By calling util_map_texcoords2d_onto_cubemap. A new parameter for util_blit_pixels_tex is necessary, as pipe_sampler_view::first_layer is always supposed to point to the first face when sampling from cubemaps. Reviewed-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-09-18gallivm: some bits of seamless cube filtering implementationRoland Scheidegger3-14/+29
Simply adjust wrap mode to clamp_to_edge. This is all that's needed for a correct implementation for nearest filtering, and it's way better than using repeat wrap for instance for linear filtering (though obviously this doesn't actually do seamless filtering). v2: fix s/t wrap not r/s... Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-09-12os: First check for __GLIBC__ and then for PIPE_OS_BSDAndreas Boll1-4/+4
Fixes FTBFS on kfreebsd-* Debian GNU/kFreeBSD doesn't provide getprogname() since it uses stdlib.h from glibc. Instead it provides program_invocation_short_name from glibc. You can find the same order in src/mesa/drivers/dri/common/xmlconfig.c Cc: "9.2" <mesa-stable@lists.freedesktop.org> Tested-by: Julien Cristau <jcristau@debian.org> Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-12llvmpipe: Remove the special path for TGSI_OPCODE_EXP.José Fonseca3-72/+30
It was wrong for EXP.y, as we clamped the source before computing the fractional part, and this opcode should be rarely used, so it's not worth the hassle.
2013-09-10util: Fix unmatched parenthesis.Vinson Lee1-1/+1
Fixes MSVC build error introduced with commit 923d3467147dd301d94ed3e6b41295fb2bcd6f47. src\gallium\auxiliary\util\u_cpu_detect.c(286) : fatal error C1012: unmatched parenthesis : missing '(' Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-09-10util: don't use _fxsave() with MSVC 2010 or olderBrian Paul1-1/+4
And update _MSC_VER comments in p_config.h Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-09-06gallivm: support indirect registers on both dimensionsZack Rusin3-8/+22
We support indirect addressing only on the vertex index, but some shaders also use indirect addressing on attributes. This patch adds support for indirect addressing on both dimensions inside gs arrays. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-08-31draw: fix segfaults with aaline and aapoint stages disabledMarek Olšák1-2/+4
There are drivers not using these optional stages. Broken by a3ae5dc7dd5c2f8893f86a920247e690e550ebd4. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-30draw: fix PIPE_MAX_SAMPLER/PIPE_MAX_SHADER_SAMPLER_VIEWS issuesRoland Scheidegger2-6/+6
pstipple/aaline stages used PIPE_MAX_SAMPLER instead of PIPE_MAX_SHADER_SAMPLER_VIEWS when dealing with sampler views. Now these stages can't actually handle sampler_unit != texture_unit anyway (they cannot work with d3d10 shaders at all due to using tex not sample opcodes as "mixed mode" shaders are impossible) but this leads to crashes if a driver just installs these stages and then more than PIPE_MAX_SAMPLER views are set even if the stages aren't even used. Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-30gallivm: handle unbound textures in texture sampling / texture queriesRoland Scheidegger1-0/+26
Turns out we don't need to do much extra work for detecting this case, since we are guaranteed to get a empty static texture state in this case, hence just rely on format being 0 and return all zero then. Previously needed dummy textures (would just have crashed on format being 0 otherwise) which cannot return the correct result for size queries and when sampling textures with wrap modes using border. As a bonus should hugely increase performance when sampling unbound textures - too bad it isn't a useful feature :-). Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-30softpipe: handle NULL sampler views for texture sampling / queriesRoland Scheidegger1-0/+1
Instead of crashing just return all zero. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-30gallivm: (trivial) don't pass sampler_unit variable down to filtering funcsRoland Scheidegger1-36/+21
The only reason this was needed was because the fetch texel function had to get the (dynamic) border color, but this is now done much earlier. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-30gallivm: don't use AoS path if min/mag filter are different with multiple lodsRoland Scheidegger1-1/+6
Instead of enhancing the AoS path so it can deal with it, just use SoA. Fixing AoS path wouldn't be all that difficult (use all the same logic as SoA) but considered not worth it for now. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-30gallivm: support per-pixel min/mag filter in SoA pathRoland Scheidegger1-43/+243
Since we can have per-pixel lod we should also honor the filter per-pixel (in fact we didn't honor it per quad neither in the multiple quad case). Do this by running the linear path and simply beating the weights into shape (the sample with the higher weight is the one which should have been chosen with nearest filtering hence adjust filter weight to 1.0/0.0 based on that). If all pixels use nearest filter (either min and mag) then still run just a nearest filter as this is way cheaper (probably around 4 times faster for 2d, more for 3d case) and it should be relatively rare that pixels really need different filtering. OTOH if all pixels would require linear don't do anything special since the linear path with filter adjustments shouldn't really be all that much more expensive than ordinary linear, and we think it's rare that min/mag filters are configured differently so there doesn't seem much value in trying to optimize this further. This does not yet fix the AoS path (though currently AoS is only used for single quads hence it could be considered less broken, just never honoring per-pixel filter decision but doing it per quad). v2: simplify code a bit (unify min linear and min nearest cases) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-30gallivm: don't calculate square root of rho if we use accurate rho methodRoland Scheidegger1-39/+74
While a sqrt here and there shouldn't hurt much (depending on the cpu) it is possible to completely omit it since rho is only used for calculating lod and there log2(x) == 0.5*log2(x^2). Depending on the exact path taken for calculating lod this means we get a simple mul instead of sqrt (in case of nearest mip filter in fact we don't need to replace the sqrt with something else at all), only in some not very useful path this doesn't work (combined brilinear calculation of int level and fractional lod, accurate rho calc but brilinear filtering seems odd). Apart from being faster as an added bonus this should increase our crappy fractional accuracy of lod, since fast_log2 is only good for ~3bits and this should increase accuracy by one bit (though not used if dimension is just one as we'd need an extra mul there as we never had the squared rho in the first place). v2: use separate ilog2_sqrt function if we have squared rho. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-30gallivm: refactor num_lods handlingRoland Scheidegger4-131/+169
This is just preparation for per-pixel (or per-quad in case of multiple quads) min/mag filter since some assumptions about number of miplevels being equal to number of lods no longer holds true. This change does not change behavior yet (though theoretically when forcing per-element path it might be slower with different min/mag filter since the code will respect this setting even when there's no mip maps now in this case, so some lod calcs will be done per-element just ultimately still the same filter used for all pixels). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-29draw: fix point/line/triangle determination in draw_need_pipeline()Brian Paul1-25/+6
The previous point/line/triangle() functions didn't handle GS primitives. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-08-27draw: clean up setting stream out information a bitRoland Scheidegger2-5/+0
In particular noone is interested in the vertex count, so drop that, and also drop the duplicated num_primitives_generated / so.primitives_storage_needed variables in drivers. I am unable for now to figure out if primitives_storage_needed in SO stats (used for d3d10) should increase if SO is disabled, though the equivalent num_primitives_generated used for OpenGL definitely should increase. In any case we were only counting when SO is active both in softpipe and llvmpipe anyway so don't pretend there's an independent num_primitives_generated counter which would count always. (This means the PIPE_QUERY_PRIMITIVES_GENERATED count will still be wrong just as before, should eventually fix this by doing either separate counting for this query or adjust the code so it always counts this even if SO is inactive depending on what's correct for d3d10.) Reviewed-by: Brian Paul <brianp@vmware.com>
2013-08-27tgsi_build: fix order of arguments for ind register buildDave Airlie1-1/+1
This was broken when arrayid was added. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-08-27tgsi: finish declaration parsing for arrays.Dave Airlie1-1/+31
I previously fixed this partly in 9e8400f4c95bde1f955c7977066583b507159a10, however I didn't go far enough in testing it, now when I parse a TGSI shader with arrays in it my iterator can see the ArrayID set to the proper value. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-08-23gallivm: fix min/mag switchover point for nearest/none mip filterRoland Scheidegger5-66/+81
Previously, the min/mag switchover point when using nearest/none mip filter was effectively -0.5 which can't be right. Looks like new OpenGL thinks it's ok if it's always 0.0 (older versions required 0.5 in some cases), let's hope everybody else thinks that's fine too. Refactor this slightly and get the per-quad/per-pixel min/mag decision values further down to sampling, though still only the first component is used yet. While here also fix code trying to skip lod bias application etc. when mipfilter is none, as this is still needed for determining min/mag filter. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-22gallivm: do per-element lod for lod bias and explicit derivs tooRoland Scheidegger2-31/+74
Except for explicit derivs with cube maps which are very bogus anyway. Just like explicit lod this is only used if no_quad_lod is set in GALLIVM_DEBUG env var. Minification is terrible on cpus which don't support true vector shifts (but should work correctly). Cannot do the min/mag filter decision (if they are different) per pixel though, only selecting different mip levels works. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-22gallivm: (trivial) fix int/uint border color clampingRoland Scheidegger1-2/+2
Just a copy & paste error. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=68409. Note that the test passing before probably simply means it doesn't verify clamping of the border color itself as required by the OpenGL spec. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-22gallivm: (trivial) fix linear aos sampling of 3d compressed formatsRoland Scheidegger1-2/+2
block size depth is always 1 even for compressed formats (unless someone invents true 3d compressed formats at least which we can't represent). Nearest (and soa) path had it right. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-22gallium: Support PIPE_FORMAT_R10G10B10A2_UINT.José Fonseca2-0/+2
Same as PIPE_FORMAT_B10G10R10A2_UINT but without the swizzling. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-08-21gallivm: unify sin and cos implementationRoland Scheidegger2-255/+53
The (complicated!) math is all identical, there's just minimal differences how sign bit is calculated plus there's an additional subtraction for the argument going into the polynomial for cos. The logic stays 100% the same (with a small exception, sign bit calculation for sin is minimally simplified, applying sign mask after xoring the arguments instead of applying it to each argument). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-21gallivm: add comment for bogus min/mag filter selection with nearest mip filterRoland Scheidegger3-2/+10
Detected this hunting some other bug, not sure if it really needs fixing but it is definitely wrong. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-21gallivm: fix rho calculation for 1d caseRoland Scheidegger1-1/+1
Was using wrong (undefined) vector element (the elements are at 0/2 position, not 0/1). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-20util: add avx2 and xop detection to cpu detection codeRoland Scheidegger3-2/+59
Going to need this soon (not going to bother with avx2 intrinsics at this time but don't want to do workarounds for true vector shifts if llvm itself can use them just fine and won't need the gazillion instruction emulation). Not really tested other than my cpu returns 0 for these features... (I have no idea if llvm actually would emit avx2/xop instructions neither...) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-20gallivm: fix bogus aos path detectionRoland Scheidegger1-5/+11
Need to check the wrap mode of the actually used coords not a fixed 2. While checking more than necessary would only potentially disable aos and not cause any harm I'm pretty sure for 3d textures it could have caused assertion failures (if s,t coords have simple filter and r not). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-20gallivm: do clamping of border color correctly for all formatsRoland Scheidegger2-46/+256
Turns out it is actually very complicated to figure out what a format really is wrt range, as using channel information for determining unorm/snorm etc. doesn't work for a bunch of cases - namely compressed, subsampled, other. Also while here add clamping for uint/sint as well - d3d10 doesn't actually need this (can only use ld with these formats hence no border) and we could do this outside the shader for GL easily (due to the fixed texture/sampler relation) do it here too just so I can forget about it. v2: move border color clamping out of fetch texel. Also change it to clamp the whole border vector at once (and use vectorized load of border color), which saves a couple of instructions - needs some different handling of mixed signed/unsigned formats so skip the per channel stuff and just derive this from first channel except for special formats. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-20gallivm: implement better control of per-quad/per-element/scalar lodRoland Scheidegger7-51/+145
There's a new debug value used to disable per-quad lod optimizations in fragment shader (ignored for vs/gs as the results are just too wrong typically). Also trying to detect if a supplied lod value is really a scalar (if it's coming from immediate or constant file) in which case sampler code can use this to stay on per-quad-lod path (in fact for explicit lod could simplify even further and use same lod for both quads in the avx case but this is not implemented yet). Still need to actually implement per-element lod bias (and derivatives), and need to handle per-element lod in size queries. v2: fix comments, prettify. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-20build: fix out-of-tree builds in gallium/auxiliaryRoss Burton1-0/+4
The rules were writing files to e.g. util/u_indices_gen.py, but in an out-of-tree build this directory doesn't exist in the build directory. So, create the directories just in case. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Ross Burton <ross.burton@intel.com>
2013-08-19vl/buffers: consistent use on VL_MAX_SURFACESEmil Velikov1-3/+3
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-08-19vl/idct: cleanup all idct buffersEmil Velikov1-1/+1
Code should loop through and cleanup the three (VL_NUM_COMPONENTS) idct buffers, rather than doing the first one three times. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-08-19vl/buffer: add sanity check after CALLOC_STRUCTEmil Velikov1-0/+2
Check if we have successfully allocated memory. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-08-19vdpau/vl 422 chroma width/height mix upAndy Furniss2-3/+3
I was looking into some minor 422 issues/discrepencies I noticed long ago using vdpau on my rv790. I noticed that there is code that is halving height rather than width - 422 is full height AFAIK. Making the changes below doesn't actually make any noticable difference to what I was looking into. Maybe there are more but here's three I've found so far Reviewed-by: Christian König <christian.koenig@amd.com>
2013-08-19vl: add entrypoint to is_video_format_supportedChristian König2-2/+4
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-08-19vl: add entrypoint to get_video_paramChristian König3-2/+6
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-08-19vl: rename pipe_video_decoder to pipe_video_codecChristian König9-32/+32
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-08-19vl: rename enum pipe_video_codec to pipe_video_formatChristian König4-11/+11
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-08-19vl: use a template for create_video_decoderChristian König4-41/+19
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-08-15draw: handle nan clipdistanceZack Rusin5-4/+48
If clipdistance for one of the vertices is nan (or inf) then the entire primitive should be discarded. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>