mesa/mesa - The Mesa 3D Graphics Library (mirrored from https://gitlab.freedesktop.org/mesa/mesa)

Age	Commit message (Collapse)	Author	Files	Lines
2019-01-01	nv30: disable rendering to 3D textures	Ilia Mirkin	1	-0/+6
	There's no way to tell the 3D engine about swizzling on such textures. While rendering to NPOT ones may be possible, there's no great way to expose that in gallium, nor would there be any practical benefit. Fixes the non-compressed-format "copyteximage 3D" failures. Something odd going on with the compressed formats. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-30	nv30: fix some s3tc layout issues	Ilia Mirkin	2	-7/+26
	s3tc layouts are a bit finicky - they're packed, but not swizzled. Adjust logic to allow for that case: - Don't set a uniform pitch for POT-sized compressed textures - Adjust define_rect API to be less confused about block sizes - Only mark a texture as linear if it has a uniform pitch set This has been tested to fix xonotic (as well as the s3tc-* piglits) on nv3x and keeps it working on nv4x. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-30	nv30: use correct helper to get blocks in y direction	Ilia Mirkin	1	-1/+1
	This doesn't matter since all compressed formats supported by this hardware use square blocks, but best to use the correct helper. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-30	nv30: add support for multi-layer transfers	Ilia Mirkin	1	-4/+35
	This logic mirrors what we do on nv50. The relatively new texture_subdata callback can cause this to happen with 3D textures, which is triggered at least by xonotic, and probably many piglits. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-30	nv30: fix rare issue with fp unbinding not finding the bufctx	Ilia Mirkin	1	-1/+1
	If the last-active context gets deleted, the pushbuf doesn't have a bufctx to reference. Then there could be a sequence of binds which would trigger a reset on that bin before validation was done. Instead we just pass in the bufctx in question directly. All other instances of PUSH_RESET happen strictly after a validation is run. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102349 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-30	nv30: avoid setting user_priv without setting cur_ctx	Ilia Mirkin	1	-3/+1
	The whole user_priv thing is a mess, but as long as it's there, it basically has to map 1:1 to the cur_ctx. Unfortunately we were setting user_priv to some context, then that context could get deleted without any draws/validations in it, leading user_priv to become NULL, with cur_ctx still pointing at some old context. Then we wouldn't run the switch logic, which in turn led to a NULL bufctx being dereferenced. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102349 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-26	nv50,nvc0: add missing CAPs for unsupported features	Ilia Mirkin	2	-0/+3
	Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-26	nvc0: enable GL_NV_shader_atomic_float on pre-Maxwell	Ilia Mirkin	1	-0/+2
	Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-26	nv50/ir: add support for converting ATOMFADD to proper ir	Ilia Mirkin	1	-0/+4
	Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-14	nvc0: always keep TSC slot 0 bound to fix TXF	Ilia Mirkin	2	-0/+21
	Same as on nv50, the TXF op always uses the TSC bound to slot 0, returning blank values if nothing is bound. An earlier change arranges for the TSC entries list to always have valid data at entry 0, so here we just make use of it. Fixes arb_texture_buffer_object-subdata-sync among others. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-14	nvc0: replace use of explicit default_tsc with entry 0	Ilia Mirkin	6	-22/+25
	This was used for implementing FBFETCH. However that uses TXF, which doesn't do much with a TSC. The only important bit is that sRGB-decoding works as expected, which we can achieve since all samplers we ever generate enable sRGB-decoding. Always point to entry 0 in the TSC table, and ensure that even before it ever gets initialized, the sRGB-decoding enable bit is set. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-09	nv50/ir: fix use-after-free in ConstantFolding::visit	Karol Herbst	1	-33/+49
	opnd() might delete the passed in instruction, but it's used through i->srcExists() later in visit v2: use continue instead return v3: use brackets for the outer if/else chain Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-09	nouveau: use atomic operations for driver statistics	Karol Herbst	1	-3/+4
	multiple threads can write to those at the same time Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-09	nv50/ir: initialize relDegree staticly	Karol Herbst	1	-7/+16
	this race condition is pretty harmless, but also pretty trivial to fix Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-03	nouveau: set texture upload budget	Ilia Mirkin	3	-3/+6
	It doesn't seem like the exact number has too much effect on the performaince in "teximage". However setting it to just about anything prevents some OOMs from getting hit. These values are not well-tuned, but don't seem too bad. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-03	nv50,nvc0: add explicit handling of PIPE_CAP_MAX_VERTEX_ELEMENT_SRC_OFFSET	Ilia Mirkin	2	-0/+4
	Since the max attrib stride is 2048, the max src offset makes sense as 2047. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-03	nv50: always keep TSC slot 0 bound	Ilia Mirkin	3	-0/+31
	All TXF operations implicitly use sampler 0, and fail if it's not bound to anything. This does not happen in LINKED_TSC mode, but we don't currently use this. We ensure that TSC entry at id 0 has the SRGB conversion bit enabled (and all samplers we normally generate will too). Then when the TSC at slot 0 (not to be confused with entry 0 in the global TSC table) is unbound, we bind it to entry 0. This way, TXF operations are not dependent on there being a regular sampler bound there. Fixes arb_texture_buffer_object-subdata-sync among others. (TBO's are particularly susceptible to this as they don't bind a sampler.) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-02	nv50,nvc0: Fix gallium nine regression regarding sampler bindings	Karol Herbst	2	-16/+12
	The new approach is that samplers don't get unbound even if they won't be used in a draw and we should just leave them be as well. Fixes a regression in multiple windows games using gallium nine and nouveau. v2: adjust num_samplers to keep track of the highest sampler bound v3: rework how to set the new value of num_samplers Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106577 Fixes: 4d6fab245eec3880e2a59424a579851f44857ce8 "cso: don't track the number of sampler states bound" Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-24	nv50/ir: remove dnz flag when converting MAD to ADD due to optimizations	Ilia Mirkin	1	-0/+3
	dnz flag only applies for multiplications (e.g. to make 0 * Infinity becomes 0 instead of NaN). Once we optimize a MAD into an ADD, the dnz flag no longer makes sense, and upsets the GM107 emitter (since it looks at the ftz and dnz flags together). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-11-16	nv50/ir/ra: enforce max register requirement, and change spill order	Ilia Mirkin	4	-16/+26
	On nv50, certain operations must happen on regs below 64, due to encoding requirements. First of all, we add infrastructure to enforce this. Secondly we change the spill order to first spill RIG nodes that are unconstrained, followed by ones that are. This makes the gamecube logo shadertoy compile properly. Curiously, if we adjust the spill order so that we first spill the constrained RIG nodes instead, the RA also succeeds. However it seems more logical to first spill the unconstrained ones. While we're at it, drop the nv50 max register to reserve r127 as the zero register of last resort (r63 is preferred). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Karol Herbst <kherbst@redhat.com>
2018-11-16	nv50/ir/ra: improve condition for short regs, unify with cond for 16-bit	Ilia Mirkin	1	-7/+7
	Instead of the size restriction existing in two places, and potentially being applied twice, we move this together. Ops with 16-bit register addresses can only take a short reg, and ops with immediates can only take a short reg. Of course we leave the immediate 0 in place since we know that it will be replaced by r63/r127 down the line, so don't treat zeroes as an immediate. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-11-16	nv50/ir: delete MINMAX instruction that is no longer in the BB	Ilia Mirkin	1	-1/+1
	We removed the op from the BB, but it was still listed in its sources' uses. This could trip up some logic down the line which analyzes all the uses of an l-value, e.g. spilling. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-11-07	gm107/ir: fix compile time warning in getTEXSMask	Karol Herbst	1	-0/+1
	In function 'uint8_t nv50_ir::getTEXSMask(uint8_t)': warning: control reaches end of non-void function [-Wreturn-type] Reported-by: Moiman@freenode Fixes: f821e80213e38e93f96255b3deacb737a600ed40 "gm107/ir: use scalar tex instructions where possible" Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-06	gm107/ir: use scalar tex instructions where possible	Karol Herbst	2	-3/+317
	TEXS, TLD4 and TLD4S are variants of tex instructions which are more scalar, which gives RA more freedom and is less likely to insert silly MOVs to satisfy quad registers. shader-db changes: total instructions in shared programs : 7687265 -> 7614782 (-0.94%) total gprs used in shared programs : 803620 -> 798045 (-0.69%) total shared used in shared programs : 639636 -> 639636 (0.00%) total local used in shared programs : 24648 -> 24648 (0.00%) total bytes used in shared programs : 82103400 -> 81330696 (-0.94%) local shared gpr inst bytes helped 0 0 3648 10647 10647 hurt 0 0 464 205 205 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-06	nv50/ir: add scalar field to TexInstructions	Karol Herbst	2	-1/+6
	Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-06	nv50/ra: add condenseDef overloads for partial condenses	Karol Herbst	1	-8/+21
	Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-11-06	nv50/ir: print color masks of tex instructions	Karol Herbst	1	-4/+33
	v2: print the mask for TXG as well make the mask to be printed more mask like Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-10-30	nouveau: remove unused class member	Eric Engestrom	1	-1/+0
	Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-10-26	util: Change remaining uint32 cache ids to sha1	David McFarland	1	-14/+15
	After discussion with Timothy Arceri. disk_cache_get_function_identifier was using only the first byte of the sha1 build-id. Replace disk_cache_get_function_identifier with implementation from radv_get_build_id. Instead of writing a uint32_t it now writes to a mesa_sha1. All drivers using disk_cache_get_function_identifier are updated accordingly. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Fixes: 83ea8dd99bb1 ("util: add disk_cache_get_function_identifier()")
2018-10-25	nvc0: increase NOUVEAU_TRANSFER_PUSHBUF_THRESHOLD to 1024 on Kepler+	Rhys Perry	4	-3/+11
	Gives a +3.89% to +5.27% FPS improvement with Hitman and +2.73% to +2.82% FPS improvement with Dirt Rally on my GTX 1060. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-10-20	nv50/ir: fix ConstantFolding::createMul for 64 bit muls	Karol Herbst	1	-1/+1
	Fixes: 2f52925f5c60c72c9389bfdc122c3d5f8e15b25f "nv50/ir: move a * b -> a << log2(b) code into createMul()" Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-10-09	nvc0: fix blitting red to srgb8_alpha	Ilia Mirkin	1	-0/+4
	For some reason the 2d engine can't handle this. Red formats get special treatment there, so perhaps related. Fixes dEQP-GLES3 tests of the form: dEQP-GLES3.functional.fbo.blit.conversion.r{8,16f,32f}_to_srgb8_alpha8 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com> Cc: mesa-stable@lists.freedesktop.org
2018-10-09	nv50,nvc0: guard against zero-size blits	Ilia Mirkin	2	-0/+14
	The current state tracker can generate these sometimes. Fixing this is more involved, and due to some integer math we can generate divisions-by-zero. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com> Cc: mesa-stable@lists.freedesktop.org
2018-10-09	nv50,nvc0: mark RGBX_UINT formats as renderable	Ilia Mirkin	1	-4/+4
	This helps st/mesa avoid some (apparently) buggy fallbacks. Specifically the CopyTexSubImage fallback tries to read texture A as RGBA_FLOAT and write back that data into the target format, which fails for integer formats which have no appropriate logic to do the conversion. Since integer formats don't blend, there's no harm in the fact that the "A" component gets written anyways. Fixes, among others: https://www.khronos.org/registry/webgl/sdk/tests/conformance2/textures/canvas/tex-2d-rgb8ui-rgb_integer-unsigned_byte.html Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org
2018-10-03	nouveau: use build-id when available for disk cache	Timothy Arceri	1	-7/+7
	Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-09-23	nv50/ir: fix link-time build failure	Rhys Perry	1	-1/+1
	Seems this fixes linking problems that occur in some situations. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-09-22	nvc0: fix bindless multisampled images on Maxwell+	Rhys Perry	3	-5/+45
	NVC0_CB_AUX_BINDLESS_INFO isn't written to on Maxwell+ and it's too small anyway. With these changes, TXQ is used to determine the number of samples and the coordinate adjustment information looked up in a small array in the driver constant buffer. v2: rework to use TXQ and a small array instead of a larger array with an entry for each texture v3: get rid of the small array and calculate the adjustments in the shader Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: c2ae9b40527 ('nvc0: implement multisampled images on Maxwell+') Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-09-22	nvc0: warn about changing NVC0_CB_AUX_MP_INFO and NVC0_CB_AUX_DRAW_INFO	Rhys Perry	1	-2/+6
	Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-09-22	nvc0: Update counter reading shaders to new NVC0_CB_AUX_MP_INFO	Rhys Perry	1	-18/+18
	Fixes: 66ca7e400b8 ('nvc0: add support for programmable sample locations') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-09-13	nvir: Always split 64-bit IMAD/IMUL operations	Pierre Moreau	1	-1/+1
	Those operations do not map to actual hardware instructions, therefore those should always be lowered to 32-bit instructions. Fixes: 009c54aa7af "nv50/ir: Split 64-bit integer MAD/MUL operations" Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-09-11	nv50,nvc0: warn on not-explicitly-handled caps	Ilia Mirkin	2	-14/+26
	Not handling caps explicitly means that we're likely getting incorrect values -- these need to be reviewed and set appropriately. While we're at it, add in some missing caps, and set all the subpixel stuff to 8 as that seems to be what the blob reports. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-09-07	gallium: add PIPE_CAP_MAX_TEXTURE_UPLOAD_MEMORY_BUDGET	Marek Olšák	3	-0/+3

2018-09-06	gallium: enable GL_AMD_depth_clamp_separate on r600, radeonsi	Marek Olšák	3	-0/+3

2018-09-06	gallium: split depth_clip into depth_clip_near & depth_clip_far	Marek Olšák	3	-3/+3
	for AMD_depth_clamp_separate.
2018-09-04	gallium: Add a helper for implementing PIPE_CAP_* default values.	Eric Anholt	3	-9/+9
	One of the pains of implementing a gallium driver is filling in a million pipe caps you don't know about yet when you're just starting out. One of the pains of working on gallium is copy-and-pasting your new PIPE_CAP into each driver. We can fix both of these by having each driver call into the default helper from their default case, so that both sides can ignore each other until they need to. v2: fix i915g build, revert swr change to avoid breaking scons build (https://travis-ci.org/anholt/mesa/jobs/419739857) v3: Rebase on 3 new gallium caps. Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Cc: Bruce Cherniak <bruce.cherniak@intel.com> Cc: George Kyriazis <george.kyriazis@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29	nv50: bump compat glsl level to same as core	Ilia Mirkin	1	-1/+1
	Passes the compat piglits. I'm sure that there will be odd issues that aren't caught by them, but at least it should basically work. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-08-29	nvc0: bump compat GLSL version to match core	Ilia Mirkin	1	-1/+1
	This passes the handful of tests in piglit. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-08-29	nv50/ir: silence partitionLoadStore() unused function warning	Rhys Kidd	1	-2/+2
	Move this now-unused function into the existing comment block, which was its only prior use. ../../../../../src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp:2645:1: warning: unused function 'partitionLoadStore' [-Wunused-function] partitionLoadStore(uint8_t comp[2], uint8_t size[2], uint8_t mask) Fixes: ("86e4440361 nouveau: codegen: Disable more old resource handling code") Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-08-27	nv50/ir,nvc0: use constant buffers for compute when possible on Kepler+	Rhys Perry	2	-10/+36
	Gives a +7.79% increase in FPS with Hitman on lowest quality settings on my GTX 1060. total instructions in shared programs : 5787979 -> 5748677 (-0.68%) total gprs used in shared programs : 669901 -> 669373 (-0.08%) total shared used in shared programs : 548832 -> 548832 (0.00%) total local used in shared programs : 21068 -> 21064 (-0.02%) local shared gpr inst bytes helped 1 0 152 274 274 hurt 0 0 0 0 0 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-08-27	nv50/ir: optimize multiplication by 16-bit immediates into two xmads	Rhys Perry	1	-0/+10
	Rather than the usual three that would be created. total instructions in shared programs : 5796385 -> 5786560 (-0.17%) total gprs used in shared programs : 670103 -> 669968 (-0.02%) total shared used in shared programs : 548832 -> 548832 (0.00%) total local used in shared programs : 21164 -> 21068 (-0.45%) local shared gpr inst bytes helped 1 0 64 1040 1040 hurt 0 0 27 0 0 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>