summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2013-07-15i965: Refer people to brw_tex_layout.c rather than the BSpec.Kenneth Graunke1-2/+2
brw_tex_layout.c sets up the align_w/h fields, and has all the appropriate spec references already. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15i965: Remove old BSpec reference from BLORP's 3DSTATE_WM/PS packets.Kenneth Graunke2-5/+5
The Sandybridge code had a citation for the range of the "Maximum Number of Threads" field, and the Ivybridge code just mentioned the "BSpec" in general. That's documented in the obvious place, so people can find it without a spec reference. The real value of the comment is to say "we tried zero, and it exploded, so program it to a valid number even if pixel shading is off." Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15i965: Cite the Ivybridge PRM for 3DSTATE_URB_* programming.Kenneth Graunke1-2/+3
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15i965: Update workaround flush comments for Gen6 3DSTATE_VS.Kenneth Graunke2-2/+6
Unfortunately, the workaround text never made it into the Sandybridge PRM, so we still have to refer to the BSpec. It also wasn't obvious why we needed this workaround at all, since we don't currently do VS passthrough - but BLORP can turn off the VS. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15i965: Cite the Ivybridge PRM for VS PIPE_CONTROL workarounds.Kenneth Graunke1-2/+2
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15i965: Cite the Sandybridge PRM for Gen7 stencil pitch requirements.Kenneth Graunke1-9/+5
Sadly, the Ivybridge PRM can't be cited, as it is missing the relevant text for some reason. However, the Sandybridge PRM has the text Chad originally quoted, and the modern BSpec has the same text. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15i965: Cite the Ivybridge PRM for multisample surface format notes.Kenneth Graunke1-13/+9
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15i965: Delete "the data cache is the sampler cache" comments on Gen7+.Kenneth Graunke1-12/+0
I cut and pasted these comments from the Gen4 code during Ivybridge enabling, and didn't understand what they meant at the time. The data cache is NOT the same as the sampler cache on Ivybridge. The sampler cache has L1 and L2 caches in addition to the L3 cache, while data port messages to the "data cache" hit L3 directly. This means that the sampler domain is technically wrong, but we stopped caring about read/write domains quite a while ago. The kernel just flushes all the caches at the end of each batchbuffer, and our render to texture code flushes the sampler caches when necessary. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15i965: Cite the 965 PRM for "the data cache is the sampler cache".Kenneth Graunke1-3/+3
Presumably, this comment exists to justify the usage of I915_GEM_DOMAIN_SAMPLER for this relocation. At one point, this was necessary to ensure that the right flushing was done to keep caches coherent. These days, the kernel just flushes everything, so I don't think it matters. Still, the comment is interesting, so leave it in place. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15i965: Cite the Ivybridge PRM for DP message descriptor fields.Kenneth Graunke1-3/+3
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15i965: Cite the Ivybridge PRM for why the fake MRF range is what it is.Kenneth Graunke1-1/+1
The exact text is in the public docs, so we should cite those. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15i965: Cite the Ivybridge PRM for SFID enum values.Kenneth Graunke1-2/+1
The Ivybridge PRM adds new SFIDs and lists them in a different volume than Sandybridge, so it's worth adding a reference. I also removed the BSpec reference, as the section it referred to was moved somewhere, and I couldn't find it. This leaves one Haswell SFID without a citation, but we can add one once the PRMs are out. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-16llvmpipe: support sRGB framebuffersRoland Scheidegger4-18/+111
Just use the new conversion functions to do the work. The way it's plugged in into the blend code is quite hacktastic but follows all the same hacks as used by packed float format already. Only support 4x8bit srgb formats (rgba/rgbx plus swizzle), 24bit formats never worked anyway in the blend code and are thus disabled, and I don't think anyone is interested in L8/L8A8. Would need even more hacks otherwise. Unless I'm missing something, this is the last feature except MSAA needed for OpenGL 3.0, and for OpenGL 3.1 as well I believe. v2: prettify a bit, use separate function for packing. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-15Revert "r300g: allow HiZ with a 16-bit zbuffer"Marek Olšák1-0/+1
This reverts commit 631c631cbf5b7e84e42a7cfffa1c206d63143370. https://bugs.freedesktop.org/show_bug.cgi?id=66921 Cc: mesa-stable@lists.freedesktop.org
2013-07-15r300g/swtcl: fix a lockup in MSAA resolveMarek Olšák1-0/+7
Cc: mesa-stable@lists.freedesktop.org
2013-07-15r300g/swtcl: fix geometry corruption by uploading indices to a bufferMarek Olšák3-45/+31
The splitting of a draw call into several draw commands was broken, because the split sometimes took place in the middle of a primitive. The splitting was supposed to be dealing with the case when there are more indices than the maximum size of a CS. This commit throws that code away and uses a real index buffer instead. https://bugs.freedesktop.org/show_bug.cgi?id=66558 Cc: mesa-stable@lists.freedesktop.org
2013-07-15glsl: Reject C-style initializers with unknown types.Matt Turner1-0/+5
_mesa_ast_set_aggregate_type walks through declarations initialized with C-style aggregate initializers and stops when it runs out of LHS declarations or RHS expressions. In the example vec4 v = {{{1, 2, 3, 4}}}; _mesa_ast_set_aggregate_type would not recurse into the subexpressions (since vec4s do not contain types that can be initialized with an aggregate initializer) to set their <constructor_type>s. Later in ::hir we would dereference the NULL pointer and segfault. If <constructor_type> is NULL in ::hir we know that the LHS and RHS were unbalanced and the code is illegal. Arrays, structs, and matrices were unaffected. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-07-15glsl: Rework builtin_variables.cpp to reduce code duplication.Paul Berry1-761/+368
Previously, we had a separate function for setting up the built-in variables for each combination of shader stage and GLSL version (e.g. generate_110_vs_variables to generate the built-in variables for GLSL 1.10 vertex shaders). The functions called each other in ad-hoc ways, leading to unexpected inconsistencies (for example, generate_120_fs_variables was called for GLSL versions 1.20 and above, but generate_130_fs_variables was called only for GLSL version 1.30). In addition, it led to a lot of code duplication, since many varyings had to be duplicated in both the FS and VS code paths. With the advent of geometry shaders (and later, tessellation control and tessellation evaluation shaders), this code duplication was going to get a lot worse. So this patch reworks things so that instead of having a separate function for each shader type and GLSL version, we have a function for constants, one for uniforms, one for varyings, and one for the special variables that are specific to each shader type. In addition, we use a class, builtin_variable_generator, to keep track of the instruction exec_list, the GLSL parse state, commonly-used types, and a few other variables, so that we don't have to pass them around as function arguments. This makes the code a lot more compact. Where it was feasible to do so without introducing compilation errors, I've also gone ahead and introduced the variables needed for {ARB,EXT}_geometry_shader4 style geometry shaders. This patch takes care of everything except the GS variable gl_VerticesIn, the FS variable gl_PrimitiveID, and GLSL 1.50 style geometry shader inputs (using the gl_in interface block). Those remaining features will be added later. I've also made a slight nomenclature change: previously we used the word "deprecated" to refer to variables which are marked in GLSL 1.40 as requiring the ARB_compatibility extension, and are marked in GLSL 1.50 onward as requiring the compatibilty profile. This was misleading, since not all deprecated variables require the compatibility profile (for example gl_FragData and gl_FragColor, which have been deprecated since GLSL 1.30, but do not require the compatibility profile until GLSL 4.20). We now consistently use the word "compatibility" to refer to these variables. This patch doesn't introduce any functional changes (since geometry shaders haven't been enabled yet). Reviewed-by: Matt Turner <mattst88@gmail.com> v2: Rename "typ" -> "type". Add blank line between inline functions and declarations in builtin_variable_generator class. Use the standard comment "/* FALLTHROUGH */" for compatibility with static code analysis tools. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15glsl: Fix lower_named_interface_blocks to account for dereferences of consts.Paul Berry1-0/+2
In certain rare cases (such as those involving dereference of a literal constant array of structs), flatten_named_interface_blocks_declarations's rvalue visitor may be invoked on an ir_dereference_record whose variable_referenced() method returns NULL. Check for this case to avoid a segfault. Prevents crashes in piglit tests {vs,fs}-deref-literal-array-of-structs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-07-15glsl: Don't allow vertex shader input arrays until GLSL 1.50.Paul Berry1-1/+1
Vertex shader inputs are not allowed to be arrays until GLSL 1.50. We were accidentally enabling them for GLSL 1.40 (although we haven't written any tests for them, so it's not clear whether they actually work). NOTE: although this is a simple bug fix, it probably isn't sensible to cherry-pick it to stable release branches, since its only effect is to cause incorrectly-written shaders to fail to compile. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-14i965: Gen4/5: use IEEE floating point mode for GLSL shaders.Chris Forbes2-2/+17
Fixes isinf(), isnan() from GLSL 1.30 Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-14i965/vs: Gen4/5: enable front colors if back colors are writtenChris Forbes1-0/+6
Fixes undefined results if a back color is written, but the corresponding front color is not, and only backfacing primitives are drawn. Results are still undefined if a frontfacing primitive is drawn, but that's OK. The other reasonable way to fix this would have been to just pick the one color slot that was populated, but that dilutes the value of the tests. On Gen6+, the fixed function clipper and triangle setup already take care of this. Fixes 11 piglits: spec/glsl-1.10/execution/interpolation/interpolation-none-gl_Back*Color-* NOTE: This is a candidate for stable branches. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-14gallivm: (trivial) use constant instead of exp2f() functionRoland Scheidegger1-2/+3
Some lame compilers can't do exp2f() and as far as I can tell they can't do exp2() (with doubles) neither so instead of providing some workaround for that (wouldn't actually be too bad just replace with pow) and since it is used with a constant only just use the precalculated constant.
2013-07-14ilo: skip 3DSTATE_INDEX_BUFFER when possibleChia-I Wu4-59/+77
When only the offset to the index buffer is changed, we can skip the 3DSTATE_INDEX_BUFFER if we always use 0 for the offset, and add (offset / index_size) to Start Vertex Location in 3DPRIMITIVE.
2013-07-13gallivm: handle srgb-to-linear and linear-to-srgb conversionsRoland Scheidegger6-7/+332
srgb-to-linear is using 3rd degree polynomial for now which should be _just_ good enough. Reverse is using some rational polynomials and is quite accurate, though not hooked into llvmpipe's blend code yet and hence unused (untested). Using a table might also be an option (for srgb-to-linear especially). This does not enable any new features yet because EXT_texture_srgb was already supported via util_format fallbacks, but performance was lacking probably due to the external function call (the table used by the util_format_srgb code may not be all that much slower on its own). Some performance figures (taken from modified gloss, replaced both base and sphere texture to use GL_SRGB instead of GL_RGB, measured on 1Ghz Sandy Bridge, the numbers aren't terribly accurate): normal gloss, aos, 8-wide: 47 fps normal gloss, aos, 4-wide: 48 fps normal gloss, forced to soa, 8-wide: 48 fps normal gloss, forced to soa, 4-wide: 47 fps patched gloss, old code, soa, 8-wide: 21 fps patched gloss, old code, soa, 4-wide: 24 fps patched gloss, new code, soa, 8-wide: 41 fps patched gloss, new code, soa, 4-wide: 38 fps So there's a performance hit but it seems acceptable, certainly better than using the fallback. Note the new code only works for 4x8bit srgb formats, others (L8/L8A8) will continue to use the old util_format fallback, because I can't be bothered to write code for formats noone uses anyway (as decoding is done as part of lp_build_unpack_rgba_soa which can only handle block type width of 32). Compressed srgb formats should get their own path though eventually (it is going to be expensive in any case, first decompress, then convert). No piglit regressions. v2: use lp_build_polynomial instead of ad-hoc polynomial construction, also since keeping both linear to srgb functions for now make sure both are compiled (since they share quite some code just integrate into the same function). v3: formatting fixes and bugfix in the complicated (disabled) linear-to-srgb path. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-13gallivm: better support for fast rsqrtRoland Scheidegger2-16/+63
We had to disable fast rsqrt before because it wasn't precise enough etc. However in situations when we know we're not going to need more precision we can still use a fast rsqrt (which can be several times faster than the quite expensive sqrt). Hence introduce a new helper which does exactly that - it is probably not useful calling it in some situations if there's no fast rsqrt available so make it queryable if it's available too. v2: use fast_rsqrt consistently instead of rsqrt_fast, fix indentation, let rsqrt use fast_rsqrt. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-12configure.ac: better detection of LLVM versionKlemens Baum1-15/+26
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-13r600g/sb: Initialize ra_constraint::cost.Vinson Lee1-1/+1
Fixes "Uninitialized scalar field" reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-12glsl: Initialize ast_aggregate_initializer::constructor_type.Vinson Lee1-1/+2
Fixes "Uninitialized pointer field" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-12glsl: Make gl_TexCoord compatibility-onlyPaul Berry1-26/+30
gl_TexCoord was deprecated in GLSL 1.30. In GLSL 1.40 it was marked as ARB_compatibility-only, and in GLSL 1.50 and above it was marked as only appearing in the compatibility profile. It has never appeared in GLSL ES. However, Mesa erroneously included it in all desktop versions of GLSL, even versions 1.40 and 1.50 (which do not currently support the compatibility profile). This patch makes gl_TexCoord available in the compatibility profile (and GLSL versions 1.30 and prior) only. NOTE: although this is a simple bug fix, it probably isn't sensible to cherry-pick it to stable release branches, since its only effect is to cause incorrectly-written shaders to fail to compile. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-12glsl ES: Fix magnitude of gl_MaxVertexUniformVectors.Paul Berry1-1/+1
Previously, we set it equal to MaxVertexUniformComponents. It should be MaxVertexUniformComponents / 4. NOTE: This is a candidate for the stable branches. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-13winsys/radeon: allow a NULL cs pointer in radeon_bo_map to fix a segfaultMarek Olšák1-9/+11
The original idea was that cs=NULL should be allowed here, but we never used NULL until 862f69fbe1e54e0e9a3c439450a14f. This fixes a segfault in CoreBreach.
2013-07-13ilo: move a santiy check into its assert()Chia-I Wu1-5/+2
The compiler does not know that ilo_3d_pipeline_estimate_size() is pure and can be eliminated in a release build in gen6_pipeline_end(). Move the call into the assert().
2013-07-13ilo: mark some states dirty when they are really changedChia-I Wu1-0/+16
The checks may seem redundant because cso_context handles them, but util_blitter does not have access to cso_context.
2013-07-13ilo: clean up ilo_blitter_pipe_begin()Chia-I Wu3-27/+39
Document why certain states need to be saved, and fix a bug when blitting with scissor enabled.
2013-07-12r600g: don't use the CB/DB CP COHER logic on r6xxAlex Deucher1-2/+10
There are hw bugs. Flush and inv event is sufficient. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=66837 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-12configure: Avoid use of AC_CHECK_FILE for cross compilingJonathan Liu1-6/+6
The AC_CHECK_FILE macro can't be used for cross compiling as it will result in "error: cannot check for file existence when cross compiling". Replace it with the AS_IF macro. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Jonathan Liu <net147@gmail.com>
2013-07-12nv30: fix KILL_IF breakageBrian Paul1-1/+1
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66858
2013-07-11gallium: fixup definitions of the rsq and sqrtZack Rusin4-18/+15
GLSL spec says that rsq is undefined for src<=0, but the D3D10 spec says it needs to be a NaN, so lets stop taking an absolute value of the source which completely breaks that behavior. For the gl program we can simply insert an extra abs instrunction which produces the desired behavior there. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-12util/u_format: Comment out half float denormal test case.José Fonseca1-0/+5
So that lp_test_format doesn't fail until we decide what should be done.
2013-07-12gallivm: Eliminate redundant lp_build_select calls.José Fonseca1-12/+2
lp_build_cmp already returns 0 / ~0, so the lp_build_select call is unnecessary. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-12tgsi: rename the TGSI fragment kill opcodesBrian Paul32-110/+109
TGSI_OPCODE_KIL and KILP had confusing names. The former was conditional kill (if any src component < 0). The later was unconditional kill. At one time KILP was supposed to work with NV-style condition codes/predicates but we never had that in TGSI. This patch renames both opcodes: TGSI_OPCODE_KIL -> KILL_IF (kill if src.xyzw < 0) TGSI_OPCODE_KILP -> KILL (unconditional kill) Note: I didn't just transpose the opcode names to help ensure that I didn't miss updating any code anywhere. I believe I've updated all the relevant code and comments but I'm not 100% sure that some drivers had this right in the first place. For example, the radeon driver might have llvm.AMDGPU.kill and llvm.AMDGPU.kilp mixed up. Driver authors should review their code. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-12tgsi: fix-up KILP commentsBrian Paul4-10/+9
KILP is really unconditional fragment kill. We've had KIL and KILP transposed forever. I'll fix that next. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-12tgsi: exec TGSI_OPCODE_SQRT as a scalar instruction, not vectorBrian Paul1-1/+1
To align with the docs and the state tracker. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-12tgsi: use X component of the second operand in exec_scalar_binary()Brian Paul1-1/+1
The code happened to work in the past since the (scalar) src args effectively always have a swizzle of .xxxx, .yyyy, .zzzz, or .wwww so whether you grab the X or Y component doesn't really matter. Just fixing the code to make it look right. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-12mesa: update glext.h to version 20130708Brian Paul7-20/+22
This update fixes the problem with duplicated typedefs for GLclampf and GLclampd in the previous version. It also changes some parameter types for glDebugMessageCallbackARB() and glTransformFeedbackVaryingsEXT(). Note we should someday update the glapi-gen code so that it understands void pointer parameters. Currently, the Python code only understands "GLvoid *" but not "void *". Luckily, the compilers don't seem to complain about mixing GLvoid and void.
2013-07-12mesa: fix Address Sanitizer (ASan) issue in _mesa_add_parameter()Brian Paul1-1/+15
If the size argument isn't a multiple of four, we would have read/ copied uninitialized memory. Fixes an issue reported by Myles C. Maxfield <myles.maxfield@gmail.com>
2013-07-12mesa: simplify some _mesa_IsEnabled() queriesBrian Paul1-10/+11
No need to test array->Enabled != 0 since the Enabled field can only be 0 or 1. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-12os: add os_get_process_name() functionBrian Paul3-0/+133
v2: explicitly test for BSD/APPLE, #warning for unexpected environments.
2013-07-12mesa: whitespace, formatting, 80-column wrappingBrian Paul1-12/+18