summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2014-05-26Revert "i965: Don't make instructions with a null dest a barrier to scheduling."Matt Turner1-8/+4
This reverts commit 42a26cb5e441a01d5288b299980f23affaad53fe. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78648
2014-05-26Revert "i965/fs: Simplify interference scan in register coalescing."Matt Turner1-9/+13
This reverts commit 5ff1e446d44bb9d50f84883c7058635cb070e069. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77704
2014-05-26Revert "i965/fs: Give up in interference check if we see a WHILE."Matt Turner1-1/+1
This reverts commit 55de1c035cbca2b7087b3aa21a8c3dfc900a4ad9. Cc: "10.2" <mesa-stable@lists.freedesktop.org>
2014-05-26Revert "i965/fs: Reduce restrictions on interference in register coalescing."Matt Turner1-0/+13
This reverts commit f770123f58b46459e8dbd27525162ee8ba89f30b. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78692
2014-05-26nvc0: revert mistaken logic to collapse color outputs to the beginningIlia Mirkin1-9/+4
In commit af38ef907, I added a "fix" to color outputs not being assigned correctly when sample mask was being output. This was totally wrong -- the color indices (i.e. "si" values) were the ones that were wrong. Undo that hunk. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-05-26mesa/st: fix color outputs in presence of sample mask outputIlia Mirkin1-13/+17
Commit c5d822dad90 added support for sample mask incorrectly. It became treated as a color output, and messed up the color output indices. Revert the hunk that did that, and add explicit support just like for depth/stencil writes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Marek Olšák <marek.olsak@amd.com>
2014-05-26freedreno/a3xx: texture fixesRob Clark1-1/+3
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-26freedreno: update generated headersRob Clark4-5/+7
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-26freedreno: few caps fixesRob Clark2-4/+8
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-25mesa/x86: Fix build with clang <= 3.3.Vinson Lee1-0/+2
clang <= 3.3 cpuid.h does not define contants for feature bits. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79095 Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2014-05-25i965: Don't treat HW_REGs as barriers if they're immediates.Matt Turner1-4/+12
We had a handful of cases where we'd used brw_imm_*() to generate an immediate, rather than fs_reg(). We shouldn't do that but we shouldn't limit scheduling flexibility on account of immediate arguments either. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-05-25i965/fs: Don't use brw_imm_* unnecessarily.Matt Turner2-5/+5
Using brw_imm_* creates a source with file=HW_REG, and the scheduler inserts barrier dependencies when it sees HW_REG. None of these are hardware-registers in the sense that they're special and scheduling shouldn't touch them. A few of the modified cases already have HW_REGs for other sources, so it won't allow extra flexibility in some cases. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-05-25automake: correctly append the version-scriptEmil Velikov6-25/+38
Turns out that the AC conditional did not include the the version-scripts as expected. Rather it truncated the remaining linker flags. Cc: Jon TURNEY <jon.turney@dronecode.org.uk> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>
2014-05-25targets/libgl-xlib: hide all the exported symbol mayhemEmil Velikov2-0/+12
Leave only the gl/glx and mangled gl symbols. XMesa* was never an official interface and the only user of it was mesa-demos, while they were still in the same repo as mesa. v2: Conditionally use the version-script. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-05-25targets/osmesa: include mangled gl symbolsEmil Velikov1-0/+1
Missed out with commit d4c3968c25885f6eb53dee4cc0c60d8d3f8fec32 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-05-25targets/xa: limit the amount of exported symbolsEmil Velikov2-0/+43
In the presence of LLVM the final library exports every symbol from the llvm namespace. Resolve this by using a version script (w/o the version/name tag). Considering that there are only ~35 symbols, explicitly list them to minimize the chances of rogue symbols sneaking in. v2: Conditionally include the version-script. Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> (v1) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-05-25dri_util: keep __dri2ConfigOptions symbol privateEmil Velikov1-1/+1
The symbol was added with commit 45e2b51c853(DRI2/GLX: check for vblank_mode in DRI2 GLX code) but was never used as such according to git log. Possibly it was marked as public due to confusion with __driConfigOptions which was used for dri1 drivers. Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-05-25targets/opencl: Fix (static) linking with LLVM (v2)Kai Wasserbäch1-0/+7
Without this, I get linking failures (static linking). The static linking is sort of required for me, because otherwise Steam and applications using the Steam runtime regularily fail because my LLVM was compiled and linked against a newer libgcc_s, libstdc++, etc. and uses features from those newer versions. And instead of Steam just not starting, my X starts crashing, whenever libGL fails to load a (32 bit) driver. Since I hate crashes of X and I don't think Valve/Steam will behave like a proper distribution soon (rebuilds versus current Debian Testing, since they base their Steam OS off that), I need a radeonsi which carries its own LLVM within and doesn't care about what the runtime sets. This means linking Mesa statically. v1 → v2: Move logic to configure.ac Acked-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
2014-05-25glx: do not leak dri3DisplayEmil Velikov1-0/+4
v2: Do not wrap the code in ifdef HAVE_DRI3 (suggested by Keith) Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Cc: Keith Packard <keithp@keithp.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-05-25gallium/egl: st_profiles are build time decision, treat them as suchEmil Velikov9-55/+28
The profiles are present depending on the defines at build time. Drop the extra functions and feed the defines directly into the state-tracker at build time. v2: Drop unused variable i. Acked-by: Chia-I Wu <olvaffe@gmail.com> (v1) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-05-25dri_util: set implemented version of the DRI_CORE extensionEmil Velikov1-1/+1
... rather than the one defined in our internal interface (dri_interface.h) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-05-25i965/fs: Don't modify ann_count if not debugging.Matt Turner2-2/+8
If we make ann_count non-zero, annotation_finalize() won't bail. Not modifying it seems to make the code more clear than would modifying annotation_finalize().
2014-05-24Revert "i965/fs: Change fs_visitor::emit_lrp to use MAC for gen<6"Matt Turner1-4/+7
This reverts commit a6860100b87415ab510d0d210cabfeeccebc9a0a. Why this code didn't work in all circumstances is unknown and without a working Ironlake simulator (which uses a different AUB format) we'll probably never know, short of a lot of experimentation, and spending a bunch of time to try to optimize a few instructions on Ironlake is not time well spent. Moreover, for mix(vec4, vec4, vec4) using the accumulator introduces a dependence between the otherwise independent per-component calculations. Not using the accumulator, even if it means an extra instruction per component might be preferable. We don't know, we don't have data, and we don't have the necessary register on Ironlake for shader_time to tell us. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77707 Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2014-05-24Revert "i965/vec4: Change vec4_visitor::emit_lrp to use MAC for gen<6"Matt Turner1-6/+10
This reverts commit 2dfbbeca50b95ccdd714d9baa4411c779f6a20d9 with the comment about MAC and implicit accumulator removed. Why this code didn't work in all circumstances is unknown and without a working Ironlake simulator (which uses a different AUB format) we'll probably never know, short of a lot of experimentation, and spending a bunch of time to try to optimize a few instructions on Ironlake is not time well spent. Moreover, for mix(vec4, vec4, vec4) using the accumulator introduces a dependence between the otherwise independent per-component calculations. Not using the accumulator, even if it means an extra instruction per component might be preferable. We don't know, we don't have data, and we don't have the necessary register on Ironlake for shader_time to tell us. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77703 Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2014-05-24i965: Remove useless typo'd debugging messages.Matt Turner1-6/+0
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-05-24i965: Move brw_land_fwd_jump() to compilation unit of its use.Matt Turner3-23/+16
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-05-24i965/fs: Use next_insn_offset rather than nr_insn.Matt Turner2-4/+4
Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-05-24i965: Emit 0.0:F sources with type VF instead.Matt Turner1-0/+16
Number of compacted instructions: 817752 -> 827404 (1.18%) Reviewed-by: Eric Anholt <eric@anholt.net>
2014-05-24i965: Emit ARF:UD for non-present src1 on Gen6+.Matt Turner1-2/+26
Enables the next commits to compact more instructions. Reviewed-by: Eric Anholt <eric@anholt.net>
2014-05-24i965: Support compacted instructions with immediate sources.Matt Turner1-20/+63
Note the weirdness with src1 subregs. The compacted immediate fields are uncompacted to bits [127:96] and the high five bits of the subreg mapping maps to bits [100:96]. Number of compacted instructions: 790085 -> 817752 (3.50%) Reviewed-by: Eric Anholt <eric@anholt.net>
2014-05-24i965: Use next_offset() in instruction compaction code.Matt Turner1-17/+3
Reviewed-by: Eric Anholt <eric@anholt.net>
2014-05-24i965: Move next_offset() to brw_eu.h for use elsewhere.Matt Turner2-11/+12
Also perform arithmetic on char* rather than void* since the latter is a GNU C extension not available in C++. Reviewed-by: Eric Anholt <eric@anholt.net>
2014-05-24i965: Rename next_ip() -> next_offset().Matt Turner1-30/+33
That we were comparing its return value with offsets should have been a clue. :) Make it take a void *store in preparation for making the function useful elsewhere. Reviewed-by: Eric Anholt <eric@anholt.net>
2014-05-24i965: Print disassembly after compaction.Matt Turner9-283/+198
Reviewed-by: Eric Anholt <eric@anholt.net>
2014-05-24i965/fs: Make patch_discard_jumps_to_fb_writes return bool.Matt Turner3-6/+8
... to tell us whether it emitted any code. Will be used to determine whether we need to skip an annotation for it. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
2014-05-24i965: Add annotation data structure and support code.Matt Turner11-9/+183
Will be used to print disassembly after jump targets are set and instructions are compacted, while still retaining higher-level IR annotations and basic block information. An array of 'struct annotation' will live along side the generated assembly. The generators will populate the array with their IR annotations, and basic block pointers if the instructions began or ended a basic block pointer. We'll then update the instruction offset when we compact instructions and then using the annotations print the disassembly. Reviewed-by: Eric Anholt <eric@anholt.net>
2014-05-24i965/fs+blorp: Remove left over dump_file arguments.Matt Turner5-19/+15
Were used by the blorp unit test programs. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
2014-05-24i965/fs: Don't hardcode DEBUG_WM in generic fs code.Matt Turner6-27/+25
Similar to Paul's commit e9fa3a944 except brw_fs_generator's debug_flag is for DEBUG_WM and DEBUG_BLORP. Reviewed-by: Eric Anholt <eric@anholt.net>
2014-05-24i965: Pass in start_offset to brw_compact_instructions().Matt Turner8-17/+17
Let's us avoid recompacting the SIMD8 instructions when we compact the SIMD16 program. Reviewed-by: Eric Anholt <eric@anholt.net>
2014-05-24i965: Delete unused brw_blorp_blit_test_compile().Matt Turner1-11/+0
2014-05-24i965/cfg: Make DO instruction begin a basic block.Matt Turner1-9/+12
The DO instruction doesn't exist on Gen6+. Since before this commit, DO always ended a basic block, if it also happened to start one (e.g., a while loop inside an if statement) the block containing only the DO would actually contain no hardware instructions. Pre-Gen6's WHILE instructions jumps to the instruction following the DO, so strictly speaking we won't be modeling that properly, but I claim there is actually no functional difference. This will simplify an upcoming change where we want to mark the first hardware instruction in the loop as beginning a block, and the last instruction before the loop as ending one. Reviewed-by: Eric Anholt <eric@anholt.net>
2014-05-24darwin: Guard Core Profile usage behind a testing envvarJeremy Huddleston Sequoia1-10/+20
Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
2014-05-24darwin: Write errors in choosing the pixel format to the crash logJeremy Huddleston Sequoia1-2/+16
Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
2014-05-23nv50: count wrapped textures towards the tex_obj countJoakim Sindholt1-0/+2
But don't count their size towards the allocated memory, since that belongs to whoever created it. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-23nvc0: assert that we have vertex elements stateChristoph Bumiller1-0/+1
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-23nvc0: use PRIxPTR for sizeof()Christoph Bumiller1-1/+1
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-23nv50,nvc0: allow 15,16,30 bpp display formatsChristoph Bumiller1-4/+4
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-23nv50,nvc0: handle guard band definesChristoph Bumiller2-4/+16
[imirkin: moved default case out of switch] Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-23nv50/ir/tgsi: optimize KILChristoph Bumiller1-0/+5
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>
2014-05-23nv50/ir: fix lowering of predicated instructions (without defs)Christoph Bumiller1-1/+4
Note that predicated instructions with defs are still not supported because transformation to SSA doesn't handle them yet. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>