summaryrefslogtreecommitdiff
path: root/src/mesa/drivers
AgeCommit message (Collapse)AuthorFilesLines
2013-08-12i965/fs: Explicitly disallow CSE on predicated instructions.Kenneth Graunke1-1/+3
The existing inst->is_partial_write() already disallows predicated instructions, so this has no functional change. However, it's worth doing explicitly since the CSE pass does not consider the flag register. This means it could blindly factor out operations that use the same sources, but which have different condition codes set. This prevents a regression in the next commit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-12i965/fs: Log a performance warning if skipping 16-wide due to pulls.Kenneth Graunke1-7/+11
Usually, the driver creates both 8-wide and 16-wide variants of every fragment shader. When 16-wide compilation fails, it logs a performance warning explaining why only an 8-wide program exists. However, when there are pull parameters, the driver won't even bother trying the 16-wide compile (since it would fail). In this case, it failed to emit a performance warning, leaving no explanation for the missing 16-wide program. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-10i965: add missing BRW_NEW_INTERPOLATION_MAP to state dumpChris Forbes1-0/+1
Makes this flag appear in the output for INTEL_DEBUG=state Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-10i965: Add a new debug mode for the VUE mapChris Forbes3-0/+29
INTEL_DEBUG=vue now emits a listing of each slot in the VUE map, and the corresponding interpolation mode. V2: Fix whitespace issues. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-08i965: Remember to call intel_prepare_render() before blitting.Kenneth Graunke1-0/+5
Otherwise, blits to the window system buffer may cause crashes, since dst_irb->mt may be NULL. This code is lifted straight out of brw_blorp_framebuffer()'s try_blorp_blit() helper. Fixes crashes in Piglit's fbo-sys-blit on systems without BLORP. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65919 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <idr@freedesktop.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-06i965: Add #defines for the MI_LOAD_REGISTER_MEM command.Kenneth Graunke1-0/+4
This command reads a value from memory and writes it to a register (the opposite of MI_STORE_REGISTER_MEM). It's only available on Gen7+. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-06i965: Initialize the intel_context::bufmgr pointer earlier.Kenneth Graunke1-2/+1
This prevents a crash in a future patch. _mesa_initialize_context() creates a default transform feedback object by calling the NewTransformFeedbackObject() driver hook. Eventually, we'll want to subclass that and allocate a buffer object. This means passing brw->bufmgr to drm_intel_alloc_bo(), and crashing if it isn't initialized yet. The buffer manager is actually already initialized; we just hadn't copied the pointer from intel_screen to intel_context quite early enough. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-06i965: Tidy preprocessor macros for SO_PRIM_STORAGE_NEEDED registers.Kenneth Graunke1-5/+2
Gen7+ supports four transform feedback streams. Using a function-like macro makes it easy to access them by stream number or loop over them. "GEN7_" prefixes are more common than "_IVB" suffixes, so use that. Gen6 only supports a single stream, so the single #define should be fine. However, SO_NUM_PRIM_STORAGE_NEEDED was a poor name. For one, the word "NUM" doesn't appear in the actual name of the register. It's also confusingly generic, as it doesn't exist on Gen7+. Add a "GEN6_" prefix for clarity. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-06i965: Tidy preprocessor macros for SO_NUM_PRIMS_WRITTEN registers.Kenneth Graunke2-7/+4
Gen7+ supports four transform feedback streams. Using a function-like macro makes it easy to access them by stream number or loop over them. "GEN7_" prefixes are more common than "_IVB" suffixes, so we use that. Gen6 only supports a single stream, so the single #define should be fine. However, SO_NUM_PRIMS_WRITTEN was confusingly generic, as it doesn't exist on Gen7+. Add a "GEN6_" prefix for clarity. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-06i965: Don't allocate curbe buffers on Gen6+.Kenneth Graunke1-2/+4
These are only used on Gen4-5. Why waste the 8kB of space? Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
2013-08-04intel_fbo: remove unused intel_renderbuffer hiz functionsJordan Justen2-26/+0
We are now using functions that operate on the renderbuffer attachment to handle layered rendering. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-04i965 clear/draw: set renderbuffer attachment as needing depth resolveJordan Justen2-2/+4
Previously we would mark a renderbuffer as needing a depth resolve. But, to support layered rendering, we need to look at the attachment instead, since the attachment knows if layered rendering is being used. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-04i965: add intel_renderbuffer_att_set_needs_depth_resolveJordan Justen2-0/+18
This function is needed to support layered rendering. With layered rendering, the attachment stores the state of whether layered rendering is being used. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-04i965: add intel_miptree_set_all_slices_need_depth_resolveJordan Justen2-0/+16
This function marks all slices of a renderbuffer at a particular level as needing a depth resolve. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-04i965 gen7: don't set FORCE_ZERO_RTAINDEX for layered renderingJordan Justen1-1/+1
When layered rendering is being used, we should not set FORCE_ZERO_RTAINDEX in the clip state to allow render target array values other than zero to be used. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-04hsw hiz: Remove x/y offset restriction for hizJordan Justen1-24/+0
This restriction was related to programming the offset fields of the depth buffer packet. We are now setting these offsets to 0 now, so this restriction should no longer be required. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-04gen7 depth surface: program 3DSTATE_DEPTH_BUFFER to top of surfaceJordan Justen4-66/+45
Previously we would always find the 2D sub-surface of interest, and then program the surface to this location. Now we always program the 3DSTATE_DEPTH_BUFFER at the start of the surface. To select the lod/slice, we utilize the lod & minimum array element fields. As part of this change, we must revert 1f112ccf: Revert "i965/gen7: Align all depth miplevels to 8 in the X direction." We also must disable brw_workaround_depthstencil_alignment for gen >= 7. Now the hardware will handle alignment when rendering to additional slices/LODs. v2: * Merge with recent MOCS changes Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-04gen7 fbo: make unmatched depth/stencil configs return unsupportedJordan Justen1-0/+16
For gen >= 7, we will use the lod/minimum-array-element fields to support layered rendering. This means that we must restrict the depth & stencil attachments to match in various more retrictive ways. (Now the width, height, depth, LOD and layer must match) The reason width, height, and depth must match is that the hardware has a single set of width, height, and depth settings (in 3DSTATE_DEPTH_BUFFER) that affect both the depth and stencil buffers. Since these controls determine the miptree layout, they need to be set correctly in order for lod and minimum-array-element to work properly. So the only way rendering can work is if the width, height, and depth match. In the future, if this restriction proves to be a problem (say because some crucial client application relies on rendering to different levels/layers of stencil and depth buffers), then we can always work around the restriction by copying depth and/or stencil data to a temporary buffer prior to rendering (much in the same way that brw_workaround_depthstencil_alignment() does today for gen < 7), but hopefully that won't be necessary. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-04hsw hiz: Add new size restrictions for miplevels > 0Jordan Justen1-3/+13
When performing hiz ops, we must ensure that the region sizes have an 8 aligned width and 4 aligned height. We can tweak the size for blorp hiz operations at LOD 0, but for the others we can't. Therefore, we disable hiz for these miplevels if they don't meet the size alignment requirements. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-04gen7 blorp depth: calculate base surface width/heightJordan Justen1-0/+13
This will be used in 3DSTATE_DEPTH_BUFFER in a later patch. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-04gen7 depth surface: calculate minimum array element being renderedJordan Justen2-0/+17
In layered rendering this will be 0. Otherwise it will be the selected slice. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-04gen7 depth surface: calculate LOD being rendered toJordan Justen2-0/+6
This will be used in 3DSTATE_DEPTH_BUFFER in a later patch. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-04gen7 depth surface: calculate depth (array size) for depth surfaceJordan Justen2-0/+5
This will be used in 3DSTATE_DEPTH_BUFFER in a later patch. Note: Cube maps are treated as 2D arrays with 6 times as many array elements as the cube map array would have. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-04gen7 depth surface: calculate more specific surface typeJordan Justen2-0/+47
This will be used in 3DSTATE_DEPTH_BUFFER in a later patch. Note: Cube maps are treated as 2D arrays with 6 times as many array elements as the cube map array would have. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-04i965: init global state first in brw_workaround_depthstencil_alignmentJordan Justen1-5/+14
In a future pass this will allow us to exit-early from this routine to disable it for gen >= 7. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-02i965: Initialize the maximum number of GS threads on Haswell.Kenneth Graunke1-0/+3
We'll need proper values for max_gs_threads when we eventually support geometry shaders. Also, we initialize it for every other platform. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-02i965: enable image external sampling for imported dma-buffersTopi Pohjolainen2-0/+8
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-02intel: restrict dma-buf-import images to external sampling onlyTopi Pohjolainen4-1/+27
Memory originating outside mesa stack is meant to be for reading only. In addition, the restrictions imposed by the image external extension should apply. For example, users shouldn't be allowed to generare mip-trees based on these images. v2 (Chad): document using full extension names, fix the comment style itself and emit description of error Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-02dri: propagate extra dma_buf import attributes to the driversTopi Pohjolainen2-2/+53
v2: do not break ABI, but instead introduce new entry point for dma buffers and bump up the dri-interface version to eight v3 (Chad): allow the hook to specify an error originating from the driver. For now only unsupported format is considered. I thought about rejecting the hints also as they are addressing only YUV sampling which is not supported at the moment but then thought against it as the spec is not saying one way or the other. v4 (Eric, Chad): restrict to rgb formatted only v5: rebased on top of i915/i965 split v6 (Chad): document using full extension name Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-02intel: set dri image dimensions even when creating out of primesTopi Pohjolainen1-0/+2
Otherwise 'intel_set_texture_image_region()' won't have enough details to work with. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-02intel: refactor planar format lookupTopi Pohjolainen1-13/+18
v2 (Eric): refactor both occurences, not just one v3 (Chad): replace 0 by NULL Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-02intel: do not create renderbuffers out of planar imagesTopi Pohjolainen1-0/+7
v2 (Chad): emit 'GL_INVALID_OPERATION' and description of error Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-02intel: allow packed prime buffers to be treated normallyTopi Pohjolainen1-1/+5
v2: - fix earlier rebase error breaking bisect (loaderPriv -> loaderPrivate) Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-01mesa: Refactor copying of linked program data.Paul Berry1-4/+2
This patch creates a single function to copy the the UsesClipDistance flag from gl_shader_program.Vert to gl_vertex_program. Previously this logic was duplicated in the i965-specific function brw_link_shader() and the core mesa function _mesa_ir_link_shader(). This logic will have to be expanded to support geometry shaders, and I don't want to have to update it in two separate places. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01glsl: add ir_emit_vertex and ir_end_primitive instruction typesBryan Cain4-0/+28
These correspond to the EmitVertex and EndPrimitive functions in GLSL. v2 (Paul Berry <stereotype441@gmail.com>): Add stub implementations of new pure visitor functions to i965's vec4_visitor and fs_visitor classes. v3 (Paul Berry <stereotype441@gmail.com>): Rename classes to be more consistent with the names used in the GL spec. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01main: Allow for the possibility of GL 3.2 without ARB_geometry_shader4.Paul Berry1-1/+1
Previously, we assumed that the only way Mesa would expose geometry shader support was via the ARB_geometry_shader4 extension. But this extension has some extra complications over GL 3.2 (interactions with compatibility-only features, and link-time initialization of the constant gl_VerticesIn). So we want to allow for the possibility of supporting GL 3.2 (with GLSL 1.50 style geometry shaders) even if ctx->Extensions.ARB_geometry_shader4 is false. This patch adds a new function, _mesa_has_geometry_shaders(), which returns true if either ARB_geometry_shader4 is supported or the GL version is at least 3.2 desktop. Since compute_version() only enables GL 3.2 functionality when GLSL 1.50 support is present, a sufficient way for a back-end to advertise geometry shader support is to set ctx->Const.GLSLVersion >= 150. v2: Remove unnecessary ctx->Const.GeometryShaders150 constant. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01glsl: Change do_set_program_inouts' is_fragment_shader arg to shader_type.Paul Berry1-2/+1
This will allow us to add geometry shader support without having to add another boolean argument. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01i965: Delete the BATCH_LOCALS macro.Kenneth Graunke2-6/+0
This hasn't done anything in a long time, and it's only used in a couple places...which means we couldn't use it without doing a bunch of work anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-01i965 Gen4/5: clip: Don't mangle flat varyingsChris Forbes1-21/+32
This patch ensures that integers will pass through unscathed. Doing (useless) computations on them is risky, especially when their bit patterns correspond to values like inf or nan. [V1-2]: Signed-off-by: Olivier Galibert <galibert at pobox.com> Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01i965 Gen4/5: clip: Add support for noperspective varyingsChris Forbes3-10/+113
Adds support for interpolating noperspective varyings linearly in screen space when clipping. Based on Olivier Galibert's patch from last year: http://lists.freedesktop.org/archives/mesa-dev/2012-July/024341.html At this point all -fixed and -vertex interpolation tests work. V5: Add brw_clip_compile.has_noperspective_shading rather than another key flag. V6: Real bools. [V1-2]: Signed-off-by: Olivier Galibert <galibert at pobox.com> Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01i965 Gen4/5: clip: correctly handle flat varyingsChris Forbes6-54/+30
Previously we only gave special treatment to the builtin color varyings. This patch adds support for arbitrary flat-shaded varyings, which is required for GLSL 1.30. Based on Olivier Galibert's patch from last year: http://lists.freedesktop.org/archives/mesa-dev/2012-July/024340.html V5: Move key.do_flat_shading to brw_clip_compile.has_flat_shading V6: Real bools. [V1-2]: Signed-off-by: Olivier Galibert <galibert at pobox.com> Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01i965 Gen4/5: Generalize SF interpolation setup for GLSL1.3Chris Forbes4-70/+85
Previously the SF only handled the builtin color varying specially. This patch generalizes that support to cover user-defined varyings, driven by the interpolation mode array set up alongside the VUE map. Based on the following patches from Olivier Galibert: - http://lists.freedesktop.org/archives/mesa-dev/2012-July/024335.html - http://lists.freedesktop.org/archives/mesa-dev/2012-July/024339.html With this patch, all the GLSL 1.3 interpolation tests that do not clip (spec/glsl-1.30/execution/interpolation/*-none.shader_test) pass. V5: Move key.do_flat_shading to brw_sf_compile.has_flat_shading; drop vestigial hunks. V6: Real bools. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01i965: Add helper functions for interpolation mapChris Forbes1-0/+18
V6: real bools Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01i965 Gen4/5: Introduce 'interpolation map' alongside the VUE mapChris Forbes9-2/+120
The interpolation map (in brw->interpolation_mode) is a new auxiliary structure alongside the post-GS VUE map, which describes the interpolation modes for each VUE slot, for use by the clip and SF stages. This patch introduces a new state atom to compute the interpolation map, and adjusts the program keys for the clip and SF stages, but it is not actually used yet. [V1-2]: Signed-off-by: Olivier Galibert <galibert at pobox.com> V3: Updated for vue_map changes, intel -> brw merge, etc. (Chris Forbes) V4: Compute interpolation map as a new state atom rather than tacking it on the front of the clip setup V5: Rework commit message, make interpolation_mode_map a struct. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-31i965/vs: Put lod parameter in the correct place for Gen4Chris Forbes1-1/+1
This was never visible before due to the bogus sampler state pointer. Fixes remaining vertex texturing breakage on Gen4. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org
2013-07-31i965/vs: set up sampler state pointer for Gen4/5.Chris Forbes1-6/+21
Fixes broken filter and lod selection for vertex texturing. (txs/txf only worked properly because they ignore the sampler state completely) Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org
2013-07-30st/dri: add a new driconf option disable_shader_bit_encoding for UnigineMarek Olšák2-0/+11
Now Unigine Heaven 3.0 finally works with r600g. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-30mesa,glsl,st/dri: add a new driconf option force_glsl_version for UnigineMarek Olšák2-0/+39
See documentation in mtypes.h. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-30driconf: enable app-specific workarounds for all driversMarek Olšák2-2/+6
They were only enabled for i965. Note that drirc must be installed in /etc. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-30driconf: remove the unused option allow_large_texturesMarek Olšák1-9/+0
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>