summaryrefslogtreecommitdiff
path: root/src/mesa/drivers/dri/i965/brw_state_upload.c
AgeCommit message (Collapse)AuthorFilesLines
2016-10-05i965: Eliminate brw->tes.prog_data pointer.Kenneth Graunke1-1/+0
Just say no to: - brw->tes.base.prog_data = &brw->tes.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_tes_prog_data as needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>
2016-10-05i965: Eliminate brw->tcs.prog_data pointer.Kenneth Graunke1-1/+0
Just say no to: - brw->tcs.base.prog_data = &brw->tcs.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_tcs_prog_data as needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>
2016-10-05i965: Eliminate brw->vs.prog_data pointer.Kenneth Graunke1-3/+6
Just say no to: - brw->vs.base.prog_data = &brw->vs.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_vs_prog_data as needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>
2016-10-03i965: Only emit 1 viewport when possible.Kenneth Graunke1-0/+11
In core profile, we support up to 16 viewports. However, in the majority of cases, only 1 of them is actually used - we only need the others if the last shader stage prior to the rasterizer writes gl_ViewportIndex. Processing all 16 viewports adds additional CPU overhead, which hurts CPU-intensive workloads such as Glamor. This meant that switching to core profile actually penalized Glamor to an extent, which is unfortunate. This patch tracks the number of relevant viewports, switching between 1 and ctx->Const.MaxViewports if gl_ViewportIndex is written. A new BRW_NEW_VIEWPORT_COUNT flag tracks this. This could mean re-emitting viewport state when switching, but hopefully this is offset by doing 1/16th of the work in the common case. The new flag is also lighter weight than BRW_NEW_VUE_MAP_GEOM_OUT, which we were using in one case. According to Eric Anholt, x11perf -copypixwin10 performance improves by 11.5094% +/- 3.10841% (n=10) on his Skylake. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-09-27i965: create populate key functions for tcs and tesTimothy Arceri1-16/+2
These will be used by the on disk shader cache. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-31i965: Merge gen7_clip_state atom into gen6_clip_state atom.Kenneth Graunke1-1/+1
The original motivation was that gen6_clip_state ignored _NEW_POLYGON as it didn't care about early culling. The only other change was that Gen6 ignored BRW_NEW_TES_PROG_DATA as it doesn't have tessellation shaders, but listening to this is harmless as it'll never be signalled. Now that we've added _NEW_POLYGON for is_drawing_lines/points, we can merge the two as the distinction is meaningless. This actually fixes a bug, though: Gen8+ was using the gen6_clip_state atom because it doesn't care about early culling, but it also needs BRW_NEW_TES_PROG_DATA, which was missing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-08-25i965: Upload surface state for non-coherent framebuffer fetch.Francisco Jerez1-0/+4
This iterates over the list of attached render buffers and binds appropriate surface state structures to the binding table block allocated for shader framebuffer read. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-07-14i965: fix compiler warnings for 32bit buildTimothy Arceri1-1/+1
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-06-23i965: Combine 3DSTATE_STREAMOUT emitters and genX_sol_state atoms.Kenneth Graunke1-1/+1
They're basically the same. Let's avoid the code duplication. v2: Fix SO_BUFFER_ENABLE stuff to only happen on Gen < 8 (caught by Jason Ekstrand). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-16i965: Send the minimal number of STATE_BASE_ADDRESS packets.Kenneth Graunke1-12/+2
STATE_BASE_ADDRESS stalls the whole pipeline, and the documentation cautions us to emit it as little as possible for better performance. We recently put some hacks in BLORP to try and avoid emitting it if it was already set correctly. However, this wasn't quite minimal: if BLORP is the first operation (i.e. glClear()), then it would emit it, and subsequent draw calls would emit it again. This caused a small drop in performance in GPUTest Triangle when switching from Meta to BLORP. Unlike most packets, STATE_BASE_ADDRESS isn't influenced by GL state: it needs to be emitted once per batch, before most other commands, or whenever we change the program cache BO. It's also valid in both the 3D and compute pipelines, which makes it even more unique. This patch removes it from the atom mechanism and instead directly calls it as part of every draw, compute dispatch, or BLORP operation. We introduce a new flag indicating that STATE_BASE_ADDRESS has already been emitted this batch, and if so, skip doing it again. When we make a new program cache BO, we simply reset the flag, so the next operation will emit it again. When we flush/reset the batch, we reset the flag. This guarantees that we'll emit STATE_BASE_ADDRESS only when we have to. It's also less code than the old atom mechanism. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-16i965: Combine Gen4-7 and Gen8+ state base address emitters.Kenneth Graunke1-2/+2
We're about to start calling it directly, and this means the callers won't have to think about generations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-16i965: Move Gen4-5 programs to brw_upload_programs() too.Kenneth Graunke1-5/+6
This way all the programs are in one place again, and it also should make some future STATE_BASE_ADDRESS related changes possible. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-23i965: Introduce state flag for blorpTopi Pohjolainen1-0/+1
In the past, BLORP has clobbered all BRW_NEW_* state flags, to trigger re-emission of the entire 3D pipeline on the next draw. However, there are some packets BLORP simply leaves alone, so there's no need to re-emit them. Trying to reduce the set of dirty bits flagged after BLORP runs is tricky. Instead, we introduce a BRW_NEW_BLORP flag. This should be set on any atom which emits a packet that BLORP also emits. When BLORP runs, it will flag BRW_NEW_BLORP, causing those packets to get re-emitted. This also makes it easy to avoid re-emitting specific atoms - we can simply drop the BRW_NEW_BLORP flag on those. To start, we assume that all packets need to be re-emitted. This is the safest approach and closest to the existing code's behavior. Many of these are obviously not required, and can be dropped in subsequent patches. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2016-02-11i965: Split brw_upload_texture_surfaces into compute/render atoms.Kenneth Graunke1-2/+2
When uploading state for the compute pipeline, we don't want to look at VS/TCS/TES/GS/FS programs, as they might be stale, and aren't relevant anyway. Likewise, the render pipeline shouldn't look at CS. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93790 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-19i965: Implement compute sampler state atom.Francisco Jerez1-0/+2
Fixes a number of GLES31 CTS failures and hangs on various hardware: ES31-CTS.texture_gather.plain-gather-depth-2d ES31-CTS.texture_gather.plain-gather-depth-2darray ES31-CTS.texture_gather.plain-gather-depth-cube ES31-CTS.texture_gather.offset-gather-depth-2d ES31-CTS.texture_gather.offset-gather-depth-2darray ES31-CTS.layout_binding.sampler2D_layout_binding_texture_ComputeShader ES31-CTS.layout_binding.sampler2DArray_layout_binding_texture_ComputeShader ES31-CTS.explicit_uniform_location.uniform-loc-types-samplers ES31-CTS.compute_shader.resources-texture Some of them were actually passing by luck on some generations even though we weren't uploading sampler state tables explicitly for the compute stage, most likely because they relied on the cached sampler state left from previous rendering to be close enough. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92589 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93312 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93325 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93407 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93725 Reported-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-14i965: Add state bit to trigger re-emission of color calculator state.Francisco Jerez1-0/+1
This will be used on Gen8+ to make sure that the color calculator state pointers are re-emitted when switching back to the 3D pipeline after some GPGPU workload due to a hardware workaround. There are other state bits already defined that could be used to achieve the same effect but they all cause a ton of unrelated state to be re-emitted (e.g. BRW_NEW_STATE_BASE_ADDRESS), so just define a new one, state bits are cheap. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-12-28i965: Add the TCS/TES state upload atoms to the gen7_atoms list.Kenneth Graunke1-0/+14
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-22i965: Handle mix-and-match TCS/TES with separate shader objects.Kenneth Graunke1-9/+29
GL_ARB_separate_shader_objects allows the application to mix-and-match TCS and TES programs separately. This means that the interface between the two stages isn't known until the final SSO pipeline is in place. This isn't a great match for our hardware: the TCS and TES have to agree on the Patch URB entry layout. Since we store data as per-patch slots followed by per-vertex slots, changing the number of per-patch slots can significantly alter the layout. This can easily happen with SSO. To handle this, we store the [Patch]OutputsWritten and [Patch]InputsRead bitfields in the TCS/TES program keys, introducing program recompiles. brw_upload_programs() decides the layout for both TCS and TES, and passes it to brw_upload_tcs/tes(), which store it in the key. When creating the NIR for a shader specialization, we override nir->info.inputs_read (and friends) to the program key's values. Since everything uses those, no further compiler changes are needed. This also replaces the hack in brw_create_nir(). To avoid recompiles, brw_precompile_tes() looks to see if there's a TCS in the linked shader. If so, it accounts for the TCS outputs, just as brw_upload_programs() would. This eliminates all recompiles in the non-SSO case. In the SSO case, there should only be recompiles when using a TCS and TES that have different input/output interfaces. Fixes Piglit's mix-and-match-tcs-tes test. v2: Pull the brw_upload_programs code into a brw_upload_tess_programs() helper function (requested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-22i965: Upload HS push constants whenever default tess. levels change.Kenneth Graunke1-0/+2
When using tessellation on OpenGL without a TCS, default values for gl_TessLevelOuter/gl_TessLevelInner are provided via the API. Core Mesa will flag ctx->DriverFlags.NewDefaultTessLevels whenever those values change. We add a corresponding BRW_NEW_DEFAULT_TESS_LEVELS flag and hook it up to HS push constants (which will be used to upload these default values to the autogenerated TCS). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-22i965: Consolidate BRW_NEW_TESS_{CTRL,EVAL}_PROGRAM flags.Kenneth Graunke1-4/+3
For several reasons, I don't think it's particularly useful to have separate flags: 1. Most of the time, tessellation shaders are paired, so both will be replaced at the same time. 2. The data layout is tightly coupled. Both need to agree on the number of per-patch slots in the VUE map. Even adding extra TCS outputs that aren't read by the TES will trigger the need for recompiles. 3. The TCS is optional from an API perspective, but required by the hardware whenever tessellation is enabled. So, atoms that deal with the TCS must check brw->tess_eval_program (BRW_NEW_TESS_EVAL_PROGRAM?) rather than brw->tess_ctrl_program to tell whether tessellation is enabled. So, not only is it unlikely to be useful, it's a bit confusing to get right. Simply using one flag for both simplifies this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-22i965: Only call brw_upload_tcs/tes_prog when using tessellation.Kenneth Graunke1-2/+9
If there's no evaluation shader, tessellation is disabled. The upload functions would just bail. Instead, don't bother calling them. This will simplify the optional-TCS case a bit, as brw_upload_tcs can assume that we're doing tessellation. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-22i965: Add tessellation control shaders.Kenneth Graunke1-0/+1
The TCS is the first tessellation shader stage, and the most complicated. It has access to each of the control points in the input patch, and computes a new output patch. There is one logical invocation per output control point; all invocations run in parallel, and can communicate by reading and writing output variables. One of the main responsibilities of the TCS is to write the special gl_TessLevelOuter[] and gl_TessLevelInner[] output variables which control how much new geometry the hardware tessellation engine will produce. Otherwise, it simply writes outputs that are passed along to the TES. We run in SIMD4x2 mode, handling two logical invocations per EU thread. The hardware doesn't properly manage the dispatch mask for us; it always initializes it to 0xFF. We wrap the whole program in an IF..ENDIF block to handle an odd number of invocations, essentially falling back to SIMD4x1 on the last thread. v2: Update comments (requested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-22i965: Add tessellation evaluation shadersKenneth Graunke1-0/+3
The TES is essentially a post-tessellator VS, which has access to the entire TCS output patch, and a special gl_TessCoord input. Otherwise, they're very straightforward. This patch implements SIMD8 tessellation evaluation shaders for Gen8+. The tessellator can generate a lot of geometry, so operating in SIMD8 mode (8 vertices per thread) is more efficient than SIMD4x2 mode (only 2 vertices per thread). I have another patch which implements SIMD4x2 mode for older hardware (or via an environment variable override). We currently handle all inputs via the pull model. v2: Improve comments (suggested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-11i965: Add tessellation shader push constant support.Kenneth Graunke1-0/+2
Based on a patch by Chris Forbes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-11i965: Add tessellation shader sampler support.Kenneth Graunke1-0/+2
Based on code by Chris Forbes and Fabian Bieler. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-11i965: Add tessellation shader surface support.Kenneth Graunke1-0/+12
This is brw_gs_surface_state.c copy and pasted twice with search and replace. brw_binding_table.c code is similarly copy and pasted. v2: Drop dword_pitch related fields. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-12-09i965: Hook up L3 partitioning state atom.Francisco Jerez1-0/+4
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-12-09i965: Define and use REG_MASK macro to make masked MMIO writes slightly more ↵Francisco Jerez1-1/+1
readable. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-12-09i965: Define state flag to signal that the URB size has been altered.Francisco Jerez1-0/+1
This will make sure that we recalculate the URB layout anytime the URB size is modified by the L3 partitioning code. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-12-07i965: Add state bits for tess stagesChris Forbes1-0/+7
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-12-07i965: Add backend structures for tess stagesChris Forbes1-0/+8
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-12-07i965: Create new files for HS/DS/TE state upload code.Kenneth Graunke1-1/+6
For now, this just splits the existing code to disable these stages into separate atoms/files. We can then replace it with real code. v2: Bump the render atoms in this patch so it compiles (in my branch, I'd bumped it in an earlier patch). 61 seems to be the minimum that works, which doesn't match the old value + the number of atoms I added in this patch, so apparently we had some slop before. v3: Actually disable the DS unit on Gen8+. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> [v1] Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-11-11i965: Combine BRW_NEW_*_BINDING_TABLE dirty bits.Kenneth Graunke1-3/+1
A while back, we moved to directly emitting the Gen7+ state when constructing the binding tables. These flags are only used on Gen4-6, which emit all the binding table pointers at once. We gain nothing by having separate flags, so combine them. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-11-01i965: Setup pull constant state for compute programsJordan Justen1-0/+2
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-09-30i965/cs: Upload UBO/SSBO surfacesJordan Justen1-0/+2
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-09-29i965/cs: Setup surface binding for gl_NumWorkGroupsJordan Justen1-0/+3
This will only be setup when the prog_data uses_num_work_groups boolean is set. At this point nothing will set uses_num_work_groups, but soon code will set it when emitting code for the intrinsic that loads gl_NumWorkGroups. We can't emit this surface information earlier at the start of the DispatchCompute* call because we may not have generated the program yet. Until we generate the program, we don't know if the gl_NumWorkGroups variable is accessed. We also can't emit the surface as part of the brw_cs_state atom, because we might not need the surface if gl_NumWorkGroups is not used by the program. Lastly, we cannot emit the surface later (after state upload) in the DispatchCompute* call, because it needs to be run before the brw_cs_state atom is emitted, since it changes the surface state. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-26i965: Simplify handling of VUE map changes.Kenneth Graunke1-1/+15
The old code was disasterously complex - spread across multiple atoms which may not even run, inspecting the dirty bits to try and decide whether it was necessary to do checks...storing VS information in brw_context...extra flagging... This code tripped me and Carl up very badly when working on the shader cache code. It's very fragile and hard to maintain. Now that geometry shaders only depend on their inputs and don't have to worry about the VS VUE map, we can dramatically simplify this: just compute the VUE map coming out of the geometry shader stage in brw_upload_programs. If it changes, flag it. Done. v2: Also check vue_map.separable. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-09-25i965: Implement DriverFlags.NewShaderStorageBufferIago Toral Quiroga1-0/+1
We use the same dirty state for SSBOs and UBOs because they share the same infrastructure. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-10i965/cs: Emit texture surfaces to enable CS samplingJordan Justen1-0/+2
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-10i965: Resolve GCC sign-compare warning.Rhys Kidd1-1/+1
mesa/src/mesa/drivers/dri/i965/brw_eu_compact.c: In function 'set_3src_control_index': mesa/src/mesa/drivers/dri/i965/brw_eu_compact.c:805:22: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int i = 0; i < ARRAY_SIZE(gen8_3src_control_index_table); i++) { ^ mesa/src/mesa/drivers/dri/i965/brw_eu_compact.c: In function 'set_3src_source_index': mesa/src/mesa/drivers/dri/i965/brw_eu_compact.c:839:22: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int i = 0; i < ARRAY_SIZE(gen8_3src_source_index_table); i++) { ^ mesa/src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_sampler_state': mesa/src/mesa/drivers/dri/i965/brw_state_dump.c:382:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < size / 16; i++) { ^ mesa/src/mesa/drivers/dri/i965/brw_state_upload.c: In function 'brw_pipeline_state_finished': mesa/src/mesa/drivers/dri/i965/brw_state_upload.c:801:13: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (i != pipeline) { ^ mesa/src/mesa/drivers/dri/i965/intel_mipmap_tree.c: In function 'intel_gen7_hiz_buf_create': mesa/src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1544:47: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int level = mt->first_level; level <= mt->last_level; ++level) { ^ mesa/src/mesa/drivers/dri/i965/intel_mipmap_tree.c: In function 'intel_gen8_hiz_buf_create': mesa/src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1638:44: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int level = mt->first_level; level <= mt->last_level; ++level) { ^ mesa/src/mesa/drivers/dri/i965/intel_mipmap_tree.c: In function 'intel_miptree_alloc_hiz': mesa/src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1771:44: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int level = mt->first_level; level <= mt->last_level; ++level) { ^ mesa/src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1775:33: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int layer = 0; layer < mt->level[level].depth; ++layer) { ^ Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-09-02i965/cs: Setup push constant data for uniformsJordan Justen1-0/+2
brw_upload_cs_push_constants was based on gen6_upload_push_constants. v2: * Add FINISHME comments about more efficient ways to push uniforms Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-08-11i965: Hook up image state upload.Francisco Jerez1-0/+12
v2: Add CS support. Move the image_params array back to brw_stage_prog_data. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2015-07-18i965: Enable hardware-generated binding tables on render path.Abdiel Janulgue1-0/+4
This patch implements the binding table enable command which is also used to allocate a binding table pool where where hardware-generated binding table entries are flushed into. Each binding table offset in the binding table pool is unique per each shader stage that are enabled within a batch. Also insert the required brw_tracked_state objects to enable hw-generated binding tables in normal render path. v2: - Use MOCS in binding table pool alloc for GEN8 - Fix spurious offset when allocating binding table pool entry and start from zero instead. v3: - Include GEN8 fix for spurious offset above. v4: - Fixup wrong packet length in enable/disable hw-binding table for GEN8 (Ville). - Don't invoke HW-binding table disable command when we dont have resource streamer (Chris). v5: - Reorder the state cache invalidate flush so it happens in-between enabling hw-generated binding tables and the previous sw-binding table GPU state (Chris). v6: - Do the same fix in v5 for gen7_disable_hw_binding_tables(). - Adhere to coding guidelines and make comments more informative. Cc: kenneth@whitecape.org Cc: syrjala@sci.fi Cc: chris@chris-wilson.co.uk Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2015-06-24i965: Rename intel_emit* to reflect their new location in brw_pipe_controlChris Wilson1-2/+2
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-17i965: Use _mesa_geometric_ functions appropriatelyKevin Rogovin1-2/+4
Change references to gl_framebuffer::Width, Height, MaxNumLayers and Visual::samples to use the _mesa_geometry_ convenience functions for those places where the geometry of the gl_framebuffer is needed (in contrast to the geometry of the intersection of the attachments of the gl_framebuffer). This patch is to pave the way to enable GL_ARB_framebuffer_no_attachments on Gen7 and higher in i965. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
2015-05-02i965: Upload atomic buffer state for compute shadersJordan Justen1-0/+2
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02i965/state: Emit pipeline select when changing pipelinesJordan Justen1-0/+5
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02i965/cs: Emit state base addressJordan Justen1-0/+2
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02i965/cs: Upload brw_cs_stateJordan Justen1-0/+2
v3: * Add defines. Misc cleanup suggestions. (Ken) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02i965/cs: Emit compute shader code and upload programsJordan Justen1-0/+3
v2: * Don't bother checking for 'gen > 5' (krh) * Populate sampler data in key (krh) v3: * Drop no8 support, and simplify code in several places (Ken) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>