This page has Mesa impementation notes for the [[ARB_compute_shader|https://www.opengl.org/registry/specs/ARB/compute_shader.txt]] extension. # i965 Status ## Started / In-progress * GLES 3.1 conformance test suite results (ES31-CTS.compute_shader*): * Ivy Bridge, Haswell: All 36 tests passing * Broadwell: 28 of 36 tests passing * Piglit tests ## Not Started * Always be able to produce SIMD16 code. SIMD32. * curro is working on SIMD16 * See MAX_COMPUTE_WORK_GROUP_INVOCATIONS note below ## Sent for Code Review ## Complete * Multiple pipelines - [[upstream 20ef23b2|http://cgit.freedesktop.org/mesa/mesa/log/?id=20ef23b2]] * Basic compute program generations - [[upstream 5328ffbe|http://cgit.freedesktop.org/mesa/mesa/log/?id=5328ffbe]] * atomic counters - [[upstream eeee212e|http://cgit.freedesktop.org/mesa/mesa/log/?id=eeee212e]] * Texture sampling - [[upstream b01d047|http://cgit.freedesktop.org/mesa/mesa/log/?id=b01d047]] * barrier() * part1 - [[upstream f0e77239|http://cgit.freedesktop.org/mesa/mesa/log/?id=f0e77239]] * part2 - [[upstream 34cff76|http://cgit.freedesktop.org/mesa/mesa/log/?id=34cff76]] * uniforms - [[upstream 06ada49|http://cgit.freedesktop.org/mesa/mesa/commit/?id=06ada49]] * DispatchCompute - [[upstream 013031b2|http://cgit.freedesktop.org/mesa/mesa/log/?id=013031b2]] * gl_LocalInvocationID - [[upstream 49f999b|http://cgit.freedesktop.org/mesa/mesa/commit/?id=49f999b]] * gl_WorkGroupID - [[upstream c5743a5|http://cgit.freedesktop.org/mesa/mesa/commit/?id=c5743a5]] * gl_GlobalInvocationID - [[upstream 2b6cc03|http://cgit.freedesktop.org/mesa/mesa/commit/?id=2b6cc03]] * gl_LocalInvocationIndex - [[upstream c4cf824|http://cgit.freedesktop.org/mesa/mesa/commit/?id=c4cf824]] * DispatchComputeIndirect - [[upstream ebbe6cd|http://cgit.freedesktop.org/mesa/mesa/commit/?id=ebbe6cd]] * gl_NumWorkGroups - [[upstream 681b4ba|http://cgit.freedesktop.org/mesa/mesa/log/?qt=range&q=681b4ba~9..681b4ba]] * SSBO - [[upstream 7b39114|http://cgit.freedesktop.org/mesa/mesa/log/?qt=range&q=7b39114~..7b39114]] * New GLSL memory barrier functions * groupMemoryBarrier(), memoryBarrierShared(), memoryBarrierBuffer(), memoryBarrierImage(), memoryBarrierAtomicCounter() * [[upstream 51694072|http://cgit.freedesktop.org/mesa/mesa/log/?qt=range&q=51694072~4..51694072]] * L3 for SLM (shared variables) - [[upstream 228d5a3|http://cgit.freedesktop.org/mesa/mesa/log/?qt=range&q=228d5a3~16..228d5a3]] * Shared variable support * parsing - [[upstream fb3da129|http://cgit.freedesktop.org/mesa/mesa/log/?qt=range&q=fb3da129~5..fb3da129]] * lowering and atomics - [[upstream e288b4a|http://cgit.freedesktop.org/mesa/mesa/log/?qt=range&q=e288b4a~25..e288b4a]] * Enable extension - [[upstream d04612b|http://cgit.freedesktop.org/mesa/mesa/log/?qt=range&q=d04612b~..d04612b]] * Update docs - [[upstream 83e8e07|http://cgit.freedesktop.org/mesa/mesa/log/?qt=range&q=83e8e07~2..83e8e07]] * Hardware: Ivybridge, Haswell, Broadwell ## Issues * OpenGL 4.3 requires 1024 for MAX_COMPUTE_WORK_GROUP_INVOCATIONS * For devices with 64 threads per subslice, this requires that SIMD16 never fails * Unlike FS, we can't just fallback to only providing a SIMD8 program (if the local work group size is too big) * For Gen8 and many mobile skews this requires SIMD32 * Note: OpenGLES 3.1 only requires 128 which SIMD8 can always handle ## Git Repositories * [[http://cgit.freedesktop.org/~jljusten/mesa/log/?h=cs]] * [[git://people.freedesktop.org/~jljusten/mesa]] cs