summaryrefslogtreecommitdiff
path: root/src/compiler/glsl/builtin_functions.cpp
AgeCommit message (Collapse)AuthorFilesLines
2017-05-04glsl: rename image_* qualifiers to memory_*Samuel Pitoiset1-15/+15
It doesn't make sense to prefix them with 'image' because they are called "Memory Qualifiers" and they can be applied to members of storage buffer blocks. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-04-28glsl: implement arb_shader_ballot builtins using intrinsicsNicolai Hähnle1-3/+83
2017-04-28glsl: implement arb_shader_group_vote builtins via intrinsicsNicolai Hähnle1-6/+32
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-21glsl: make use of glsl_type::is_double()Samuel Pitoiset1-6/+6
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-04-11glsl: use the BA1 macro for textureQueryLevels()Samuel Pitoiset1-32/+33
For both consistency and new bindless sampler types. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-11glsl: use the BA1 macro for textureSamples()Samuel Pitoiset1-9/+10
For both consistency and new bindless sampler types. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-11glsl: use the BA1 macro for textureCubeArrayShadow()Samuel Pitoiset1-5/+6
For both consistency and new bindless sampler types. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-05glsl: add ARB_shader_ballot builtin functionsNicolai Hähnle1-0/+77
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-31glsl: use -O1 optimization for builtin_functions.cpp with MinGWBrian Paul1-0/+20
Some versions of MinGW-w64 such as 5.3.1 and 6.2.0 produce bad code with -O2 or -O3 causing a random driver crash when running programs that use GLSL. Most Mesa demos in the glsl/ directory trigger the bug, but not the fragcoord.c test. Use a #pragma to force -O1 for this file for later MinGW versions. Luckily, this is basically one-time setup code. I suspect the bug is related to the sheer size of this file. This should let us move to newer versions of MinGW-w64 for Mesa. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-31glsl: fix clockARB builtin functionNicolai Hähnle1-1/+1
The underlying intrinsic is defined to always have a uvec2 return type. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-09glsl: builtin: always return clones of the builtinsLionel Landwerlin1-5/+17
Builtins are created once and allocated using their own private ralloc context. When reparenting IR that includes builtins, we might be steal bits of builtins. This is problematic because these builtins might now be freed when the shader that includes then last is disposed. This might also lead to inconsistent ralloc trees/lists if shaders are created on multiple threads. Rather than including builtins directly into a shader's IR, we should include clones of them in the ralloc context of the shader that requires them. This fixes double free issues we've been seeing when running shader-db on a big multicore (72 threads) server. v2: Also rename _mesa_glsl_find_builtin_function_by_name() to better reflect how this function is used. (Ken) v3: Rename ctx to mem_ctx (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-06glsl: correct compute shader checks for memoryBarrier functionsMarc Di Luzio1-6/+12
As per the spec - "The functions memoryBarrierShared() and groupMemoryBarrier() are available only in compute shaders; the other functions are available in all shader types." Conform to this by adding another delegate to check for compute shader support instead of only whether the current stage is compute This allows some fragment shaders in Dirt Rally to compile Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-31glsl: Implement IEEE-compliant handling of atan2(±∞, ±∞).Francisco Jerez1-1/+21
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-01-31glsl: Rewrite atan2 implementation to fix accuracy and handling of ↵Francisco Jerez1-36/+60
zero/infinity. This addresses several issues of the current atan2 implementation: - Negative zero (and negative denorms which end up getting flushed to zero) isn't handled correctly by the current implementation. The reason is that it does 'y >= 0' and 'x < 0' comparisons to decide on which side of the branch cut the argument is, which causes us to return incorrect results (off by up to 2π) for very small negative values. - There is a serious precision problem for x values of large enough magnitude introduced by the floating point division operation being implemented as a mul+rcp sequence. This can lead to the quotient getting flushed to zero in some cases introducing an error of over 8e6 ULP in the result -- Or in the most catastrophic case will cause us to return NaN instead of the correct value ±π/2 for y=±∞ and x very large. We can fix this easily by scaling down both arguments when the absolute value of the denominator goes above certain threshold. The error of this atan2 implementation remains below 25 ULP in most of its domain except for a neighborhood of y=0 where it reaches a maximum error of about 180 ULP. - It emits a bunch of instructions including no less than three if-else branches per scalar component that don't seem to get optimized out later on. This implementation uses about 13% less instructions on Intel SKL hardware and doesn't emit any control flow instructions. v2: Fix up argument scaling to take into account the range and precision of exotic FP24 hardware. Flip coordinate system for arguments along the vertical line as if they were on the left half-plane in order to avoid division by zero which may give unspecified results on non-GLSL 4.1-capable hardware. Sprinkle in some more comments. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-01-20glsl: Add "built-in" functions to do 64%64 => 64 modulusIan Romanick1-0/+8
These functions are directly available in shaders. A #define is added to detect the presence. This allows these functions to be tested using piglit regardless of whether the driver uses them for lowering. The GLSL spec says that functions and macros beginning with __ are reserved for use by the implementation... hey, that's us! v2: Use function inlining. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20glsl: Add "built-in" functions to do 64/64 => 64 divisionIan Romanick1-0/+8
These functions are directly available in shaders. A #define is added to detect the presence. This allows these functions to be tested using piglit regardless of whether the driver uses them for lowering. The GLSL spec says that functions and macros beginning with __ are reserved for use by the implementation... hey, that's us! v2: Use function inlining. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20glsl: Add "built-in" function for 64-bit integer sign()Ian Romanick1-0/+4
These functions are directly available in shaders. A #define is added to detect the presence. This allows these functions to be tested using piglit regardless of whether the driver uses them for lowering. The GLSL spec says that functions and macros beginning with __ are reserved for use by the implementation... hey, that's us! Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20glsl: Add "built-in" functions to do 64x64 => 64 multiplicationIan Romanick1-0/+9
These functions are directly available in shaders. A #define is added to detect the presence. This allows these functions to be tested using piglit regardless of whether the driver uses them for lowering. The GLSL spec says that functions and macros beginning with __ are reserved for use by the implementation... hey, that's us! Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20glsl: Move builtin_function related prototypes to a separate fileIan Romanick1-0/+1
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20glsl: Add interaction between ARB_gpu_shader_int64 and ARB_shader_clockIan Romanick1-1/+19
If ARB_gpu_shader_int64 is supported, ARB_shader_clock also adds clockARB() that returns a uint64_t. Rather than add new opcodes and intrinsics for this, just wrap the existing intrinsic with a packUint2x32. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20glsl: Add 64-bit integer functionsDave Airlie1-3/+174
These are all the allowed 64-bit functions from ARB_gpu_shader_int64 spec. v2: restrict int64/double functions better. v3 (idr): Delete spurious blank lines. Suggested by Matt. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-09glsl: Do not allow scalar types in vector relational functionsBoyan Ding1-19/+10
According to OpenGL Shading Language 4.50 spec, Section 8.7 "Vector Relational Functions", functions of this type do not operate on scalar types, so remove scalar types from signature definitions to make the behavior consistent with glslangValidator and other drivers. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
2016-12-12treewide: s/comparitor/comparator/Ilia Mirkin1-6/+6
git grep -l comparitor | xargs sed -i 's/comparitor/comparator/g' Just happened to notice this in a patch that was sent and included one of the tokens in question. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-12-09glsl: Use a simpler formula for tanhJason Ekstrand1-8/+10
The formula we have used in the past is a trivial reduction from the definition by simply multiplying both the numerator and denominator of the formula by 2. However, multiplying by e^x, you can further reduce it. This allows us to get rid of one side of the clamp and two of exponential functions which should make it faster. The new formula still passes the dEQP precision tests for tanh so it should be fine. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-12-09compiler/glsl: fix precision problem of tanhHaixia Shi1-2/+10
Clamp input scalar value to range [-10, +10] to avoid precision problems when the absolute value of input is too large. Fixes dEQP-GLES3.functional.shaders.builtin_functions.precision.tanh.* test failures. v2: added more explanation in the comment. v3: fixed a typo in the comment. Signed-off-by: Haixia Shi <hshi@chromium.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0" <mesa-dev@lists.freedesktop.org>
2016-10-16glsl: Disable textureOffset(sampler2DArrayShadow, ...) in GLSL ES.Kenneth Graunke1-1/+7
This has apparently never existed in GLSL ES. Fixes dEQP-GLES3.functional.shaders.texture_functions.invalid .textureoffset_sampler2darrayshadow_vec4_ivec2_vertex and .textureoffset_sampler2darrayshadow_vec4_ivec2_fragment Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98244 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-10-04glsl: Kill __intrinsic_atomic_subIan Romanick1-8/+46
Just generate an __intrinsic_atomic_add with a negated parameter. Some background on the non-obvious reasons for the the big change to builtin_builder::call()... this is cribbed from some discussion with Ilia on mesa-dev. Why change builtin_builder::call() to allow taking dereferences and create them here rather than just feeding in the ir_variables directly? The problem is the neg_data ir_variable node would have to be in two lists at the same time: the instruction stream and parameters. The ir_variable node is automatically added to the instruction stream by the call to make_temp. Restructuring the code so that the ir_variables could be in parameters then move them to the instruction stream would have been pretty terrible. ir_call in the instruction stream has an exec_list that contains ir_dereference_variable nodes. The builtin_builder::call method previously took an exec_list of ir_variables and created a list of ir_dereference_variable. All of the original users of that method wanted to make a function call using exactly the set of parameters passed to the built-in function (i.e., call __intrinsic_atomic_add using the parameters to atomicAdd). For these users, the list of ir_variables already existed: the list of parameters in the built-in function signature. This new caller doesn't do that. It wants to call a function with a parameter from the function and a value calculated in the function. So, I changed builtin_builder::call to take a list that could either be a list of ir_variable or a list of ir_dereference_variable. In the former case it behaves just as it previously did. In the latter case, it uses (and removes from the input list) the ir_dereference_variable nodes instead of creating new ones. text data bss dec hex filename 6036395 283160 28608 6348163 60dd83 lib64/i965_dri.so before 6036923 283160 28608 6348691 60df93 lib64/i965_dri.so after Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-10-04glsl: Remove ir_function_signature::_is_intrinsic fieldIan Romanick1-2/+0
text data bss dec hex filename 6036491 283160 28608 6348259 60dde3 lib64/i965_dri.so before 6036395 283160 28608 6348163 60dd83 lib64/i965_dri.so after Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-10-04glsl: Add ir_function_signature::is_intrinsic() methodIan Romanick1-2/+2
This necessetated renaming the is_intrinsic field to _is_intrinsic. The next commit will remove the field. text data bss dec hex filename 6036507 283160 28608 6348275 60ddf3 lib64/i965_dri.so before 6036491 283160 28608 6348259 60dde3 lib64/i965_dri.so after Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-10-04glsl: Track a unique intrinsic ID with each intrinsic functionIan Romanick1-72/+136
text data bss dec hex filename 6037483 283160 28608 6349251 60e1c3 lib64/i965_dri.so before 6038043 283160 28608 6349811 60e3f3 lib64/i965_dri.so after Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-09-23glsl: Delete ftransform support from builtin_functions.cpp.Kenneth Graunke1-26/+4
This is now handled directly by ast_function.cpp. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by; Ian Romanick <ian.d.romanick@intel.com>
2016-08-28mesa: add EXT_texture_cube_map_array supportIlia Mirkin1-0/+2
This is identical to OES_texture_cube_map_array support. dEQP has tests which use this extension. Also it is part of AEP. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-08-26mesa: Add support for OES_texture_cube_map_arrayIan Romanick1-10/+19
This has a separate enable flag because this extension also requires OES_geometry_shader. It is possible that some drivers may support OpenGL ES 3.1 and ARB_texture_cube_map but not support OES_geometry_shader. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-26glsl: Add and use has_texture_cube_map_array helperIan Romanick1-4/+2
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-07-19MESA_shader_integer_functions: Expose new built-in functionsIan Romanick1-11/+20
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-06-30glsl/main: remove unused params and make function staticTimothy Arceri1-1/+1
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-06-30glsl: pass symbols rather than shader to _mesa_get_main_function_signature()Timothy Arceri1-2/+2
This will allow us to split gl_shader into two different structs, one for shader objects and one for linked shaders. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-06-16mesa/glsl: stop using GL shader type internallyTimothy Arceri1-1/+1
Instead use the internal gl_shader_stage enum everywhere. This makes things more consistent and gets rid of unnecessary conversions. Ideally it would be nice to remove the Type field from gl_shader altogether but currently it is used to differentiate between gl_shader and gl_shader_program in the ShaderObjects hash table. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-06-06mesa: hook up core bits of GL_ARB_shader_group_voteIlia Mirkin1-0/+22
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-05-13glsl: make sure that textureProj(bias) variants are only exposed in fsIlia Mirkin1-37/+37
Many were already marked as fs_only, but not all. This fixes the remaining ir_txb entries. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-21glsl: add forgotten textureOffset function for sampler2DArrayShadowRoland Scheidegger1-0/+7
This was part of EXT_gpu_shader4 - as such it should have been supported by glsl 130. It was however forgotten, and not added until glsl 430 - with the wrong syntax no less (glsl 430 mentions it was overlooked). glsl 440 (but revision 8 only) fixed this finally for good. At least nvidia supports this with just version glsl version 1.30 as well (the spec doesn't explicitly say it should be supported retroactively), so just add this to the other glsl 130 textureOffset functions. Passes a (hacked) piglit tex-miplevel-selection test (2DArrayShadow textureOffset -auto) with llvmpipe. v2: fix up comment (by Ian), add testing to commit message. Reviewed-by: Dave Airlie <airlied@gmail.com>
2016-04-03glsl: add ARB_ES3_1_compatibility supportIlia Mirkin1-0/+2
Oddly a bunch of the features it adds are actually from ESSL 3.20. But the spec is quite clear, oh well. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-30mesa: add GL_OES_shader_multisample_interpolation supportIlia Mirkin1-5/+7
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-28glsl: add OES_texture_buffer and EXT_texture_buffer supportIlia Mirkin1-12/+14
Expose the samplerBuffer/imageBuffer types, and allow the various functions to operate on them. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-10mesa: add GL_ARB_shader_atomic_counter_ops supportIlia Mirkin1-0/+110
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-03glsl: Improve the accuracy of the acos() approximation.Francisco Jerez1-1/+1
The adjusted polynomial coefficients come from the numerical minimization of the L2 norm of the relative error. The old coefficients would give a maximum relative error of about 15000 ULP in the neighborhood around acos(x) = 0, the new ones give a relative error bounded by less than 2000 ULP in the same neighborhood. Fixes four dEQP subtests: dEQP-GLES31.functional.shaders.builtin_functions.precision.acos. highp_compute.{scalar,vec2,vec3,vec4} Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-03glsl: Parameterize asin_expr() on the fit coefficients.Kenneth Graunke1-6/+6
This will allow us to share the implementation while using different polynomials for asin() and acos(). Francisco Jerez did this in the SPIR-V front-end; I'm merely porting his idea to the GLSL world. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-02-27mesa: add GL_OES_gpu_shader5 and GL_EXT_gpu_shader5 supportIlia Mirkin1-47/+58
The two extensions are identical, and are largely taking bits of already existing desktop functionality. We continue to do a poor job of supporting the 'precise' keyword, just like we do on desktop. This passes the relevant dEQP tests that I could find. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-02-22glsl: Implement the required built-in functions when OES_shader_image_atomic ↵Francisco Jerez1-18/+43
is enabled. This is basically just the same atomic functions exposed by ARB_shader_image_load_store, with one exception: "highp float imageAtomicExchange( coherent IMAGE_PARAMS, float data);" There's no float atomic exchange overload in the original ARB_shader_image_load_store or GL 4.2, so this seems like new functionality that requires specific back-end support and a separate availability condition in the built-in function generator. v2: Move image availability predicate logic into a separate static function for clarity. Had to pull out the image_function_flags enum from the builtin_builder class for that to be possible. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-02-13glsl/types: Rename sampler_type to sampled_typeJason Ekstrand1-2/+2
It's a bit more descriptive since it is the base type that you get when you sample from it. Also, the next commit adds a bare "sampler" type and we need glsl_type::sampler_type available for a public static member. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>