diff options
author | Tor Lillqvist <tml@collabora.com> | 2015-09-10 21:58:28 +0300 |
---|---|---|
committer | Andras Timar <andras.timar@collabora.com> | 2015-09-18 10:10:51 +0200 |
commit | d58e3c7215b0420f5cbb08c3a31bc84efed9e85a (patch) | |
tree | febec710746d4de1553d7336f27e0260af1ecbb4 /opencl | |
parent | 754e009b73896614a094ff9658454937ac3ad845 (diff) |
Split formula group for OpenCL up into smaller bits
Will make it less demanding on low-end hardware, where the device
driver is unresponsive for too long when a OpenCL kernel handling lots
of data is executing. This makes Windows restart it which is
problematic.
I tried several approaches of splitting, both at higher levels in sc
and at the lowest level just before creating and executing the OpenCL
kernel(s). This seems to be the most minimal and local approach. Doing
it at the lower level would have required too much poking into our
obscure OpenCL code, like passing an offset parameter to every kernel.
Use a simple heuristic to find out whether to split. On the
problematic low-end devices, CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT is
4, while for more performant devices it is 1 or 8.
Change-Id: If16d152710057b34d09ef0203960e1fbb9ac067f
Reviewed-on: https://gerrit.libreoffice.org/18613
Reviewed-by: Michael Meeks <michael.meeks@collabora.com>
Tested-by: Michael Meeks <michael.meeks@collabora.com>
Diffstat (limited to 'opencl')
-rw-r--r-- | opencl/source/openclwrapper.cxx | 5 |
1 files changed, 5 insertions, 0 deletions
diff --git a/opencl/source/openclwrapper.cxx b/opencl/source/openclwrapper.cxx index 5574d2c3fa09..9d03a2780220 100644 --- a/opencl/source/openclwrapper.cxx +++ b/opencl/source/openclwrapper.cxx @@ -501,6 +501,11 @@ bool initOpenCLRunEnv( GPUEnv *gpuInfo ) gpuInfo->mnKhrFp64Flag = bKhrFp64; gpuInfo->mnAmdFp64Flag = bAmdFp64; + gpuInfo->mnPreferredVectorWidthFloat = 0; + + clGetDeviceInfo(gpuInfo->mpArryDevsID[0], CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT, sizeof(cl_uint), + &gpuInfo->mnPreferredVectorWidthFloat, NULL); + return false; } |