path: root/Software/Beignet/Backend.mdwn
diff options
authorRuiling Song <>2014-06-19 15:20:54 +0800
committerZhigang Gong <>2014-06-25 22:16:32 -0700
commit5ccc7bb53d3e8274dc49c4b52f6eca04b5a49eee (patch)
tree00fb667dd172dcfc131bc13e3914824e5b0df29e /Software/Beignet/Backend.mdwn
parentacb7d3f99947799ca7d8e3991c15d5a9e5790ee2 (diff)
update Software on environment variables.
Signed-off-by: Ruiling Song <> Reviewed-by: Zhigang Gong <>
Diffstat (limited to 'Software/Beignet/Backend.mdwn')
1 files changed, 40 insertions, 2 deletions
diff --git a/Software/Beignet/Backend.mdwn b/Software/Beignet/Backend.mdwn
index 99d678e7..be6081b2 100644
--- a/Software/Beignet/Backend.mdwn
+++ b/Software/Beignet/Backend.mdwn
@@ -30,7 +30,17 @@ Various environment variables
Environment variables are used all over the code. Most important ones are:
-- `OCL_SIMD_WIDTH` `(8 or 16)`. Change the number of lanes per hardware thread
+- `OCL_STRICT_CONFORMANCE` `(0 or 1)`. Gen does not provide native high
+ precision math instructions compliant with OpenCL Spec. So we provide a
+ software version to meet the high precision requirement. Obviously the
+ software version's performance is not as good as native version supported by
+ GEN hardware. What's more, most graphics application don't need this high
+ precision, so we choose 0 as the default value. So OpenCL apps do not suffer
+ the performance penalty for using high precision math functions.
+- `OCL_SIMD_WIDTH` `(8 or 16)`. Select the number of lanes per hardware thread,
+ Normally, you don't need to set it, we will select suitable simd width for
+ a given kernel. Default value is 16.
- `OCL_OUTPUT_GEN_IR` `(0 or 1)`. Output Gen IR (scalar intermediate
representation) code
@@ -42,7 +52,35 @@ Environment variables are used all over the code. Most important ones are:
- `OCL_OUTPUT_ASM` `(0 or 1)`. Output Gen ISA
-- `OCL_OUTPUT_REG_ALLOC` `(0 or 1)`. Output Gen register allocations
+- `OCL_OUTPUT_REG_ALLOC` `(0 or 1)`. Output Gen register allocations, including
+ virtual register to physical register mapping, live ranges.
+- `OCL_OUTPUT_BUILD_LOG` `(0 or 1)`. Output error messages if there is any
+ during CL kernel compiling and linking.
+- `OCL_OUTPUT_CFG` `(0 or 1)`. Output control flow graph in .dot file.
+- `OCL_OUTPUT_CFG_ONLY` `(0 or 1)`. Output control flow graph in .dot file,
+ but without instructions in each BasicBlock.
+- `OCL_PRE_ALLOC_INSN_SCHEDULE` `(0 or 1)`. The instruction scheduler in
+ beignet are currently splitted into two passes: before and after register
+ allocation. The pre-alloc scheduler tend to decrease register pressure.
+ This variable is used to disable/enable pre-alloc scheduler. This pass is
+ disabled now for some bugs.
+- `OCL_POST_ALLOC_INSN_SCHEDULE` `(0 or 1)`. Disable/enable post-alloc
+ instruction scheduler. The post-alloc scheduler tend to reduce instruction
+ latency. By default, this is enabled now.
+- `OCL_SIMD16_SPILL_THRESHOLD` `(0 to 256)`. Tune how much registers can be
+ spilled under SIMD16. Default value is 16. We find spill too much register
+ under SIMD16 is not as good as fall back to SIMD8 mode. So we set the
+ variable to control spilled register number under SIMD16.
+- `OCL_USE_PCH` `(0 or 1)`. The default value is 1. If it is enabled, we use
+ a pre compiled header file which include all basic ocl headers. This would
+ reduce the compile time.
Implementation details