diff options
author | Thomas Hellstrom <thellstrom@vmware.com> | 2018-04-11 09:33:05 +0200 |
---|---|---|
committer | Thomas Hellstrom <thellstrom@vmware.com> | 2018-04-11 09:33:05 +0200 |
commit | eebaa7f86212d7ebab3c87aae1f9d68cade1b49e (patch) | |
tree | 38b556f915ea612375202be6ec434f9f75c98b6a /README |
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Diffstat (limited to 'README')
-rw-r--r-- | README | 59 |
1 files changed, 59 insertions, 0 deletions
@@ -0,0 +1,59 @@ +The ww_mutex_test kernel module + +Compile using +make +make install + +Start a test sequence using +sudo modprobe wtest + +After test sequence use +sudo rmmod wtest + +The test sequence simulates a number of gpu command submissions taking +ww_mutex_locks. The module maintains a number of global locks and each thread +is required to lock a local number of locks that are randomly determined from +the set of global locks. Typically threads will sometimes try to acquire +the same lock and the ww mutex rollback will be initialized. Defines to +tune this behaviour (ww_mutex_test.h): +#define WWT_NUM_LOCKS 100000 /* Number of global locks * +#define WWT_NUM_T_LOCKS 800 /* Number of locks per thread out of the global */ +#define WWT_NUM_THREADS 16 /* Number of locking threads to fire off */ + +Each thread performs a number of simulated command submissions with the same +locks. Each command submission consists of +*) Taking the locks +*) Busy-wait for a while (Mimicing time use to submit GPU commands) +*) Releasing the locks. +The busy wait makes it more difficult to judge how much time was used for +the actual locking, but would on the other hand give more real-world-like +results for the number of rollbacks. Related defines: +#define WWT_NUM_SUB 10000 /* Number of command submissions */ +#define WWT_CS_UDELAY 000 /* Command submission udelay */ + +The results can be wiewed as starting and ending time for each thread in +"dmesg". Each thread also prints the number of rollbacks it had to do. +There are two ways to have zero rollbacks: One is to fire off the threads +sequentially in which case there will be no contention. The other one is to +make sure there are no common locks between threads. Be careful with the latter +option so that there are enough global locks to accommodate the requests for +all threads. Otherwise module loading may lock up. +Related defines: +#define WWT_NO_SHARED /* No shared mutexes - No rollbacks */ +#define WWT_SEQUENTIAL /* Fire off locking threads sequentially */ + +The module can either use the kernel built-in ww mutex implementation or a +replacement drop-in implementation. The drop in replacement implements a +choice of algorithms: Wait-Die and Wound-Wait. It's also possible to batch +mutex locks and unlocks, significantly reducing the number of locked CPU cycles. +Note that the drop-in replacement manipulates locking state under a class +global spinlock instead of the builtin atomic operation manipulation. This +is slightly slower in cases where the global spinlock is not contended, and +significantly slower in cases where the global spinlock is contended, but +it allows for batching locks and unlocks in a single global spinlock +critical section. + +Related defines: +#define WW_BUILTIN /* Use kernel builtin ww mutexes */ +#define WW_WAITDIE true /* Use wait-die, not wound-wait */ +#define WW_BATCHING /* Batch locks and unlocks */ |