Initial commit.HEAD master

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
author: Thomas Hellstrom <thellstrom@vmware.com> 2018-04-11 09:33:05 +0200
committer: Thomas Hellstrom <thellstrom@vmware.com> 2018-04-11 09:33:05 +0200
commit: eebaa7f86212d7ebab3c87aae1f9d68cade1b49e (patch)
tree: 38b556f915ea612375202be6ec434f9f75c98b6a /README
1 files changed, 59 insertions, 0 deletions
diff --git a/README b/README
new file mode 100644
index 0000000..77a732f
--- /dev/null
+++ b/README
@@ -0,0 +1,59 @@
+The ww_mutex_test kernel module
+
+Compile using
+make
+make install
+
+Start a test sequence using
+sudo modprobe wtest
+
+After test sequence use
+sudo rmmod wtest
+
+The test sequence simulates a number of gpu command submissions taking
+ww_mutex_locks. The module maintains a number of global locks and each thread
+is required to lock a local number of locks that are randomly determined from
+the set of global locks. Typically threads will sometimes try to acquire
+the same lock and the ww mutex rollback will be initialized. Defines to
+tune this behaviour (ww_mutex_test.h):
+#define WWT_NUM_LOCKS 100000 /* Number of global locks *
+#define WWT_NUM_T_LOCKS 800  /* Number of locks per thread out of the global */
+#define WWT_NUM_THREADS 16   /* Number of locking threads to fire off */
+
+Each thread performs a number of simulated command submissions with the same
+locks. Each command submission consists of
+*) Taking the locks
+*) Busy-wait for a while (Mimicing time use to submit GPU commands)
+*) Releasing the locks.
+The busy wait makes it more difficult to judge how much time was used for
+the actual locking, but would on the other hand give more real-world-like
+results for the number of rollbacks. Related defines:
+#define WWT_NUM_SUB 10000    /* Number of command submissions */
+#define WWT_CS_UDELAY 000    /* Command submission udelay */
+
+The results can be wiewed as starting and ending time for each thread in
+"dmesg". Each thread also prints the number of rollbacks it had to do.
+There are two ways to have zero rollbacks: One is to fire off the threads
+sequentially in which case there will be no contention. The other one is to
+make sure there are no common locks between threads. Be careful with the latter
+option so that there are enough global locks to accommodate the requests for
+all threads. Otherwise module loading may lock up.
+Related defines:
+#define WWT_NO_SHARED         /* No shared mutexes - No rollbacks */
+#define WWT_SEQUENTIAL        /* Fire off locking threads sequentially */
+
+The module can either use the kernel built-in ww mutex implementation or a
+replacement drop-in implementation. The drop in replacement implements a
+choice of algorithms: Wait-Die and Wound-Wait. It's also possible to batch
+mutex locks and unlocks, significantly reducing the number of locked CPU cycles.
+Note that the drop-in replacement manipulates locking state under a class
+global spinlock instead of the builtin atomic operation manipulation. This
+is slightly slower in cases where the global spinlock is not contended, and
+significantly slower in cases where the global spinlock is contended, but
+it allows for batching locks and unlocks in a single global spinlock
+critical section.
+
+Related defines:
+#define WW_BUILTIN            /* Use kernel builtin ww mutexes */
+#define WW_WAITDIE true       /* Use wait-die, not wound-wait */
+#define WW_BATCHING           /* Batch locks and unlocks */
author	Thomas Hellstrom <thellstrom@vmware.com>	2018-04-11 09:33:05 +0200
committer	Thomas Hellstrom <thellstrom@vmware.com>	2018-04-11 09:33:05 +0200
commit	eebaa7f86212d7ebab3c87aae1f9d68cade1b49e (patch)
tree	38b556f915ea612375202be6ec434f9f75c98b6a /README