Age | Commit message (Collapse) | Author | Files | Lines |
|
Replace the open-coded ioctls with the thin gem wrappers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
intel_residency has a cairo dependency through igt_fb.c. Remove it
if ANDROID_HAS_CAIRO is not defined.
Signed-off-by: Derek Morton <derek.j.morton@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
|
|
Fixup some fallout from the connector probing changes so testdisplay -m
will pick up newly hotplugged displays correctly.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org.
|
|
If we have to share the GTT with others, we cannot rely on being able to
fill it and have to factor in some slack for others.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As well as ensuring the kernel doesn't simply crash when asked to do
lots of objects, check it actually aligns them.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Daniel has suggested that I put vc4 testing into igt, since it's got
the piglit integration and KMS coverage already. This gets the ccore
building so that I can start writing tests.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Stone <daniels@collabora.com>
|
|
Some bits can't be built on non-x86 architectures, mostly because they
require x86-specific assembly primitives. Disable these by default on
non-x86 architectures.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
We can move it from softpin test into lib, and since softpin support is
highly unlikely to go away in-between getparam ioctl calls, let's just
do a single call and store the value.
v2: rebase
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
No functional changes.
While I'm here, let's also rename gem_uses_aliasing_ppgtt (since it's
being used to indicate if we are using ANY kind of ppgtt) and introduce
gem_uses_full_ppgtt to drop some unnecessary code from tests that were
previously calling getparam directly instead of using ioctl wrapper.
v2: drop gem_uses_full_48b_ppgtt since it's no longer used anywhere,
s/48b/64b (Chris)
v3: rebase
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
References: https://bugs.freedesktop.org/show_bug.cgi?id=93849
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
After the recent discussions regarding the effects of the vblank
disabling policies on PC state residencies, I started running some
experiments to reevaluate some non-intuitive conclusions I had
reached. In order to help me do this, I decided to write this tool.
The idea is very simple: the tool puts the system on an screen-on idle
state, checks which PC state residency is the deepest we can reach,
measures its residency, then does some not-so-idle tests and measures
the residencies. You can use the tool to compare different Kernel
trees and you can also use the tool to compare enabled vs disabled
features.
It's obvious that these cases do not represent real-world use cases of
our driver, but they are already enough to highlight differences
between the many patches I wrote. I was even able to catch a bug in
one of my patches by spotting an unexpected regression in the
residencies.
I've been using this tool for FBC, but I expect it to also be useful
for PSR, DRRS and similar features. I've been measuring the effects of
different optimizations I wrote, and I've also been measuring the FBC
vs no-FBC cases.
It is also important to highlight that if your system is not properly
configured for efficient power savings the tool may not be able to
show differences between the results. On my Broadwell machine, for
example, if I don't run "powertop --auto-tune" before running the
tool, I get PC2 as the deepest state, and 90%+ residency for every
workload. After properly configuring the machine, I get PC7 as the
deepest state, which is the expected.
So far I only tested this tool on BDW and SKL, and it may hit some
unexpected assertions for older platforms.
I only implemented the cases that are immediately useful for me, but
we may also expand the tool in the future. We can add more important
workloads. We can add support for screen-off cases, so we can compare
the effects of runtime PM and other screen-off features. There's a lot
we can do, but none of this is on my current priority list.
And remember: /usr/bin/paste is your friend when comparing results.
v2:
- Be more idle at setup_idle().
- Improve printing for /usr/bin/paste usage.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
|
|
If we don't close the handle from the last pass, we don't free up the
previous pass's vma immediately, changing the hole allocation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
snooped objects are not allowed to abutt uncached objects on older gen
(!llc and global GTT) or else the GPU may hang if it prefetches across a
page boundary into a different memory type (i.e. CS reading from snoop).
The kernel should be checking the alignment rules as normal.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
If we use a MAP_SHARED mmaping for the our backing storage for userptr,
then it will be inherited across the fork with the same address. ideal
for continuity testing of children.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
We have to avoid the COW alias for the intel_bufmgr and intel_batch
cache as the child may close the object (in its local cache) leaving an
alias in the parent cache pointing to a stale object.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The issue here is that the pointer inherited upon the child is
copied-on-write, i.e. the pointer is private to each process, but the
handle is shared. This means that writes and reads in the child are
going to a different set of pages than the GPU's object - the test is
simply broken. To overcome this we would need to mmap the shared buffer
into the child.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The current forked modes recreate their handles in the children and just
look at any complications arising from contention. This mode looks at
inheriting the fd+handles from the parent into the child and seeing if
we can use them within the child.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
After setting the flag for NORELOC (to avoid having to pay the cost of
validating the relocations on every pass), we need to make sure that
we set EXEC_OBJECT_WRITE so that we do track the outstanding writes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
It's broken, avoid at all costs.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As we have the same function in a few places to read the
debugfs/i915_ring_missed_irq file, move it to the core.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
We need both a secure batch and to flag it to use the virtual GTT
address.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
For the older gen, MI_STORE_DATA_IMM is a privileged command so we need
to set the "secure" batch flag, and we also need to instruct the command
to use the GTT virtual address.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Since we need a lot of memory, trim off the less significant digits for
easier human consumption.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The objective of this test is to check how the driver handles a full
ring. To that end we need only submit enough work to fill the ring by
submitting work faster than the GPU can execute it. If we are more
careful in our batch construction, we can feed them much faster and
achieve the same results much quicker.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
We can fit a few more objects in at high alignment, so do so.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
With softpin we can explicitly manage the layout of the objects to be
executed, deliberately forcing the reuse of active pages in an attempt
to spot misbehaviour in the CS TLBs. Being explicit allows us to
eliminate a lot of the CPU overhead between execbuf, hopefully
increasing the likelihood of a conflict.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Before gen4, MI_STORE_DWORD was just 3 dwords long (cmd, offset, value).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Some potential callers want to inject a hang into a particular context,
some want to trigger an actual ban and others may or may not want to
capture the associated error state. Expand the hang injection interface
to suit all.
v2: Disable the new kernel API, but push to provide a missing piece of
infrastucture to unbreak compilation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As the hang injection now itself checks for validity before use, the
tests don't need to do so themselves. Except in certain situations! If
the test forks, it should do requirement checks before the fork (so that
we don't anger the igt gods) and if the test plays around i915.reset
then it needs to do an early igt_require_hang_ring() that is not
affected by the changes to i915.reset.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
If we move the igt_require() into the hang injector, this makes simple
test cases even more convenient. More complex test cases can always do
their own precursory check before settting up the test.
However, this does embed the assumption that the first context we are
called from is safe (i.e no i915.enable_hangcheck/i915.reset
interferrence).
v2: A couple of environment variables to skip hang testing or to force
hang injection even if the GPU cannot be reset.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The range we chose to overwrite in the target had an off-by-one error
that could cause it to compute a size that went past the end of the
buffer (and so trigger EINVAL). Fortuituously with our seed this did not
occur. Whilst changing the range calculation, update the error logging
to include the range information.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
For softpinning, we do not require either userptr or extended ppgtt, so
remove those requirements and make the tests work universally. (Certain
ABI tests require large GTT, or per-process GTT.)
In the process, make the tests more extensive - validate overlapping
handling more careful, explicitly test no-relocation support, validate
more ABI handling. And for fun, cause a kernel GPF.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Allow both parts (single, many) to be run independently.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
In order to actually use the high space we need to set the can-use-48bit
flag.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Compute the largest alignment for the most number of objects we can create,
then trying an execbuf with them.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The test just aims to execute batches on alternating rings with a write
target such that every batch must be executed after the previous
completes. This stresses the inter-ring synchronisation, which is
interrupt driven if the gpu does not support semaphores, and so is a
good stress tests for detecting "missed interrupt syndrome". Make that
detection explicit.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Whilst still keeping the runtime down, extend the pipeline slightly.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Make the behaviour of the test more explicit wrt to the handle management,
mmap and domain handling.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
We can trade off the explicit sync (presumably to avoid some resource
starvation issue?) with the implicit sync of having to perform a
relocation. Using an implicit sync helps stress core kernel code,
besides being much faster!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Similar to the cpu mmap vs gtt mmap coherency test.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The idea is to check partial cacheline reads/writes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
gem_concurrent_blit tries to ensure that it doesn't try and run a test
that would grind the system to a halt, i.e. unexpectedly cause swap
thrashing. It currently calls intel_require_memory(), but outside of
the subtest (as the tests use fork, it cannot do requirement testing
within the test children) - but intel_require_memory() calls
igt_require() and triggers and abort. Wrapping that initial require
within an igt_fixture() stops the abort(), but also prevents any further
testing.
This patch restructures the requirement checking to ordinary conditions,
which though allowing the test to run, also prevents listing of subtests
on machines which cannot handle them.
|
|
A very basic test of functionality, execute a nop and wait for it to
complete. It should be very effective at stimulating the "missed
interrupt syndrome" on all devices.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Remove one assumption from the test and amek the domain management
explict - when we write through the CPU to construction the batch, mark
it as having been written.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Exercise the busy-ioctl and verify it reports the right active engines
using the execbuffer notation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|