Age | Commit message (Collapse) | Author | Files | Lines |
|
Graphic context is not created until there is an object that reference the engine.
Creating a compute class object as no other side effect than pgrah context creation.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
|
|
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
|
|
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
|
|
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
|
|
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
|
|
Hackish close your eyes and cover your ears.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
|
|
Add nvkm_vma when buffer is mmap through device file so that same
CPU virtual address can be use on GPU too.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
|
|
Channel indirect buffer execution.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
|
|
Channel infrastructure.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
|
|
Allow mmap of compote device file to access memory objects. Each
memory objects is given a unique range inside the compote device
file.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
|
|
Add memory allocation ioctl. Very basic and simple linear allocation
inside GART.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
|
|
Starting with Kepler GPU we can do unified memory for compute. With
Pascal we can even transparently share the same virtual address space
on the GPU as on the CPU. Compote is an attempt to prototype a new
set of API for userspace to leverage those features.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
|
|
Need this as we can not lookup regular process vma from page table
synchronization callback.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
|
|
This allow to create partial mapping of a bo (nvkm_vma).
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
|
|
This allow to create a bo vma (nvkm_vma) at a fix offset inside a vm.
Usefull when we want to force same virtual address on CPU and GPU.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
|
|
CPU process address space on 64bits architecture is 47bits or bigger
hence we need to convert nvkmm_mm to use 64bits integer to support
bigger address space.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
We previously required each VMM user to allocate their own page directory
and fill in the instance block themselves.
It makes more sense to handle this in a common location.
WIP: gf100 chicken-and-egg
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Removes the need to expose internals outside of MMU.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
MMU notifiers can sleep, but in try_to_unmap_one() we call
mmu_notifier_invalidate_page() under page table lock.
Let's instead use mmu_notifier_invalidate_range() outside
page_vma_mapped_walk() loop.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
|
|
MMU notifiers can sleep, but in page_mkclean_one() we call
mmu_notifier_invalidate_page() under page table lock.
Let's instead use mmu_notifier_invalidate_range() outside
page_vma_mapped_walk() loop.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
|
|
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
|
|
Add fake device memory.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
|
|
Just a dummy driver for test purposes.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
|
|
Unlike unaddressable memory, coherent device memory has a real
resource associated with it on the system (as CPU can address
it). Add a new helper to hotplug such memory within the HMM
framework.
Changed since v2:
- s/host/public
Changed since v1:
- s/public/host
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
|
|
Platform with advance system bus (like CAPI or CCIX) allow device
memory to be accessible from CPU in a cache coherent fashion. Add
a new type of ZONE_DEVICE to represent such memory. The use case
are the same as for the un-addressable device memory but without
all the corners cases.
Changed since v4:
- added memory cgroup change to this patch
Changed since v3:
- s/public/public (going back)
Changed since v2:
- s/public/public
- add proper include in migrate.c and drop useless #if/#endif
Changed since v1:
- Kconfig and #if/#else cleanup
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
|
|
This allow caller of migrate_vma() to allocate new page for empty CPU
page table entry (pte_none or back by zero page). This is only for
anonymous memory and it won't allow new page to be instanced if the
userfaultfd is armed.
This is useful to device driver that want to migrate a range of virtual
address and would rather allocate new memory than having to fault later
on.
Changed since v3:
- support zero pfn entry
- improve commit message
Changed sinve v2:
- differentiate between empty CPU page table entry and non empty
- improve code comments explaining how this works
Changed since v1:
- 5 level page table fix
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
|
|
Allow to unmap and restore special swap entry of un-addressable
ZONE_DEVICE memory.
Changed since v2:
- un-conditionaly allow device private memory to be migrated (it can
not be pin to pointless to check reference count).
Changed since v1:
- s/device unaddressable/device private/
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
|
|
Common case for migration of virtual address range is page are map
only once inside the vma in which migration is taking place. Because
we already walk the CPU page table for that range we can directly do
the unmap there and setup special migration swap entry.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: Evgeny Baskakov <ebaskakov@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Mark Hairgrove <mhairgrove@nvidia.com>
Signed-off-by: Sherry Cheung <SCheung@nvidia.com>
Signed-off-by: Subhash Gutti <sgutti@nvidia.com>
|
|
This patch add a new memory migration helpers, which migrate memory
backing a range of virtual address of a process to different memory
(which can be allocated through special allocator). It differs from
numa migration by working on a range of virtual address and thus by
doing migration in chunk that can be large enough to use DMA engine
or special copy offloading engine.
Expected users are any one with heterogeneous memory where different
memory have different characteristics (latency, bandwidth, ...). As
an example IBM platform with CAPI bus can make use of this feature
to migrate between regular memory and CAPI device memory. New CPU
architecture with a pool of high performance memory not manage as
cache but presented as regular memory (while being faster and with
lower latency than DDR) will also be prime user of this patch.
Migration to private device memory will be useful for device that
have large pool of such like GPU, NVidia plans to use HMM for that.
Changed since v4:
- split THP instead of skipping them
Changes since v3:
- Rebase
Changes since v2:
- droped HMM prefix and HMM specific code
Changes since v1:
- typos fix
- split early unmap optimization for page with single mapping
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: Evgeny Baskakov <ebaskakov@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Mark Hairgrove <mhairgrove@nvidia.com>
Signed-off-by: Sherry Cheung <SCheung@nvidia.com>
Signed-off-by: Subhash Gutti <sgutti@nvidia.com>
|
|
Introduce a new migration mode that allow to offload the copy to
a device DMA engine. This changes the workflow of migration and
not all address_space migratepage callback can support this. So
it needs to be tested in those cases.
This is intended to be use by migrate_vma() which itself is use
for thing like HMM (see include/linux/hmm.h).
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
|
|
This introduce a dummy HMM device class so device driver can use it to
create hmm_device for the sole purpose of registering device memory.
It is useful to device driver that want to manage multiple physical
device memory under same struct device umbrella.
Changed since v2:
- use device_initcall() and drop everything that is module specific
Changed since v1:
- Improve commit message
- Add drvdata parameter to set on struct device
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: Evgeny Baskakov <ebaskakov@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Mark Hairgrove <mhairgrove@nvidia.com>
Signed-off-by: Sherry Cheung <SCheung@nvidia.com>
Signed-off-by: Subhash Gutti <sgutti@nvidia.com>
|
|
This introduce a simple struct and associated helpers for device driver
to use when hotpluging un-addressable device memory as ZONE_DEVICE. It
will find a unuse physical address range and trigger memory hotplug for
it which allocates and initialize struct page for the device memory.
Device driver should use this helper during device initialization to
hotplug the device memory. It should only need to remove the memory
once the device is going offline (shutdown or hotremove). There should
not be any userspace API to hotplug memory expect maybe for host
device driver to allow to add more memory to a guest device driver.
Device's memory is manage by the device driver and HMM only provides
helpers to that effect.
Changed since v6:
- fix start address calculation (Balbir Singh)
- more comments and updated commit message.
Changed since v5:
- kernel configuration simplification
- remove now unuse device driver helper
Changed since v4:
- enable device_private_key static key when adding device memory
Changed since v3:
- s/device unaddressable/device private/
Changed since v2:
- s/SECTION_SIZE/PA_SECTION_SIZE
Changed since v1:
- change to adapt to new add_pages() helper
- make this x86-64 only for now
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: Evgeny Baskakov <ebaskakov@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Mark Hairgrove <mhairgrove@nvidia.com>
Signed-off-by: Sherry Cheung <SCheung@nvidia.com>
Signed-off-by: Subhash Gutti <sgutti@nvidia.com>
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
|
|
HMM pages (private or public device pages) are ZONE_DEVICE page and
thus need special handling when it comes to lru or refcount. This
patch make sure that memcontrol properly handle those when it face
them. Those pages are use like regular pages in a process address
space either as anonymous page or as file back page. So from memcg
point of view we want to handle them like regular page for now at
least.
Changed since v3:
- remove public support and move those chunk to separate patch
Changed since v2:
- s/host/public
Changed since v1:
- s/public/host
- add comments explaining how device memory behave and why
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: cgroups@vger.kernel.org
|
|
HMM pages (private or public device pages) are ZONE_DEVICE page and
thus you can not use page->lru fields of those pages. This patch
re-arrange the uncharge to allow single page to be uncharge without
modifying the lru field of the struct page.
There is no change to memcontrol logic, it is the same as it was
before this patch.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: cgroups@vger.kernel.org
|
|
A ZONE_DEVICE page that reach a refcount of 1 is free ie no longer
have any user. For device private pages this is important to catch
and thus we need to special case put_page() for this.
Changed since v3:
- clear page mapping field
Changed since v2:
- clear page active and waiters
Changed since v1:
- use static key to disable special code path in put_page() by
default
- uninline put_zone_device_private_page()
- fix build issues with some kernel config related to header
inter-dependency
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
|