summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-09-05drm/nouveau/compote: create a compute object to force graphic context creationhmm-nouveauJérôme Glisse2-0/+12
Graphic context is not created until there is an object that reference the engine. Creating a compute class object as no other side effect than pgrah context creation. Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2017-09-05drm/nouveau/compote: add page fault handlerJérôme Glisse1-1/+122
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2017-09-05drm/nouveau/compote: add helper to map nvkm_vmaJérôme Glisse2-0/+34
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2017-09-05drm/nouveau/compote: add GPU page fault handlerJérôme Glisse4-0/+185
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2017-09-05drm/nouveau/compote: add HMM mirror supportJérôme Glisse4-0/+126
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2017-09-05drm/nouveau/compote: allow to have nvkm_vma not bind to nouveau_boJérôme Glisse1-0/+4
Hackish close your eyes and cover your ears. Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2017-08-29drm/nouveau/compote: add buffer vma on mmap through device fileJérôme Glisse1-0/+29
Add nvkm_vma when buffer is mmap through device file so that same CPU virtual address can be use on GPU too. Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2017-08-29drm/nouveau/compote: add channel indirect buffer execute ioctlJérôme Glisse4-13/+78
Channel indirect buffer execution. Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2017-08-29drm/nouveau/compote: add channel supportJérôme Glisse5-0/+189
Channel infrastructure. Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2017-08-29drm/nouveau/compote: memory object mmap supportJérôme Glisse4-2/+133
Allow mmap of compote device file to access memory objects. Each memory objects is given a unique range inside the compote device file. Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2017-08-29drm/nouveau/compote: memory allocation ioctlJérôme Glisse6-1/+234
Add memory allocation ioctl. Very basic and simple linear allocation inside GART. Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2017-08-29drm/nouveau/compote: GPU compute on top of nouveauJérôme Glisse8-0/+418
Starting with Kepler GPU we can do unified memory for compute. With Pascal we can even transparently share the same virtual address space on the GPU as on the CPU. Compote is an attempt to prototype a new set of API for userspace to leverage those features. Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2017-08-29drm/nouveau/core/mm: allow to find nvkm_vma from offsetJérôme Glisse4-0/+43
Need this as we can not lookup regular process vma from page table synchronization callback. Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2017-08-23drm/nouveau/core/mm: allow partial mapping of bo (buffer object)Jérôme Glisse5-11/+24
This allow to create partial mapping of a bo (nvkm_vma). Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2017-08-23drm/nouveau/core/mm: allow to create bo vma to fix offset inside vmJérôme Glisse6-0/+129
This allow to create a bo vma (nvkm_vma) at a fix offset inside a vm. Usefull when we want to force same virtual address on CPU and GPU. Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2017-08-23drm/nouveau/core/mm: convert to u64 tu support bigger address spaceJérôme Glisse2-25/+25
CPU process address space on 64bits architecture is 47bits or bigger hence we need to convert nvkmm_mm to use 64bits integer to support bigger address space. Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2017-08-09fault/gp100: initial implementation of MaxwellFaultBufferABen Skeggs11-0/+371
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-08-09mc/gp100-: handle replayable fault interruptBen Skeggs3-2/+22
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-08-09core: define engine for handling replayable faultsBen Skeggs7-0/+12
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-08-09mmu/gp100: allow gcc/tex to generate replayable faultsBen Skeggs3-0/+22
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-08-09WIPmmu: handle instance block setupBen Skeggs23-180/+167
We previously required each VMM user to allocate their own page directory and fill in the instance block themselves. It makes more sense to handle this in a common location. WIP: gf100 chicken-and-egg Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-08-09mmu/gf100-: implement vmm on top of new baseBen Skeggs9-49/+185
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-08-09mmu/nv50,g84: implement vmm on top of new baseBen Skeggs7-8/+158
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-08-09mmu/nv44: implement vmm on top of new baseBen Skeggs5-9/+67
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-08-09mmu/nv41: implement vmm on top of new baseBen Skeggs4-8/+67
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-08-09mmu/nv04: implement vmm on top of new baseBen Skeggs6-17/+116
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-08-09mmu: implement base for new vm managementBen Skeggs7-20/+188
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-08-09mmu/gp100: fork from gf100Ben Skeggs4-6/+40
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-08-09mmu/g84: fork from nv50Ben Skeggs4-13/+47
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-08-09mmu/gf100: allow implementation to be subclassedBen Skeggs2-3/+45
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-08-09mmu/nv50: allow implementation to be subclassedBen Skeggs2-3/+45
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-08-09mmu: automatically handle "un-bootstrapping" of vmmBen Skeggs4-8/+7
Removes the need to expose internals outside of MMU. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-08-09falcon: use a more reasonable msgqueue timeout valueBen Skeggs1-1/+1
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-08-09mm/rmap: try_to_unmap_one() do not call mmu_notifier under ptlJérôme Glisse1-15/+21
MMU notifiers can sleep, but in try_to_unmap_one() we call mmu_notifier_invalidate_page() under page table lock. Let's instead use mmu_notifier_invalidate_range() outside page_vma_mapped_walk() loop. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
2017-08-09rmap: do not call mmu_notifier_invalidate_page() under ptlKirill A. Shutemov1-8/+13
MMU notifiers can sleep, but in page_mkclean_one() we call mmu_notifier_invalidate_page() under page table lock. Let's instead use mmu_notifier_invalidate_range() outside page_vma_mapped_walk() loop. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
2017-08-09hmm/dummy: show how to use allocate device page on migration of empty entryJérôme Glisse1-4/+9
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2017-08-09hmm/dmirror: dummy mirror support for fake device memoryJérôme Glisse2-0/+371
Add fake device memory. Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2017-08-09hmm/dmirror: dummy mirror driver for testing and showcasing the HMMJérôme Glisse4-0/+902
Just a dummy driver for test purposes. Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2017-08-09mm/hmm: add new helper to hotplug CDM memory region v3Jérôme Glisse2-5/+86
Unlike unaddressable memory, coherent device memory has a real resource associated with it on the system (as CPU can address it). Add a new helper to hotplug such memory within the HMM framework. Changed since v2: - s/host/public Changed since v1: - s/public/host Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: Balbir Singh <bsingharora@gmail.com>
2017-08-09mm/device-public-memory: device memory cache coherent with CPU v5Jérôme Glisse14-47/+159
Platform with advance system bus (like CAPI or CCIX) allow device memory to be accessible from CPU in a cache coherent fashion. Add a new type of ZONE_DEVICE to represent such memory. The use case are the same as for the un-addressable device memory but without all the corners cases. Changed since v4: - added memory cgroup change to this patch Changed since v3: - s/public/public (going back) Changed since v2: - s/public/public - add proper include in migrate.c and drop useless #if/#endif Changed since v1: - Kconfig and #if/#else cleanup Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
2017-08-09mm/migrate: allow migrate_vma() to alloc new page on empty entry v4Jérôme Glisse2-9/+205
This allow caller of migrate_vma() to allocate new page for empty CPU page table entry (pte_none or back by zero page). This is only for anonymous memory and it won't allow new page to be instanced if the userfaultfd is armed. This is useful to device driver that want to migrate a range of virtual address and would rather allocate new memory than having to fault later on. Changed since v3: - support zero pfn entry - improve commit message Changed sinve v2: - differentiate between empty CPU page table entry and non empty - improve code comments explaining how this works Changed since v1: - 5 level page table fix Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2017-08-09mm/migrate: support un-addressable ZONE_DEVICE page in migration v3Jérôme Glisse4-30/+164
Allow to unmap and restore special swap entry of un-addressable ZONE_DEVICE memory. Changed since v2: - un-conditionaly allow device private memory to be migrated (it can not be pin to pointless to check reference count). Changed since v1: - s/device unaddressable/device private/ Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
2017-08-09mm/migrate: migrate_vma() unmap page from vma while collecting pagesJérôme Glisse1-29/+112
Common case for migration of virtual address range is page are map only once inside the vma in which migration is taking place. Because we already walk the CPU page table for that range we can directly do the unmap there and setup special migration swap entry. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Signed-off-by: Evgeny Baskakov <ebaskakov@nvidia.com> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Mark Hairgrove <mhairgrove@nvidia.com> Signed-off-by: Sherry Cheung <SCheung@nvidia.com> Signed-off-by: Subhash Gutti <sgutti@nvidia.com>
2017-08-09mm/migrate: new memory migration helper for use with device memory v5Jérôme Glisse2-0/+596
This patch add a new memory migration helpers, which migrate memory backing a range of virtual address of a process to different memory (which can be allocated through special allocator). It differs from numa migration by working on a range of virtual address and thus by doing migration in chunk that can be large enough to use DMA engine or special copy offloading engine. Expected users are any one with heterogeneous memory where different memory have different characteristics (latency, bandwidth, ...). As an example IBM platform with CAPI bus can make use of this feature to migrate between regular memory and CAPI device memory. New CPU architecture with a pool of high performance memory not manage as cache but presented as regular memory (while being faster and with lower latency than DDR) will also be prime user of this patch. Migration to private device memory will be useful for device that have large pool of such like GPU, NVidia plans to use HMM for that. Changed since v4: - split THP instead of skipping them Changes since v3: - Rebase Changes since v2: - droped HMM prefix and HMM specific code Changes since v1: - typos fix - split early unmap optimization for page with single mapping Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Signed-off-by: Evgeny Baskakov <ebaskakov@nvidia.com> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Mark Hairgrove <mhairgrove@nvidia.com> Signed-off-by: Sherry Cheung <SCheung@nvidia.com> Signed-off-by: Subhash Gutti <sgutti@nvidia.com>
2017-08-09mm/migrate: new migrate mode MIGRATE_SYNC_NO_COPYJérôme Glisse9-15/+86
Introduce a new migration mode that allow to offload the copy to a device DMA engine. This changes the workflow of migration and not all address_space migratepage callback can support this. So it needs to be tested in those cases. This is intended to be use by migrate_vma() which itself is use for thing like HMM (see include/linux/hmm.h). Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2017-08-09mm/hmm/devmem: dummy HMM device for ZONE_DEVICE memory v3Jérôme Glisse2-1/+102
This introduce a dummy HMM device class so device driver can use it to create hmm_device for the sole purpose of registering device memory. It is useful to device driver that want to manage multiple physical device memory under same struct device umbrella. Changed since v2: - use device_initcall() and drop everything that is module specific Changed since v1: - Improve commit message - Add drvdata parameter to set on struct device Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Signed-off-by: Evgeny Baskakov <ebaskakov@nvidia.com> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Mark Hairgrove <mhairgrove@nvidia.com> Signed-off-by: Sherry Cheung <SCheung@nvidia.com> Signed-off-by: Subhash Gutti <sgutti@nvidia.com>
2017-08-09mm/hmm/devmem: device memory hotplug using ZONE_DEVICE v7Jérôme Glisse2-1/+532
This introduce a simple struct and associated helpers for device driver to use when hotpluging un-addressable device memory as ZONE_DEVICE. It will find a unuse physical address range and trigger memory hotplug for it which allocates and initialize struct page for the device memory. Device driver should use this helper during device initialization to hotplug the device memory. It should only need to remove the memory once the device is going offline (shutdown or hotremove). There should not be any userspace API to hotplug memory expect maybe for host device driver to allow to add more memory to a guest device driver. Device's memory is manage by the device driver and HMM only provides helpers to that effect. Changed since v6: - fix start address calculation (Balbir Singh) - more comments and updated commit message. Changed since v5: - kernel configuration simplification - remove now unuse device driver helper Changed since v4: - enable device_private_key static key when adding device memory Changed since v3: - s/device unaddressable/device private/ Changed since v2: - s/SECTION_SIZE/PA_SECTION_SIZE Changed since v1: - change to adapt to new add_pages() helper - make this x86-64 only for now Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Signed-off-by: Evgeny Baskakov <ebaskakov@nvidia.com> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Mark Hairgrove <mhairgrove@nvidia.com> Signed-off-by: Sherry Cheung <SCheung@nvidia.com> Signed-off-by: Subhash Gutti <sgutti@nvidia.com> Signed-off-by: Balbir Singh <bsingharora@gmail.com>
2017-08-09mm/memcontrol: support MEMORY_DEVICE_PRIVATE v4Jérôme Glisse2-4/+49
HMM pages (private or public device pages) are ZONE_DEVICE page and thus need special handling when it comes to lru or refcount. This patch make sure that memcontrol properly handle those when it face them. Those pages are use like regular pages in a process address space either as anonymous page or as file back page. So from memcg point of view we want to handle them like regular page for now at least. Changed since v3: - remove public support and move those chunk to separate patch Changed since v2: - s/host/public Changed since v1: - s/public/host - add comments explaining how device memory behave and why Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: cgroups@vger.kernel.org
2017-08-09mm/memcontrol: allow to uncharge page without using page->lru fieldJérôme Glisse1-76/+92
HMM pages (private or public device pages) are ZONE_DEVICE page and thus you can not use page->lru fields of those pages. This patch re-arrange the uncharge to allow single page to be uncharge without modifying the lru field of the struct page. There is no change to memcontrol logic, it is the same as it was before this patch. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: cgroups@vger.kernel.org
2017-08-09mm/ZONE_DEVICE: special case put_page() for device private pages v4Jérôme Glisse4-10/+67
A ZONE_DEVICE page that reach a refcount of 1 is free ie no longer have any user. For device private pages this is important to catch and thus we need to special case put_page() for this. Changed since v3: - clear page mapping field Changed since v2: - clear page active and waiters Changed since v1: - use static key to disable special code path in put_page() by default - uninline put_zone_device_private_page() - fix build issues with some kernel config related to header inter-dependency Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com>