summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-01-30fs: Don't need to put list_lru into its own cachelineWaiman Long1-4/+5
The list_lru structure is essentially just a pointer to a table of per-node LRU lists. Even if CONFIG_MEMCG_KMEM is defined, the list field is just used for LRU list registration and shrinker_id is set at initialization. Those fields won't need to be touched that often. So there is no point to make the list_lru structures to sit in their own cachelines. Signed-off-by: Waiman Long <longman@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-30fs/dcache: Fix incorrect nr_dentry_unused accounting in shrink_dcache_sb()Waiman Long1-5/+1
The nr_dentry_unused per-cpu counter tracks dentries in both the LRU lists and the shrink lists where the DCACHE_LRU_LIST bit is set. The shrink_dcache_sb() function moves dentries from the LRU list to a shrink list and subtracts the dentry count from nr_dentry_unused. This is incorrect as the nr_dentry_unused count will also be decremented in shrink_dentry_list() via d_shrink_del(). To fix this double decrement, the decrement in the shrink_dcache_sb() function is taken out. Fixes: 4e717f5c1083 ("list_lru: remove special case function list_lru_dispose_all." Cc: stable@kernel.org Signed-off-by: Waiman Long <longman@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-30cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVMJosh Poimboeuf6-35/+8
With the following commit: 73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS") ... the hotplug code attempted to detect when SMT was disabled by BIOS, in which case it reported SMT as permanently disabled. However, that code broke a virt hotplug scenario, where the guest is booted with only primary CPU threads, and a sibling is brought online later. The problem is that there doesn't seem to be a way to reliably distinguish between the HW "SMT disabled by BIOS" case and the virt "sibling not yet brought online" case. So the above-mentioned commit was a bit misguided, as it permanently disabled SMT for both cases, preventing future virt sibling hotplugs. Going back and reviewing the original problems which were attempted to be solved by that commit, when SMT was disabled in BIOS: 1) /sys/devices/system/cpu/smt/control showed "on" instead of "notsupported"; and 2) vmx_vm_init() was incorrectly showing the L1TF_MSG_SMT warning. I'd propose that we instead consider #1 above to not actually be a problem. Because, at least in the virt case, it's possible that SMT wasn't disabled by BIOS and a sibling thread could be brought online later. So it makes sense to just always default the smt control to "on" to allow for that possibility (assuming cpuid indicates that the CPU supports SMT). The real problem is #2, which has a simple fix: change vmx_vm_init() to query the actual current SMT state -- i.e., whether any siblings are currently online -- instead of looking at the SMT "control" sysfs value. So fix it by: a) reverting the original "fix" and its followup fix: 73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS") bc2d8d262cba ("cpu/hotplug: Fix SMT supported evaluation") and b) changing vmx_vm_init() to query the actual current SMT state -- instead of the sysfs control value -- to determine whether the L1TF warning is needed. This also requires the 'sched_smt_present' variable to exported, instead of 'cpu_smt_control'. Fixes: 73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS") Reported-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Joe Mario <jmario@redhat.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kvm@vger.kernel.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/e3a85d585da28cc333ecbc1e78ee9216e6da9396.1548794349.git.jpoimboe@redhat.com
2019-01-30drm/amdgpu: Transfer fences to dmabuf importerChris Wilson1-8/+51
amdgpu only uses shared-fences internally, but dmabuf importers rely on implicit write hazard tracking via the reservation_object.fence_excl. For example, the importer use the write hazard for timing a page flip to only occur after the exporter has finished flushing its write into the surface. As such, on exporting a dmabuf, we must either flush all outstanding fences (for we do not know which are writes and should have been exclusive) or alternatively create a new exclusive fence that is the composite of all the existing shared fences, and so will only be signaled when all earlier fences are signaled (ensuring that we can not be signaled before the completion of any earlier write). v2: reservation_object is already locked by amdgpu_bo_reserve() v3: Replace looping with get_fences_rcu and special case the promotion of a single shared fence directly to an exclusive fence, bypassing the fence array. v4: Drop the fence array ref after assigning to reservation_object Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107341 Testcase: igt/amd_prime/amd-to-i915 References: 8e94a46c1770 ("drm/amdgpu: Attach exclusive fence to prime exported bo's. (v5)") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Reviewed-by: "Christian König" <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-30Merge tag 'iommu-fixes-v5.0-rc4' of ↵Linus Torvalds3-7/+18
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU fixes from Joerg Roedel: "A few more fixes this time: - Two patches to fix the error path of the map_sg implementation of the AMD IOMMU driver. - Also a missing IOTLB flush is fixed in the AMD IOMMU driver. - Memory leak fix for the Intel IOMMU driver. - Fix a regression in the Mediatek IOMMU driver which caused device initialization to fail (seen as broken HDMI output)" * tag 'iommu-fixes-v5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/amd: Fix IOMMU page flush when detach device from a domain iommu/mediatek: Use correct fwspec in mtk_iommu_add_device() iommu/vt-d: Fix memory leak in intel_iommu_put_resv_regions() iommu/amd: Unmap all mapped pages in error path of map_sg iommu/amd: Call free_iova_fast with pfn in map_sg
2019-01-30Merge tag 'gpio-v5.0-3' of ↵Linus Torvalds5-17/+41
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "Here is a bunch of GPIO fixes for the v5.0 series. I was helped out by Bartosz in collecting these fixes, for which I am very grateful, the biggest achievement in GPIO right now is work distribution. There is one serious core fix (timestamping) and a bunch of driver fixes: - Fix timestamps on nested IRQs - Handle IRQs properly in multiple instances of PCF857x - Use the right data register and IRQ type setting in the Spreadtrum GPIO driver - Let the value argument work properly when setting direction in the Altera GPIO driver - Mask interrupts properly in the vf610 driver" * tag 'gpio-v5.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: vf610: Mask all GPIO interrupts gpio: altera-a10sr: Set proper output level for direction_output gpio: sprd: Fix incorrect irq type setting for the async EIC gpio: sprd: Fix the incorrect data register gpiolib: fix line event timestamps for nested irqs gpio: pcf857x: Fix interrupts on multiple instances
2019-01-30btrfs: On error always free subvol_name in btrfs_mountEric W. Biederman1-0/+3
The subvol_name is allocated in btrfs_parse_subvol_options and is consumed and freed in mount_subvol. Add a free to the error paths that don't call mount_subvol so that it is guaranteed that subvol_name is freed when an error happens. Fixes: 312c89fbca06 ("btrfs: cleanup btrfs_mount() using btrfs_mount_root()") Cc: stable@vger.kernel.org # v4.19+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-01-30btrfs: clean up pending block groups when transaction commit abortsDavid Sterba1-0/+16
The fstests generic/475 stresses transaction aborts and can reveal space accounting or use-after-free bugs regarding block goups. In this case the pending block groups that remain linked to the structures after transaction commit aborts in the middle. The corrupted slabs lead to failures in following tests, eg. generic/476 [ 8172.752887] BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 [ 8172.755799] #PF error: [normal kernel read fault] [ 8172.757571] PGD 661ae067 P4D 661ae067 PUD 3db8e067 PMD 0 [ 8172.759000] Oops: 0000 [#1] PREEMPT SMP [ 8172.760209] CPU: 0 PID: 39 Comm: kswapd0 Tainted: G W 5.0.0-rc2-default #408 [ 8172.762495] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626cc-prebuilt.qemu-project.org 04/01/2014 [ 8172.765772] RIP: 0010:shrink_page_list+0x2f9/0xe90 [ 8172.770453] RSP: 0018:ffff967f00663b18 EFLAGS: 00010287 [ 8172.771184] RAX: 0000000000000000 RBX: ffff967f00663c20 RCX: 0000000000000000 [ 8172.772850] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8c0620ab20e0 [ 8172.774629] RBP: ffff967f00663dd8 R08: 0000000000000000 R09: 0000000000000000 [ 8172.776094] R10: ffff8c0620ab22f8 R11: ffff8c063f772688 R12: ffff967f00663b78 [ 8172.777533] R13: ffff8c063f625600 R14: ffff8c063f625608 R15: dead000000000200 [ 8172.778886] FS: 0000000000000000(0000) GS:ffff8c063d400000(0000) knlGS:0000000000000000 [ 8172.780545] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8172.781787] CR2: 0000000000000058 CR3: 000000004e962000 CR4: 00000000000006f0 [ 8172.783547] Call Trace: [ 8172.784112] shrink_inactive_list+0x194/0x410 [ 8172.784747] shrink_node_memcg.constprop.85+0x3a5/0x6a0 [ 8172.785472] shrink_node+0x62/0x1e0 [ 8172.786011] balance_pgdat+0x216/0x460 [ 8172.786577] kswapd+0xe3/0x4a0 [ 8172.787085] ? finish_wait+0x80/0x80 [ 8172.787795] ? balance_pgdat+0x460/0x460 [ 8172.788799] kthread+0x116/0x130 [ 8172.789640] ? kthread_create_on_node+0x60/0x60 [ 8172.790323] ret_from_fork+0x24/0x30 [ 8172.794253] CR2: 0000000000000058 or accounting errors at umount time: [ 8159.537251] WARNING: CPU: 2 PID: 19031 at fs/btrfs/extent-tree.c:5987 btrfs_free_block_groups+0x3d5/0x410 [btrfs] [ 8159.543325] CPU: 2 PID: 19031 Comm: umount Tainted: G W 5.0.0-rc2-default #408 [ 8159.545472] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626cc-prebuilt.qemu-project.org 04/01/2014 [ 8159.548155] RIP: 0010:btrfs_free_block_groups+0x3d5/0x410 [btrfs] [ 8159.554030] RSP: 0018:ffff967f079cbde8 EFLAGS: 00010206 [ 8159.555144] RAX: 0000000001000000 RBX: ffff8c06366cf800 RCX: 0000000000000000 [ 8159.556730] RDX: 0000000000000002 RSI: 0000000000000001 RDI: ffff8c06255ad800 [ 8159.558279] RBP: ffff8c0637ac0000 R08: 0000000000000001 R09: 0000000000000000 [ 8159.559797] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8c0637ac0108 [ 8159.561296] R13: ffff8c0637ac0158 R14: 0000000000000000 R15: dead000000000100 [ 8159.562852] FS: 00007f7f693b9fc0(0000) GS:ffff8c063d800000(0000) knlGS:0000000000000000 [ 8159.564839] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8159.566160] CR2: 00007f7f68fab7b0 CR3: 000000000aec7000 CR4: 00000000000006e0 [ 8159.567898] Call Trace: [ 8159.568597] close_ctree+0x17f/0x350 [btrfs] [ 8159.569628] generic_shutdown_super+0x64/0x100 [ 8159.570808] kill_anon_super+0x14/0x30 [ 8159.571857] btrfs_kill_super+0x12/0xa0 [btrfs] [ 8159.573063] deactivate_locked_super+0x29/0x60 [ 8159.574234] cleanup_mnt+0x3b/0x70 [ 8159.575176] task_work_run+0x98/0xc0 [ 8159.576177] exit_to_usermode_loop+0x83/0x90 [ 8159.577315] do_syscall_64+0x15b/0x180 [ 8159.578339] entry_SYSCALL_64_after_hwframe+0x49/0xbe This fix is based on 2 Josef's patches that used sideefects of btrfs_create_pending_block_groups, this fix introduces the helper that does what we need. CC: stable@vger.kernel.org # 4.4+ CC: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-01-30btrfs: fix potential oops in device_list_addAl Viro1-2/+2
alloc_fs_devices() can return ERR_PTR(-ENOMEM), so dereferencing its result before the check for IS_ERR() is a bad idea. Fixes: d1a63002829a4 ("btrfs: add members to fs_devices to track fsid changes") Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-01-30blk-mq: fix a hung issue when fsyncJianchao Wang1-1/+1
Florian reported a io hung issue when fsync(). It should be triggered by following race condition. data + post flush a flush blk_flush_complete_seq case REQ_FSEQ_DATA blk_flush_queue_rq issued to driver blk_mq_dispatch_rq_list try to issue a flush req failed due to NON-NCQ command .queue_rq return BLK_STS_DEV_RESOURCE request completion req->end_io // doesn't check RESTART mq_flush_data_end_io case REQ_FSEQ_POSTFLUSH blk_kick_flush do nothing because previous flush has not been completed blk_mq_run_hw_queue insert rq to hctx->dispatch due to RESTART is still set, do nothing To fix this, replace the blk_mq_run_hw_queue in mq_flush_data_end_io with blk_mq_sched_restart to check and clear the RESTART flag. Fixes: bd166ef1 (blk-mq-sched: add framework for MQ capable IO schedulers) Reported-by: Florian Stecker <m19@florianstecker.de> Tested-by: Florian Stecker <m19@florianstecker.de> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-01-30block: pass no-op callback to INIT_WORK().Tetsuo Handa1-1/+5
syzbot is hitting flush_work() warning caused by commit 4d43d395fed12463 ("workqueue: Try to catch flush_work() without INIT_WORK().") [1]. Although that commit did not expect INIT_WORK(NULL) case, calling flush_work() without setting a valid callback should be avoided anyway. Fix this problem by setting a no-op callback instead of NULL. [1] https://syzkaller.appspot.com/bug?id=e390366bc48bc82a7c668326e0663be3b91cbd29 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-and-tested-by: syzbot <syzbot+ba2a929dcf8e704c180e@syzkaller.appspotmail.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-01-30MAINTAINERS: Add Andy and Darren as arch/x86/platform/ reviewersBorislav Petkov1-0/+9
... so that they can get CCed on platform patches. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Darren Hart <dvhart@infradead.org> Cc: Andy Shevchenko <andy@infradead.org> Cc: x86@kernel.org Link: https://lkml.kernel.org/r/20190128113619.19025-1-bp@alien8.de
2019-01-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds50-195/+605
Pull networking fixes from David Miller: 1) Need to save away the IV across tls async operations, from Dave Watson. 2) Upon successful packet processing, we should liberate the SKB with dev_consume_skb{_irq}(). From Yang Wei. 3) Only apply RX hang workaround on effected macb chips, from Harini Katakam. 4) Dummy netdev need a proper namespace assigned to them, from Josh Elsasser. 5) Some paths of nft_compat run lockless now, and thus we need to use a proper refcnt_t. From Florian Westphal. 6) Avoid deadlock in mlx5 by doing IRQ locking, from Moni Shoua. 7) netrom does not refcount sockets properly wrt. timers, fix that by using the sock timer API. From Cong Wang. 8) Fix locking of inexact inserts of xfrm policies, from Florian Westphal. 9) Missing xfrm hash generation bump, also from Florian. 10) Missing of_node_put() in hns driver, from Yonglong Liu. 11) Fix DN_IFREQ_SIZE, from Johannes Berg. 12) ip6mr notifier is invoked during traversal of wrong table, from Nir Dotan. 13) TX promisc settings not performed correctly in qed, from Manish Chopra. 14) Fix OOB access in vhost, from Jason Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits) MAINTAINERS: Add entry for XDP (eXpress Data Path) net: set default network namespace in init_dummy_netdev() net: b44: replace dev_kfree_skb_xxx by dev_consume_skb_xxx for drop profiles net: caif: call dev_consume_skb_any when skb xmit done net: 8139cp: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles net: macb: Apply RXUBR workaround only to versions with errata net: ti: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles net: apple: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles net: amd8111e: replace dev_kfree_skb_irq by dev_consume_skb_irq net: alteon: replace dev_kfree_skb_irq by dev_consume_skb_irq net: tls: Fix deadlock in free_resources tx net: tls: Save iv in tls_rec for async crypto requests vhost: fix OOB in get_rx_bufs() qed: Fix stack out of bounds bug qed: Fix system crash in ll2 xmit qed: Fix VF probe failure while FLR qed: Fix LACP pdu drops for VFs qed: Fix bug in tx promiscuous mode settings net: i825xx: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles netfilter: ipt_CLUSTERIP: fix warning unused variable cn ...
2019-01-29CIFS: Do not consider -ENODATA as stat failure for readsPavel Shilovsky1-1/+1
When doing reads beyound the end of a file the server returns error STATUS_END_OF_FILE error which is mapped to -ENODATA. Currently we report it as a failure which confuses read stats. Change it to not consider -ENODATA as failure for stat purposes. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org>
2019-01-29CIFS: Do not count -ENODATA as failure for query directoryPavel Shilovsky1-2/+2
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org>
2019-01-29CIFS: Fix trace command logging for SMB2 reads and writesPavel Shilovsky1-16/+30
Currently we log success once we send an async IO request to the server. Instead we need to analyse a response and then log success or failure for a particular command. Also fix argument list for read logging. Cc: <stable@vger.kernel.org> # 4.18 Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-01-29CIFS: Fix possible oops and memory leaks in async IOPavel Shilovsky1-3/+8
Allocation of a page array for non-cached IO was separated from allocation of rdata and wdata structures and this introduced memory leaks and a possible null pointer dereference. This patch fixes these problems. Cc: <stable@vger.kernel.org> Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-01-29cifs: limit amount of data we request for xattrs to CIFSMaxBufSizeRonnie Sahlberg2-3/+16
minus the various headers and blobs that will be part of the reply. or else we might trigger a session reconnect. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-01-29cifs: fix computation for MAX_SMB2_HDR_SIZERonnie Sahlberg1-2/+2
The size of the fixed part of the create response is 88 bytes not 56. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-01-29NFS: Fix up return value on fatal errors in nfs_page_async_flush()Trond Myklebust1-4/+5
Ensure that we return the fatal error value that caused us to exit nfs_page_async_flush(). Fixes: c373fff7bd25 ("NFSv4: Don't special case "launder"") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v4.12+ Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-01-29x86/speculation: Remove redundant arch_smt_update() invocationZhenzhong Duan1-4/+1
With commit a74cfffb03b7 ("x86/speculation: Rework SMT state change"), arch_smt_update() is invoked from each individual CPU hotplug function. Therefore the extra arch_smt_update() call in the sysfs SMT control is redundant. Fixes: a74cfffb03b7 ("x86/speculation: Rework SMT state change") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <konrad.wilk@oracle.com> Cc: <dwmw@amazon.co.uk> Cc: <bp@suse.de> Cc: <srinivas.eeda@oracle.com> Cc: <peterz@infradead.org> Cc: <hpa@zytor.com> Link: https://lkml.kernel.org/r/e2e064f2-e8ef-42ca-bf4f-76b612964752@default
2019-01-29x86/fault: Fix sign-extend unintended sign extensionColin Ian King1-1/+1
show_ldttss() shifts desc.base2 by 24 bit, but base2 is 8 bits of a bitfield in a u16. Due to the really great idea of integer promotion in C99 base2 is promoted to an int, because that's the standard defined behaviour when all values which can be represented by base2 fit into an int. Now if bit 7 is set in desc.base2 the result of the shift left by 24 makes the resulting integer negative and the following conversion to unsigned long legitmately sign extends first causing the upper bits 32 bits to be set in the result. Fix this by casting desc.base2 to unsigned long before the shift. Detected by CoverityScan, CID#1475635 ("Unintended sign extension") [ tglx: Reworded the changelog a bit as I actually had to lookup the standard (again) to decode the original one. ] Fixes: a1a371c468f7 ("x86/fault: Decode page fault OOPSes better") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: kernel-janitors@vger.kernel.org Link: https://lkml.kernel.org/r/20181222191116.21831-1-colin.king@canonical.com
2019-01-29x86/boot/compressed/64: Set EFER.LME=1 in 32-bit trampoline before returning ↵Wei Huang2-1/+9
to long mode In some old AMD KVM implementation, guest's EFER.LME bit is cleared by KVM when the hypervsior detects that the guest sets CR0.PG to 0. This causes the guest OS to reboot when it tries to return from 32-bit trampoline code because the CPU is in incorrect state: CR4.PAE=1, CR0.PG=1, CS.L=1, but EFER.LME=0. As a precaution, set EFER.LME=1 as part of long mode activation procedure. This extra step won't cause any harm when Linux is booted on a bare-metal machine. Signed-off-by: Wei Huang <wei@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: bp@alien8.de Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/20190104054411.12489-1-wei@redhat.com
2019-01-29IB/uverbs: Fix OOPs in uverbs_user_mmap_disassociateYishai Hadas1-5/+13
The vma->vm_mm can become impossible to get before rdma_umap_close() is called, in this case we must not try to get an mm that is already undergoing process exit. In this case there is no need to wait for anything as the VMA will be destroyed by another thread soon and is already effectively 'unreachable' by userspace. BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 PGD 800000012bc50067 P4D 800000012bc50067 PUD 129db5067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 1 PID: 2050 Comm: bash Tainted: G W OE 4.20.0-rc6+ #3 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:__rb_erase_color+0xb9/0x280 Code: 84 17 01 00 00 48 3b 68 10 0f 84 15 01 00 00 48 89 58 08 48 89 de 48 89 ef 4c 89 e3 e8 90 84 22 00 e9 60 ff ff ff 48 8b 5d 10 <f6> 03 01 0f 84 9c 00 00 00 48 8b 43 10 48 85 c0 74 09 f6 00 01 0f RSP: 0018:ffffbecfc090bab8 EFLAGS: 00010246 RAX: ffff97616346cf30 RBX: 0000000000000000 RCX: 0000000000000101 RDX: 0000000000000000 RSI: ffff97623b6ca828 RDI: ffff97621ef10828 RBP: ffff97621ef10828 R08: ffff97621ef10828 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff97623b6ca838 R13: ffffffffbb3fef50 R14: ffff97623b6ca828 R15: 0000000000000000 FS: 00007f7a5c31d740(0000) GS:ffff97623bb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000011255a000 CR4: 00000000000006e0 Call Trace: unlink_file_vma+0x3b/0x50 free_pgtables+0xa1/0x110 exit_mmap+0xca/0x1a0 ? mlx5_ib_dealloc_pd+0x28/0x30 [mlx5_ib] mmput+0x54/0x140 uverbs_user_mmap_disassociate+0xcc/0x160 [ib_uverbs] uverbs_destroy_ufile_hw+0xf7/0x120 [ib_uverbs] ib_uverbs_remove_one+0xea/0x240 [ib_uverbs] ib_unregister_device+0xfb/0x200 [ib_core] mlx5_ib_remove+0x51/0xe0 [mlx5_ib] mlx5_remove_device+0xc1/0xd0 [mlx5_core] mlx5_unregister_device+0x3d/0xb0 [mlx5_core] remove_one+0x2a/0x90 [mlx5_core] pci_device_remove+0x3b/0xc0 device_release_driver_internal+0x16d/0x240 unbind_store+0xb2/0x100 kernfs_fop_write+0x102/0x180 __vfs_write+0x36/0x1a0 ? __alloc_fd+0xa9/0x170 ? set_close_on_exec+0x49/0x70 vfs_write+0xad/0x1a0 ksys_write+0x52/0xc0 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Cc: <stable@vger.kernel.org> # 4.19 Fixes: 5f9794dc94f5 ("RDMA/ucontext: Add a core API for mmaping driver IO memory") Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-29MAINTAINERS: Add entry for XDP (eXpress Data Path)Jesper Dangaard Brouer1-0/+18
Add multiple people as maintainers for XDP, sorted alphabetically. XDP is also tied to driver level support and code, but we cannot add all drivers to the list. Instead K: and N: match on 'xdp' in hope to catch some of those changes in drivers. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-29net: set default network namespace in init_dummy_netdev()Josh Elsasser1-0/+3
Assign a default net namespace to netdevs created by init_dummy_netdev(). Fixes a NULL pointer dereference caused by busy-polling a socket bound to an iwlwifi wireless device, which bumps the per-net BUSYPOLLRXPACKETS stat if napi_poll() received packets: BUG: unable to handle kernel NULL pointer dereference at 0000000000000190 IP: napi_busy_loop+0xd6/0x200 Call Trace: sock_poll+0x5e/0x80 do_sys_poll+0x324/0x5a0 SyS_poll+0x6c/0xf0 do_syscall_64+0x6b/0x1f0 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Fixes: 7db6b048da3b ("net: Commonize busy polling code to focus on napi_id instead of socket") Signed-off-by: Josh Elsasser <jelsasser@appneta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-29net: b44: replace dev_kfree_skb_xxx by dev_consume_skb_xxx for drop profilesYang Wei1-2/+2
The skb should be freed by dev_consume_skb_any() in b44_start_xmit() when bounce_skb is used. The skb is be replaced by bounce_skb, so the original skb should be consumed(not drop). dev_consume_skb_irq() should be called in b44_tx() when skb xmit done. It makes drop profiles(dropwatch, perf) more friendly. Signed-off-by: Yang Wei <yang.wei9@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-29net: caif: call dev_consume_skb_any when skb xmit doneYang Wei1-4/+1
The skb shouled be consumed when xmit done, it makes drop profiles (dropwatch, perf) more friendly. dev_kfree_skb_irq()/kfree_skb() shouled be replaced by dev_consume_skb_any(), it makes code cleaner. Signed-off-by: Yang Wei <yang.wei9@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-29net: 8139cp: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profilesYang Wei1-1/+1
dev_consume_skb_irq() should be called in cp_tx() when skb xmit done. It makes drop profiles(dropwatch, perf) more friendly. Signed-off-by: Yang Wei <yang.wei9@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-29net: macb: Apply RXUBR workaround only to versions with errataHarini Katakam2-11/+20
The interrupt handler contains a workaround for RX hang applicable to Zynq and AT91RM9200 only. Subsequent versions do not need this workaround. This workaround unnecessarily resets RX whenever RX used bit read is observed, which can be often under heavy traffic. There is no other action performed on RX UBR interrupt. Hence introduce a CAPS mask; enable this interrupt and workaround only on affected versions. Signed-off-by: Harini Katakam <harini.katakam@xilinx.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-29drm/amd/powerplay: Fix missing break in switchGustavo A. R. Silva1-0/+1
Add missing break statement in order to prevent the code from falling through to the default case. The resoning for this is that pclk_vol_table is an automatic variable. So, it makes no sense to update it just before falling through to the default case and return -EINVAL. This bug was found thanks to the ongoing efforts to enabling -Wimplicit-fallthrough. Fixes: cd70f3d6e3fa ("drm/amd/powerplay: PP/DAL interface changes for dynamic clock switch") Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-29drm/radeon: check if device is root before getting pci speed capsAlex Deucher2-4/+6
Check if the device is root rather before attempting to see what speeds the pcie port supports. Fixes a crash with pci passthrough in a VM. Bug: https://bugs.freedesktop.org/show_bug.cgi?id=109366 Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-29drm/amdgpu: Add missing power attribute to APU checkAlex Deucher1-1/+2
Add missing power_average to visible check for power attributes for APUs. Was missed before. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-29x86/cpu: Add Atom Tremont (Jacobsville)Kan Liang1-1/+2
Add the Atom Tremont model number to the Intel family list. [ Tony: Also update comment at head of file to say "_X" suffix is also used for microserver parts. ] Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Aristeu Rozanski <aris@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Mauro Carvalho Chehab <mchehab@s-opensource.com> Cc: Megha Dey <megha.dey@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190125195902.17109-4-tony.luck@intel.com
2019-01-29ALSA: hda/realtek - Fixed hp_pin no valueKailang Yang1-33/+45
Fix hp_pin always no value. [More notes on the changes: The hp_pin value that is referred in alc294_hp_init() is always zero at the moment the function gets called, hence this is actually useless as in the current code. And, this kind of init sequence should be called from the codec init callback, instead of the parser function. So, the first fix in this patch to move the call call into its own init_hook. OTOH, this function is needed to be called only once after the boot, and it'd take too long for invoking at each resume (where the init callback gets called). So we add a new flag and invoke this only once as an additional fix. The one case is still not covered, though: S4 resume. But this change itself won't lead to any regression in that regard, so we leave S4 issue as is for now and fix it later. -- tiwai ] Fixes: bde1a7459623 ("ALSA: hda/realtek - Fixed headphone issue for ALC700") Signed-off-by: Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-01-29Merge branch 'md-fixes' of https://github.com/liu-song-6/linux into for-linusJens Axboe2-13/+28
Pull MD fix from Song. * 'md-fixes' of https://github.com/liu-song-6/linux: md/raid5: fix 'out of memory' during raid cache recovery
2019-01-29drm/vmwgfx: unwind spaghetti code in vmw_dma_select_modeChristoph Hellwig1-19/+6
Just use a simple if/else chain to select the DMA mode. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2019-01-29drm/vmwgfx: fix the check when to use dma_alloc_coherentChristoph Hellwig1-7/+4
Since Linux 4.21 we merged the swiotlb ops into the DMA direct ops, so they would always have a the sync_single methods. But late in the cicle we also removed the direct ops entirely, so we'd see NULL DMA ops. Switch vmw_dma_select_mode to only detect swiotlb presence using swiotlb_nr_tbl() instead. Fixes: 55897af630 ("dma-direct: merge swiotlb_dma_ops into the dma_direct code") Fixes: 356da6d0cd ("dma-mapping: bypass indirect calls for dma-direct") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2019-01-29drm/vmwgfx: remove CONFIG_INTEL_IOMMU ifdefs v2Christoph Hellwig1-17/+4
intel_iommu_enabled is defined as always false for !CONFIG_INTEL_IOMMU, so remove the ifdefs around it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2019-01-29drm/vmwgfx: remove CONFIG_X86 ifdefsChristoph Hellwig1-6/+0
The driver depends on CONFIG_X86 so these are dead code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2019-01-29platform/x86: Fix unmet dependency warning for SAMSUNG_Q10Sinan Kaya1-0/+1
Add BACKLIGHT_LCD_SUPPORT for SAMSUNG_Q10 to fix the warning: unmet direct dependencies detected for BACKLIGHT_CLASS_DEVICE. SAMSUNG_Q10 selects BACKLIGHT_CLASS_DEVICE but BACKLIGHT_CLASS_DEVICE depends on BACKLIGHT_LCD_SUPPORT. Copy BACKLIGHT_LCD_SUPPORT dependency into SAMSUNG_Q10 to fix: WARNING: unmet direct dependencies detected for BACKLIGHT_CLASS_DEVICE Depends on [n]: HAS_IOMEM [=y] && BACKLIGHT_LCD_SUPPORT [=n] Selected by [y]: - SAMSUNG_Q10 [=y] && X86 [=y] && X86_PLATFORM_DEVICES [=y] && ACPI [=y] Signed-off-by: Sinan Kaya <okaya@kernel.org> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-01-29platform/x86: Fix unmet dependency warning for ACPI_CMPCSinan Kaya1-0/+1
Add BACKLIGHT_LCD_SUPPORT for ACPI_CMPC to fix the warning: unmet direct dependencies detected for BACKLIGHT_CLASS_DEVICE. ACPI_CMPC selects BACKLIGHT_CLASS_DEVICE but BACKLIGHT_CLASS_DEVICE depends on BACKLIGHT_LCD_SUPPORT. Copy BACKLIGHT_LCD_SUPPORT dependency into ACPI_CMPC to fix WARNING: unmet direct dependencies detected for BACKLIGHT_CLASS_DEVICE Depends on [n]: HAS_IOMEM [=y] && BACKLIGHT_LCD_SUPPORT [=n] Selected by [y]: - ACPI_CMPC [=y] && X86 [=y] && X86_PLATFORM_DEVICES [=y] && ACPI [=y] && INPUT [=y] && (RFKILL [=n] || RFKILL [=n]=n) Signed-off-by: Sinan Kaya <okaya@kernel.org> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-01-29mfd: Fix unmet dependency warning for MFD_TPS68470Sinan Kaya1-1/+1
After commit 5d32a66541c4 ("PCI/ACPI: Allow ACPI to be built without CONFIG_PCI set") dependencies on CONFIG_PCI that previously were satisfied implicitly through dependencies on CONFIG_ACPI have to be specified directly. WARNING: unmet direct dependencies detected for I2C_DESIGNWARE_PLATFORM Depends on [n]: I2C [=y] && HAS_IOMEM [=y] && (ACPI [=y] && COMMON_CLK [=n] || !ACPI [=y]) Selected by [y]: - MFD_TPS68470 [=y] && HAS_IOMEM [=y] && ACPI [=y] && I2C [=y]=y MFD_TPS68470 is an ACPI only device and selects I2C_DESIGNWARE_PLATFORM. I2C_DESIGNWARE_PLATFORM does not have any configuration today for ACPI support without CONFIG_PCI set. For sake of a quick fix this introduces a new mandatory dependency to the driver which may survive without it. Otherwise we need to revisit the driver architecture to address this properly. Fixes: 5d32a66541c46 ("PCI/ACPI: Allow ACPI to be built without CONFIG_PCI set") Signed-off-by: Sinan Kaya <okaya@kernel.org> Acked-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-01-28net: ti: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profilesYang Wei1-1/+1
dev_consume_skb_irq() should be called in cpmac_end_xmit() when xmit done. It makes drop profiles more friendly. Signed-off-by: Yang Wei <yang.wei9@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-28net: apple: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profilesYang Wei1-1/+1
dev_consume_skb_irq() should be called in bmac_txdma_intr() when xmit done. It makes drop profiles more friendly. Signed-off-by: Yang Wei <yang.wei9@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-28net: amd8111e: replace dev_kfree_skb_irq by dev_consume_skb_irqYang Wei1-1/+1
dev_consume_skb_irq() should be called in amd8111e_tx() when xmit done. It makes drop profiles more friendly. Signed-off-by: Yang Wei <yang.wei9@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-28net: alteon: replace dev_kfree_skb_irq by dev_consume_skb_irqYang Wei1-1/+1
dev_consume_skb_irq() should be called in ace_tx_int() when xmit done. It makes drop profiles more friendly. Signed-off-by: Yang Wei <yang.wei9@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-28net: tls: Fix deadlock in free_resources txDave Watson1-0/+2
If there are outstanding async tx requests (when crypto returns EINPROGRESS), there is a potential deadlock: the tx work acquires the lock, while we cancel_delayed_work_sync() while holding the lock. Drop the lock while waiting for the work to complete. Fixes: a42055e8d2c30 ("Add support for async encryption of records...") Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-28net: tls: Save iv in tls_rec for async crypto requestsDave Watson2-1/+5
aead_request_set_crypt takes an iv pointer, and we change the iv soon after setting it. Some async crypto algorithms don't save the iv, so we need to save it in the tls_rec for async requests. Found by hardcoding x64 aesni to use async crypto manager (to test the async codepath), however I don't think this combination can happen in the wild. Presumably other hardware offloads will need this fix, but there have been no user reports. Fixes: a42055e8d2c30 ("Add support for async encryption of records...") Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-28vhost: fix OOB in get_rx_bufs()Jason Wang5-7/+11
After batched used ring updating was introduced in commit e2b3b35eb989 ("vhost_net: batch used ring update in rx"). We tend to batch heads in vq->heads for more than one packet. But the quota passed to get_rx_bufs() was not correctly limited, which can result a OOB write in vq->heads. headcount = get_rx_bufs(vq, vq->heads + nvq->done_idx, vhost_len, &in, vq_log, &log, likely(mergeable) ? UIO_MAXIOV : 1); UIO_MAXIOV was still used which is wrong since we could have batched used in vq->heads, this will cause OOB if the next buffer needs more than 960 (1024 (UIO_MAXIOV) - 64 (VHOST_NET_BATCH)) heads after we've batched 64 (VHOST_NET_BATCH) heads: Acked-by: Stefan Hajnoczi <stefanha@redhat.com> ============================================================================= BUG kmalloc-8k (Tainted: G B ): Redzone overwritten ----------------------------------------------------------------------------- INFO: 0x00000000fd93b7a2-0x00000000f0713384. First byte 0xa9 instead of 0xcc INFO: Allocated in alloc_pd+0x22/0x60 age=3933677 cpu=2 pid=2674 kmem_cache_alloc_trace+0xbb/0x140 alloc_pd+0x22/0x60 gen8_ppgtt_create+0x11d/0x5f0 i915_ppgtt_create+0x16/0x80 i915_gem_create_context+0x248/0x390 i915_gem_context_create_ioctl+0x4b/0xe0 drm_ioctl_kernel+0xa5/0xf0 drm_ioctl+0x2ed/0x3a0 do_vfs_ioctl+0x9f/0x620 ksys_ioctl+0x6b/0x80 __x64_sys_ioctl+0x11/0x20 do_syscall_64+0x43/0xf0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 INFO: Slab 0x00000000d13e87af objects=3 used=3 fp=0x (null) flags=0x200000000010201 INFO: Object 0x0000000003278802 @offset=17064 fp=0x00000000e2e6652b Fixing this by allocating UIO_MAXIOV + VHOST_NET_BATCH iovs for vhost-net. This is done through set the limitation through vhost_dev_init(), then set_owner can allocate the number of iov in a per device manner. This fixes CVE-2018-16880. Fixes: e2b3b35eb989 ("vhost_net: batch used ring update in rx") Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>